rof File Format
STATUS: Likely trashed
I couldn't get the regex-only parsing to work. I needed a config format for bash, but that was kind of... complex... I thought I could find a really simple, straightforward way to handle key:value pairs in multiple programming languages, but... Idunno. I'm very likely giving up on this project.
a.key: A value
Another_key-whoo: A
multiline
Value
A.Key.again: \ <- Keep leading whitespace
Escape the next line, because it looks like a key
\not.really-a_key: Needs escaping
\\ <- To have an actual backslash
The next key has an empty value, but that's okay.
This key wants to keep trailing whitespace \
last.key:
Under Development
This idea is kind of a flop atm. But I have a crude version that kind of works, detailed just below.
What's working:
The Regex
-
^([a-zA-Z0-9_\-\.]+):(?:\s|\r|\n)*((?:(?:.|\n|\r)(?!^[a-zA-Z0-9_\-\.]+:))+)
- Require multi-line & global flags/modifiers
Rules:
- keys contain a-z, A-Z, dash (-), underscore (_), 0-9, and dot (.) and
- Keys terminated by
:
- values contain any characters
- values can be multi-line
- A value line, if matching
keypattern:
, will be parsed as a key & a new value will start - White-space is trimmed form beginning of value
- White-space is NOT trimmed from end of value
- I desperately want to fix this
Decompressed version of the regex
^([a-zA-Z0-9_\-\.]+) # key
:
(?:\s|\r|\n)*
((?:
(?:.|\n|\r) # characters we want
(?!^[a-zA-Z0-9_\-\.]+:) # But NOT if those characters make up a key
)+)
TODO
- Refine the regex so all keys are $1 & all values are $2
- Trim whitespace surrounding values
- Expand the keys to allow for
([a-zA-Z\-\_0-9\.]+)
& possibly other characters.- I'm testing with just
a-z
keys because that's wayyyy simpler
- I'm testing with just
- Create a multi-lingual examples, at least: bash, PHP, javascript (because I know how to use those lol)
- And never use
(.|\r|\n)*
, use(?s).*
Notes & Wishful thinking
The target format is
key: value 1
nightmare:DELIM:
notakey:
obviously not a key
notakey:
:DELIM:
abc: value 2
new line
anotherkey:: value
nostring: on this one
::
Which would yield These key/value pairs
key
value1
nightmare
notakey:
obviously not a key
notakey:
abc
value 2
new line
anotherkey
value
nostring: on this one
What is working
- Correctly matches non-delimited keys & values
- ([a-z]+):((?:(?:.|\n|\r)(?!^[a-z]+:))+)
- ([a-z]+):((?:(?:.|\n|\r)(?!^[a-z]+:(?![A-Z]*:)))+)
- Correctly matches delimited keys & values
-
([a-z]+):([A-Z]*:)((.|\r|\n)*)^:\2
-
- Matches everything correctly, BUT references are not right
-
(?:(?:([a-z]+):([A-Z]*:)((.|\r|\n)*)^:\2)|([a-z]+):((?:(?:.|\n|\r)(?!^[a-z]+:))+))
- The 'list' feature on regexr, using
($1|$5) = --$3 || $6 --\n
shows everything with clear differentiation between delim & non-delim- (?:(?:([a-z]+):([A-Z]:)((.|\r|\n))^:\2)|([a-z]+):()((?:(?:.|\n|\r)(?!^[a-z]+:))+)) to make it 1/5 & 3/7
-
# $1$5\n$3$6\n
- shows everything cleanly
-
- Matches everything and gives me victory:
-
(?|(?:([a-z]+):([A-Z]*:)((.|\r|\n)*)^:\2)|([a-z]+):()((?:(?:.|\n|\r)(?!^[a-z]+:))+))
-
$1 = $3\n\n
does a nice print of it
-