NOtes
- Sure: 1) to match any char, use [^] or [\s\S], 2) branch reset group is not supported so use your current approach with a non-capturing group and manipulate the group values upon getting a match.
This works:
(?|(?:([a-z]+):([A-Z]:)((.|\r|\n))^:\2)|([a-z]+):()((?:(?:.|\n|\r)(?!^[a-z]+:))+))
This is decompressed & it works
It needs
- White space removed from values
- Key regex made complete
- Delimiter regex possibly allow different characters
- port to work on all regex engines
(?| # Match everything,
(?: # Non-Capture Match delimited key/value pairs,
([a-z]+): # Key, Capture Group #1
([A-Z]*:) # Delimiter, Capture Group #2
( # Match everything after the delimiter
(.|\r|\n)* # Value, Capture Group #3
)
^:\2 # Delimiter, ending value capture
)
| (?: # Match non-delimited values ([a-z]+): # Key, Capture Group #1 () # Capture Group #2, absense of delimiter ( # Value, Capture Group #3 (?: (?:.|\n|\r) (?!^[a-z]+:) # look ahead to make sure there is NOT a key )+
)
)
)
This is a work-in-progress
Working on:
-
DONE-- Replace (.|\r|\n)* with (?s).*
-
White space removed from values
- NEW LINES at beginning are removed
- Need to remove new lines at the end
-
Key regex made complete
-
Delimiter regex possibly allow different characters (?s) (?| # Balance the capture groups
(?: # Non-Capture Match delimited key/value pairs,
([a-z]+): # Key, Capture Group #1 ([A-Z]*:) # Delimiter, Capture Group #2 \s* ( # Match everything after the delimiter, Capture Group #3 (?: . # Value, Capture Group #3 (?!\s+:\2) # Do NOT match trailing white space )* ) \s* # Match the trailing white space ^:\2 # Delimiter, ending value capture
)
| (?: # Match non-delimited values ([a-z]+): # Key, Capture Group #1 () # Capture Group #2, absense of delimiter \s* ( # Value, Capture Group #3 (?: . (?!^[a-z]+:) # look ahead to make sure there is NOT a key )+
)
)
)
Non-delimited values only
([a-z]+):((?:(?:.|\n|\r)(?!^[a-z]+:(?![A-Z]*:)))+)