NOtes

  • Sure: 1) to match any char, use [^] or [\s\S], 2) branch reset group is not supported so use your current approach with a non-capturing group and manipulate the group values upon getting a match.

This works:

(?|(?:([a-z]+):([A-Z]:)((.|\r|\n))^:\2)|([a-z]+):()((?:(?:.|\n|\r)(?!^[a-z]+:))+))

This is decompressed & it works

It needs

  • White space removed from values
  • Key regex made complete
  • Delimiter regex possibly allow different characters
  • port to work on all regex engines

(?| # Match everything,

(?: # Non-Capture Match delimited key/value pairs, 

    ([a-z]+): # Key, Capture Group #1
        ([A-Z]*:) # Delimiter, Capture Group #2
    ( # Match everything after the delimiter
        (.|\r|\n)*  # Value, Capture Group #3
    )
    ^:\2   # Delimiter, ending value capture

)

| (?: # Match non-delimited values ([a-z]+): # Key, Capture Group #1 () # Capture Group #2, absense of delimiter ( # Value, Capture Group #3 (?: (?:.|\n|\r) (?!^[a-z]+:) # look ahead to make sure there is NOT a key )+

    )
)

)

This is a work-in-progress

Working on:

  • DONE-- Replace (.|\r|\n)* with (?s).*

  • White space removed from values

    • NEW LINES at beginning are removed
    • Need to remove new lines at the end
  • Key regex made complete

  • Delimiter regex possibly allow different characters (?s) (?| # Balance the capture groups

    (?: # Non-Capture Match delimited key/value pairs,

      ([a-z]+): # Key, Capture Group #1
          ([A-Z]*:) # Delimiter, Capture Group #2
      \s*
      ( # Match everything after the delimiter, Capture Group #3
          (?:
              .  # Value, Capture Group #3
              (?!\s+:\2) # Do NOT match trailing white space
          )*
      )
      \s* # Match the trailing white space
      ^:\2   # Delimiter, ending value capture
    

    )

| (?: # Match non-delimited values ([a-z]+): # Key, Capture Group #1 () # Capture Group #2, absense of delimiter \s* ( # Value, Capture Group #3 (?: . (?!^[a-z]+:) # look ahead to make sure there is NOT a key )+

    )
)

)

Non-delimited values only

([a-z]+):((?:(?:.|\n|\r)(?!^[a-z]+:(?![A-Z]*:)))+)