Identify substrings between delimiters with regex
Why is group 3 empty?
Because the empty string is a match for it, and there is no pattern to match after that group, so the empty string suffices to have the regex succeed. Be aware that ? is lazy. If you would have dropped that last ?, making the .* greedy, the third group would contain all remaining characters in that line. Also that would not be what you wanted, because then it captures too much, even all other _s_ and _e_.
How do i get the text after the ending delimiter “e”?
By:
- repeating the execution of the regex as many times as there are matches. Your programming language is likely to have a function for such repetition. For instance, PHP has
preg_match_all; and - allowing a match (in capture group 1) to be followed by either
_s_or by the end of the input ($).
Any idea how to update the pattern?
Drop the third capture group, as you want successive matches to be captured by the first capture group.
Proposed regex:
Read more here: Source link
