Confusing behavior of a capturing group in a positive lookbehind in a Java regex with Pattern.matcher
The following issue is observed only on Java and not on other regex flavors (e.g PCRE).
I have the following regex: (?:(?<=([A-Za-z\d]))|\b)(MyString). There’s a capturing group on [A-Za-z\d] in the lookbehind.
And I’m trying to match (through Pattern.matcher(regex); to be precise, I’m calling replaceAll) the following string: string.MyString.
On PCRE, I will match MyString, and it will be the second group in the match. On Java, however, I will match the g in string as group 1, and MyString as group 2.
- Why does Java do that? To me this regex implies that a character matching
[A-Za-z\d]should only be matched if it directly precedesMyString, which is not the case here. - How can I avoid that and not match this
g? I want to keep the capturing group in case I have to match a string likestringMyString, in which case I do need thatgas group 1.
Read more here: Source link
