Python regex capture in sliding pairs
You can use \w/\S in place of the too broad .:
re.findall(r'(\S+)(?= -> (\S+))', r'a -> b -> c')
re.findall(r'(\w+)(?= -> (\w+))', r'a -> b -> c')
Output: [('a', 'b'), ('b', 'c')]
Another, more generic, example:
re.findall(r'\b((?:(?!->).)+)\b(?=\s*->\s*\b((?:(?!->).)+)\b)',
'a1 -> b1 b2-b3->c1 c2')
Output: [('a1', 'b1 b2-b3'), ('b1 b2-b3', 'c1 c2')]
Alternatively, as suggested in comments, a non pure regex solution, post-processing the output of re.split with itertools.pairwise:
from itertools import pairwise
out = list(pairwise(re.split(r'\s*->\s*', 'a1 -> b1 b2-b3->c1 c2')))
Read more here: Source link
