python – How to remove ‘-‘ but ignore “–>’ using regex

l = ['00:00:34,021 --> 00:00:37,562',\
'-What is the cost of lies?',\
'- It\'s not that we\'ll mistake',\
'- them for the truth.']
s="\n".join(l)
print(s)

print('-'*10)
import re
result = re.sub(r"-(?!-)(?!>)", '', s, 0, re.MULTILINE)
print(result)

Whilst new line anchor is a viable option.. An alternative is to use a combination of an anchor symbol, the , with a negative lookbehind for a before and negative lookahead for a > symbol after. (Negative in each case – since you don’t want to include anchor if this is the case).

The sub(stitute) arguments are:

  1. regex
  2. replacement
  3. source
  4. number of occurrences – 0 = all
  5. multi line flag

Output – tested with programiz.pro/ide/python


00:00:34,021 --> 00:00:37,562
-What is the cost of lies?
- It's not that we'll mistake
- them for the truth.
----------
00:00:34,021 --> 00:00:37,562
What is the cost of lies?
 It's not that we'll mistake
 them for the truth.

Read more here: Source link