regexp in spark sql to extract numeric from a string
I have a string
norethindrone (norethindrone 5 mg oral tablet)
from which I need to match 5
I have used the below query in oracle and got output as “5”
select regexp_replace(regexp_SUBSTR(lower('norethindrone (norethindrone 5 mg oral tablet)')
,'[0-9,./]+ ?(mc?g|ml|meq|(intl )?unit.?s?.?|g|each|dose|%)+([-0-9/ ]+(ml|mc?g))?'),'[a-z% ]+','')
from dual
But the same code in spark SQL output null
.
What would be the reason?
Note: The string I gave here is a sample and I have millions of records with different possible combinations.
Read more here: Source link