I have a string
norethindrone (norethindrone 5 mg oral tablet)
from which I need to match
I have used the below query in oracle and got output as “5”
select regexp_replace(regexp_SUBSTR(lower('norethindrone (norethindrone 5 mg oral tablet)') ,'[0-9,./]+ ?(mc?g|ml|meq|(intl )?unit.?s?.?|g|each|dose|%)+([-0-9/ ]+(ml|mc?g))?'),'[a-z% ]+','') from dual
But the same code in spark SQL output
What would be the reason?
Note: The string I gave here is a sample and I have millions of records with different possible combinations.
Read more here: Source link