R: replace multiple occurrences of regex-matched strings in dataframe fields by looking them up in another dataframe
I have two dataframes:
df lookup:
oldId <- c(123, 456, 567, 789)
newId <- c(1, 2, 3, 4)
lookup <- data.frame(oldId, newId)
df data:
descr <- c("description with no match",
+ "description with one 123 match",
+ "description with again no match",
+ "description 456 with two 789 matches")
Goal:
I want a new dataframe:
- same structure as the data df
- same field values, except that all instances of numbers (i.e. 123, 456, 789) are looked up in the other dataframe, and replaced by lookup$newId.
The resulting dataframe will thus look like this:
- “description with no match”
- “description with one 1 match”
- “description with again no match”
- “description 2 with two 4 matches”
So, each text in the descr column may have a large amount of numbers which need to be replaced. Of course, this is a stripped down example; my real life dataframes are much bigger.
I do have the regex-part fixed:
fx <- function(x) {gsub("([[:digit:]]{3})", "TESTTEST", x)}
data$descr <- lapply(data$descr, fx)
But I have no idea how to let the function loop over all matches in a row, and then let it look up the number and replace it.
Read more here: Source link