Please deliver links to an R Markdown file (in GitHub and rpubs.com) with solutions to problems 3 and 4 from chapter 8 of Automated Data Collection in R.
raw.data <- "555-1239Moe Szyslak(636) 555-0113Burns, C. Montgomery555-6542Rev. Timothy Lovejoy555 8904Ned Flanders636-555-3226Simpson, Homer5553642Dr. Julius Hibbert"
library (stringr)
name <- unlist(str_extract_all(raw.data, "[[:alpha:]., ]{2,}"))
name
## [1] "Moe Szyslak" "Burns, C. Montgomery" "Rev. Timothy Lovejoy"
## [4] "Ned Flanders" "Simpson, Homer" "Dr. Julius Hibbert"
My first attempt excluded the middle name after the first initial.
unlist(str_extract_all(name, "(\\w+),\\s(\\w+)"))
## [1] "Burns, C" "Simpson, Homer"
With inclusion of the middle name after the first initial:
unlist(str_extract_all(name, "(\\w+),\\s(\\w+)?(.\\s(\\w+))?"))
## [1] "Burns, C. Montgomery" "Simpson, Homer"
newname <- str_replace_all(name,"(\\w+),\\s(\\w+)?(.\\s(\\w+))?", "\\2\\3 \\1")
newname
## [1] "Moe Szyslak" "C. Montgomery Burns" "Rev. Timothy Lovejoy"
## [4] "Ned Flanders" "Homer Simpson" "Dr. Julius Hibbert"
source: https://stackoverflow.com/questions/33826650/last-name-first-name-to-first-name-last-name
str_extract(string = newname, pattern = "(Rev|Dr)\\.")
## [1] NA NA "Rev." NA NA "Dr."
str_extract(string = newname, pattern = "\\s(\\w+)(.\\s(\\w+))")
## [1] NA " Montgomery Burns" " Timothy Lovejoy"
## [4] NA NA " Julius Hibbert"
Unfortunately, I could not figure out how to do this one…