library(stringr)
Load raw data
raw.data <-"555-1239Moe Szyslak(636) 555-0113Burns, C. Montgomery555-6542Rev. Timothy Lovejoy555 8904Ned Flanders636-555-3226Simpson, Homer5553642Dr. Julius Hibbert"
Extract all names
name <- unlist(str_extract_all(raw.data, "[[:alpha:]., ]{2,}"))
name
## [1] "Moe Szyslak" "Burns, C. Montgomery" "Rev. Timothy Lovejoy"
## [4] "Ned Flanders" "Simpson, Homer" "Dr. Julius Hibbert"
Rearrange vector to First Name then Last Name
name
## [1] "Moe Szyslak" "Burns, C. Montgomery" "Rev. Timothy Lovejoy"
## [4] "Ned Flanders" "Simpson, Homer" "Dr. Julius Hibbert"
fl_extract <- unlist(str_split(name[5], ","))
fl_extract
## [1] "Simpson" " Homer"
new_name <- str_c(fl_extract[2], fl_extract[1], sep = " ")
new_name
## [1] " Homer Simpson"
name[5] <- new_name
fl_extract <- unlist(str_split(name[2], ","))
fl_extract
## [1] "Burns" " C. Montgomery"
new_name <- str_c(fl_extract[2], fl_extract[1], sep = " ")
new_name
## [1] " C. Montgomery Burns"
name[2] <- new_name
show all new names
name
## [1] "Moe Szyslak" " C. Montgomery Burns" "Rev. Timothy Lovejoy"
## [4] "Ned Flanders" " Homer Simpson" "Dr. Julius Hibbert"
Vector indicating if charachter has a title
pmatch(c("Dr.", "Rev."), name)
## [1] 6 3
Vector checking if character has a second name
str_extract(name, "[[:alpha:]][[:blank:]][[:alpha:]]")
## [1] "e S" "y B" "y L" "d F" "r S" "s H"
Question 4 - describe type of strings and construct example 4.1 Digits at the end of the exression or sting ex: Street number 87
#[0-9]+\\$
4.2 Word edge, all letters a-z, item should be matched at least 1 time but no more than 4, follwed by word edge. Or a 4 letter word. ex: “book”
#\\b[a-z]{1,4}\\b
4.3 Can contain a “.” and is optional. Sting or expression ends in .txt ex: “file.txt”
#.*?\\.txt$
4.4 2 digits, a forward slash, followed by 2 more digits, a forward slash, and ends with 4 digits. Like a date ex: 02/01/2018
#\\d{2}/\\d{2}/\\d{4}
4.5 Optional period(s) enclosed in perentheses, followed by another optional period(s), followed by a forward slash and ending with the number 1. ex: (..)../1 or ()../1 or (..)/1
#<(.+?)>.+?</\\1>