Question 3 Load Data

  1. Use the tools of chapter to rearrang the vector so that all elements conform to the standard first_name last name.

To clear “Rev” and “Dr”

Name_cleartitle<-unlist(str_replace(name,"[[:alpha:]]{2,}[.]",""))
Name_cleartitle
## [1] "Moe Szyslak"          "Burns, C. Montgomery" " Timothy Lovejoy"    
## [4] "Ned Flanders"         "Simpson, Homer"       " Julius Hibbert"

First Name:

firstName <- unlist(str_extract(Name_cleartitle,"[[:alpha:]]{2,}[[:space:]]{1,}|[[:space:]]{1,}[[:alpha:]]{2,}"))
firstName
## [1] "Moe "        " Montgomery" " Timothy"    "Ned "        " Homer"     
## [6] " Julius"

Last Name:

lastName <- unlist(str_extract(Name_cleartitle,"[^[:punct:]][[:space:]][[:alpha:]]{2,}|[[:alpha:]]{2,}[[:punct:]]"))
lastName <- unlist(str_replace(lastName,"[[:alpha:]][[:space:]]", ""))
lastName <- unlist(str_replace(lastName,"[[:punct:]]", ""))
lastName
## [1] "Szyslak"  "Burns"    "Lovejoy"  "Flanders" "Simpson"  "Hibbert"

data.frame(firstName, lastName)

data.frame(firstName, lastName)
##     firstName lastName
## 1        Moe   Szyslak
## 2  Montgomery    Burns
## 3     Timothy  Lovejoy
## 4        Ned  Flanders
## 5       Homer  Simpson
## 6      Julius  Hibbert
  1. Construct a logical vector indicating whether a character has a title(i.e., Rev. and Dr.)

TRUE of logical vector indicate has a title:

title <- unlist(str_detect(name,"[[:alpha:]]{2,}[.]"))
title
## [1] FALSE FALSE  TRUE FALSE FALSE  TRUE

Table of logical vector

Table1<-data.frame(name,title)
Table1
##                   name title
## 1          Moe Szyslak FALSE
## 2 Burns, C. Montgomery FALSE
## 3 Rev. Timothy Lovejoy  TRUE
## 4         Ned Flanders FALSE
## 5       Simpson, Homer FALSE
## 6   Dr. Julius Hibbert  TRUE
  1. Construct a logical vector indicating whether a character has a second name.

Logical vector indicate if has a second name.

secondname <- unlist(str_detect(Name_cleartitle,"[[:alpha:]]{1,}[.]"))
secondname
## [1] FALSE  TRUE FALSE FALSE FALSE FALSE

Table of logical vector

Table2<-data.frame(name,secondname)
Table2
##                   name secondname
## 1          Moe Szyslak      FALSE
## 2 Burns, C. Montgomery       TRUE
## 3 Rev. Timothy Lovejoy      FALSE
## 4         Ned Flanders      FALSE
## 5       Simpson, Homer      FALSE
## 6   Dr. Julius Hibbert      FALSE

4 Describe the types of strings that conform to the following regular expressions and construct an example that is matched by the regular expression.

1.[0-9]+\$

example1 <-c ("123$","0","$123","0$","123$ab","123")
example1 <- str_detect(example1,"[0-9]+\\$")
example1
## [1]  TRUE FALSE FALSE  TRUE  TRUE FALSE
  1. \b[a-z]{1,4}\b
example2 <-c ("b4b","123tow","abfg123bb","skdfj&","dkk45b","123qef")
example2 <- str_detect(example2,"\\b[a-z]{1,4}\\b")
example2
## [1] FALSE FALSE FALSE FALSE FALSE FALSE
  1. .*?\.txt$
example3 <-c ("*?txt$","af.txt","abfg123bb","*adfd5b","dff")
example3 <- str_detect(example3,".*?\\.txt$")
example3
## [1] FALSE  TRUE FALSE FALSE FALSE
  1. \d{2}/\d{2}/\d{4}
example4 <-c ("adf","af/123","12/12/1212","13/qw/1235","1/1/2032")
example4 <- str_detect(example4,".*\\d{2}/\\d{2}/\\d{4}")
example4
## [1] FALSE FALSE  TRUE FALSE FALSE
  1. <(.+?)>.+?</\1>
example5 <-c ("<abc>abc</abc>","</\fgf>sdfd<s\12>","12/12/1212","<344>dfd<232>","<dfd>")
example5 <- str_detect(example5,"<(.+?)>.+?</\\1>")
example5
## [1]  TRUE FALSE FALSE FALSE FALSE
  1. The following code hides a secret message. Crack it with R and regular expressions. Hint: Some of the characters are more revealing than others! The code snippet is also available in the materials at www.r-datacollection.com.
message<-"clcopCow1zmstc0d87wnkig7OvdicpNuggvhryn92Gjuwczi8hqrfpRxs5Aj5dwpn0Tanwo
Uwisdij7Lj8kpf03AT5Idr3coc0bt7yczjatOaootj55t3Nj3ne6c4Sfek.r1w1YwwojigO
d6vrfUrbz2.2bkAnbhzgv4R9i05zEcrop.wAgnb.SqoU65fPa1otfb7wEm24k6t3sR9zqe5
fy89n6Nd5t9kc4fE905gmc4Rgxo5nhDk!gr"
message<-unlist(str_extract_all(message, "[[:upper:].]{1,}"))
message <- cat(str_c(message,collapse = ""))
## CONGRATULATIONS.YOU.ARE.A.SUPERNERD