Capofari_Week_4

Example 4

The regular expression [0-9]+\$ takes the first instance of 1 or more digits concatenated with 1 $ symbol.

unlist(str_extract_all("Bobby won 314.15$ and Alice won 271$.", "[0-9]+\\$"))

## [1] "15$"  "271$"

The regular expression \b[a-z]{1,4}\b matches strings of length 1, 2, 3, or 4, that are made of only lower case letters.

unlist(str_extract_all("In my younger and more vulnerable years my father gave me some advice that I've been turning over in my mind ever since.", "\\b[a-z]{1,4}\\b"))

##  [1] "my"   "and"  "more" "my"   "gave" "me"   "some" "that" "ve"   "been"
## [11] "over" "in"   "my"   "mind" "ever"

The regular expression .*?\.txt$ matches all possible strings that end with .txt.

sample_text <- c("my_text_file.txt", "my_python_file.py", "my other text file.txt")
unlist(str_extract(sample_text, ".*?\\.txt$"))

## [1] "my_text_file.txt"       NA                      
## [3] "my other text file.txt"

The regular expression \d{2}/\d{2}/\d{4} matches dates in the form ##/##/####.

sample_text <- c("03/14/2015", "1/10/2011", "08/25/81")
unlist(str_extract(sample_text, "\\d{2}/\\d{2}/\\d{4}"))

## [1] "03/14/2015" NA           NA

Example 5

[0-9]+\$ can be rewritten as\d(\d)*\${1}. These two regular expressions will perform the same task.

unlist(str_extract_all("Bobby won 271$ and Alice won 314.15$ and Chris won 1$.", "\\d(\\d)*\\${1}"))

## [1] "271$" "15$"  "1$"

Example 6

chunkylover[at]aol[dot]com

e_mail <- "chunkylover53[at]aol[dot]com"
e_mail <- str_replace(str_replace(e_mail, "\\[dot\\]", "."), "\\[at\\]", "@")
e_mail

## [1] "chunkylover53@aol.com"

unlist(str_extract_all(e_mail, "[:digit:]"))

## [1] "5" "3"

The expression \D matches all characters are not digits. we should use \d.

unlist(str_extract_all(e_mail, "\\D"))

##  [1] "c" "h" "u" "n" "k" "y" "l" "o" "v" "e" "r" "@" "a" "o" "l" "." "c"
## [18] "o" "m"

unlist(str_extract_all(e_mail, "\\d"))

## [1] "5" "3"

Capofari_Week_4_IS607

Nicholas Capofari

September 18, 2015

Example 4

Example 5

Example 6