IS607_wk4_assmnt

p. 217

Match a string with one or more digits followed by $.

grep("[0-9]+\\$","100$")

## [1] 1

grep("[0-9]+\\$","foo$") # fails

## integer(0)

grep("[0-9]+\\$","100") # fails

## integer(0)

grep("[0-9]+\\$","foo2$")

## [1] 1

Match words 1-4 chars in length of alphabetic characters.

grep("\\b[a-z]{1,4}\\b","a b")

## [1] 1

grep("\\b[a-z]{1,4}\\b","5 abcd 5")

## [1] 1

grep("\\b[a-z]{1,4}\\b","5 abcde 5") # fails

## integer(0)

Match a string ending in .txt (eg. find text files).

grep(".*?\\.txt$","hello world.txt")

## [1] 1

grep(".*?\\.txt$","hello world.dat") # fail

## integer(0)

grep(".*?\\.txt$","hello world.dat.txt")

## [1] 1

The ? is redundant, I think.

Date string in the format dd/dd/dddd

grep("\\d{2}/\\d{2}/\\d{4}","11/22/2015")

## [1] 1

grep("\\d{2}/\\d{2}/\\d{4}","11/22/15") # nah

## integer(0)

Match balanced XML <.>.</.> tags with no whitespace in or between using backreference.

grep("<(.+?)>.+?</\\1>","<f>foo</f>")

## [1] 1

grep("<(.+?)>.+?</\\1>","<f>foo</g>") # no

## integer(0)

[^a-zA-Z]{1,}\\$

library(stringr)
s <- "chunkylover53[at]aol[dot]com"
str_replace(s, "(.+)\\[at\\](.+)\\[dot\\](com)", "\\1@\\2.\\3")

## [1] "chunkylover53@aol.com"

str_extract(s, "[:digit:]")

## [1] "5"

str_extract(s, "[:digit:]+")

## [1] "53"

str_extract(s, "\\D")

## [1] "c"

str_extract(s, "\\d+")

## [1] "53"