Questions 1

1. Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset [https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/], provide code that identifies the majors that contain either “DATA” or “STATISTICS”

majorsList <- read.csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/majors-list.csv")

id = c(grep("DATA", majorsList$Major),grep("STATISTICS", majorsList$Major))

majorsList <- majorsList %>%
    mutate(search_flag = FALSE)
majorsList[id,4] = TRUE

filter(majorsList, search_flag == TRUE)

Questions 2

**2 Write code that transforms the data below:

[1] “bell pepper” “bilberry” “blackberry” “blood orange” [5] “blueberry” “cantaloupe” “chili pepper” “cloudberry”
[9] “elderberry” “lime” “lychee” “mulberry”
[13] “olive” “salal berry”

Into a format like this:

c(“bell pepper”, “bilberry”, “blackberry”, “blood orange”, “blueberry”, “cantaloupe”, “chili pepper”, “cloudberry”, “elderberry”, “lime”, “lychee”, “mulberry”, “olive”, “salal berry”)**

str <- '[1] "bell pepper"  "bilberry"     "blackberry"   "blood orange"
[5] "blueberry"    "cantaloupe"   "chili pepper" "cloudberry"  
[9] "elderberry"   "lime"         "lychee"       "mulberry"    
[13] "olive"        "salal berry"'

writeLines(str)
## [1] "bell pepper"  "bilberry"     "blackberry"   "blood orange"
## [5] "blueberry"    "cantaloupe"   "chili pepper" "cloudberry"  
## [9] "elderberry"   "lime"         "lychee"       "mulberry"    
## [13] "olive"        "salal berry"
str <- str_replace_all(str,"[\\[\\]]","")
str <- str_replace_all(str,"[\\n]","")
str <- str_replace_all(str,"[\\d]","")
str <- str_replace_all(str,"\\s","")
str <- str_replace_all(str,'""','","')

str <- str_c("c(",str,")")
writeLines(str)
## c("bellpepper","bilberry","blackberry","bloodorange","blueberry","cantaloupe","chilipepper","cloudberry","elderberry","lime","lychee","mulberry","olive","salalberry")

Question 3

3 Describe, in words, what these expressions will match:

(.)\1\1 - (xxx) this will match to any string with 3 repeating characters

“(.)(.)\2\1” - (xyyx) this will match to a string that follows this patter, 1st(x) char, 2 instances of the 2nd char(yy), followed by the 1st character (x)

(..)\1 - (xyxy) this will match to a sting that follows the pattern 1st character = 3rd character and 2nd char = th character

“(.).\1.\1” - (xyxxx) this will match to a sting that follows the pattern 1st character = 3rd through 5th charter

**"(.)(.)(.).*\3\2\1" ** - (xyz*zyx) this will match to a string that follows that pattern 3 characters, followed by one or more other characters, followed by 3rd character, the 2nd character and the 3rd character

Question 4

4 Construct regular expressions to match words that:

Start and end with the same character.

**- "^(.).*\1$"**

Contain a repeated pair of letters (e.g. “church” contains “ch” repeated twice.)

**- "^(.)(.).*\1\2$"**

Contain one letter repeated in at least three places (e.g. “eleven” contains three “e”s.)

- “(.).+\1.+\1”

This document is available at [RPubs] and on [Github]