Question #1

Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset [https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/], provide code that identifies the majors that contain either “DATA” or “STATISTICS”

college_majors = read_csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/majors-list.csv', 
                          show_col_types = FALSE)
college_majors %>% filter(str_detect(Major,"STATISTICS") | str_detect(Major,"DATA"))
## # A tibble: 3 × 3
##   FOD1P Major                                         Major_Category         
##   <chr> <chr>                                         <chr>                  
## 1 6212  MANAGEMENT INFORMATION SYSTEMS AND STATISTICS Business               
## 2 2101  COMPUTER PROGRAMMING AND DATA PROCESSING      Computers & Mathematics
## 3 3702  STATISTICS AND DECISION SCIENCE               Computers & Mathematics

Question #2

Write code that transforms the data below:

[1] “bell pepper” “bilberry” “blackberry” “blood orange” [5] “blueberry” “cantaloupe” “chili pepper” “cloudberry”
[9] “elderberry” “lime” “lychee” “mulberry”
[13] “olive” “salal berry”

Into a format like this:

c(“bell pepper”, “bilberry”, “blackberry”, “blood orange”, “blueberry”, “cantaloupe”, “chili pepper”, “cloudberry”, “elderberry”, “lime”, “lychee”, “mulberry”, “olive”, “salal berry”)

## [1] "[1] \"bell pepper\"  \"bilberry\"     \"blackberry\"   \"blood orange\"\n[5] \"blueberry\"    \"cantaloupe\"   \"chili pepper\" \"cloudberry\" \n[9] \"elderberry\"   \"lime\"         \"lychee\"       \"mulberry\" \n[13] \"olive\"        \"salal berry\""
## [1] "c(\"bell pepper\", \"bilberry\", \"blackberry\", \"blood orange\", \"blueberry\", \"cantaloupe\", \"chili pepper\", \"cloudberry\", \"elderberry\", \"lime\", \"lychee\", \"mulberry\", \"olive\", \"salal berry\")"

Question #3

Describe, in words, what these expressions will match:

This matches 3 consecutive matches of a character e.g. in string “422-3777”, it would match 777

This matches a character followed by a second character repeated twice followed by the first character again. e.g. in string “gamma”, it would match amma

This matches 2 characters repeated twice e.g. in string “cucumber”, it would match cucu

This matches a character followed by any other character, followed by the first character, followed by any other character, followed by the first character e.g. in string “707372456”, it would match 70737

This matches any 3 characters, followed by 0 or more additional characters, followed by the 3rd character, then the 2nd character, and lastly the 1st character. e.g. in string “4563abc45643cba1231, it would match abc45643cba

Question #4

Construct regular expressions to match words that:

**^(.).*\1$**

**(..).*\1**

(.).\1.\1