1.Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset [https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/], provide code that identifies the majors that contain either “DATA” or “STATISTICS”
library(readr)
majors_list <- read_csv("~/Desktop/majors-list.csv")
## Rows: 174 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): FOD1P, Major, Major_Category
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
library(readr)
head(majors_list)
## # A tibble: 6 × 3
## FOD1P Major Major_Category
## <chr> <chr> <chr>
## 1 1100 GENERAL AGRICULTURE Agriculture & Natural Resources
## 2 1101 AGRICULTURE PRODUCTION AND MANAGEMENT Agriculture & Natural Resources
## 3 1102 AGRICULTURAL ECONOMICS Agriculture & Natural Resources
## 4 1103 ANIMAL SCIENCES Agriculture & Natural Resources
## 5 1104 FOOD SCIENCE Agriculture & Natural Resources
## 6 1105 PLANT SCIENCE AND AGRONOMY Agriculture & Natural Resources
FOD1P Major Major_Category
52 2101 COMPUTER PROGRAMMING AND DATA PROCESSING Computers & Mathematics
FOD1P Major Major_Category
44 6212 MANAGEMENT INFORMATION SYSTEMS AND STATISTICS Business
59 3702 STATISTICS AND DECISION SCIENCE Computers & Mathematics
[1] “bell pepper” “bilberry” “blackberry” “blood orange”
[5] “blueberry” “cantaloupe” “chili pepper” “cloudberry”
[9] “elderberry” “lime” “lychee” “mulberry”
[13] “olive” “salal berry”
Into a format like this:
c(“bell pepper”, “bilberry”, “blackberry”, “blood orange”, “blueberry”, “cantaloupe”, “chili pepper”, “cloudberry”, “elderberry”, “lime”, “lychee”, “mulberry”, “olive”, “salal berry”)
no1 <- c("bell pepper", "bilberry", "blackberry", "blood orange", "blueberry", "cantalope", "chili pepper", "cloudberry", "elderberry", "lime", "lychee", "mulberry", "olive", "salal berry")
dput(as.character(no1))
## c("bell pepper", "bilberry", "blackberry", "blood orange", "blueberry",
## "cantalope", "chili pepper", "cloudberry", "elderberry", "lime",
## "lychee", "mulberry", "olive", "salal berry")
3.Describe, in words, what these expressions will match:
(.)\1\1
Answer:Same character appears 3 times in a row
“(.)(.)\2\1”
Answer:2 characters attached to the same 2 characters in reverse order
(..)\1
Answer:Any 2 characters repeated
“(.).\1.\1”
Answer:Has the same character repeat 3 times and they are all seperated by one character.
"(.)(.)(.).*\3\2\1"
Answer:3 characters followed by zero or more characters followed by the original 3 characters in reverse order.
4.Construct regular expressions to match words that:
-Start and end with the same character. “^(.).+\1$”
-Contain a repeated pair of letters (e.g. “church” contains “ch” repeated twice.) “\b\w(\w{2})\w\1”
-Contain one letter repeated in at least three places (e.g. “eleven” contains three “e”s.) “1([a-z])\1[a-z]$”
library(tidyr)
test_words = list("banana", "peep", "strawberry", "cucumber", "olive", "test")
regex1 = "^(.).+\\1$"
Filter(function(x) any(grepl(regex1, x)), test_words)
## [[1]]
## [1] "peep"
##
## [[2]]
## [1] "test"
regex2 = "\\b\\w*(\\w{2})\\w*\\1"
Filter(function(x) any(grepl(regex2, x)), test_words)
## [[1]]
## [1] "banana"
##
## [[2]]
## [1] "cucumber"
regex3 = "^[a-z]*([a-z])\\1[a-z]*$"
Filter(function(x) any(grepl(regex3, x)), test_words)
## [[1]]
## [1] "peep"
##
## [[2]]
## [1] "strawberry"
https://github.com/Gunduzhazal/https-rpubs.com-gunduzhazal-808190
a-z↩︎