library(tidyverse)
library(stringr)
Major <- read.csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/majors-list.csv",header=TRUE,sep=",")
head(Major)## FOD1P Major Major_Category
## 1 1100 GENERAL AGRICULTURE Agriculture & Natural Resources
## 2 1101 AGRICULTURE PRODUCTION AND MANAGEMENT Agriculture & Natural Resources
## 3 1102 AGRICULTURAL ECONOMICS Agriculture & Natural Resources
## 4 1103 ANIMAL SCIENCES Agriculture & Natural Resources
## 5 1104 FOOD SCIENCE Agriculture & Natural Resources
## 6 1105 PLANT SCIENCE AND AGRONOMY Agriculture & Natural Resources
Using str_detect to return any pattern I found that in the majors list we have DATA returned once and STATISTICS returned twice.
Major1 <- Major %>%
select(Major) %>%
filter(str_detect(Major,"(DATA|STATISTICS)"))
Major1## Major
## 1 MANAGEMENT INFORMATION SYSTEMS AND STATISTICS
## 2 COMPUTER PROGRAMMING AND DATA PROCESSING
## 3 STATISTICS AND DECISION SCIENCE
Fruits <- c("bell pepper","bilberry","blackberry","blood orange","blueberry","cantaloupe","chili pepper","cloudberry","elderberry","lime","lychee","mulberry","olive","salal berry")
dput(Fruits)## c("bell pepper", "bilberry", "blackberry", "blood orange", "blueberry",
## "cantaloupe", "chili pepper", "cloudberry", "elderberry", "lime",
## "lychee", "mulberry", "olive", "salal berry")
(.)\1\1 this matches any expressions that have the same letters three times.
“(.)(.)\2\1” returns a pair of letters and the pair of letters but reversed ex: peppers
(..)\1 finds any expression in which any 2 characters are repeated
“(.).\1.\1” returns a string with a character then another character, the same character with any other character,and the original character again. ex: “acada”
“(.)(.)(.).*\3\2\1” returns three character followed by zero or more characters with the same three character but reversed. ex: “abccba”
str_subset(words,"^(.)((.*\\1$)|\\1$)")## [1] "america" "area" "dad" "dead" "depend"
## [6] "educate" "else" "encourage" "engine" "europe"
## [11] "evidence" "example" "excuse" "exercise" "expense"
## [16] "experience" "eye" "health" "high" "knock"
## [21] "level" "local" "nation" "non" "rather"
## [26] "refer" "remember" "serious" "stairs" "test"
## [31] "tonight" "transport" "treat" "trust" "window"
## [36] "yesterday"
str_view("church", "([A-Za-z][A-Za-z]).*\\1")str_view("eleven","([a-z]).*\\1.*\\1.")