library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readr)
data <- read.csv("https://raw.githubusercontent.com/sphill12/DATA607/main/majors-list%20data%20607.csv")
major_filtered <- data %>% filter(str_detect(Major, regex("STATISTICS|DATA")))
major_filtered
## FOD1P Major Major_Category
## 1 6212 MANAGEMENT INFORMATION SYSTEMS AND STATISTICS Business
## 2 2101 COMPUTER PROGRAMMING AND DATA PROCESSING Computers & Mathematics
## 3 3702 STATISTICS AND DECISION SCIENCE Computers & Mathematics
raw_str <- r"([1] "bell pepper" "bilberry" "blackberry" "blood orange"
[5] "blueberry" "cantaloupe" "chili pepper" "cloudberry"
[9] "elderberry" "lime" "lychee" "mulberry"
[13] "olive" "salal berry")"
pattern <- '"([^"]+)"'
find_match <- str_extract_all(raw_str, pattern)
final <- lapply(find_match, function(x) substr(x, 2, nchar(x)-1))
print(final)
## [[1]]
## [1] "bell pepper" "bilberry" "blackberry" "blood orange" "blueberry"
## [6] "cantaloupe" "chili pepper" "cloudberry" "elderberry" "lime"
## [11] "lychee" "mulberry" "olive" "salal berry"
“(.)\1\1” This expression will take any character for “(.)”. The “\1” will then match to the text of the first grouping, the “(.)”. This would match strings such as “aaa”
“(.)(.)\2\1” This expression will form 2 matching groups with “(.)(.)”. It will then take a match with the second group followed by a match with the first group. This would match strings such as “abba”
“(..)\1” This expression will take any two characters for the grouping. It will then match another set of this grouping. This would match a string such as “abab”
“(.).\1.\1”
This expression will take a grouping with the first char, and then allow any character to follow it. The next character must be the first grouping, followed by any character, and finally the first grouping. This would match a string such as “abaca”
“(.)(.)(.).\3\2\1”
This expression will make 3 groupings at “(.)(.)(.)”.The .* will match 0 or more characters after the first 3. The string must then match the 3rd grouping, and the second grouping. The “\1*” will match 0 or more occurances of the 1st grouping.There are a variety of ways to match a string to this pattern. “abccb”,“abcabca”,“abccba” would all match
The following regex would do this “^(.).*\1$”
The following regex would do this “(..).*\1”
I was not able to get this one