This assignment will use the following packages:
library(readr)
library(stringr)
library(dplyr)
library(tidyr)
library(tidyverse)
library(ggplot2)
Utilizing the 173 majors listed on fivethirtyeight. The link to the dataset is within this github link here, which will also be uploaded onto my github page. Using the following data from the csv file the following code will identify the majors that contain either “Data” or “Statistics”.
a <- getwd() # Just to set directory
setwd(a)
majors <- read.csv("majors-list.csv")
head(majors)
## FOD1P Major Major_Category
## 1 1100 GENERAL AGRICULTURE Agriculture & Natural Resources
## 2 1101 AGRICULTURE PRODUCTION AND MANAGEMENT Agriculture & Natural Resources
## 3 1102 AGRICULTURAL ECONOMICS Agriculture & Natural Resources
## 4 1103 ANIMAL SCIENCES Agriculture & Natural Resources
## 5 1104 FOOD SCIENCE Agriculture & Natural Resources
## 6 1105 PLANT SCIENCE AND AGRONOMY Agriculture & Natural Resources
grep("DATA|STATISTICS", majors$Major, value = TRUE)
## [1] "MANAGEMENT INFORMATION SYSTEMS AND STATISTICS"
## [2] "COMPUTER PROGRAMMING AND DATA PROCESSING"
## [3] "STATISTICS AND DECISION SCIENCE"
From the following code block, of the 173 majors, only 3 majors consist of “Data” or “Statistics”.
The following code block will convert the following data:
[1] "bell pepper" "bilberry" "blackberry" "blood orange"
[5] "blueberry" "cantaloupe" "chili pepper" "cloudberry"
[9] "elderberry" "lime" "lychee" "mulberry"
[13] "olive" "salal berry"
to
c("bell pepper", "bilberry", "blackberry", "blood orange", "blueberry", "cantaloupe",
"chili pepper", "cloudberry", "elderberry", "lime", "lychee", "mulberry", "olive", "salal berry")
# Initialization
fruitsMain <- '[1] "bell pepper" "bilberry" "blackberry" "blood orange"
[5] "blueberry" "cantaloupe" "chili pepper" "cloudberry"
[9] "elderberry" "lime" "lychee" "mulberry"
[13] "olive" "salal berry"'
# Remove brackets and numbers
fruitsMod <- gsub('\\[\\d+\\]|\\s{2,}', '', fruitsMain)
# Splices at the quotes
fruitsModded <- unlist(strsplit(fruitsMod, '"'))
# removes empty white spaces
fruitsModded <- trimws(fruitsModded[fruitsModded != "" & fruitsModded != " "])
fruitsModded <- paste0('c(', paste(shQuote(fruitsModded), collapse = ", "), ')')
#Print final product
cat(fruitsModded)
## c("bell pepper", "bilberry", "blackberry", "blood orange", "blueberry", "cantaloupe", "chili pepper", "cloudberry", "elderberry", "lime", "lychee", "mulberry", "olive", "salal berry")
The following problems will be listed below to explain what will happen to each expression.
(.)\\1\\1
alpha <- c("aaabcdef", "cheese", "banana", "wawawawaw", "Starlette", "Aurora", "Thalassa", "Apollo", "Bobobo", "haroldlorah")
str_view_all(alpha, "(.)\\1\\1", match=TRUE)
## Warning: `str_view_all()` was deprecated in stringr 1.5.0.
## ℹ Please use `str_view()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## [1] │ <aaa>bcdef
This highlights any character that is repeated three times.
"(.)(.)\\2\\1"
str_view(alpha, "(.)(.)\\2\\1", match=TRUE)
## [5] │ Starl<ette>
## [7] │ Thal<assa>
## [8] │ Ap<ollo>
This highlights any characters with xyyx.
(..)\\1
str_view(alpha, "(..)\\1", match=TRUE)
## [3] │ b<anan>a
## [4] │ <wawa><wawa>w
## [9] │ B<obob>o
This one highlights the characters with xyxy.
"(.).\\1.\\1"
str_view(alpha, "(.).\\1.\\1", match=TRUE)
## [3] │ b<anana>
## [4] │ <wawaw>awaw
## [9] │ B<obobo>
This looks for a single character that repeats three times consecutively.
"(.)(.)(.).*\\3\\2\\1"
str_view(alpha, "(.)(.)(.).*\\3\\2\\1", match=TRUE)
## [4] │ <wawawawaw>
## [10] │ <haroldlorah>
This is basically where the first three characters are followed by their reverse after some random text, this can also be names that are palindromic.
The following code blocks will answer the following questions below:
^(.).*\\1$
str_view(alpha, "^(.).*\\1$", match=TRUE)
## [4] │ <wawawawaw>
## [10] │ <haroldlorah>
(..).*\\1
str_view(alpha, "(..).*\\1", match=TRUE)
## [3] │ b<anan>a
## [4] │ <wawawawa>w
## [9] │ B<obob>o
(.).*\\1.*\\1
str_view(alpha, "(.).*\\1.*\\1", match=TRUE)
## [1] │ <aaa>bcdef
## [2] │ ch<eese>
## [3] │ b<anana>
## [4] │ <wawawawaw>
## [5] │ S<tarlett>e
## [7] │ Th<alassa>
## [9] │ B<obobo>