Question #1

knitr::opts_chunk$set(echo = TRUE)
#install.packages("tidyverse")
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# cvs import 
major_list <- read.csv("C:/Users/tiffh/Downloads/4 Questions HW/majors-list.csv")
head(major_list)
##   FOD1P                                 Major                  Major_Category
## 1  1100                   GENERAL AGRICULTURE Agriculture & Natural Resources
## 2  1101 AGRICULTURE PRODUCTION AND MANAGEMENT Agriculture & Natural Resources
## 3  1102                AGRICULTURAL ECONOMICS Agriculture & Natural Resources
## 4  1103                       ANIMAL SCIENCES Agriculture & Natural Resources
## 5  1104                          FOOD SCIENCE Agriculture & Natural Resources
## 6  1105            PLANT SCIENCE AND AGRONOMY Agriculture & Natural Resources
# use str to filter DATA + STATA 
data_stat_major <- major_list %>%
  filter(str_detect(Major, "DATA") | str_detect(Major, "STATISTICS"))
# View which majors have data or stats in name 
data_stat_major
##   FOD1P                                         Major          Major_Category
## 1  6212 MANAGEMENT INFORMATION SYSTEMS AND STATISTICS                Business
## 2  2101      COMPUTER PROGRAMMING AND DATA PROCESSING Computers & Mathematics
## 3  3702               STATISTICS AND DECISION SCIENCE Computers & Mathematics
Management information systems and statistics,Computer programming and data processing, and Statistics and decision science where the three majors that contained data or statistics when filtered. They belong to the major category of business and computers & mathematics.

Question #2

knitr::opts_chunk$set(echo = TRUE)
# Fruit name vector 
fruit_name <- c("bell pepper", "bilberry", "blackberry", "blood orange", "blueberry", "cantaloupe", "chili pepper", "cloudberry", "elderberry", "lime", "lychee", "mulberry", "olive", "salal berry")

print(fruit_name)
##  [1] "bell pepper"  "bilberry"     "blackberry"   "blood orange" "blueberry"   
##  [6] "cantaloupe"   "chili pepper" "cloudberry"   "elderberry"   "lime"        
## [11] "lychee"       "mulberry"     "olive"        "salal berry"
# Fruit list assigned the fruit a number 
fruit_number <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14)

fruit_name <- c("bell pepper", "bilberry", "blackberry", "blood orange", "blueberry", "cantaloupe", "chili pepper", "cloudberry", "elderberry", "lime", "lychee", "mulberry", "olive", "salal berry")

fruit_list = list(fruit_number,fruit_name)
print(fruit_list)
## [[1]]
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14
## 
## [[2]]
##  [1] "bell pepper"  "bilberry"     "blackberry"   "blood orange" "blueberry"   
##  [6] "cantaloupe"   "chili pepper" "cloudberry"   "elderberry"   "lime"        
## [11] "lychee"       "mulberry"     "olive"        "salal berry"

I initially made a vector with just the fruit name, but you can also make a list just by assigning the fruit to a number.

Question #3

For the questions lets assume that there a vector C(“q”,“s”,“t”,“r”) which is a string of ” q s t r”

  1. (.)\1\1 (.) = q /1 = refers to the first variable So (.)\1\1 = qqq

  2. (.)(.)\2\1 (.) = q (.) = s \2 = refers to the second variables \1 = refers to first variable Which means (.)(.)\2\1 = qssq

  3. (..)\1 (..) = qs \1 = refers to first variables Meaning (..)\1 = qsqs

  4. (.).\1.\1 (.) = q . = s \1 = refers to first variable So (.).\1.\1 = qsqq

  5. (.)(.)(.).\3\2\1 (.) = q (.) = s (.) = t . = r \3 = refers to third variable \2 = refers to second variable \1 = refers to first variable So (.)(.)(.).*\3\2\1 = qstrtsq

Question #4

knitr::opts_chunk$set(echo = TRUE)
library(stringr)
library(dplyr)
# create a vector of 50 states 
 
states <- c("alabama", "alaska", "arizona", "arkansas", "california", "colorado", 
            "connecticut", "delaware", "florida", "georgia", "hawaii", "idaho", 
            "illinois", "indiana", "iowa", "kansas", "kentucky", "louisiana", 
            "maine", "maryland", "massachusetts", "michigan", "minnesota", 
            "mississippi", "missouri", "montana", "nebraska", "nevada", 
            "new hampshire", "new jersey", "new mexico", "new york", 
            "north carolina", "north dakota", "ohio", "oklahoma", 
            "oregon", "pennsylvania", "rhode island", "south carolina", 
            "south dakota", "tennessee", "texas", "utah", "vermont", 
            "virginia", "washington", "west virginia", "wisconsin", "wyoming")

#Start and end with the same character
str_subset(states, "^(.)((.*\\1$)|\\1?$)")
## [1] "alabama" "alaska"  "arizona" "ohio"
# Contain a repeated pair of letters
str_subset("mississippi", "([A-Za-z][A-Za-z]).*\\1")
## [1] "mississippi"
str_subset(states, "([A-Za-z][A-Za-z]).*\\1")
## [1] "mississippi"
#Contain one letter repeated in at least three places
str_subset("alabama", "([a-z]).*\\1.*\\1")
## [1] "alabama"
str_subset(states, "([a-z]).*\\1.*\\1")
##  [1] "alabama"       "alaska"        "arkansas"      "colorado"     
##  [5] "connecticut"   "illinois"      "massachusetts" "mississippi"  
##  [9] "new jersey"    "pennsylvania"  "tennessee"     "virginia"     
## [13] "west virginia"

For the last question I created a vector of the 50 states as word pool. alabama, alaska, arizona, and ohio are the four states that all begin and end with the same letter. And mississippi is the only state with repeating letter “ss”. Then there are 13 states(AL,AK,AR,CO,CT,IL,MA,MS,NJ,PA, TN,VA,WV) that have one letter that repeats at least three places.

Reference Q1. https://brshallo.github.io/r4ds_solutions/14-strings.html

Q2. https://www.geeksforgeeks.org/r-lists/

Q3/4.https://jrnold.github.io/r4ds-exercise-solutions/strings.html