Introduction

In this homework assignment, there are 4 problems to complete.

Load packages

library(tidyverse)
library(openintro)
library(dplyr)

Exercise 1

The first problem is to identify majors that contain either “DATA” or “STATISTICS” listed in fivethirtyeight.com’s College Majors dataset [https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/].

majors <- read.csv('https://raw.githubusercontent.com/fivethirtyeight/data/refs/heads/master/college-majors/majors-list.csv')

majors_data_stat <- majors %>%
  filter(grepl("DATA|STATISTICS", Major))

print(majors_data_stat)
##   FOD1P                                         Major          Major_Category
## 1  6212 MANAGEMENT INFORMATION SYSTEMS AND STATISTICS                Business
## 2  2101      COMPUTER PROGRAMMING AND DATA PROCESSING Computers & Mathematics
## 3  3702               STATISTICS AND DECISION SCIENCE Computers & Mathematics

Exercise 2

The second problem is to transform the fruit data into the given format.

fruits <- c("bell pepper", "bilberry", "blackberry" ,  "blood orange", "blueberry",  "cantaloupe" , "chili pepper", "cloudberry", "elderberry", "lime",  "lychee",  "mulberry", "olive", "salal berry")

print(fruits)
##  [1] "bell pepper"  "bilberry"     "blackberry"   "blood orange" "blueberry"   
##  [6] "cantaloupe"   "chili pepper" "cloudberry"   "elderberry"   "lime"        
## [11] "lychee"       "mulberry"     "olive"        "salal berry"

Exercise 3

Describe, in words, what these expressions will match:

(.)\1\1   This will match any single character that repeats three times in a row

"(.)(.)\\2\\1"  This will match four character palindromes where the first two characters appear in reverse order at the end

(..)\1    This will match a two character sequence that repeats immediately

"(.).\\1.\\1" This will match a pattern where a character appears three times, but with a different character in between each time the character appears

"(.)(.)(.).*\\3\\2\\1" This will match a character where the first 3 characters appear in the reverse order at the end

Exercise 4

Construct regular expressions to match words that:

Start and end with the same character.

"^(.).*\\1$"

Contain a repeated pair of letters (e.g. "church" contains "ch" repeated twice.)

"(..).*\\1"

Contain one letter repeated in at least three places (e.g. "eleven" contains three "e"s.)

"(.).*\\1.*\\1"

End