R Markdown

Task 1

provide code that identifies the majors that contain either “DATA” or “STATISTICS”

Load your dataset (replace ‘your_data.csv’ with your actual dataset file)

your_data <- read.csv(“recent-grads.csv”)

Extract the ‘Major’ column from your dataset

majors <- your_data$Major

Define a regular expression pattern to match “DATA” or “STATISTICS”

pattern <- “(DATA|STATISTICS)”

Use grepl() to check if each major matches the pattern

matches <- grepl(pattern, majors, ignore.case = TRUE)

Extract the rows (majors) that match the pattern

matching_majors <- your_data[matches, ]

Given list of items

items <- c(“bell pepper”, “bilberry”, “blackberry”, “blood orange”, “blueberry”, “cantaloupe”, “chili pepper”, “cloudberry”, “elderberry”, “lime”, “lychee”, “mulberry”, “olive”, “salal berry”)

Convert the list to a single character vector

formatted_items <- paste0(“c("”, paste(items, collapse = “", "”), “")”)

Describe, in words, what these expressions will match:

(.)\1\1 “(.)(.)\2\1” (..)\1 “(.).\1.\1” “(.)(.)(.).*\3\2\1”

Sample text to match against

text <- c(“baaaab”, “abbab”, “ababab”, “ababa”, “abcddcba”)

Regular expressions

pattern1 <- “(.)\1\1” pattern2 <- “(.)(.)\2\1” pattern3 <- “(..)\1” pattern4 <- “(.).\1.\1” pattern5 <- “(.)(.)(.).*\3\2\1”

Apply the regular expressions and print matches

cat(“Pattern 1 matches:”, grep(pattern1, text, value = TRUE), “”) cat(“Pattern 2 matches:”, grep(pattern2, text, value = TRUE), “”) cat(“Pattern 3 matches:”, grep(pattern3, text, value = TRUE), “”) cat(“Pattern 4 matches:”, grep(pattern4, text, value = TRUE), “”) cat(“Pattern 5 matches:”, grep(pattern5, text, value = TRUE), “”)

Task 4

Construct regular expressions to match words that:

1. Start and end with the same character.

Regular expression to match words that start and end with the same character

start_end_same_char <- “^([a-zA-Z]).*\1$”

Sample words to match against

words <- c(“racecar”, “hello”, “apple”, “banana”, “civic”)

Apply the regular expression and print matches

matching_words <- grep(start_end_same_char, words, value = TRUE) cat(“Words that start and end with the same character:”, matching_words, “”)

2. Contain a repeated pair of letters (e.g. “church” contains “ch” repeated twice.)

Regular expression to match words that contain a repeated pair of letters

repeated_pair <- “.([a-zA-Z]{2}).\1.*”

Sample words to match against

words <- c(“church”, “apple”, “banana”, “successful”, “book”)

Apply the regular expression and print matches

matching_words <- grep(repeated_pair, words, value = TRUE) cat(“Words that contain a repeated pair of letters:”, matching_words, “”)

3. Contain one letter repeated in at least three places (e.g. “eleven” contains three “e”s.)

Define a list of words

word_list <- c(“eleven”, “apple”, “banana”, “committee”, “success”, “address”)

Function to check if a word contains one letter repeated in at least three places

contains_repeated_letter <- function(word) { word <- tolower(word) # Convert the word to lowercase for case insensitivity letters <- strsplit(word, ““)[[1]] # Split the word into letters

# Create a table of letter frequencies letter_counts <- table(letters)

# Check if any letter occurs at least three times any(letter_counts >= 3) }

Find words that meet the criteria

result <- Filter(contains_repeated_letter, word_list)