R Character Manipulation

Question 1

Start by reading the csv into a data frame. Then use str_view to see all the majors that contain either DATA or STATISTICS.

library(tidyverse)

## -- Attaching packages ------------------------------------------------ tidyverse 1.3.0 --

## v ggplot2 3.2.1     v purrr   0.3.3
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   1.0.0     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0

## -- Conflicts --------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

majors <- read.csv("majors.csv", stringsAsFactors = FALSE)
str_view(majors$Major, "(DATA)|(STATISTICS)", match = TRUE)

You could also extract the data more explicitly without using str_view.

majors$Major[str_detect(majors$Major, "(DATA)|(STATISTICS)")]

## [1] "MANAGEMENT INFORMATION SYSTEMS AND STATISTICS"
## [2] "COMPUTER PROGRAMMING AND DATA PROCESSING"     
## [3] "STATISTICS AND DECISION SCIENCE"

Question 2

First, load the string into variable x.

x <- '[1] "bell pepper"  "bilberry"     "blackberry"   "blood orange"
+ 
+ [5] "blueberry"    "cantaloupe"   "chili pepper" "cloudberry"  
+ 
+ [9] "elderberry"   "lime"         "lychee"       "mulberry"    
+ 
+ [13] "olive"        "salal berry"'

Next, split the string on quotations since each fruit ends and begins with quotes.

y <- str_split(x, '"')

Finally, take every other element in the resulting list and add it to the new list z.

z <- y[[1]][c(FALSE, TRUE)]

# Examples
z[1]

## [1] "bell pepper"

z[7]

## [1] "chili pepper"

z[12]

## [1] "mulberry"

Question 3

This will match any letter followed by \1\1, such as “a\1\1”.
This will match any expression such as ‘“anna”’ where the second two characters are the reverse of the first two characters, wrapped in quotations.
This will match any two letters followed by \1, such as “ab\1”.
This will match any expression where a letter is repeated every other character and wrapped in quotations, such as ‘“abaca”’
This will match an expression such as ‘“abccba”’ where there could be any character of length greater than or equal to 0 between the c’s and wrapped in quotations.

Question 4

Start and end with the same character.

str_view_all("anna","^(.).*\\1$")

Contain a repeated pair of letters (e.g. “church” contains “ch” repeated twice.)

str_view_all("church","(..)[^ ]*\\1")

Contain one letter repeated in at least three places (e.g. “eleven” contains three “e”s.)

str_view_all("eleven", "(.)[^ ]*\\1[^ ]*\\1")

R Character Manipulation

David Moste

2/10/2020

Question 1

Question 2

Question 3

Question 4