This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
require(dplyr)
## Loading required package: dplyr
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## 1. For getting Data from collegemajors.csv where the column Major contains Statistics or Data
setwd("C:/RData")
MajorsList <- read.csv("collegemajors.csv",header=TRUE)
subset(MajorsList,(grepl("Data|STATISTICS", Major, ignore.case = TRUE)))
## FOD1P Major Major_Category
## 44 6212 MANAGEMENT INFORMATION SYSTEMS AND STATISTICS Business
## 52 2101 COMPUTER PROGRAMMING AND DATA PROCESSING Computers & Mathematics
## 59 3702 STATISTICS AND DECISION SCIENCE Computers & Mathematics
## 2. For getting Data from collegemajors.csv where the column Major contains Statistics or Data
#2 Write code that transforms the data below:
##[1] "bell pepper" "bilberry" "blackberry" "blood orange"
##[5] "blueberry" "cantaloupe" "chili pepper" "cloudberry"
##[9] "elderberry" "lime" "lychee" "mulberry"
##[13] "olive" "salal berry"
testdata <- '[1] "bell pepper" "bilberry" "blackberry" "blood orange"
[5] "blueberry" "cantaloupe" "chili pepper" "cloudberry"
[9] "elderberry" "lime" "lychee" "mulberry"
[13] "olive" "salal berry"'
library(stringr)
testdata_split <-unlist(str_extract_all(testdata, pattern = "\"([a-z]+.[a-z]+)\""))
testdata_split
## [1] "\"bell pepper\"" "\"bilberry\"" "\"blackberry\"" "\"blood orange\""
## [5] "\"blueberry\"" "\"cantaloupe\"" "\"chili pepper\"" "\"cloudberry\""
## [9] "\"elderberry\"" "\"lime\"" "\"lychee\"" "\"mulberry\""
## [13] "\"olive\"" "\"salal berry\""
testdata_final <- str_remove_all(testdata_split, "\"")
##3 Describe, in words, what these expressions will match:
## "(.)\1\1"
#This will match any one character followed by two repetitions, like "ccc" or "666".
## "(.)(.)\\2\\1"
#This will search for two characters repeated, except in reverse like "cddc" or "2552".
## "(..)\1"
#This will search for two characters, repeated once, like “dada” or “6767”
"(.).\\1.\\1"
## [1] "(.).\\1.\\1"
#This will search for a five character term, three of which are the same, like “71727”.
## "(.)(.)(.).*\\3\\2\\1"
# This will construct a set of characters that begin and end with the same three characters, except the second instance is reversed, like “6547113456”.
##4 Construct regular expressions to match words that:
##Start and end with the same character.
# __"(.).*\1"__
##Contain a repeated pair of letters (e.g. "church" contains "ch" repeated twice.)
# __".([A-Za-z][A-Za-z]).\1.*"__
##Contain one letter repeated in at least three places (e.g. "eleven" contains three "e"s.)
# “.([A-Za-z]).\1.\1.”
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.