- if-else
- for
- while
- function
An if statement consists of a logic condition (TRUE or FALSE) followed by one or more statements.
# Template in words
if(a logic condition) {
Get inside the curly brackets and
run this block when the condition is true
}
# Example
x = 1
if(x == 1) {
print("x equals 1")
}
## [1] "x equals 1"
An if statement can be followed by an optional else statement, which executes when the previous logic expression is false.
name="Tim Cook" ; role = "CEO"
if(role == "CEO") {
print(paste(name, "is a CEO.", sep=" "))
} else {
print(paste(name, "is not a CEO.", sep=" "))
}
## [1] "Tim Cook is a CEO."
name = "Jeff Williams" ; role = "COO"
if(role == "CEO") {
print(paste(name, "is a CEO.", sep=" "))
} else {
print(paste(name, "is not a CEO.", sep=" "))
}
## [1] "Jeff Williams is not a CEO."
The logic expression in the previous example is role == "CEO".
Alternatively, you can just use a logic variable (TRUE or FALSE):
name="Tim Cook"
is_ceo = TRUE
if(is_ceo) {
print(paste(name, "is a CEO.", sep=" "))
} else {
print(paste(name, "is not a CEO.", sep=" "))
}
## [1] "Tim Cook is a CEO."
You can have more than two decision points:
age = 15
if(age < 13) {
print("Kid")
} else if (age < 20) {
# Go into this block if the previous logic expression is FALSE
# and the current logic expression is TRUE
print("Teenager")
} else {
# Go into this block if all previous logic expressions are FALSE
print("Adult")
}
## [1] "Teenager"
Create a variable grade = "A" and use if-else statements to print its respective value in a 4.0 scale based on the following conversion table. Try different letter grades to see if your if-else statements are correct.
print(paste("point:" ), point) to print the variable with a string)| Grade Letter | Point |
|---|---|
| A | 4 |
| B | 3 |
| C | 2 |
| D | 1 |
| All others | 0 |
You often encounter situations when you need to perform the same statements multiple times, potentially over a set of data objects.
Motivating example:
print( paste("The year was", 2001) )
print( paste("The year was", 2002) )
print( paste("The year was", 2003) )
print( paste("The year was", 2004) )
A for loop iterates through every element in a vector.
# Template in words
for (element in vector){
Use the element to do something
}
# Example
for (year in 2001:2004){
print( paste("The year was", year) )
}
## [1] "The year was 2001" ## [1] "The year was 2002" ## [1] "The year was 2003" ## [1] "The year was 2004"
for each year that is in the sequence 2001:2004, execute the code chunk print( paste("The year was", year) )
# you could define the vector outside the for-loop
x = c(1, 3, 5)
for(i in x) {
print(i)
}
## [1] 1 ## [1] 3 ## [1] 5
# it works regardless the data type of the vector
for(i in c("A", "B", "C", "D")) {
print(i)
}
## [1] "A" ## [1] "B" ## [1] "C" ## [1] "D"
If you want to skip some items in a for-loop, use the keyword next.
for (i in 1:5) {
if (i == 2){
next
}
print(i)
}
## [1] 1 ## [1] 3 ## [1] 4 ## [1] 5
Continue with our previous exercise on converting letter grades to grade points.
Put letter grades (A, A, C, B, B) in a vector and use for-loop to print out their respective points.
The while loop continually executes a block of statements while a particular condition is true.
# Template in words
while(condition to check) {
statements to run when the condition is true
}
# Example
x = 1
while(x <= 3) {
print(x)
x = x + 1 # what would happen if you skip this line?
}
## [1] 1 ## [1] 2 ## [1] 3
for() is better when you want to iterate over a set of elements that you know in advance
while() is better if you find it easy to specify when to run and when to stop.
Note: Every for() could be replaced with a while()
1. Use a for loop to get the sum of all numbers from 1 to 100 2. Use a while loop to get the sum of all numbers from 1 to 100
total outside the for/while loop, and change the value of total as you looping through all numbers from 1 to 100sum() function: sum(1:100)A function is a procedure or routine which takes optional inputs and produces an optional output.
So far we have already seen many built-in functions:
seq(), rep(), mean(), length(), …colnames(), rownames()paste()Data structures tie related values into one object
Functions tie related commands into one object
In both cases: easier to understand, easier to work with, easier to build into larger things
In R, you can create your own functions using the following syntax:
my_function <- function(input1, input2, ...) {
# Use the input to do something
return(output) # return a result
}
Here is a working example:
hello_world <- function() {
# this particular function requires no inputs
print("Hello world!")
# This function has no return statement; return nothing.
}
hello_world() # call the function; don't forget the parentheses
## [1] "Hello world!"
Another example:
add_one <- function(num) {
num = num+1 # be sure to match the input variable name
return(num) # return() says what the output is
}
a = add_one(10)
print(a)
## [1] 11
greeting <- function(your_name, course_name) {
print(paste0("Hello, ", your_name, ". This is ", course_name))
}
greeting(your_name="Smith", course_name="CIS 4730")
## [1] "Hello, Smith. This is CIS 4730"
greeting("Alice", "CIS 4950")
## [1] "Hello, Alice. This is CIS 4950"
stop_words to store the following stop words: a, an, and, the, that. Then write a function detect_stop_word that take a word as input and detect if the word is a stop word.
%in% operator from lab-02detect_stop_word("atlanta")
## [1] FALSE
detect_stop_word("that")
## [1] TRUE
shakespeare<-c("Et tu, Brute? — Then fall, Caesar!",
"Romans, countrymen, and lovers, hear me for my cause,
and be silent, that you may hear.",
"Believe me for mine honor, and have respect to mine honor,
that you may believe.",
"Censure me in your wisdom, and awake your senses,
that you may the better judge.",
"If there be any in this assembly, any dear friend of Caesar's,
to him I say that Brutus' love to Caesar was no less than his.",
"If then that friend demand why Brutus rose against Caesar,
this is my answer:",
"not that I loved Caesar less, but that I loved Rome more.")
#text_df <- data.frame(text = shakespeare, stringsAsFactors=FALSE) text_df <- tibble(line=1:7, text = shakespeare) text_df
## # A tibble: 7 x 2 ## line text ## <int> <chr> ## 1 1 "Et tu, Brute? — Then fall, Caesar!" ## 2 2 "Romans, countrymen, and lovers, hear me for my cause, \nand be silent,~ ## 3 3 "Believe me for mine honor, and have respect to mine honor, \nthat you ~ ## 4 4 "Censure me in your wisdom, and awake your senses, \nthat you may the b~ ## 5 5 "If there be any in this assembly, any dear friend of Caesar's, \nto hi~ ## 6 6 "If then that friend demand why Brutus rose against Caesar, \nthis is m~ ## 7 7 "not that I loved Caesar less, but that I loved Rome more."
unnest_tokens(df, word, text), in tidytext package, splits a column (text, in this case) into tokens and flattens the table into one-token-per-row (using word as the columns name).
library(tidytext) tokens <- text_df %>% unnest_tokens(word, text) #tokens <- unnest_tokens(text_df, word, text) tokens %>% # word count count(word) %>% # Arrange the counts in descending order arrange(desc(n))
## # A tibble: 65 x 2 ## word n ## <chr> <int> ## 1 that 7 ## 2 and 4 ## 3 caesar 4 ## 4 i 3 ## 5 may 3 ## 6 me 3 ## 7 to 3 ## 8 you 3 ## 9 any 2 ## 10 be 2 ## # ... with 55 more rows
Stop words can be removed using anti_join(stop_words). anti_join(x,y) takes two data frames, returns all rows from x that not in y. (Notice the change in the number of rows.)
sw_tokens <- text_df %>% # Tokenize the lines data unnest_tokens(word, text) %>% anti_join(stop_words)
## Joining, by = "word"
sw_tokens %>% # word count count(word) %>% # Arrange the counts in descending order arrange(desc(n))
## # A tibble: 28 x 2 ## word n ## <chr> <int> ## 1 caesar 4 ## 2 brutus 2 ## 3 friend 2 ## 4 hear 2 ## 5 honor 2 ## 6 loved 2 ## 7 mine 2 ## 8 answer 1 ## 9 assembly 1 ## 10 awake 1 ## # ... with 18 more rows
library(tm)install.package("tm")
library(tm) #text mining package that utilizes NLP package
shakespeare_lower <- tolower(shakespeare) shakespeare_lower
## [1] "et tu, brute? — then fall, caesar!" ## [2] "romans, countrymen, and lovers, hear me for my cause, \nand be silent, that you may hear." ## [3] "believe me for mine honor, and have respect to mine honor, \nthat you may believe." ## [4] "censure me in your wisdom, and awake your senses, \nthat you may the better judge." ## [5] "if there be any in this assembly, any dear friend of caesar's, \nto him i say that brutus' love to caesar was no less than his." ## [6] "if then that friend demand why brutus rose against caesar, \nthis is my answer:" ## [7] "not that i loved caesar less, but that i loved rome more."
# Remove punctuation
shakespeare_lower_np <- gsub('[[:punct:]]', '', shakespeare_lower)
shakespeare_lower_np
## [1] "et tu brute — then fall caesar" ## [2] "romans countrymen and lovers hear me for my cause \nand be silent that you may hear" ## [3] "believe me for mine honor and have respect to mine honor \nthat you may believe" ## [4] "censure me in your wisdom and awake your senses \nthat you may the better judge" ## [5] "if there be any in this assembly any dear friend of caesars \nto him i say that brutus love to caesar was no less than his" ## [6] "if then that friend demand why brutus rose against caesar \nthis is my answer" ## [7] "not that i loved caesar less but that i loved rome more"
# Remove stopwords shakespeare_lower_np_sw <- removeWords(shakespeare_lower_np, stopwords()) shakespeare_lower_np_sw
## [1] "et tu brute — fall caesar" ## [2] "romans countrymen lovers hear cause \n silent may hear" ## [3] "believe mine honor respect mine honor \n may believe" ## [4] "censure wisdom awake senses \n may better judge" ## [5] " assembly dear friend caesars \n say brutus love caesar less " ## [6] " friend demand brutus rose caesar \n answer" ## [7] " loved caesar less loved rome "
# Remove whitespaces
shakespeare_lower_np_sw_nw <- gsub(' +',' ', shakespeare_lower_np_sw) %>%
str_trim(side="both")
shakespeare_lower_np_sw_nw
## [1] "et tu brute — fall caesar" ## [2] "romans countrymen lovers hear cause \n silent may hear" ## [3] "believe mine honor respect mine honor \n may believe" ## [4] "censure wisdom awake senses \n may better judge" ## [5] "assembly dear friend caesars \n say brutus love caesar less" ## [6] "friend demand brutus rose caesar \n answer" ## [7] "loved caesar less loved rome"
dw <- c('driver', 'drive', 'drove', 'driven', 'drives', 'driving')
stem_words(dw)
## [1] "driver" "drive" "drove" "driven" "drive" "drive"
lemmatize_words(dw)
## [1] "driver" "drive" "drive" "drive" "drive" "drive"
bw <- c('are', 'am', 'being', 'been', 'be')
stem_words(bw)
## [1] "ar" "am" "be" "been" "be"
lemmatize_words(bw)
## [1] "be" "be" "be" "be" "be"
reference: https://cran.r-project.org/web/packages/textstem/README.html
st_shakespeare <- tibble(line=1:7,
text=stem_strings(shakespeare))
lm_shakespeare <- tibble(line=1:7,
text=lemmatize_strings(shakespeare))
st_shakespeare
## # A tibble: 7 x 2 ## line text ## <int> <chr> ## 1 1 Et tu, Brute? — Then fall, Caesar! ## 2 2 Roman, countrymen, and lover, hear me for my caus, and be silent, that ~ ## 3 3 Believ me for mine honor, and have respect to mine honor, that you mai ~ ## 4 4 Censur me in your wisdom, and awak your sens, that you mai the better j~ ## 5 5 If there be ani in thi assembli, ani dear friend of Caesar', to him I s~ ## 6 6 If then that friend demand why Brutu rose against Caesar, thi i my answ~ ## 7 7 not that I love Caesar less, but that I love Rome more.
lm_shakespeare
## # A tibble: 7 x 2 ## line text ## <int> <chr> ## 1 1 Et tu, Brute? — Then fall, Caesar! ## 2 2 roman, countryman, and lover, hear me for my cause, and be silent, that~ ## 3 3 Believe me for mine honor, and have respect to mine honor, that you may~ ## 4 4 Censure me in your wisdom, and awake your sense, that you may the good ~ ## 5 5 If there be any in this assembly, any dear friend of Caesar's, to him I~ ## 6 6 If then that friend demand why Brutus rise against Caesar, this be my a~ ## 7 7 not that I love Caesar little, but that I love Rome much.
stemmed_tokens <- st_shakespeare %>% unnest_tokens(word, text) %>% anti_join(stop_words) %>% count(word) %>% arrange(desc(n))
## Joining, by = "word"
stemmed_tokens
## # A tibble: 34 x 2 ## word n ## <chr> <int> ## 1 caesar 5 ## 2 love 3 ## 3 mai 3 ## 4 ani 2 ## 5 believ 2 ## 6 friend 2 ## 7 hear 2 ## 8 honor 2 ## 9 mine 2 ## 10 thi 2 ## # ... with 24 more rows
lemmatized_tokens <- lm_shakespeare %>% unnest_tokens(word, text) %>% anti_join(stop_words) %>% count(word) %>% arrange(desc(n))
## Joining, by = "word"
lemmatized_tokens
## # A tibble: 27 x 2 ## word n ## <chr> <int> ## 1 caesar 4 ## 2 love 3 ## 3 brutus 2 ## 4 friend 2 ## 5 hear 2 ## 6 honor 2 ## 7 mine 2 ## 8 answer 1 ## 9 assembly 1 ## 10 awake 1 ## # ... with 17 more rows