This list of exercises seeks to review material discussed so far and to introduce new notions that will be useful as we move forward. It reviews (1) logical expressions, (2) sapply(), (3) how to create functions and (4) data ordering. It also introduces if…else statements.
Dataset1.csv contains more than 9000 transactions from a bakery called “The Bread Basket” between October 30th, 2016 and April 9th, 2017. We begin by loading it using the read.csv() function.
# Loading the dataset
BakeryData <- read.csv("/Users/evelynebrie/Dropbox/TA/PSCI_107_Fall2018/Recitation/Week7/Dataset1.csv")
Here is what this dataset looks like: its first column contains the transaction number, and all other 91 variables are different products sold at this bakery. They are ordered alphabetically.
BakeryData[1:10,1:10]
## Transaction Alfajores Argentina.Night Art.Tray Baguette Bakewell
## 1 1 0 0 0 0 0
## 2 2 0 0 0 0 0
## 3 3 0 0 0 0 0
## 4 4 0 0 0 0 0
## 5 5 0 0 0 0 0
## 6 6 0 0 0 0 0
## 7 7 0 0 0 0 0
## 8 8 0 0 0 0 0
## 9 9 0 0 0 0 0
## 10 10 0 0 0 0 0
## Bare.Popcorn Basket Bowl.Nic.Pitt Bread
## 1 0 0 0 1
## 2 0 0 0 0
## 3 0 0 0 0
## 4 0 0 0 0
## 5 0 0 0 1
## 6 0 0 0 0
## 7 0 0 0 0
## 8 0 0 0 1
## 9 0 0 0 1
## 10 0 0 0 0
myVector <- sapply(______,______)
______(myVector, decreasing=TRUE)
Please find the mistakes in the following code chuncks.
Imagine a databased called Data encompassing a continuous numeric variable named age. Someone decides to create a dichotomous variable (a dummy) regrouping people who are 18 years old or older. What are the 2 mistakes in the following code?
Data$adult <- NA
Data$adult[Data$age => 18] <- 1
Data$adult[Data$age < 18] <- 0
Data$adult[Data$age = "Refused to answer"] <- NA
Imagine a databased called Data encompassing a categorical variable named educ5Cat with 5 categories: “Primary School”, “High School”, “University”, “No education” and “Refused to answer”. Someone decides to create a dichotomous variable (a dummy) regrouping people who don’t have a high school diploma. What are the 4 mistakes in the following code?
Data$educUniv <- NA
Data$educUniv[Data$educCat=="Primary School" & Data$educCat=="No education"] <- 1
Data$educUniv[Data$educCat!="Primary School" | Data$educCat!="No education"] <- 0
Imagine a vector called myCards (displayed below) which encompasses all cards from a standard 52-card deck.
myCards
## [1] "Ace Diamonds" "Deuce Diamonds" "Three Diamonds" "Four Diamonds"
## [5] "Five Diamonds" "Six Diamonds" "Seven Diamonds" "Eight Diamonds"
## [9] "Nine Diamonds" "Ten Diamonds" "Jack Diamonds" "Queen Diamonds"
## [13] "King Diamonds" "Ace Clubs" "Deuce Clubs" "Three Clubs"
## [17] "Four Clubs" "Five Clubs" "Six Clubs" "Seven Clubs"
## [21] "Eight Clubs" "Nine Clubs" "Ten Clubs" "Jack Clubs"
## [25] "Queen Clubs" "King Clubs" "Ace Hearts" "Deuce Hearts"
## [29] "Three Hearts" "Four Hearts" "Five Hearts" "Six Hearts"
## [33] "Seven Hearts" "Eight Hearts" "Nine Hearts" "Ten Hearts"
## [37] "Jack Hearts" "Queen Hearts" "King Hearts" "Ace Spades"
## [41] "Deuce Spades" "Three Spades" "Four Spades" "Five Spades"
## [45] "Six Spades" "Seven Spades" "Eight Spades" "Nine Spades"
## [49] "Ten Spades" "Jack Spades" "Queen Spades" "King Spades"
Someone wants to create a function that randomly selects n number of cards for one player to keep during a game. What is the main mistake within the following code?
card <- myCards
selected_cards <- function(n) {
sample(card, size = n, replace = TRUE))
}
If…else statements can be very useful in R. They are conditional statements: if the expression is TRUE, the statement gets executed. But if it’s FALSE, nothing happens (or something else, which you specify).
if (EXPRESSION) {
# FIRST COMMAND
} else
# SECOND COMMAND
Here is an example of two if…else statements that prints “Negative number” if x has a value smaller than 0
x <- 1 # Giving a value to the x object
# First technique
if (x < 0) { # Opening the if...else statement
print("Negative number") # Printing out a string of characters
} else # If the expression is not true, the following code will be executed
print("Not a negative number") # Printing out a string of characters
## [1] "Not a negative number"
# Second technique
ifelse(x < 0,"Negative number","Not a negative number") # Doing the same thing in a condensed format
## [1] "Not a negative number"
To answer the following question, you will need to use a syntax displaying remainder of divisions—the %% syntax. Here are a few examples.
10 %% 2 # Example of %% syntax (no remainder of division for 10 ÷ 2)
## [1] 0
!10 %% 2 # Confirmed with a logical statement
## [1] TRUE
10 %% 3 # Example of %% syntax (remainder of division for 10 ÷ 1 is 1)
## [1] 1
!10 %% 3 # Confirmed with a logical statement
## [1] FALSE
even_number <- function(n){
... your code here ...
}
The World Happiness Report is a landmark survey of the state of global happiness. The happiness scores and rankings use data from the Gallup World Poll. The scores are based on answers to the main life evaluation question asked in the poll. This question, known as the Cantril ladder, asks respondents to think of a ladder with the best possible life for them being a 10 and the worst possible life being a 0 and to rate their own current lives on that scale. More information on this data can be found here: http://worldhappiness.report/.
Let’s begin by loading Dataset2.csv using the read.csv() function.
HappinessData <- read.csv("/Users/evelynebrie/Dropbox/TA/PSCI_107_Fall2018/Recitation/Week7/Dataset2.csv")
Just for fun, let’s look at the ten happiest countries on Earth. The United States ranks 14th out of 155 countries.
HappinessData[1:10,2] # Displaying the first 10 rows of the second column (row number = ranking within this dataset)
## [1] Norway Denmark Iceland Switzerland Finland
## [6] Netherlands Canada New Zealand Sweden Australia
## 155 Levels: Afghanistan Albania Algeria Angola Argentina ... Zimbabwe
which(HappinessData$Country=="United States") # Displaying the row number of the United States
## [1] 14
Freedom to make life choices is the national average of binary responses to the Gallup question “Are you satisfied or dissatisfied with your freedom to choose what you do with your life?”
Relevant functions: order(), which().