Exercise 6.5.1

I’ve created a dataset about presidential candidates for the 2020 US election.

(a) Re-code the Gender column to have Male and Female levels. Similarly convert the party variable to be Democratic or Republican.

Candidate Gender Birthday Party AgeOnElection
Pete Buttigieg Male 1982-01-19 Democrat 38
Andrew Yang Male 1975-01-13 Democrat 45
Juilan Castro Male 1976-09-16 Democrat 44
Beto O’Rourke Male 1972-09-26 Democrat 48
Cory Booker Male 1969-04-27 Democrat 51
Kamala Harris Female 1964-10-20 Democrat 56
Amy Klobucher Female 1960-05-25 Democrat 60
Elizabeth Warren Female 1949-06-22 Democrat 71
Donald Trump Male 1946-06-14 Republican 74
Joe Biden Male 1942-11-20 Democrat 77
Bernie Sanders Male 1941-09-08 Democrat 79

(b) Bernie Sanders was registered as an Independent up until his 2016 presidential run. Change his political party value into ‘Independent’.

Candidate Gender Birthday Party AgeOnElection
Pete Buttigieg Male 1982-01-19 Democrat 38
Andrew Yang Male 1975-01-13 Democrat 45
Juilan Castro Male 1976-09-16 Democrat 44
Beto O’Rourke Male 1972-09-26 Democrat 48
Cory Booker Male 1969-04-27 Democrat 51
Kamala Harris Female 1964-10-20 Democrat 56
Amy Klobucher Female 1960-05-25 Democrat 60
Elizabeth Warren Female 1949-06-22 Democrat 71
Donald Trump Male 1946-06-14 Republican 74
Joe Biden Male 1942-11-20 Democrat 77
Bernie Sanders Male 1941-09-08 Independent 79

Exercise 6.5.2

Let’s write our own uniform distribution function! We will write a sequence of statements that utilizes if statements to appropriately calculate the density of x, assuming that a, b , and x are given to you, but your code won’t know if x is between a and b. That is, your code needs to figure out if it is and give either 1/(b-a) or 0.

(a) Write a series of ifelse statements.

## [1] "x= 2.481   result= 0"

(b) Perform the logical comparison all in one.

(With if, else)
## [1] "x= 6.103  result= 0.167"
(With “or” operator)
## [1] "x= 0.331  result= 0"
(With “ifelse”)
## [1] "x= 4.906  result= 0.167"

Exercise 6.5.3

I often want to repeat some section of code some number of times. For example, I might want to create a bunch plots that compare the density of a t-distribution with specified degrees of freedom to a standard normal distribution.

(a) Use a for loop to create similar graphs for degrees of freedom 2, 3, 4…29, 30.

(b) In retrospect, perhaps we didn’t need to produce all of those. Rewrite your loop so that we only produce graphs for {2, 3, 4, 5, 10, 15, 20, 25, 30} degrees of freedom. Hint: you can just modify the vector in the for statement to include the desired degrees of freedom.

Exercise 6.5.4

The game is to roll a pair of 6-sided dice 24 times. If a “double-sixes” comes up on any of the 24 rolls, the player wins. What is the probability of winning?

(a) We can simulate rolling two 6-sided dice using the sample() function with the replace=TRUE option. Read the help file on sample() to see how to sample from the numbers 1, 2,…,6. Sum those two die rolls and save it as throw.

throw <- sum(sample(x=1:6, size=2, replace=TRUE))
throw
## [1] 10

(b) Write a for(){} loop that wraps your code from part (a) and then tests if any of the throws of dice summed to 12. Read the help files for the functions any() and all().

throws <- NULL
for(i in 1:24) {
  throws[i] <- sum(sample(x=1:6, size=2, replace=TRUE)) #does this create a vector?
}
game <- any(throws==12)
game
## [1] TRUE

(c) Wrap all of your code from part (b) in another for(){} loop that you run 10,000 times. Save the result of each game in a games vector that will have 10,000 elements that are either TRUE/FALSE depending on the outcome of that game. You’ll need to initialize the games vector to NULL and modify your part (b) code to save the result into some location in the vector games.

games <- NULL #nested loops make me wanna cry cry cry
for(j in 1:10000) {
  for(i in 1:24){
    throws[i] <-sum(sample(x=1:6, size=2, replace=TRUE))
  }
  games[j] <- any(throws==12)
}
str(games)
##  logi [1:10000] FALSE TRUE FALSE FALSE TRUE FALSE ...

(d) Finally, calculate win percentage by taking the average of the games vector.

Game Average.Wins
Games 1-10,000 0.4973