1: Filtering for Probabilites.

The following code will load the Titanic data set as a training set and a testing set.

library("tidyverse")
library("titanic")

train_df <- titanic_train #training set of Titanic data
test_df <- titanic_test #testing set of Titanic data

Use R to compute the following probabilities.

  1. Based on the training set sample, what is the probability that a randomly selected passenger was both in first class or whose age \(\leq\) 35?
N <- nrow(train_df) #number of passangers in the data set

train_df %>%
  filter(Pclass == 1 | Age == 35) %>%
  summarize(prob = n() / N) #counts the rows
##        prob
## 1 0.2525253
  1. Based on the testing set sample, what is the probability that a randomly selected passenger was both in second class and whose age \(>\) 35?
N <- nrow(test_df) #number of passangers in the data set

test_df %>%
  filter(Pclass == 2 | Age == 35) %>%
  summarize(prob = n() / N) #counts the rows
##        prob
## 1 0.2320574

2: Coin Simulations

Starting with the following code, run a simulation and make a ggplot barplot for the a. distribution of one coin flip

library("ggplot2")
coin <-  c("heads", "tails")

one_coin <- sample(coin, 1, replace = TRUE)
simulation <-replicate(10000, sample(coin, 1, replace = TRUE))
df <- data.frame(simulation)

ggplot(df, aes(x = simulation)) + 
  geom_bar() + ggtitle("Ditribution of one coin flip") + theme(axis.text.x = element_text(angle = 270, vjust = 0.33))

  1. distribution of two coin flips
two_coins <- paste(sample(coin, 2, replace = TRUE), collapse = " ")
simulation <- replicate(10000, paste(sample(coin, 2, replace= TRUE), collapse = " "))
df <- data.frame(simulation)

ggplot(df, aes(x = simulation)) + 
  geom_bar() + ggtitle("Ditribution two coins") + theme(axis.text.x = element_text(angle = 270, vjust = 0.33))

c. distribution of three coin flips

three_coins <- paste(sample(coin, 3, replace = TRUE), collapse = " ")
simulation <- replicate(10000, paste(sample(coin, 3, replace= TRUE), collapse = " "))
df <- data.frame(simulation)

ggplot(df, aes(x = simulation)) + 
  geom_bar() + ggtitle("Ditribution three coins") + theme(axis.text.x = element_text(angle = 270, vjust = 0.33))

library("ggplot2")
coin <- c("heads", "tails")


#one_coin <- sample(coin, 1)
#two_coins <- paste(sample(coin, 2, replace = TRUE), collapse = " ")
#three_coins <- paste(sample(coin, 3, replace = TRUE), collapse = " ")

Be sure to use ggtitle to put a title on each graph, and add a layer such as theme(axis.text.x = element_text(angle = 270, vjust = 0.33)) to fix the axis’ labels.


When you are done, be sure that your name is in the file name and on the top of this markdown document. Find the HTML file on your computer and upload the HTML file back into our CatCourses page for this homework assignment.