Download and review the Titanic Data.csv data file associated with some Titanic Survivors.
Use the read.csv() function in R to read the data and store it in a dataframe called “titanic”.
Use the View() function in R to view the dataframe in R
titanic.df <- read.csv(paste("Titanic Data.csv"), sep= ",")
View(titanic.df)
count the total number of passengers on board the Titanic.
nrow(titanic.df)
## [1] 889
count the number of passengers who survived the sinking of the Titanic.
nrow(subset(titanic.df,Survived==1))
## [1] 340
measure the percentage of passengers who survived the sinking of the Titanic.
(prop.table(table(titanic.df$Survived))*100)[2]
## 1
## 38.24522
count the number of first-class passengers who survived the sinking of the Titanic.
mytable <- xtabs(~Survived+Pclass,data=titanic.df)
mytable[2]
## [1] 134
measure the percentage of first-class passengers who survived the sinking of the Titanic.
(prop.table(mytable)*100)[2]
## [1] 15.07312
count the number of females from First-Class who survived the sinking of the Titanic
female <- xtabs(~Survived+Pclass+Sex,data=titanic.df)
(ftable(female))[4]
## [1] 89
measure the percentage of survivors who were female
mytable <- xtabs(~Survived+Sex,data=titanic.df)
(prop.table(mytable,1)*100)[2,1]
## [1] 67.94118
measure the percentage of females on board the Titanic who survived.
(prop.table(mytable,2)*100)[2,1]
## [1] 74.03846
Run a Pearson’s Chi-squared test to test the following hypothesis:
Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.
chisq.test(mytable)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: mytable
## X-squared = 258.43, df = 1, p-value < 2.2e-16
since the P-value < 0.05, we reject the null hypothesis.