The analysis is based on the dataset Titanic Data.csv.
This file explains the meaning of each column in the given dataset.
Using the read.csv() function in R, data is read and stored in a dataframe called “titanic”.
titanic.df<-read.csv(paste("Titanic Data.csv", sep=""))
Using the View() function the dataframe is viewed.
View(titanic.df)
Counting the total number of passengers on board the Titanic.
dim(titanic.df)
## [1] 889 8
length(titanic.df$Survived) #Alternate
## [1] 889
Therefore, total number of passengers on board the Titanic = 889
Counting the number of passengers who survived the sinking of the Titanic. Survival {0 = No, 1 = Yes}
mytable <- with(titanic.df, table(Survived))
mytable # frequencies
## Survived
## 0 1
## 549 340
Therefore, the number of passengers who survived the sinking of the Titanic = 340
Measuring the percentage of passengers who survived the sinking of the Titanic.
prop.table(mytable)*100
## Survived
## 0 1
## 61.75478 38.24522
Therefore, the percentage of passengers who survived the sinking of the Titanic = 38.24522
Counting the number of first-class passengers who survived the sinking of the Titanic.
mytable <- xtabs(~ Pclass+Survived, data=titanic.df)
mytable # frequencies
## Survived
## Pclass 0 1
## 1 80 134
## 2 97 87
## 3 372 119
Therefore, the number of first-class passengers who survived the sinking of the Titanic = 134
Measuring the percentage of first-class passengers who survived the sinking of the Titanic.
prop.table(mytable, 1)*100
## Survived
## Pclass 0 1
## 1 37.38318 62.61682
## 2 52.71739 47.28261
## 3 75.76375 24.23625
Therefore, percentage of first-class passengers who survived the sinking of the Titanic = 62.61682
Counting the number of females from First-Class who survived the sinking of the Titanic
mytable <- xtabs(~ Sex+Pclass+Survived, data=titanic.df)
ftable(mytable)
## Survived 0 1
## Sex Pclass
## female 1 3 89
## 2 6 70
## 3 72 72
## male 1 77 45
## 2 91 17
## 3 300 47
Therefore, number of females from First-Class who survived the sinking of the Titanic = 89
Measuring the percentage of survivors who were female
margin.table(mytable, c(1,3))
## Survived
## Sex 0 1
## female 81 231
## male 468 109
prop.table(margin.table(mytable, c(1,3)), 2)*100
## Survived
## Sex 0 1
## female 14.75410 67.94118
## male 85.24590 32.05882
Therefore, percentage of survivors who were female = 67.94118
Measuring the percentage of females on board the Titanic who survived
margin.table(mytable, c(1,3))
## Survived
## Sex 0 1
## female 81 231
## male 468 109
prop.table(margin.table(mytable, c(1,3)), 1)*100
## Survived
## Sex 0 1
## female 25.96154 74.03846
## male 81.10919 18.89081
Therefore, percentage of females on board the Titanic who survived = 74.03846
Run a Pearson’s Chi-squared test to test the following hypothesis: Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.
mytable <- xtabs(~ Sex+Survived, data=titanic.df)
chisq.test(mytable)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: mytable
## X-squared = 258.43, df = 1, p-value < 2.2e-16
Since the probability is small (p < 0.01), we reject the Null hypothesis that gender and survival are independent.