Sinking of the Titanic dataset

Creating the dataframe titanic with the csv file into it and viewing it.

setwd("C:/Users/Kalyan/Downloads")
titanic.df<-read.csv(paste("Titanic Data.csv",sep = ""))
View(titanic.df)

Total number of passengers aboard the titanic

dim(titanic.df)
## [1] 889   8

Total number of passengers is 889.

Total number of passengers who survived the sinking

sum(titanic.df$Survived[titanic.df$Survived==1])
## [1] 340

Total number of passengers who survived the sinking is 340.

Percentage of passengers who survived the sinking

survivaltable<-table(titanic.df$Survived)
survivaltable
## 
##   0   1 
## 549 340
prop.table(survivaltable)*100
## 
##        0        1 
## 61.75478 38.24522

38.245% of passengers survived the sinking.

Total number of first class passengers who survived the sinking

firstclasssurvivors<-xtabs(~Survived+Pclass,data=titanic.df)
firstclasssurvivors
##         Pclass
## Survived   1   2   3
##        0  80  97 372
##        1 134  87 119

134 first class passengers survived the sinking.

Percentage of first class passengers who survived the sinking

prop.table(firstclasssurvivors,2)*100
##         Pclass
## Survived        1        2        3
##        0 37.38318 52.71739 75.76375
##        1 62.61682 47.28261 24.23625

62.62% of the first class passengers survived the sinking.This result is done w.r.t only the class,the passengers were travelling in.

Total number of female first class passengers who survived the sinking

femalefirst<-xtabs(~Survived+Sex+Pclass,data=titanic.df)
femalefirst
## , , Pclass = 1
## 
##         Sex
## Survived female male
##        0      3   77
##        1     89   45
## 
## , , Pclass = 2
## 
##         Sex
## Survived female male
##        0      6   91
##        1     70   17
## 
## , , Pclass = 3
## 
##         Sex
## Survived female male
##        0     72  300
##        1     72   47

89 female first class passengers survived the sinking.

Percentage of survivors who were female

femalesurvivors<-xtabs(~Survived+Sex,data=titanic.df)
addmargins(femalesurvivors)
##         Sex
## Survived female male Sum
##      0       81  468 549
##      1      231  109 340
##      Sum    312  577 889
prop.table(femalesurvivors,1)*100
##         Sex
## Survived   female     male
##        0 14.75410 85.24590
##        1 67.94118 32.05882

231 of the 340 survivors were female.67.94% of the survivors were female.

Percentage of female who survived

addmargins(femalesurvivors)
##         Sex
## Survived female male Sum
##      0       81  468 549
##      1      231  109 340
##      Sum    312  577 889
prop.table(femalesurvivors,2)*100
##         Sex
## Survived   female     male
##        0 25.96154 81.10919
##        1 74.03846 18.89081

231 of the 312 female,survived.74.04% of the female were survivors.

Pearson’s Chi-squared test

Here we’ll be checking the hypothesis:The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.

proportiontable<-xtabs(~Survived+Sex,data=titanic.df)
proportiontable
##         Sex
## Survived female male
##        0     81  468
##        1    231  109
chisq.test(proportiontable)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  proportiontable
## X-squared = 258.43, df = 1, p-value < 2.2e-16

Here we can see we have obtained the p-value to be less than 0.05 and hence we reject the null hypothesis which treats the survival and the gender to be independent.