Reading the titanic Dataset and having a first look:
titanic <- read.csv(paste("Titanic Data.csv",sep=""))
head(titanic)
## Survived Pclass Sex Age SibSp Parch Fare Embarked
## 1 0 3 male 22.0 1 0 7.2500 S
## 2 1 1 female 38.0 1 0 71.2833 C
## 3 1 3 female 26.0 0 0 7.9250 S
## 4 1 1 female 35.0 1 0 53.1000 S
## 5 0 3 male 35.0 0 0 8.0500 S
## 6 0 3 male 29.7 0 0 8.4583 Q
Counting the total number of passengers
sum(table(titanic$Survived))
## [1] 889
Counting the number of people who survived the sinking titanic
t=data.frame(table(titanic$Survived))
names(t)[1]= 'Survived'
t
## Survived Freq
## 1 0 549
## 2 1 340
As visible 340 people survived the tragedy. Calculating the percentages of people survivng the tragedy
y=table(titanic$Survived)
prop.table(y)*100
##
## 0 1
## 61.75478 38.24522
As seen 38.24% of the people survived.
Now let us Use R to count the number of first-class passengers who survived the sinking of the Titanic.
mytable <- xtabs(~Survived + Pclass,data= titanic[titanic$Pclass=="1", ])
mytable
## Pclass
## Survived 1
## 0 80
## 1 134
Thus the number of people survived from 1st class is 134
Lets use R to measure the percentage of first-class passengers who survived the sinking of the Titanic.
prop.table(mytable)*100
## Pclass
## Survived 1
## 0 37.38318
## 1 62.61682
The results are self explainatory
Lets use R to count the number of females from First-Class who survived the sinking of the Titanic
mytable1=(subset(titanic,Pclass=='1' & Survived=='1',select=c(Pclass,Survived,Sex)))
ftable(mytable1)
## Sex female male
## Pclass Survived
## 1 1 89 45
Thus 89 females from the 1st class survived the sinking of the titanic
Lets use R to measure the percentage of survivors who were female ie percentage from only the survived people icluding both men and women.
q = subset(titanic,Survived=='1',select=c(Sex,Survived))
mytable2 <- xtabs(~Survived + Sex, data=q)
prop.table(mytable2)*100
## Sex
## Survived female male
## 1 67.94118 32.05882
Thus from the people survived 67% of them were females
Lets use R to measure the percentage of females on board the Titanic who survived ie from the total female passengers onboard
p=xtabs(~Survived + Sex,data = titanic)
prop.table(p)*100
## Sex
## Survived female male
## 0 9.111361 52.643420
## 1 25.984252 12.260967
Thus 26% of the females and 12% of males were only be aabled to be rescued
Lets run a Pearson’s Chi-squared test to test the following hypothesis:
Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.
chisq.test(p)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: p
## X-squared = 258.43, df = 1, p-value < 2.2e-16
assocstats(p)
## X^2 df P(> X^2)
## Likelihood Ratio 266.21 1 0
## Pearson 260.76 1 0
##
## Phi-Coefficient : 0.542
## Contingency Coeff.: 0.476
## Cramer's V : 0.542
Since p- value is very small we reject the null hypothesis and accept the Alternate hypothesis and hence the proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.