Read the titanic dataset
titanic <- read.csv(paste("Titanic Data.csv", sep=""))
head(titanic) # first few rows of the data frame
## Survived Pclass Sex Age SibSp Parch Fare Embarked
## 1 0 3 male 22.0 1 0 7.2500 S
## 2 1 1 female 38.0 1 0 71.2833 C
## 3 1 3 female 26.0 0 0 7.9250 S
## 4 1 1 female 35.0 1 0 53.1000 S
## 5 0 3 male 35.0 0 0 8.0500 S
## 6 0 3 male 29.7 0 0 8.4583 Q
view data to confirm it is exactly matching the one we saw in the excel file
View(titanic)
totalpassenger<-table(titanic$Sex)
totalpassenger
##
## female male
## 312 577
addmargins(totalpassenger)
##
## female male Sum
## 312 577 889
second method
dim(titanic)[1]
## [1] 889
here total number of passengers on board were 889
3b Use R to count the number of passengers who survived the sinking of the Titanic.
survivedtable<-table(titanic$Survived)
survivedtable[2]
## 1
## 340
second method
nrow(subset(titanic, Survived == 1))
## [1] 340
#or use the length function
length(titanic$Survived[titanic$Survived=="1"])
## [1] 340
so the number of passengers who survived the sinking titanic was 340
3c-Use R to measure the percentage of passengers who survived the sinking of the Titanic.
prop.table(survivedtable)
##
## 0 1
## 0.6175478 0.3824522
(100*prop.table(survivedtable))[2]
## 1
## 38.24522
second method
x<-dim(titanic)[1]
y<-survivedtable[2]
z<-y/x
100*z
## 1
## 38.24522
so the percentage of passengers surviving the sinking titanic is 38.245%
3d Use R to count the number of first-class passengers who survived the sinking of the Titanic.
psurvived<-xtabs(~Survived+Pclass,data = titanic)
addmargins(psurvived)
## Pclass
## Survived 1 2 3 Sum
## 0 80 97 372 549
## 1 134 87 119 340
## Sum 214 184 491 889
addmargins(psurvived)[2,1]
## [1] 134
so the no. of first class passengerswho survived the sinking of the Titanic.= 134
3e Use R to measure the percentage of first-class passengers who survived the sinking of the Titanic.
prop.table(psurvived)
## Pclass
## Survived 1 2 3
## 0 0.08998875 0.10911136 0.41844769
## 1 0.15073116 0.09786277 0.13385827
(100*prop.table(psurvived))[2,1]
## [1] 15.07312
so the percentage of first-class passengers who survived the sinking of the Titanic= 15.073%
3f Use R to count the number of females from First-Class who survived the sinking of the Titanic.
femalesurvivor<-xtabs(~Survived+Sex+Pclass,data = titanic)
femalesurvivor
## , , Pclass = 1
##
## Sex
## Survived female male
## 0 3 77
## 1 89 45
##
## , , Pclass = 2
##
## Sex
## Survived female male
## 0 6 91
## 1 70 17
##
## , , Pclass = 3
##
## Sex
## Survived female male
## 0 72 300
## 1 72 47
ftable(femalesurvivor)
## Pclass 1 2 3
## Survived Sex
## 0 female 3 6 72
## male 77 91 300
## 1 female 89 70 72
## male 45 17 47
ftable(femalesurvivor)[3,1]
## [1] 89
so the number of females from First-Class who survived the sinking of the Titanic=89 note- we have used ftable function to give a compact view of our table.
3g Use R to measure the percentage of survivors who were female first class
prop.table(ftable(femalesurvivor))
## Pclass 1 2 3
## Survived Sex
## 0 female 0.003374578 0.006749156 0.080989876
## male 0.086614173 0.102362205 0.337457818
## 1 female 0.100112486 0.078740157 0.080989876
## male 0.050618673 0.019122610 0.052868391
(100*prop.table(ftable(femalesurvivor)))
## Pclass 1 2 3
## Survived Sex
## 0 female 0.3374578 0.6749156 8.0989876
## male 8.6614173 10.2362205 33.7457818
## 1 female 10.0112486 7.8740157 8.0989876
## male 5.0618673 1.9122610 5.2868391
(100*prop.table(ftable(femalesurvivor)))[3,1]
## [1] 10.01125
so percentage of female survivor firt class= 10.01125%
3h Use R to measure the percentage of females on board the Titanic who survived
femsurv<-xtabs(~Survived+Sex,data = titanic)
prop.table(femsurv)
## Sex
## Survived female male
## 0 0.09111361 0.52643420
## 1 0.25984252 0.12260967
(100*prop.table(femsurv))
## Sex
## Survived female male
## 0 9.111361 52.643420
## 1 25.984252 12.260967
(100*prop.table(femsurv))[2,1]
## [1] 25.98425
so percentage of female survived on board the sinking titanic=25.98425%
3i Run a Pearson’s Chi-squared test to test the following hypothesis: Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic
chisq.test(femsurv)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: femsurv
## X-squared = 258.43, df = 1, p-value < 2.2e-16
chisq.test(femsurv)$p.value
## [1] 3.77991e-58
here we can see that p value is less than 0.05.
so we reject the null hypothes we say that in the wake of given titanic data we are accepting the alternative hypothesis that The proportion of females onboard who survived the sinking of the Titanic was not higher than the proportion of males onboard who survived the sinking of the Titanic
important note-rejecting null does not prove null hypothesisi to be false.it says in the wake of given data null hypothesis does not seem to be plausible hypothesis.