setwd("C:/Users/Akshay/Desktop/R Books/Udemy")
titanic<- read.csv (paste("TitanicData.csv", sep = ""))
View(titanic)
table1<- table(titanic$Survived)
table1
##
## 0 1
## 549 340
So the Total no of passenger on the Ship were 889. And out of these 889 passengers only 340 survived.
prop.table(table1)*100
##
## 0 1
## 61.75478 38.24522
So only 38.24% people survived on the ship
table2 <- xtabs(~ Pclass + Survived, data = titanic)
table2
## Survived
## Pclass 0 1
## 1 80 134
## 2 97 87
## 3 372 119
So form the table its quite clear that the no of first class passengers who survived were 134.
prop.table(table2, 1)*100
## Survived
## Pclass 0 1
## 1 37.38318 62.61682
## 2 52.71739 47.28261
## 3 75.76375 24.23625
So as per the table around 62.61 percent of first class people survived in the ship
table3 <- xtabs( ~ Survived + Sex + Pclass, data = titanic)
table3
## , , Pclass = 1
##
## Sex
## Survived female male
## 0 3 77
## 1 89 45
##
## , , Pclass = 2
##
## Sex
## Survived female male
## 0 6 91
## 1 70 17
##
## , , Pclass = 3
##
## Sex
## Survived female male
## 0 72 300
## 1 72 47
From the above table its clear that only 89 females survived from the first class.
table4 <- xtabs (~ Survived + Sex, data= titanic)
prop.table(table4, 1)*100
## Sex
## Survived female male
## 0 14.75410 85.24590
## 1 67.94118 32.05882
So out of total survivors 67.94% were females
prop.table(table4, 2)*100
## Sex
## Survived female male
## 0 25.96154 81.10919
## 1 74.03846 18.89081
So around 74% of the total females survived.
chisq.test(table4)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: table4
## X-squared = 258.43, df = 1, p-value < 2.2e-16
by(titanic$Age, titanic$Survived, mean)
## titanic$Survived: 0
## [1] 30.4153
## --------------------------------------------------------
## titanic$Survived: 1
## [1] 28.42382
Thus the mean age of survivors was less than the mean age of people who died.
t.test(Age~ Survived, alternative = "less", var.equal = TRUE, data= titanic)
##
## Two Sample t-test
##
## data: Age by Survived
## t = 2.2302, df = 887, p-value = 0.987
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
## -Inf 3.461823
## sample estimates:
## mean in group 0 mean in group 1
## 30.41530 28.42382
Thus from the test its clear that since the p- value is quite large than 5% or 10% so we do not reject the null hypothesis that there was no age difference in the titanic survivors.