titanic <- read.csv(paste("Titanic Data.csv",sep=""))
View(titanic)
summary(titanic)
## Survived Pclass Sex Age
## Min. :0.0000 Min. :1.000 female:312 Min. : 0.40
## 1st Qu.:0.0000 1st Qu.:2.000 male :577 1st Qu.:22.00
## Median :0.0000 Median :3.000 Median :29.70
## Mean :0.3825 Mean :2.312 Mean :29.65
## 3rd Qu.:1.0000 3rd Qu.:3.000 3rd Qu.:35.00
## Max. :1.0000 Max. :3.000 Max. :80.00
## SibSp Parch Fare Embarked
## Min. :0.0000 Min. :0.0000 Min. : 0.000 C:168
## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.: 7.896 Q: 77
## Median :0.0000 Median :0.0000 Median : 14.454 S:644
## Mean :0.5242 Mean :0.3825 Mean : 32.097
## 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.: 31.000
## Max. :8.0000 Max. :6.0000 Max. :512.329
library(psych)
describe(titanic)
## vars n mean sd median trimmed mad min max range
## Survived 1 889 0.38 0.49 0.00 0.35 0.00 0.0 1.00 1.00
## Pclass 2 889 2.31 0.83 3.00 2.39 0.00 1.0 3.00 2.00
## Sex* 3 889 1.65 0.48 2.00 1.69 0.00 1.0 2.00 1.00
## Age 4 889 29.65 12.97 29.70 29.22 9.34 0.4 80.00 79.60
## SibSp 5 889 0.52 1.10 0.00 0.27 0.00 0.0 8.00 8.00
## Parch 6 889 0.38 0.81 0.00 0.19 0.00 0.0 6.00 6.00
## Fare 7 889 32.10 49.70 14.45 21.28 10.24 0.0 512.33 512.33
## Embarked* 8 889 2.54 0.79 3.00 2.67 0.00 1.0 3.00 2.00
## skew kurtosis se
## Survived 0.48 -1.77 0.02
## Pclass -0.63 -1.27 0.03
## Sex* -0.62 -1.61 0.02
## Age 0.43 0.96 0.43
## SibSp 3.68 17.69 0.04
## Parch 2.74 9.66 0.03
## Fare 4.79 33.23 1.67
## Embarked* -1.26 -0.23 0.03
Total number of passengers on board the Titanic.
length(titanic$Survived)
## [1] 889
The answer is 889.
Number of passengers who survived the sinking of the Titanic.
table(titanic$Survived)
##
## 0 1
## 549 340
mytable<-table(titanic$Survived==1)
mytable
##
## FALSE TRUE
## 549 340
The answer is 340.
Percentage of passengers who survived the sinking of the Titanic.
prop.table(mytable)*100
##
## FALSE TRUE
## 61.75478 38.24522
The answer is 38.24 percent(approx.)
Number of first-class passengers who survived the sinking of the Titanic.
The percentage of first-class passengers who survived the sinking of the Titanic.
mytable <- with(titanic,table(Pclass))
mytable
## Pclass
## 1 2 3
## 214 184 491
mytable <- xtabs(~Survived + Pclass, data= titanic)
mytable
## Pclass
## Survived 1 2 3
## 0 80 97 372
## 1 134 87 119
prop.table(mytable,1)
## Pclass
## Survived 1 2 3
## 0 0.1457195 0.1766849 0.6775956
## 1 0.3941176 0.2558824 0.3500000
prop.table(mytable,2)
## Pclass
## Survived 1 2 3
## 0 0.3738318 0.5271739 0.7576375
## 1 0.6261682 0.4728261 0.2423625
prop.table(mytable,2)*100
## Pclass
## Survived 1 2 3
## 0 37.38318 52.71739 75.76375
## 1 62.61682 47.28261 24.23625
The number of first class passengers who survived is 134.
The percentage of first-class passengers who survived the sinking of the Titanic is 62.6(approx)
Number of females from First-Class who survived the sinking of the Titanic
mytable <- xtabs(~Sex + Survived + Pclass, data= titanic)
mytable
## , , Pclass = 1
##
## Survived
## Sex 0 1
## female 3 89
## male 77 45
##
## , , Pclass = 2
##
## Survived
## Sex 0 1
## female 6 70
## male 91 17
##
## , , Pclass = 3
##
## Survived
## Sex 0 1
## female 72 72
## male 300 47
Number of females from First-Class who survived the sinking of the Titanic is 89.
Percentage of survivors who were female
mytable <- xtabs(~Sex + Survived, data= titanic)
mytable
## Survived
## Sex 0 1
## female 81 231
## male 468 109
prop.table(mytable,2)
## Survived
## Sex 0 1
## female 0.1475410 0.6794118
## male 0.8524590 0.3205882
prop.table(mytable,2)*100
## Survived
## Sex 0 1
## female 14.75410 67.94118
## male 85.24590 32.05882
prop.table(mytable,1)
## Survived
## Sex 0 1
## female 0.2596154 0.7403846
## male 0.8110919 0.1889081
prop.table(mytable,1)*100
## Survived
## Sex 0 1
## female 25.96154 74.03846
## male 81.10919 18.89081
The percentage of survivors who were female is 67.94(approx.)
The percentage of females on board the Titanic who survived is 74.03(approx)
Run a Pearson’s Chi-squared test to test the following hypothesis:
Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.
mytable <- xtabs(~Sex + Survived, data= titanic)
chisq.test(mytable)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: mytable
## X-squared = 258.43, df = 1, p-value < 2.2e-16
The p-values are the probability of obtaining the sampled results, assuming independence of the row and column variables in the population. Since the probability is small (p < 0.01), we reject the Null hypothesis that Sex type and Survival chances are independent.
This means that indeed, the hypothesis,The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic, is true.