titanic.df <- read.csv(paste("Titanic Data.csv",sep = ""))
head(titanic.df)
## Survived Pclass Sex Age SibSp Parch Fare Embarked
## 1 0 3 male 22.0 1 0 7.2500 S
## 2 1 1 female 38.0 1 0 71.2833 C
## 3 1 3 female 26.0 0 0 7.9250 S
## 4 1 1 female 35.0 1 0 53.1000 S
## 5 0 3 male 35.0 0 0 8.0500 S
## 6 0 3 male 29.7 0 0 8.4583 Q
library(psych)
describe(titanic.df)
## vars n mean sd median trimmed mad min max range
## Survived 1 889 0.38 0.49 0.00 0.35 0.00 0.0 1.00 1.00
## Pclass 2 889 2.31 0.83 3.00 2.39 0.00 1.0 3.00 2.00
## Sex* 3 889 1.65 0.48 2.00 1.69 0.00 1.0 2.00 1.00
## Age 4 889 29.65 12.97 29.70 29.22 9.34 0.4 80.00 79.60
## SibSp 5 889 0.52 1.10 0.00 0.27 0.00 0.0 8.00 8.00
## Parch 6 889 0.38 0.81 0.00 0.19 0.00 0.0 6.00 6.00
## Fare 7 889 32.10 49.70 14.45 21.28 10.24 0.0 512.33 512.33
## Embarked* 8 889 2.54 0.79 3.00 2.67 0.00 1.0 3.00 2.00
## skew kurtosis se
## Survived 0.48 -1.77 0.02
## Pclass -0.63 -1.27 0.03
## Sex* -0.62 -1.61 0.02
## Age 0.43 0.96 0.43
## SibSp 3.68 17.69 0.04
## Parch 2.74 9.66 0.03
## Fare 4.79 33.23 1.67
## Embarked* -1.26 -0.23 0.03
dim(titanic.df)
## [1] 889 8
Number of passengers are 889
apply(titanic.df[1:1], 2, function(x){mean(x)*100})
## Survived
## 38.24522
mytable <- xtabs(~ Survived+Pclass,titanic.df)
mytable
## Pclass
## Survived 1 2 3
## 0 80 97 372
## 1 134 87 119
addmargins(mytable)
## Pclass
## Survived 1 2 3 Sum
## 0 80 97 372 549
## 1 134 87 119 340
## Sum 214 184 491 889
Shows that 134 of First class passengers survived!!
prop.table(mytable,1)*100
## Pclass
## Survived 1 2 3
## 0 14.57195 17.66849 67.75956
## 1 39.41176 25.58824 35.00000
39 percent of survivors belong to first class
tablenew <- xtabs(~ Survived+Pclass+Sex ,titanic.df)
ftable(tablenew)
## Sex female male
## Survived Pclass
## 0 1 3 77
## 2 6 91
## 3 72 300
## 1 1 89 45
## 2 70 17
## 3 72 47
89 First-class females survived sinking RMS Titanic
table1 <- xtabs(~ Survived+Sex,titanic.df)
addmargins(table1)
## Sex
## Survived female male Sum
## 0 81 468 549
## 1 231 109 340
## Sum 312 577 889
prop.table(table1,1)
## Sex
## Survived female male
## 0 0.1475410 0.8524590
## 1 0.6794118 0.3205882
67 percent of survivors are female
prop.table(table1,2)
## Sex
## Survived female male
## 0 0.2596154 0.8110919
## 1 0.7403846 0.1889081
74 percent females survived
chisq.test(table1)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: table1
## X-squared = 258.43, df = 1, p-value < 2.2e-16
P value is less than standard value of 0.05 there fore there is relation between or we can say that sex of person and survival are dependent
library(vcd,grid)
## Loading required package: grid
assocstats(table1)
## X^2 df P(> X^2)
## Likelihood Ratio 266.21 1 0
## Pearson 260.76 1 0
##
## Phi-Coefficient : 0.542
## Contingency Coeff.: 0.476
## Cramer's V : 0.542
Phi-Coefficient is higher so, we can say that strong realation between gender and surviving
So,there fore our hypothesis that proportion of female survivors is higher than proportion of male survivors is correct