library(vcd)
Task 2b
Setting the working directory
setwd("C:/Users/rishu/Downloads")
Reading the Dataset
titanic.df <- read.csv(paste("Titanic Data.csv",sep=""))
Viewing the Dataset
View(titanic.df)
Task 3a
Counting the total number of passengers on board the Titanic
library(psych)
describe(titanic.df) # here n gives the total number of passengers -> 889
## vars n mean sd median trimmed mad min max range
## Survived 1 889 0.38 0.49 0.00 0.35 0.00 0.0 1.00 1.00
## Pclass 2 889 2.31 0.83 3.00 2.39 0.00 1.0 3.00 2.00
## Sex* 3 889 1.65 0.48 2.00 1.69 0.00 1.0 2.00 1.00
## Age 4 889 29.65 12.97 29.70 29.22 9.34 0.4 80.00 79.60
## SibSp 5 889 0.52 1.10 0.00 0.27 0.00 0.0 8.00 8.00
## Parch 6 889 0.38 0.81 0.00 0.19 0.00 0.0 6.00 6.00
## Fare 7 889 32.10 49.70 14.45 21.28 10.24 0.0 512.33 512.33
## Embarked* 8 889 2.54 0.79 3.00 2.67 0.00 1.0 3.00 2.00
## skew kurtosis se
## Survived 0.48 -1.77 0.02
## Pclass -0.63 -1.27 0.03
## Sex* -0.62 -1.61 0.02
## Age 0.43 0.96 0.43
## SibSp 3.68 17.69 0.04
## Parch 2.74 9.66 0.03
## Fare 4.79 33.23 1.67
## Embarked* -1.26 -0.23 0.03
Task 3b
Counting the number of passengers who survived the sinking of the Titanic
titanic_table <- table(titanic.df$Survived)
titanic_table # here 1 means Survival -> 340
##
## 0 1
## 549 340
Task 3c
Measuring the percentage of passengers who survived the sinking of the Titanic
prop.table(titanic_table)*100 # here 1 means Survival -> 38.24%
##
## 0 1
## 61.75478 38.24522
Task 3d
Counting the number of first-class passengers who survived the sinking of the Titanic
titanic_table <- xtabs(~Survived+Pclass,data = titanic.df)
titanic_table # here 1 means Survival and First Class both -> 134
## Pclass
## Survived 1 2 3
## 0 80 97 372
## 1 134 87 119
Task 3e
Measuring the percentage of first-class passengers who survived the sinking of the Titanic
prop.table(titanic_table)*100 # here 1 means Survival and First Class both -> 15.07%
## Pclass
## Survived 1 2 3
## 0 8.998875 10.911136 41.844769
## 1 15.073116 9.786277 13.385827
Task 3f
Counting the number of females from first-class who survived the sinking of the Titanic
titanic_table <- xtabs(~Survived+Pclass+Sex,data = titanic.df)
ftable(titanic_table) # here 1 means Survival and First Class both -> 89
## Sex female male
## Survived Pclass
## 0 1 3 77
## 2 6 91
## 3 72 300
## 1 1 89 45
## 2 70 17
## 3 72 47
Task 3g
Measuring the percentage of survivors who were female
titanic_table <- xtabs(~Survived+Sex,data = titanic.df)
titanic_table
## Sex
## Survived female male
## 0 81 468
## 1 231 109
prop.table(titanic_table)*100 # here 1 means Survival -> 25.98%
## Sex
## Survived female male
## 0 9.111361 52.643420
## 1 25.984252 12.260967
Task 3h
Measuring the percentage of females on board the Titanic who survived
prop.table(titanic_table,2)*100 # here 1 means Survival -> 74.03%
## Sex
## Survived female male
## 0 25.96154 81.10919
## 1 74.03846 18.89081
Task 3i
Pearson’s Chi-squared test to check the authenticity of the given hypothesis
Hypothesis:The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic
titanic_table <- xtabs(~Survived+Sex,data = titanic.df)
titanic_table
## Sex
## Survived female male
## 0 81 468
## 1 231 109
addmargins(titanic_table)
## Sex
## Survived female male Sum
## 0 81 468 549
## 1 231 109 340
## Sum 312 577 889
chisq.test(titanic_table)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: titanic_table
## X-squared = 258.43, df = 1, p-value < 2.2e-16
Since after running the Pearson’s Chi-squared test,we get the p-value < 0.01,that means the given hypothesis is correct.