Case Study

The sinking of the RMS Titanic occurred on the night of 14 April through to the morning of 15 April 1912 in the North Atlantic Ocean, four days into the ship’s maiden voyage from Southampton to New York City. The largest passenger liner in service at the time, Titanic had an estimated 2,224 people on board when she struck an iceberg at around 23:40 (ship’s time) on Sunday, 14 April 1912. Her sinking two hours and forty minutes later at 02:20 (05:18 GMT) on Monday, 15 April resulted in the deaths of more than 1,500 people, which made it one of the deadliest peacetime maritime disasters in history.

Task 2b: Reading the data set and viewing it

setwd("C:/Users/Internship")
titanic.df <- read.csv(paste("Titanic Data.csv", sep=""))
View(titanic.df)

Task 3a: Count of total number of passengers

dim(titanic.df)
## [1] 889   8

The left result = 889 is the answer to the total number of row entries which is equal to the number of passengers on board the titanic.

Task 3b: The number of passengers who survived the sinking of the Titanic

sum(titanic.df$Survived)
## [1] 340

The number of passengers who survived = 340.

Task 3c: Percentage of passengers who survived the sinking of the Titanic

prop.table(table(titanic.df$Survived))*100
## 
##        0        1 
## 61.75478 38.24522

Since 1 signifies passengers who survived, the answer is 38.24522%.

Task 3d: Number of first-class passengers who survived the sinking of the Titanic

xtabs(~Survived+Pclass, data=titanic.df)
##         Pclass
## Survived   1   2   3
##        0  80  97 372
##        1 134  87 119

Since 1 signifies first-class, the answer is 134.

Task 3e: Percentage of first-class passengers who survived the sinking of the Titanic

prop.table(xtabs(~Survived+Pclass, data=titanic.df),2)*100
##         Pclass
## Survived        1        2        3
##        0 37.38318 52.71739 75.76375
##        1 62.61682 47.28261 24.23625

Since 1 signifies both first-class and passengers who survived, the answer is 62.61682%.

Task 3f: The number of females from First-Class who survived the sinking of the Titanic

xtabs(~ Survived+Pclass+Sex, data=titanic.df)
## , , Sex = female
## 
##         Pclass
## Survived   1   2   3
##        0   3   6  72
##        1  89  70  72
## 
## , , Sex = male
## 
##         Pclass
## Survived   1   2   3
##        0  77  91 300
##        1  45  17  47

Therefore the number of female survivors from first-class are 89.

Task 3g: The percentage of survivors who were female

prop.table(xtabs(~ Survived+Sex, data=titanic.df),1)*100
##         Sex
## Survived   female     male
##        0 14.75410 85.24590
##        1 67.94118 32.05882

Female survivors = 67.94118%.

Task 3h: The percentage of females on board the Titanic who survived

prop.table(xtabs(~ Survived+Sex, data=titanic.df),2)*100
##         Sex
## Survived   female     male
##        0 25.96154 81.10919
##        1 74.03846 18.89081

Total percentage of female survivors on board of Titanic = 74.03846%.

Task 3i: Pearson’s Chi-squared test

Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.

chisq.test(xtabs(~ Sex+Survived, data=titanic.df))
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  xtabs(~Sex + Survived, data = titanic.df)
## X-squared = 258.43, df = 1, p-value < 2.2e-16

As we can see from above, p value is less than 0.05, thus the null hypothesis is rejected.