RMS Titanic

This is an R Markdown document which gives the analysis of the Titanic dataset case study.

Create Dataframe of Titanic

setwd("~/Desktop/Data Analytics Internship")
titanic.df <- read.csv(file="Titanic Data.csv")
View(titanic.df)

R Code to count the total number of passengers on board the Titanic.

mytab <- with( titanic.df,table(Sex))
addmargins(mytab)
## Sex
## female   male    Sum 
##    312    577    889

Hence, there are 889 passengers in Titanic.

R Code to count the number of passengers who survived the sinking of the Titanic.

mytab1 <- xtabs(~Sex+Survived,data = titanic.df)
addmargins(mytab1)
##         Survived
## Sex        0   1 Sum
##   female  81 231 312
##   male   468 109 577
##   Sum    549 340 889

Therefore, 340 passengers survived the sinking of the titanic.

R Code to find the percentage of passengers who survived

prop.table(mytab1)*100
##         Survived
## Sex              0         1
##   female  9.111361 25.984252
##   male   52.643420 12.260967

Hence, 25% Females and 12% Males survived.

R Code to count the number of first-class passengers who survived the sinking of the Titanic.

mytab2 <- xtabs(~Pclass+Survived,data = titanic.df)
addmargins(mytab2)
##       Survived
## Pclass   0   1 Sum
##    1    80 134 214
##    2    97  87 184
##    3   372 119 491
##    Sum 549 340 889

Hence, 134 passengers of first class survived.

R Code to find the percentage of first class passenger who survived

prop.table(mytab2)*100
##       Survived
## Pclass         0         1
##      1  8.998875 15.073116
##      2 10.911136  9.786277
##      3 41.844769 13.385827

Hence, 15% first class passengers survived compared to all other classes.

R Code to find the number of female first class passengers who survived

mytab3 <- xtabs(~Sex+Pclass+Survived,data = titanic.df)
mytab3
## , , Survived = 0
## 
##         Pclass
## Sex        1   2   3
##   female   3   6  72
##   male    77  91 300
## 
## , , Survived = 1
## 
##         Pclass
## Sex        1   2   3
##   female  89  70  72
##   male    45  17  47

Hence, only 3 females from first class survived.

R Code to find the percentage of female survivors from first class

prop.table(mytab3)*100
## , , Survived = 0
## 
##         Pclass
## Sex               1          2          3
##   female  0.3374578  0.6749156  8.0989876
##   male    8.6614173 10.2362205 33.7457818
## 
## , , Survived = 1
## 
##         Pclass
## Sex               1          2          3
##   female 10.0112486  7.8740157  8.0989876
##   male    5.0618673  1.9122610  5.2868391

Hence, approx. 10% females from first class survived.

R Code to measure the percentage of females onboard who survive the sinking

mytab4 <- xtabs(~Sex+Survived,data = titanic.df)
prop.table(mytab4)*100
##         Survived
## Sex              0         1
##   female  9.111361 25.984252
##   male   52.643420 12.260967

Hence, approx. 26% females survived the sinking.

R Code to Run a Pearson’s Chi-squared test to test the following hypothesis:

Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.

chisq.test(mytab4)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  mytab4
## X-squared = 258.43, df = 1, p-value < 2.2e-16