Predicting death in Titanic ship

This is an R Markdown document which does the analysis on the death and survival of the people in the accident which occurred with the Titanic ship based in the data given in RMS Titanic case study.

  setwd("G:/INTERNSHIP IIM/titanic death data")
  titanic.df <- read.csv(paste("Titanic Data.csv", sep=""))
  View(titanic.df)

Task3A: Counting total number of passengers on board the Titanic.

  str(titanic.df)
## 'data.frame':    889 obs. of  8 variables:
##  $ Survived: int  0 1 1 1 0 0 0 0 1 1 ...
##  $ Pclass  : int  3 1 3 1 3 3 1 3 3 2 ...
##  $ Sex     : Factor w/ 2 levels "female","male": 2 1 1 1 2 2 2 2 1 1 ...
##  $ Age     : num  22 38 26 35 35 29.7 54 2 27 14 ...
##  $ SibSp   : int  1 1 0 1 0 0 0 3 0 1 ...
##  $ Parch   : int  0 0 0 0 0 0 0 1 2 0 ...
##  $ Fare    : num  7.25 71.28 7.92 53.1 8.05 ...
##  $ Embarked: Factor w/ 3 levels "C","Q","S": 3 1 3 3 3 2 3 3 3 1 ...

Hence, from above we get that total 889 passengers on board the Titanic.

Task3b: Use R to count the number of passengers who survived the sinking of the Titanic.

  table(titanic.df$Survived==1)
## 
## FALSE  TRUE 
##   549   340

Total 340 passengers Survied.

Task3c: Use R to measure the percentage of passengers who survived the sinking of the Titanic.

 per_survival <- prop.table(table(titanic.df$Survived))*100
 per_survival[2]
##        1 
## 38.24522

38.24% of people Survived the Sinking.

Task3d: Use R to count the number of first-class passengers who survived the sinking of the Titanic.

  fcsur <- xtabs(~Pclass+Survived,data =  titanic.df)
  fcsur
##       Survived
## Pclass   0   1
##      1  80 134
##      2  97  87
##      3 372 119
  fcsur [1,2]
## [1] 134

134 1st class Passenger Survived.

Task3e: Use R to measure the percentage of first-class passengers who survived the sinking of the Titanic.

  per_fcsur <-  prop.table(fcsur,1)*100
  per_fcsur[1,2]
## [1] 62.61682

62.61% of first class passenger survived the sinking.

Task3f: Use R to count the number of females from First-Class who survived the sinking of the Titanic

 fsur <- xtabs(~Pclass+Sex+Survived,data =  titanic.df)
 fsur1 <- ftable(fsur)
 fsur1[1,2]
## [1] 89

89 number of Females from First-Class survived the sinking of the Titanic.

Task3g: Use R to measure the percentage of survivors who were female

  per_fsur1 <- prop.table(xtabs(~Sex+Survived,data = titanic.df))*100
  per_fsur1
##         Survived
## Sex              0         1
##   female  9.111361 25.984252
##   male   52.643420 12.260967
  per_fsur1[1,2]
## [1] 25.98425

25.98% of surviors are Female.

Task3h: Use R to measure the percentage of females on board the Titanic who survived

  fm <- prop.table(xtabs(~Sex+Survived,data= titanic.df),1)*100
  fm[1,2]
## [1] 74.03846

74.038 percentage of females on board the Titanic who survived.

Task3i: Run a Pearson’s Chi-squared test to test the following hypothesis:

Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.

  chi <- prop.table(xtabs(~Sex+Survived, data=titanic.df))
  chi
##         Survived
## Sex               0          1
##   female 0.09111361 0.25984252
##   male   0.52643420 0.12260967
  chisq.test(chi)
## Warning in chisq.test(chi): Chi-squared approximation may be incorrect
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  chi
## X-squared = 5.7395e-33, df = 1, p-value = 1

As p-value is much greater than 0.05, therefore, our hypothesis i wrong.