R Markdown

This is an R Markdown document. This document contains a detailed analysis of the titanic sinking episode.

Reading the titanic data set into R:

titanic.df <- read.csv(paste("Titanic Data.csv",sep=""))
View(titanic.df)

The total number of passengers on board the titanic (equivalent to number of rows in data set):

## [1] 889

The number of passengers who survived the sinking of the Titanic(equivalent to the number of 1’s in the survived column):

## 
##   0   1 
## 549 340

The percentage of passengers who survived the sinking of the Titanic:

## 
##         0         1 
## 0.6175478 0.3824522

The number of first class passengers who survived the sinking of titanic (Pclass = 1 and Survived = 1):

##       Survived
## Pclass   0   1
##      1  80 134
##      2  97  87
##      3 372 119

Percentage of first class passengers who survived the sinking of titanic (Pclass = 1 and Survived = 1):

##       Survived
## Pclass         0         1
##      1  8.998875 15.073116
##      2 10.911136  9.786277
##      3 41.844769 13.385827

The number of females from first class who survived:

## , , Pclass = 1
## 
##         Survived
## Sex        0   1
##   female   3  89
##   male    77  45
## 
## , , Pclass = 2
## 
##         Survived
## Sex        0   1
##   female   6  70
##   male    91  17
## 
## , , Pclass = 3
## 
##         Survived
## Sex        0   1
##   female  72  72
##   male   300  47

The percentage of survivors who were female:

##         Survived
## Sex             0        1
##   female 14.75410 67.94118
##   male   85.24590 32.05882

The percentage of females who survived:

##         Survived
## Sex             0        1
##   female 25.96154 74.03846
##   male   81.10919 18.89081

Pearson’s chi squared test to test the following hypothesis:

The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  mytable
## X-squared = 258.43, df = 1, p-value < 2.2e-16

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.