Task 2a

The analysis is based on the dataset Titanic Data.csv.

This file explains the meaning of each column in the given dataset.

Task 2b

Using the read.csv() function in R, data is read and stored in a dataframe called “titanic”.

titanic.df<-read.csv(paste("Titanic Data.csv", sep=""))

Using the View() function the dataframe is viewed.

View(titanic.df)

Task 3a

Counting the total number of passengers on board the Titanic.

dim(titanic.df)
## [1] 889   8
length(titanic.df$Survived) #Alternate
## [1] 889

Therefore, total number of passengers on board the Titanic = 889

Task 3b

Counting the number of passengers who survived the sinking of the Titanic. Survival {0 = No, 1 = Yes}

mytable <- with(titanic.df, table(Survived))
mytable  # frequencies
## Survived
##   0   1 
## 549 340

Therefore, the number of passengers who survived the sinking of the Titanic = 340

Task 3c

Measuring the percentage of passengers who survived the sinking of the Titanic.

prop.table(mytable)*100
## Survived
##        0        1 
## 61.75478 38.24522

Therefore, the percentage of passengers who survived the sinking of the Titanic = 38.24522

Task 3d

Counting the number of first-class passengers who survived the sinking of the Titanic.

mytable <- xtabs(~ Pclass+Survived, data=titanic.df)
mytable # frequencies
##       Survived
## Pclass   0   1
##      1  80 134
##      2  97  87
##      3 372 119

Therefore, the number of first-class passengers who survived the sinking of the Titanic = 134

Task 3e

Measuring the percentage of first-class passengers who survived the sinking of the Titanic.

prop.table(mytable, 1)*100
##       Survived
## Pclass        0        1
##      1 37.38318 62.61682
##      2 52.71739 47.28261
##      3 75.76375 24.23625

Therefore, percentage of first-class passengers who survived the sinking of the Titanic = 62.61682

Task 3f

Counting the number of females from First-Class who survived the sinking of the Titanic

mytable <- xtabs(~ Sex+Pclass+Survived, data=titanic.df)
ftable(mytable)
##               Survived   0   1
## Sex    Pclass                 
## female 1                 3  89
##        2                 6  70
##        3                72  72
## male   1                77  45
##        2                91  17
##        3               300  47

Therefore, number of females from First-Class who survived the sinking of the Titanic = 89

Task 3g

Measuring the percentage of survivors who were female

margin.table(mytable, c(1,3))
##         Survived
## Sex        0   1
##   female  81 231
##   male   468 109
prop.table(margin.table(mytable, c(1,3)), 2)*100
##         Survived
## Sex             0        1
##   female 14.75410 67.94118
##   male   85.24590 32.05882

Therefore, percentage of survivors who were female = 67.94118

Task 3h

Measuring the percentage of females on board the Titanic who survived

margin.table(mytable, c(1,3))
##         Survived
## Sex        0   1
##   female  81 231
##   male   468 109
prop.table(margin.table(mytable, c(1,3)), 1)*100
##         Survived
## Sex             0        1
##   female 25.96154 74.03846
##   male   81.10919 18.89081

Therefore, percentage of females on board the Titanic who survived = 74.03846

Task 3i

Run a Pearson’s Chi-squared test to test the following hypothesis: Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.

mytable <- xtabs(~ Sex+Survived, data=titanic.df)
chisq.test(mytable)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  mytable
## X-squared = 258.43, df = 1, p-value < 2.2e-16

Since the probability is small (p < 0.01), we reject the Null hypothesis that gender and survival are independent.