Firstly read a csv file containing data elements associated with some Titanic survivors.This data set contains 8 variables and has no missing data.

Survived–Survival {0 = No, 1 = Yes}

Pclass–Ticket Class {1 = 1st, 2 = 2nd, 3 = 3rd}

Sex–Sex {Male, Female}

Age–Age in years

SibSp–Number of Siblings / spouses aboard the Titanic

Parch–Number of Parents / children aboard the Titanic

Fare–Passenger fare

Embarked–Port of Embarking {C = Cherbourg, Q = Queenstown, S = Southampton}

titanic.df <- read.csv(paste("TitanicData.csv", sep=""))

[TASK 3a]

Total number of passengers on board the titanic=889

library(psych)
describe(titanic.df)
##           vars   n  mean    sd median trimmed   mad min    max  range
## Survived     1 889  0.38  0.49   0.00    0.35  0.00 0.0   1.00   1.00
## Pclass       2 889  2.31  0.83   3.00    2.39  0.00 1.0   3.00   2.00
## Sex*         3 889  1.65  0.48   2.00    1.69  0.00 1.0   2.00   1.00
## Age          4 889 29.65 12.97  29.70   29.22  9.34 0.4  80.00  79.60
## SibSp        5 889  0.52  1.10   0.00    0.27  0.00 0.0   8.00   8.00
## Parch        6 889  0.38  0.81   0.00    0.19  0.00 0.0   6.00   6.00
## Fare         7 889 32.10 49.70  14.45   21.28 10.24 0.0 512.33 512.33
## Embarked*    8 889  2.54  0.79   3.00    2.67  0.00 1.0   3.00   2.00
##            skew kurtosis   se
## Survived   0.48    -1.77 0.02
## Pclass    -0.63    -1.27 0.03
## Sex*      -0.62    -1.61 0.02
## Age        0.43     0.96 0.43
## SibSp      3.68    17.69 0.04
## Parch      2.74     9.66 0.03
## Fare       4.79    33.23 1.67
## Embarked* -1.26    -0.23 0.03

[TASK 3b]

Number of passengers who survived the sinking of the Titanic=340

(since the digit 1 in the variable (survival) represents passengers survived.)

table(titanic.df$Survived)
## 
##   0   1 
## 549 340

[TASK 3c]

The percentage of passengers who survived the sinking of the Titanic=38%

survived1 <- with(titanic.df, table(Survived))
prop.table(survived1)
## Survived
##         0         1 
## 0.6175478 0.3824522
prop.table(survived1)*100
## Survived
##        0        1 
## 61.75478 38.24522

[TASK 3d]

The number of first-class passengers who survived the sinking of the Titanic=134.

task3d <- xtabs(~ Survived+Pclass, data=titanic.df)
addmargins(task3d)
##         Pclass
## Survived   1   2   3 Sum
##      0    80  97 372 549
##      1   134  87 119 340
##      Sum 214 184 491 889

[TASK 3e]

The percentage of first-class passengers who survived the sinking of the Titanic=15%

task3e <- with(titanic.df, table(Survived,Pclass))
prop.table(task3e)
##         Pclass
## Survived          1          2          3
##        0 0.08998875 0.10911136 0.41844769
##        1 0.15073116 0.09786277 0.13385827
prop.table(task3e)*100
##         Pclass
## Survived         1         2         3
##        0  8.998875 10.911136 41.844769
##        1 15.073116  9.786277 13.385827

[TASK 3f]

The number of females from First-Class who survived the sinking of the Titanic=89

task3f <- xtabs(~ Pclass+Sex+Survived, data=titanic.df)
task3f
## , , Survived = 0
## 
##       Sex
## Pclass female male
##      1      3   77
##      2      6   91
##      3     72  300
## 
## , , Survived = 1
## 
##       Sex
## Pclass female male
##      1     89   45
##      2     70   17
##      3     72   47

[TASK 3g]

The percentage of survivors who were female=26%

task3g <- xtabs(~ Survived+Sex, data=titanic.df)
prop.table(task3g)
##         Sex
## Survived     female       male
##        0 0.09111361 0.52643420
##        1 0.25984252 0.12260967
prop.table(task3g)*100
##         Sex
## Survived    female      male
##        0  9.111361 52.643420
##        1 25.984252 12.260967

[TASK 3h]

The percentage of females on board the Titanic who survived=74%

prop.table(task3g,2)
##         Sex
## Survived    female      male
##        0 0.2596154 0.8110919
##        1 0.7403846 0.1889081
prop.table(task3g,2)*100
##         Sex
## Survived   female     male
##        0 25.96154 81.10919
##        1 74.03846 18.89081

[TASK 3i]

Run a Pearson’s Chi-squared test to test the following hypothesis:

Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.

chisq.test(task3g)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  task3g
## X-squared = 258.43, df = 1, p-value < 2.2e-16

As the value of p is less than 0.01 it is safe to reject the null hypothesis.