Sinking of the RMS Titanic

Time 23:40 - 02:20 Date 14-15 April 1912 Location North Atlantic Ocean Cause Collision with iceberg on 14 April 1912

Outcome Between 1,490 and 1,635 deaths Improvements to navigational safety Cultural impact

titanic.df <- read.csv(paste("Titanic Data.csv", sep=""))

A sample view of Titanic Dataset.

head(titanic.df)
##   Survived Pclass    Sex  Age SibSp Parch    Fare Embarked
## 1        0      3   male 22.0     1     0  7.2500        S
## 2        1      1 female 38.0     1     0 71.2833        C
## 3        1      3 female 26.0     0     0  7.9250        S
## 4        1      1 female 35.0     1     0 53.1000        S
## 5        0      3   male 35.0     0     0  8.0500        S
## 6        0      3   male 29.7     0     0  8.4583        Q

Use R to create a table showing the average age of the survivors and the average age of the people who died.

aggregate(titanic.df$Age, by=list(titanic.df$Survived), FUN=mean)
##   Group.1        x
## 1       0 30.41530
## 2       1 28.42382

Here 0 denotes who doesn’t survived and 1 for those who servived

Use R to run a t-test to test the following hypothesis:

H2: The Titanic survivors were younger than the passengers who died

notsurvived.df <- titanic.df[which(titanic.df$Survived=='0'),]
survived.df <- titanic.df[which(titanic.df$Survived=='1'),]
serv<- survived.df$Age
notserv <- notsurvived.df$Age
t.test(serv,notserv)
## 
##  Welch Two Sample t-test
## 
## data:  serv and notserv
## t = -2.1816, df = 667.56, p-value = 0.02949
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.7838912 -0.1990628
## sample estimates:
## mean of x mean of y 
##  28.42382  30.41530