The RMD file contains the T-test Analysis of Titanic Survivors Case Study.

4a.Reading the ‘Titanic.csv’ file into R

Titanic <- read.csv(paste("Titanic.csv",sep=""))
attach(Titanic)
str(Titanic)
## 'data.frame':    889 obs. of  8 variables:
##  $ Survived: int  0 1 1 1 0 0 0 0 1 1 ...
##  $ Pclass  : int  3 1 3 1 3 3 1 3 3 2 ...
##  $ Sex     : Factor w/ 2 levels "female","male": 2 1 1 1 2 2 2 2 1 1 ...
##  $ Age     : num  22 38 26 35 35 29.7 54 2 27 14 ...
##  $ SibSp   : int  1 1 0 1 0 0 0 3 0 1 ...
##  $ Parch   : int  0 0 0 0 0 0 0 1 2 0 ...
##  $ Fare    : num  7.25 71.28 7.92 53.1 8.05 ...
##  $ Embarked: Factor w/ 3 levels "C","Q","S": 3 1 3 3 3 2 3 3 3 1 ...

4b.Use R to create a table showing the average age of the survivors and the average age of the people who died.

Titanic$Survived = factor(Titanic$Survived,levels=c(0,1),labels=c("Died","Survived"))
aggregate(Titanic$Age,list(Titanic$Survived),mean)
##    Group.1        x
## 1     Died 30.41530
## 2 Survived 28.42382

4c.se R to run a t-test to test the following hypothesis:

H2: The Titanic survivors were younger than the passengers who died.

Perfoming Dependent T-test (as the variables are dependent )

Assumtions made: #The variables are normally distributed #The variances in each group are equal

t.test(Age,Survived,paired=TRUE,data=Titanic)
## 
##  Paired t-test
## 
## data:  Age and Survived
## t = 67.065, df = 888, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  28.41458 30.12782
## sample estimates:
## mean of the differences 
##                 29.2712

As the obtained P(value)<0.05, we can reject the null hypothesis. Hence, we can conclude that the survivors of Titanic are younger than the passengers who died in the tragedy.