Task 4a

Creating Titanic Data set, by creating titanic Dataframe

setwd("C:/Users/hp/Desktop/IIML/My Project files")
titanic.df <- read.csv(paste("Titanic Data.csv", sep=""))
View(titanic.df)

Task 4b

aggregate(titanic.df$Age,by=list(Survived = titanic.df$Survived),mean)
##   Survived        x
## 1        0 30.41530
## 2        1 28.42382

Task 4c

t.test(Age ~ Survived, data = titanic.df)
## 
##  Welch Two Sample t-test
## 
## data:  Age by Survived
## t = 2.1816, df = 667.56, p-value = 0.02949
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.1990628 3.7838912
## sample estimates:
## mean in group 0 mean in group 1 
##        30.41530        28.42382

Since p-value = 0.02949

The small p-value indicates that we can reject the null hypothesis that there is no significant difference in age of passengers that survived and died.

The t-test shows that the p-value is less that 0.05 and hence, there is a relationship between the age of the passenger and whether he survived or not.

The mean values prove that average age of passengers that died is higher than those of passengers that survived. Thus, the titanic survivors were younger than the passengers who died.