Creating Titanic Data set, by creating titanic Dataframe
setwd("C:/Users/hp/Desktop/IIML/My Project files")
titanic.df <- read.csv(paste("Titanic Data.csv", sep=""))
View(titanic.df)
aggregate(titanic.df$Age,by=list(Survived = titanic.df$Survived),mean)
## Survived x
## 1 0 30.41530
## 2 1 28.42382
t.test(Age ~ Survived, data = titanic.df)
##
## Welch Two Sample t-test
##
## data: Age by Survived
## t = 2.1816, df = 667.56, p-value = 0.02949
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.1990628 3.7838912
## sample estimates:
## mean in group 0 mean in group 1
## 30.41530 28.42382
Since p-value = 0.02949
The small p-value indicates that we can reject the null hypothesis that there is no significant difference in age of passengers that survived and died.
The t-test shows that the p-value is less that 0.05 and hence, there is a relationship between the age of the passenger and whether he survived or not.
The mean values prove that average age of passengers that died is higher than those of passengers that survived. Thus, the titanic survivors were younger than the passengers who died.