setwd("C:/Users/CJ With HP/Desktop/IIM Lucknow/Datasets")
titan.df <- read.csv(paste("Titanic Data.csv",sep=""))
aggregate(Age~Survived,data=titan.df,mean)
## Survived Age
## 1 0 30.41530
## 2 1 28.42382
aggregate(Age~Survived,data=titan.df,mean)
## Survived Age
## 1 0 30.41530
## 2 1 28.42382
boxplot(Age~Survived,data=titan.df, main="Survival v/s Age Boxplot")
Now, consider the Hypothesis that “The Titanic survivors were younger than the passengers who died.” Null Hypothesis: There is no relation between the age of the person and his survival/death.
t.test(Age~Survived,data = titan.df)
##
## Welch Two Sample t-test
##
## data: Age by Survived
## t = 2.1816, df = 667.56, p-value = 0.02949
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.1990628 3.7838912
## sample estimates:
## mean in group 0 mean in group 1
## 30.41530 28.42382
Conclusion: Since, the p-value is < 0.05, we can reject the null hypothesis. And conclude that,“The Titanic survivors were younger than the passengers who died”.