setwd(“C:/Users/Taiyyab Ali/Desktop/R language”)
library(readr)
Titanic_Data <- read_csv("Titanic Data.csv")
## Parsed with column specification:
## cols(
## Survived = col_integer(),
## Pclass = col_integer(),
## Sex = col_character(),
## Age = col_double(),
## SibSp = col_integer(),
## Parch = col_integer(),
## Fare = col_double(),
## Embarked = col_character()
## )
# checking if file is accessible
#View(Titanic_Data)
aggregate(Titanic_Data$Age, by=list(Titanic_Data$Survived),mean)
## Group.1 x
## 1 0 30.41530
## 2 1 28.42382
H2: The Titanic survivors were younger than the passengers who died.
t.test(Titanic_Data$Age~Titanic_Data$Survived)
##
## Welch Two Sample t-test
##
## data: Titanic_Data$Age by Titanic_Data$Survived
## t = 2.1816, df = 667.56, p-value = 0.02949
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.1990628 3.7838912
## sample estimates:
## mean in group 0 mean in group 1
## 30.41530 28.42382
Based on p-value=0.02949 approx. 0.03, which is less than 0.05 significance level we can reject the null hypothesis that there no significant difference between age of suvived and age of died people.So, we can say survived people were younger than died ones.