1.TASK 4a

Recall the Titanic Data.csv data associated with the “Sinking of the RMS Titanic” that you analyzed on WEEK 1, DAY 5

First we Download the file and place it in the working directory.Then we open it ,reading it and then place it in the data frame.

We know that the original Data contains 889 rows and 8 columns

titanic.df<-read.csv(paste("G:/R Intern/Titanic Data.csv",sep=""))
View(titanic.df)        #Viewing the DataSet
dim(titanic.df)        # to confirm the data frame has the same data as the Titanic Data.
## [1] 889   8

As we see the dimensions, the data is correctly read.

TASK 4b

Use R to create a table showing the average age of the survivors and the average age of the people who died.

mytable<-aggregate(titanic.df$Age,list(titanic.df$Survived),mean)
mytable
##   Group.1        x
## 1       0 30.41530
## 2       1 28.42382

The average age of survivors was 28.42382.

The average age of people who died was 30.41530.

TASK 4c

Use R to run a t-test to test the following hypothesis: H2: The Titanic survivors were younger than the passengers who died.

The above can be tested by:

t.test(mytable,var.equal = TRUE)
## 
##  One Sample t-test
## 
## data:  mytable
## t = 1.7893, df = 3, p-value = 0.1715
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -11.64783  41.56739
## sample estimates:
## mean of x 
##  14.95978

The p-value is higher than 0.05, thus this implies that there is no signifacnt proof for the null hypothesis to be rejected.This assumption is thus true.

Thus this report is concluded.