1.TASK 2b - Reading the dataset

This is details and operations on Titanic Data Set

First we Download the file and place it in the working directory.Then we open it ,reading it and then place it in the data frame.

We know that the original Data contains 889 rows and 8 columns

titanic.df<-read.csv(paste("G:/R Intern/Titanic Data.csv",sep=""))
View(titanic.df)        #Viewing the DataSet
dim(titanic.df)        # to confirm the data frame has the same data as the Titanic Data.
## [1] 889   8

Thus we confirm that the data frame created has the same data as the Data given.

TASK 3a

Use R to count the total number of passengers on board the Titanic. Also we have to count how many people died and survived We do this in r:

mytable<-table(titanic.df$Survived)
margin.table(mytable)    #Summation of all people whether survived or not
## [1] 889
mytable     # 0 represents those who died
## 
##   0   1 
## 549 340
        #1 represents they survived

TASK 3b

Use R to count the number of passengers who survived the sinking of the Titanic. To calculate those who survived

titanic.survived<-sum(titanic.df$Survived==1)
titanic.survived
## [1] 340

TASK 3c

Use R to measure the percentage of passengers who survived the sinking of the Titanic.

Percentage of those who survived(1) and those who died(0):

prop.table(mytable)*100
## 
##        0        1 
## 61.75478 38.24522

Task 3d

Use R to count the number of first-class passengers who survived the sinking of the Titanic.

titanic.first.survived<-sum(titanic.df$Pclass==1 & titanic.df$Survived==1)
titanic.first.survived
## [1] 134

TASK 3e

Use R to measure the percentage of first-class passengers who survived the sinking of the Titanic. (Hint: You could use prop.table()

titanic.table<-xtabs(~Survived+Pclass,data=titanic.df)
prop.table(titanic.table,margin=1)*100
##         Pclass
## Survived        1        2        3
##        0 14.57195 17.66849 67.75956
##        1 39.41176 25.58824 35.00000

SO as we see 39.41176 percent of the first class survived.

TASK 3f

Use R to count the number of females from First-Class who survived the sinking of the Titanic

mytable<-xtabs(~Survived+Pclass+Sex,data=titanic.df)
ftable(mytable)
##                 Sex female male
## Survived Pclass                
## 0        1               3   77
##          2               6   91
##          3              72  300
## 1        1              89   45
##          2              70   17
##          3              72   47

We see that the total number of ladies who survived from first class are 89.

TASK 3g and TASK 3h

Use R to measure the percentage of survivors who were female

Use R to measure the percentage of females on board the Titanic who survived

mytable<-table(titanic.df$Survived,titanic.df$Sex)
titanic.survivers<-prop.table(mytable,margin=1)*100     # percent of survivors who were females
titanic.survivers
##    
##       female     male
##   0 14.75410 85.24590
##   1 67.94118 32.05882
prop.table(mytable,margin=2)*100    #percent of females who survived
##    
##       female     male
##   0 25.96154 81.10919
##   1 74.03846 18.89081

67.94 % of the survivers were females.

74% of the females survived.

TASK 3i

Run a Pearson’s Chi-squared test to test the following hypothesis:

Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.

chisq.test(titanic.survivers)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  titanic.survivers
## X-squared = 56.151, df = 1, p-value = 6.71e-14

As p<0.01 this shows that this hypothesis is true i.e. The proportion of females onboard who survived were larger than the proportion of males who survived.