This is details and operations on Titanic Data Set
First we Download the file and place it in the working directory.Then we open it ,reading it and then place it in the data frame.
We know that the original Data contains 889 rows and 8 columns
titanic.df<-read.csv(paste("G:/R Intern/Titanic Data.csv",sep=""))
View(titanic.df) #Viewing the DataSet
dim(titanic.df) # to confirm the data frame has the same data as the Titanic Data.
## [1] 889 8
Thus we confirm that the data frame created has the same data as the Data given.
Use R to count the total number of passengers on board the Titanic. Also we have to count how many people died and survived We do this in r:
mytable<-table(titanic.df$Survived)
margin.table(mytable) #Summation of all people whether survived or not
## [1] 889
mytable # 0 represents those who died
##
## 0 1
## 549 340
#1 represents they survived
Use R to count the number of passengers who survived the sinking of the Titanic. To calculate those who survived
titanic.survived<-sum(titanic.df$Survived==1)
titanic.survived
## [1] 340
Use R to measure the percentage of passengers who survived the sinking of the Titanic.
Percentage of those who survived(1) and those who died(0):
prop.table(mytable)*100
##
## 0 1
## 61.75478 38.24522
Use R to count the number of first-class passengers who survived the sinking of the Titanic.
titanic.first.survived<-sum(titanic.df$Pclass==1 & titanic.df$Survived==1)
titanic.first.survived
## [1] 134
Use R to measure the percentage of first-class passengers who survived the sinking of the Titanic. (Hint: You could use prop.table()
titanic.table<-xtabs(~Survived+Pclass,data=titanic.df)
prop.table(titanic.table,margin=1)*100
## Pclass
## Survived 1 2 3
## 0 14.57195 17.66849 67.75956
## 1 39.41176 25.58824 35.00000
SO as we see 39.41176 percent of the first class survived.
Use R to count the number of females from First-Class who survived the sinking of the Titanic
mytable<-xtabs(~Survived+Pclass+Sex,data=titanic.df)
ftable(mytable)
## Sex female male
## Survived Pclass
## 0 1 3 77
## 2 6 91
## 3 72 300
## 1 1 89 45
## 2 70 17
## 3 72 47
We see that the total number of ladies who survived from first class are 89.
Use R to measure the percentage of survivors who were female
Use R to measure the percentage of females on board the Titanic who survived
mytable<-table(titanic.df$Survived,titanic.df$Sex)
titanic.survivers<-prop.table(mytable,margin=1)*100 # percent of survivors who were females
titanic.survivers
##
## female male
## 0 14.75410 85.24590
## 1 67.94118 32.05882
prop.table(mytable,margin=2)*100 #percent of females who survived
##
## female male
## 0 25.96154 81.10919
## 1 74.03846 18.89081
67.94 % of the survivers were females.
74% of the females survived.
Run a Pearson’s Chi-squared test to test the following hypothesis:
Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.
chisq.test(titanic.survivers)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: titanic.survivers
## X-squared = 56.151, df = 1, p-value = 6.71e-14
As p<0.01 this shows that this hypothesis is true i.e. The proportion of females onboard who survived were larger than the proportion of males who survived.