This reports calculates various factors related to surviving the sinking Titanic."
First, I loaded the required packages and datasets. Please see below:
## Loading required package: dplyr
##
## Attaching package: 'dplyr'
##
## The following objects are masked from 'package:stats':
##
## filter, lag
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
##
## Loading required package: ggvis
## Loading required package: magrittr
## Source: local data frame [2,201 x 4]
##
## Class Age Sex Survive
## 1 1 1 1 1
## 2 1 1 1 1
## 3 1 1 1 1
## 4 1 1 1 1
## 5 1 1 1 1
## 6 1 1 1 1
## 7 1 1 1 1
## 8 1 1 1 1
## 9 1 1 1 1
## 10 1 1 1 1
## .. ... ... ... ...
Then, I wrote the necessary codes to calculate each of the following:
1.The total number of passengers in the set.I selected the Class category and counted the number of passengers.
totalpass <- count (select (ttsv1, Class))
totalpass
## Source: local data frame [1 x 1]
##
## n
## 1 2201
2.The proportion of passengers surviving. I grouped the data by the Survive category, then summarized it and created a new variable to calculate the proportion.
finpropsurv <- group_by(ttsv1, Survive) %>% summarise (n = n()) %>% mutate (prop = n/sum (n))
finpropsurv
## Source: local data frame [2 x 3]
##
## Survive n prop
## 1 0 1490 0.676965
## 2 1 711 0.323035
3.The proportion of passengers surviving for each class. I grouped the data by the Class and Survive categories, then summarized it and created a new variable to calculate the proportion.
propsurvclass <- group_by(ttsv1, Class, Survive ==1) %>% summarise (n = n()) %>% mutate (prop = n/sum (n))
propsurvclass
## Source: local data frame [8 x 4]
## Groups: Class
##
## Class Survive == 1 n prop
## 1 0 FALSE 673 0.7604520
## 2 0 TRUE 212 0.2395480
## 3 1 FALSE 122 0.3753846
## 4 1 TRUE 203 0.6246154
## 5 2 FALSE 167 0.5859649
## 6 2 TRUE 118 0.4140351
## 7 3 FALSE 528 0.7478754
## 8 3 TRUE 178 0.2521246
4.The proportion of passengers surviving for each sex. I grouped the data by the Sex and Survive categories, then summarized it and created a new variable to calculate the proportion.The data below indicates that the women had the highest survival rate.
propsurvsex <- group_by (ttsv1, Sex, Survive ==1) %>% summarise (n=n()) %>% mutate (prop = n/sum (n))
propsurvsex
## Source: local data frame [4 x 4]
## Groups: Sex
##
## Sex Survive == 1 n prop
## 1 0 FALSE 126 0.2680851
## 2 0 TRUE 344 0.7319149
## 3 1 FALSE 1364 0.7879838
## 4 1 TRUE 367 0.2120162
5.The proportion of passengers surviving for each age group. I grouped the data by the Age and Survive categories, then summarized it and created a new variable to calculate the proportion. The data below indicates that the adults had the lowest survival rate.
propsurvage <- group_by(ttsv1, Age, Survive ==1) %>% summarise (n = n()) %>% mutate (prop = n/sum (n))
propsurvage
## Source: local data frame [4 x 4]
## Groups: Age
##
## Age Survive == 1 n prop
## 1 0 FALSE 52 0.4770642
## 2 0 TRUE 57 0.5229358
## 3 1 FALSE 1438 0.6873805
## 4 1 TRUE 654 0.3126195
6.The proportion of passengers surviving for each age/sex category. I grouped the data by the Age, Sex and Survive categories, then summarized it and created a new variable to calculate the proportion. The data below indicates that the adult female was most likely to survive and the adult male was least likely to survive.
propsurvagesex <- group_by(ttsv1, Age, Sex, Survive ==1) %>% summarise (n = n()) %>% mutate (prop = n/sum (n))
propsurvagesex
## Source: local data frame [8 x 5]
## Groups: Age, Sex
##
## Age Sex Survive == 1 n prop
## 1 0 0 FALSE 17 0.3777778
## 2 0 0 TRUE 28 0.6222222
## 3 0 1 FALSE 35 0.5468750
## 4 0 1 TRUE 29 0.4531250
## 5 1 0 FALSE 109 0.2564706
## 6 1 0 TRUE 316 0.7435294
## 7 1 1 FALSE 1329 0.7972406
## 8 1 1 TRUE 338 0.2027594
7.The proportion of passengers surviving for each age/sex/class category. I grouped the data by the Survive, Class, Age categories, then summarized it and created a new variable to calculate the proportion.The data below indicates that the adult males a part of the crew had the highest mortality. The reason for this may be due the financial status, working statue and gender.
propsurvagesexclass <- group_by(ttsv1, Class, Age, Sex, Survive ==1) %>% summarise (n = n()) %>% mutate (prop = n/sum (n))
propsurvagesexclass
## Source: local data frame [24 x 6]
## Groups: Class, Age, Sex
##
## Class Age Sex Survive == 1 n prop
## 1 0 1 0 FALSE 3 0.13043478
## 2 0 1 0 TRUE 20 0.86956522
## 3 0 1 1 FALSE 670 0.77726218
## 4 0 1 1 TRUE 192 0.22273782
## 5 1 0 0 TRUE 1 1.00000000
## 6 1 0 1 TRUE 5 1.00000000
## 7 1 1 0 FALSE 4 0.02777778
## 8 1 1 0 TRUE 140 0.97222222
## 9 1 1 1 FALSE 118 0.67428571
## 10 1 1 1 TRUE 57 0.32571429
## .. ... ... ... ... ... ...
print (propsurvagesexclass, n=50)
## Source: local data frame [24 x 6]
## Groups: Class, Age, Sex
##
## Class Age Sex Survive == 1 n prop
## 1 0 1 0 FALSE 3 0.13043478
## 2 0 1 0 TRUE 20 0.86956522
## 3 0 1 1 FALSE 670 0.77726218
## 4 0 1 1 TRUE 192 0.22273782
## 5 1 0 0 TRUE 1 1.00000000
## 6 1 0 1 TRUE 5 1.00000000
## 7 1 1 0 FALSE 4 0.02777778
## 8 1 1 0 TRUE 140 0.97222222
## 9 1 1 1 FALSE 118 0.67428571
## 10 1 1 1 TRUE 57 0.32571429
## 11 2 0 0 TRUE 13 1.00000000
## 12 2 0 1 TRUE 11 1.00000000
## 13 2 1 0 FALSE 13 0.13978495
## 14 2 1 0 TRUE 80 0.86021505
## 15 2 1 1 FALSE 154 0.91666667
## 16 2 1 1 TRUE 14 0.08333333
## 17 3 0 0 FALSE 17 0.54838710
## 18 3 0 0 TRUE 14 0.45161290
## 19 3 0 1 FALSE 35 0.72916667
## 20 3 0 1 TRUE 13 0.27083333
## 21 3 1 0 FALSE 89 0.53939394
## 22 3 1 0 TRUE 76 0.46060606
## 23 3 1 1 FALSE 387 0.83766234
## 24 3 1 1 TRUE 75 0.16233766
H. In summary, according to the data there were 711 survivors of the Titanic, approximately 33% of all passengers. There were nearly twice as many females than males that survived. Furthermore, the higher the class of the passenger on the ship the more likely the chance of one surviving.