## Loading required package: ggvis
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## 
## Loading required package: magrittr

1.Downloaded CSV file

The full dataset needed to be downloaded in order to perform the analysis for each of the questions. This task was completed using the read.csv file function and then converted to the tbl_df(titanic1).

2.Total Number of Passengers in the Dataset

summarise(titanic1, count=n())
##   count
## 1  2201

To calculate the remainin questions, the total number of passengers (n) is important for obtaining the propotions in each category. Relative frequency in each category is determined by grouping each category based on the question, obtaining the count within each category and utilizing the mutate function to add the relative frequency variable to the dataset.

3. Calculate the total proportion of passengers surviving

titanic1 %>% group_by(Survive) %>% summarise (n=n()) %>% mutate (freq= n/sum(n)) %>% filter(Survive==1)
## Source: local data frame [1 x 3]
## 
##   Survive   n     freq
## 1       1 711 0.323035

32% of the passengers survived

4. Calculate the total proportion of passengers surviving for each class of passenger

titanic1 %>% group_by(Class, Survive) %>% summarise (n=n()) %>% mutate (freq= n/sum(n))%>% filter(Survive==1)
## Source: local data frame [4 x 4]
## Groups: Class
## 
##   Class Survive   n      freq
## 1     0       1 212 0.2395480
## 2     1       1 203 0.6246154
## 3     2       1 118 0.4140351
## 4     3       1 178 0.2521246

From this analysis, First class had the highest proportion of people surviving, at 62%.

5. Calculate the proportion of passengers surviving for each sex category

titanic1 %>% group_by(Sex, Survive) %>% summarise (n=n()) %>% mutate (freq= n/sum(n))%>% filter(Survive==1)
## Source: local data frame [2 x 4]
## Groups: Sex
## 
##   Sex Survive   n      freq
## 1   0       1 344 0.7319149
## 2   1       1 367 0.2120162

In this analysis, women were the highest proportion of survivors, at 73%.

6. Calculate the proportion of passengers surviving for each age category

titanic1 %>% group_by(Age, Survive) %>% summarise (n=n()) %>% mutate (freq= n/sum(n))%>% filter(Survive==1)
## Source: local data frame [2 x 4]
## Groups: Age
## 
##   Age Survive   n      freq
## 1   0       1  57 0.5229358
## 2   1       1 654 0.3126195

From this analysis, if you were a young child, you had a higher chance of survival, with the proportion of children surviving higher than those of adults.

7. Calculate the proportion of passengers surviving for each age/sex category

titanic1 %>% group_by(Age, Sex, Survive) %>% summarise (n=n()) %>% mutate (freq= n/sum(n)) %>% filter(Survive==1)
## Source: local data frame [4 x 5]
## Groups: Age, Sex
## 
##   Age Sex Survive   n      freq
## 1   0   0       1  28 0.6222222
## 2   0   1       1  29 0.4531250
## 3   1   0       1 316 0.7435294
## 4   1   1       1 338 0.2027594

From this analysis, a female adult had the greatest survival rate on the titanic (74%), followed by a female child (62%), then a male child (45%). A male adult had the lowest chance of surviving on the ship.

8 Calculate the Proportion for each age/sex/class category

titanic1 %>% group_by(Class,Age, Sex, Survive) %>% summarise (n=n()) %>% mutate (freq= n/sum(n)) %>% filter(Survive==1)
## Source: local data frame [14 x 6]
## Groups: Class, Age, Sex
## 
##    Class Age Sex Survive   n       freq
## 1      0   1   0       1  20 0.86956522
## 2      0   1   1       1 192 0.22273782
## 3      1   0   0       1   1 1.00000000
## 4      1   0   1       1   5 1.00000000
## 5      1   1   0       1 140 0.97222222
## 6      1   1   1       1  57 0.32571429
## 7      2   0   0       1  13 1.00000000
## 8      2   0   1       1  11 1.00000000
## 9      2   1   0       1  80 0.86021505
## 10     2   1   1       1  14 0.08333333
## 11     3   0   0       1  14 0.45161290
## 12     3   0   1       1  13 0.27083333
## 13     3   1   0       1  76 0.46060606
## 14     3   1   1       1  75 0.16233766

This data tells us that on board the titantic, if you were a male, adult, in second class you had the highest mortality rate on the ship with only 8% of that population surviving.

Summary

In summary, this analysis revealed that the decision made by leadership as the boat was sinking favored first class women and children. Second class adult males had the highest mortality rate on the ship, with third class adult males the second highest.

Sources

The dataset for this analyisis was provided by the professor and found here: http://www.personal.psu.edu/dlp/w540/datasets/titanicsurvival.csv

Per the assignment, the data is from two different reports * “Report on the Loss of the ‘Titanic’ (S.S.)” (1990), British Board of Trade Inquiry Report (reprint), Gloucester, UK: Allan Sutton Publishing * Dawson, R. J. M. (1995). The ‘unusual episode’ data revisited. Journal of Statistics Education [on-line] 3(3). (http://www.amstat.org/publications/jse/v3n3/datasets.dawson.html).

I also utilized the following website to assist in the mutate and summarise functions to complete the assignment: