1. Read a .csv file containing data elements about Titanic travelers from http://www.personal.psu.edu/dlp/w540/datasets/titanicsurvival.csv into an R dataset. These data might be available in other locations, but you must read this .csv file from the source provided here. This dataset contains four variables and has no missing data-

Class - 0 = crew, 1 = first class, 2 = second class, 3 = third class (pertain to the quality and types of cabins on the Titanic)

Age - (1 = adult, 0 = child)

Sex - (1 = male, 0 = female)

Survived - Survived (1 = yes, 0 = no)

require(dplyr)
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
titanic<-read.csv(url("http://www.personal.psu.edu/dlp/w540/datasets/titanicsurvival.csv"))
Titanic<-tbl_df(titanic)
Titanic
## Source: local data frame [2,201 x 4]
## 
##    Class Age Sex Survive
## 1      1   1   1       1
## 2      1   1   1       1
## 3      1   1   1       1
## 4      1   1   1       1
## 5      1   1   1       1
## 6      1   1   1       1
## 7      1   1   1       1
## 8      1   1   1       1
## 9      1   1   1       1
## 10     1   1   1       1
## ..   ... ... ...     ...
  1. Calculate the total number of passengers in the dataset.2201
glimpse(Titanic) 
## Observations: 2201
## Variables:
## $ Class   (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ Age     (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ Sex     (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ Survive (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
count(Titanic) # I think this syntax is better.
## Source: local data frame [1 x 1]
## 
##      n
## 1 2201
  1. Calculate the total proportion of passengers surviving. 32.3% (0.323045)
survivedpassengers<-filter(titanic,Survive==1)
glimpse(survivedpassengers)
## Observations: 711
## Variables:
## $ Class   (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ Age     (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ Sex     (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ Survive (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
prop.table(table(titanic$Survive))
## 
##        0        1 
## 0.676965 0.323035
  1. Calculate the proportion of passengers surviving for each class of passenger.

Crew: 24.0% (0.2395480)

First class: 62.5% (0.6246154)

Second class: 41.4% (0.4140351)

Third class: 25.2% (0.2521246)

prop.table(table(titanic$Class, titanic$Survive),1)
##    
##             0         1
##   0 0.7604520 0.2395480
##   1 0.3753846 0.6246154
##   2 0.5859649 0.4140351
##   3 0.7478754 0.2521246
  1. Calculate the proportion of passengers surviving for each sex category.

Female 73.2%(0.7319149)

Male 21.1%(0.2120162)

Which sex had the highest survival rate? Female

prop.table(table(titanic$Sex, titanic$Survive),1)
##    
##             0         1
##   0 0.2680851 0.7319149
##   1 0.7879838 0.2120162
  1. Calculate the proportion of passengers surviving for each age category.

Child 52.3%(0.5229358)

Adult 31.3%(0.3126195)

Which age had the lowest survival rate? Child

prop.table(table(titanic$Age, titanic$Survive),1)
##    
##             0         1
##   0 0.4770642 0.5229358
##   1 0.6873805 0.3126195
  1. Calculate the proportion of passengers surviving for each age/sex category (i.e., for adult males, child males, adult females, child females).

Adult male 20.3%(0.2027594)

Child male 45.3%(0.453125)

Adult female 74.4%(0.7435294)

Child female 62.2%(0.6222222)

Which group was most likely to survive? Adult female

Least likely? Adult male

cf<-filter(titanic,Age==0, Sex==0) # Child females
cfs<-filter(titanic,Age==0, Sex==0, Survive==1) # Survived child females
cm<-filter(titanic,Age==0, Sex==1) # Child males
cms<-filter(titanic,Age==0, Sex==1, Survive==1) # Survived child males
am<-filter(titanic,Age==1, Sex==1) # Adult males
ams<-filter(titanic,Age==1, Sex==1, Survive==1) # Survived adult males
af<-filter(titanic,Age==1, Sex==0) # Adult females
afs<-filter(titanic,Age==1, Sex==0, Survive==1) # Survived adult females
nrow(cfs)/nrow(cf)
## [1] 0.6222222
nrow(afs)/nrow(af)
## [1] 0.7435294
nrow(ams)/nrow(am)
## [1] 0.2027594
nrow(cms)/nrow(cm)
## [1] 0.453125
  1. Calculate the proportion of passengers surviving for each age/sex/class category.

Crew, Adult male: 22.3%(0.222738)

Crew, Child male: none

Crew, Adult female: 87.0%(0.869565)

Crew, Child female: none

First class , Adult male: 32.6%(0.325714)

First class , Child male: 100.0%

First class , Adult female: 97.2%(0.972222)

First class , Child female: 100.0%

Second class, Adult male: 8.3%(0.083333)

Second class, Child male: 100.0%

Second class, Adult female: 86.0%(0.860215)

Second class, Child female: 100.0%

Third class, Adult male: 16.2%(0.162338)

Third class, Child male: 27.1%(0.270833)

Third class, Adult female: 46.1%(0.460606)

Third class, Child female: 45.2%(0.451613)

Which group had the highest mortality in this disaster. Second class, Adult males

Why? I think that adult men would have sacrificed himself in order to save the women and children.

Category <- group_by(titanic, Class, Age, Sex)
Survived<-filter(Category,Survive==1)
total<-summarise(Category, n=n())
Sur<-summarise(Survived, n=n())
proportion<-Sur[,4]/total[,4]
summarise(Category)
## Source: local data frame [14 x 3]
## Groups: Class, Age
## 
##    Class Age Sex
## 1      0   1   0
## 2      0   1   1
## 3      1   0   0
## 4      1   0   1
## 5      1   1   0
## 6      1   1   1
## 7      2   0   0
## 8      2   0   1
## 9      2   1   0
## 10     2   1   1
## 11     3   0   0
## 12     3   0   1
## 13     3   1   0
## 14     3   1   1
proportion
##             n
## 1  0.86956522
## 2  0.22273782
## 3  1.00000000
## 4  1.00000000
## 5  0.97222222
## 6  0.32571429
## 7  1.00000000
## 8  1.00000000
## 9  0.86021505
## 10 0.08333333
## 11 0.45161290
## 12 0.27083333
## 13 0.46060606
## 14 0.16233766
  1. Write a summary of your findings. Your summary may contain no more than 60 words.

The results of the analysis show that the survive rate in the Titanic accident was 32.3%. To be specific, the survival rate was higher in female (73.2%) than male (21.1%) and it was higher in child (52.3%) than adult (31.3%). With regard to class, the survival rate in the first class was highest (62.5%). Overall, in all classes, the survival rate of adult males was the lowest. This is because they helped to rescue the women and children first, I think. On the other hand, the survival rate of third class was relatively low. Given that the survival rates in female and child in third class were not so high, I think that the passengers in third class may have been rescued later than the other classes.