{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE)
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
library(tidyverse)
titanic <- read.csv("train.csv")
#Analysis 1 I would like to analysis the distributio of age for the passangers at the Titanic. Where the majority of the ages are and what were the range in ages.
ggplot(titanic.train, aes(x = Age)) +
theme_bw() +
geom_histogram(binwidth = 5) +
labs(y = "Passenger Count",
x = "Age (binwidth = 5)",
title = "Titanic Age Distribtion")
theme_bw()
summary(titanic$Age)
summary(titanic$Age) Min. 1st Qu. Median Mean 3rd Qu. Max. NA’s 0.42 20.12 28.00 29.70 38.00 80.00 178
The histogram above indicated the majority of the passangers at the Titanic were between 20 to 37 years old with ages ranging from 0.42 years old up to 80 years old, with a median age of 28 years.
#Analysis 2 I would like to analysis the survival rate by gender and observe if there the “women and children first” scenerio happened during the evaluation stage.
ggplot(titanic, aes(x = Age)) +
geom_histogram(binwidth = 3) +
facet_grid(Sex ~ Survived,
labeller = labeller(Survived = c("0" = "Deceased", "1" = "Survived"))) +
labs(x = "Age in years", y = "Count",
title = "Histogram of Titanic Passengers' Ages by Sex and Survival Status") +
theme_bw()
prop.table(table(Child = titanic$Age < 14, Survived = titanic$Survived), 1)
prop.table(table(Sex = titanic$Sex, Survived = titanic$Survived), 1)
The graph indicated the majority of the female population survived compared with the male population. Further, the death rate of children under 14 years old that survived was greater than their adult counterpart for both male and female. It seems the women and children first scenerio likely occured as both female and children population survivial rate was higher than male.