Analysis of Titanic Data

Author

KB

Introduction

With the information provided in the data set from the Titanic crash in 1912, I am going to be performing an analysis of the data to understand more clearer the individual facets that increased the likelihood of survival. This data records information about individuals who were passengers on the Titanic; each row is an individual and each column describes something about that individual. To learn more the Titanic data at this link: TITANIC.CSV.csv.

titanic <- read.csv("https://myxavier-my.sharepoint.com/:x:/g/personal/bartak1_xavier_edu/EeociK8efltNt-qhLklUxOQBcp5UtRA7tL7YgY9z_9sdig?download=1")

Research Question

Some may relate the phrase “women and children first” directly to the1997 film Titanic. After reviewing the data set, I would like to explore just how much more likely women and children’s survival rates were in comparison men:

titanic %>%
  filter(!is.na(survived), !is.na(age.group), !is.na(sex), sex %in% c("female", "male")) %>%
  group_by(sex, age.group) %>%
  summarise(survival_rate = mean(survived, na.rm = TRUE)) %>%
  ggplot(aes(x = sex, na.rm=TRUE, y = survival_rate, fill = age.group)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Survival Rate by Gender and Age Group",
       x = "Gender",
       y = "Survival Rate") +
  scale_fill_manual(values = c("orange", "purple", "red", "blue", "yellow"))

As seen in the bar graph, is it clear that females were much more likely than men to survive. However, female seniors and adults had the highest survival rate among women, while male children had the highest survival rate among males. On average, women took up about an 80% survived passengers, while men only comprised 20% of survivors.

[1] 4

The echo: false option disables the printing of code (only output is displayed).