DATA 110 Week 8 Homework

Author

Emilio Difilippantonio

I chose the data set entitled, “death_prop,” which is described as giving the “probability of death within 1 year by age and sex in the United States in 2015” (source). I created my graph by making the x-axis the person’s age and the y-axis their probability of dying within 1 year. I also colored the data points to represent the person’s sex, with cyan representing males and orange representing females. As you can see in the graph, the death probability starts rising drastically after 75 years old, with the death probability for males rising quicker. At almost 100 years old, the death probability for males stops increasing so quickly, and it eventually overlaps with the death probability of females again at around 110 years old.

# Loading in the appropriate packages
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggthemes)
library(ggrepel)
library(dslabs)
# Loading in the data set
data("death_prob")
# Creating the scatterplot
  # Assinging the plot and piping in the data set
plot <- death_prob |>
  # Assigning variables to the x-axis, y-axis, and colors
  ggplot(aes(x = age, y = prob, color = sex)) +
  # Setting the color palette (I made sure it worked this time)
  scale_color_brewer(palette = "Set2") +
  # Loading in the data points and assigning a transparency level
  geom_point(alpha = 0.3) +
  # Creating a title and a caption, labeling the legend, and labeling the x- and y-axes
  labs(title = "Probability of Death at Each Age From\n0 to 119 for Males and Females",
       caption = "Source: DSLabs",
       color = "Sex",
       x = "Age in Years",
       y = "Probability of Dying") +
  # Changing the theme of the graph
  theme_minimal() +
  # Setting the domain of the graph (x-axis)
  xlim(0, 119) +
  # Setting the range of the graph (y-axis)
  ylim(0, 1) +
  # Moving the legend to the top left of the graph to use space more effectively
  theme(legend.position = c(0.2, 0.7)) +
  # Making the colors in the legend non-transparent
  guides(color = guide_legend(override.aes = list(alpha = 1)))
# Loading in the graph
plot