I chose the data set entitled, “death_prop,” which is described as giving the “probability of death within 1 year by age and sex in the United States in 2015” (source). I created my graph by making the x-axis the person’s age and the y-axis their probability of dying within 1 year. I also colored the data points to represent the person’s sex, with cyan representing males and orange representing females. As you can see in the graph, the death probability starts rising drastically after 75 years old, with the death probability for males rising quicker. At almost 100 years old, the death probability for males stops increasing so quickly, and it eventually overlaps with the death probability of females again at around 110 years old.
# Loading in the appropriate packageslibrary(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggthemes)library(ggrepel)library(dslabs)# Loading in the data setdata("death_prob")
# Creating the scatterplot# Assinging the plot and piping in the data setplot <- death_prob |># Assigning variables to the x-axis, y-axis, and colorsggplot(aes(x = age, y = prob, color = sex)) +# Setting the color palette (I made sure it worked this time)scale_color_brewer(palette ="Set2") +# Loading in the data points and assigning a transparency levelgeom_point(alpha =0.3) +# Creating a title and a caption, labeling the legend, and labeling the x- and y-axeslabs(title ="Probability of Death at Each Age From\n0 to 119 for Males and Females",caption ="Source: DSLabs",color ="Sex",x ="Age in Years",y ="Probability of Dying") +# Changing the theme of the graphtheme_minimal() +# Setting the domain of the graph (x-axis)xlim(0, 119) +# Setting the range of the graph (y-axis)ylim(0, 1) +# Moving the legend to the top left of the graph to use space more effectivelytheme(legend.position =c(0.2, 0.7)) +# Making the colors in the legend non-transparentguides(color =guide_legend(override.aes =list(alpha =1)))# Loading in the graphplot