Samuel Goon DS Labs Assignment

Load libraries and my selected dataset

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs)
Warning: package 'dslabs' was built under R version 4.4.1
data(package="dslabs")
data("death_prob")

Create a graph based on the death_prob dataset

ggplot(death_prob, aes(x = age, y = sex, fill = prob)) +
geom_tile()

Refining the aesthetics of the heatmap

ggplot(death_prob, aes(x = age, y = sex, fill = prob)) + 
geom_tile() +
  scale_fill_gradient(low = "white", high = "black") + #changing the color of the gradient
  xlim(75, 125) + #limiting the data to show the more significant results and limit the size of the graph
  labs(x = "Age",
       y = "Sex",
       fill = "Probability of death within 
one year of recorded age",
       title = "Comparing death probabilites based on sex") #Labeling each variable name so it is easier for the reader to understand what they are looking at
Warning: Removed 152 rows containing missing values or values outside the scale range
(`geom_tile()`).

What did you just look at?

The dataset I used calculated the average probability of death based on a person’s sex and age. I used my graph to compare the probability of death between male and female sexes to determine which sex is more likely to die at an earlier age. I found males have a slightly higher probability of dying at an earlier age.