Harold Nelson
2025-11-25
Get Packages and Data
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.0 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.2.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
Plot the probability of death for males over time for the USA. Use ggplotly.
Infant Mortality for the USA and Canada
USA and Canada
All Countries, Age 80, Year = 2019. Do a Cleveland plot.
The pandemic began in 2020. Take the years 2015 - 2019 as a base period for comparison.
Create the dataframe Base. There is a row for every Country and Age. Base_Male_Prob and Base_Female_Prob are the mean values of Male_Prob and Female_Prob for the years 2015 - 2019.
Base = Analysis %>%
filter(Year > 2014, Year < 2020) %>%
group_by(Country, Age) %>%
summarize(Base_Male_Prob = mean(Male_Prob),
Base_Female_Prob = mean(Female_Prob)) %>%
ungroup()## `summarise()` has grouped output by 'Country'. You can override using the
## `.groups` argument.
## tibble [4,140 × 4] (S3: tbl_df/tbl/data.frame)
## $ Country : chr [1:4140] "AUS" "AUS" "AUS" "AUS" ...
## $ Age : num [1:4140] 0 1 2 3 4 5 6 7 8 9 ...
## $ Base_Male_Prob : num [1:4140] 0.003488 0.000257 0.00016 0.000126 0.000105 ...
## $ Base_Female_Prob: num [1:4140] 2.86e-03 2.16e-04 1.43e-04 9.66e-05 7.20e-05 ...
Create Covid_Analysis. Keep years > 2020
## Joining with `by = join_by(Country, Age)`
## tibble [8,730 × 13] (S3: tbl_df/tbl/data.frame)
## $ Country : chr [1:8730] "AUS" "AUS" "AUS" "AUS" ...
## $ Year : num [1:8730] 2021 2021 2021 2021 2021 ...
## $ Age : num [1:8730] 0 1 2 3 4 5 6 7 8 9 ...
## $ Female_Deaths : num [1:8730] 455 24 16 12 11 14 12 10 12 7 ...
## $ Male_Deaths : num [1:8730] 537 29 22 22 15 16 11 9 16 15 ...
## $ Female_Pop : num [1:8730] 144842 144860 146749 149118 153649 ...
## $ Male_Pop : num [1:8730] 152819 153464 155759 157955 162682 ...
## $ Male_Prob : num [1:8730] 3.51e-03 1.89e-04 1.41e-04 1.39e-04 9.22e-05 ...
## $ Female_Prob : num [1:8730] 3.14e-03 1.66e-04 1.09e-04 8.05e-05 7.16e-05 ...
## $ MFRatio : num [1:8730] 1.12 1.14 1.3 1.73 1.29 ...
## $ MFPopRatio : num [1:8730] 1.06 1.06 1.06 1.06 1.06 ...
## $ Base_Male_Prob : num [1:8730] 0.003488 0.000257 0.00016 0.000126 0.000105 ...
## $ Base_Female_Prob: num [1:8730] 2.86e-03 2.16e-04 1.43e-04 9.66e-05 7.20e-05 ...
Create Expected Deaths for Males and Female using Base_Male_Prob and Base_Female_Prob.
Then create Ratio and Difference of Actual and Expected Deaths for Males and Females.
Covid_Analysis = Covid_Analysis %>%
mutate(Expected_Deaths_Male = Base_Male_Prob * Male_Pop,
Expected_Deaths_Female = Base_Female_Prob * Female_Pop,
Excess_Male_Ratio = Male_Deaths/Expected_Deaths_Male,
Excess_Male_Diff = Male_Deaths - Expected_Deaths_Male,
Excess_Female_Ratio = Female_Deaths/Expected_Deaths_Female,
Excess_Female_Diff = Female_Deaths - Expected_Deaths_Female)for the USA males in 2021.