Note for reviewers. This is the full-code reference solution for the Track B lab, not the learner-facing skeleton. The learner version replaces the marked analysis blocks with
TODOstubs but is otherwise identical and knits as-is. Every object and column name below was checked against thewpp2024package data (PPgp/wpp2024); the profile uses only indicators distributed in the package, so the report knits with no external data files.
wpp2024 is distributed through GitHub, not
CRAN, because the package data exceed CRAN size limits. Install
it once with remotes (or pak).
install.packages(c("remotes", "data.table", "dplyr", "tidyr", "ggplot2", "scales"))
remotes::install_github("PPgp/wpp2024") # required
remotes::install_github("PPgp/wpp2024extra") # optional companion indicators
library(wpp2024)
library(data.table)
library(dplyr)
library(tidyr)
library(ggplot2)
library(scales)
Goal. Load the annual wpp2024 indicator
tables, use UNlocations to resolve country names to M49
codes (the same M49 standard introduced in Lesson 1.1), and plot total
fertility, life expectancy at birth, and total population for three
countries at different stages of the demographic transition.
wpp2024 exposes its annual estimates as
long-format data.tables whose names end in
1dt (the 1 means single-year). Column maps, as
verified against the package:
| Object | Columns (after lazy-load) | Years |
|---|---|---|
tfr1dt |
country_code, name, year, tfr |
1950–2023 |
e01dt |
country_code, name, year, e0M, e0F, e0B |
1950–2023 |
pop1dt |
country_code, name, year, popM, popF, pop |
1950–2023 |
UNlocations |
name, country_code, reg_code, area_name, ... |
— |
The combined-sex life expectancy is e0B (not
e0), and the annual estimate series ends in
2023; 2024 is the first projection year (tables
tfrproj1dt, e0proj1dt,
popproj1dt).
data(tfr1dt) # total fertility rate
data(e01dt) # life expectancy at birth, by sex (M/F) and combined (B)
data(pop1dt) # total population (summed over ages)
data(UNlocations) # M49 location reference table
UNlocations ships as a data.frame; we
coerce it to a data.table to use the filter-and-select
idiom, then look up the three comparison countries.
comparison <- c("Niger", "India", "Japan") # early-, mid-, and post-transition
loc <- as.data.table(UNlocations)
codes <- loc[name %in% comparison, .(name, country_code)]
codes
keep <- codes$country_code # M49 codes: Niger 562, India 356, Japan 392
We pull one value column from each table, label it, stack the three
indicators into one long frame, and facet. Keeping everything long means
a single ggplot call draws all three panels for all three
countries.
tfr <- as_tibble(tfr1dt) |>
filter(country_code %in% keep) |>
transmute(name, year, indicator = "Total fertility rate", value = tfr)
e0 <- as_tibble(e01dt) |>
filter(country_code %in% keep) |>
transmute(name, year, indicator = "Life expectancy at birth (e0, both sexes)", value = e0B)
pop <- as_tibble(pop1dt) |>
filter(country_code %in% keep) |>
transmute(name, year, indicator = "Total population (thousands)", value = pop)
trends <- bind_rows(tfr, e0, pop) |>
filter(year >= 1950, year <= 2023) |>
mutate(indicator = factor(indicator, levels = c(
"Total fertility rate",
"Life expectancy at birth (e0, both sexes)",
"Total population (thousands)"
)))
ggplot(trends, aes(year, value, colour = name)) +
geom_line(linewidth = 0.8) +
facet_wrap(~ indicator, ncol = 1, scales = "free_y") +
scale_y_continuous(labels = label_comma()) +
scale_x_continuous(breaks = seq(1950, 2020, 10)) +
labs(
title = "Three countries at different transition stages, 1950–2023",
subtitle = "Annual WPP 2024 estimates",
x = NULL, y = NULL, colour = "Country",
caption = "Source: United Nations, WPP 2024 (wpp2024 R package)."
) +
theme_minimal(base_size = 12) +
theme(legend.position = "top", strip.text = element_text(face = "bold"))
The three panels should show the textbook sequence: Niger still early in fertility decline with the fastest population growth, India mid-transition, and Japan post-transition with sub-replacement fertility, the highest life expectancy, and a population now plateauing.
Goal. For the reference country (Uganda, M49 800),
compile a tidy annual profile — TFR, CBR, e0, CDR — for
1970–2023, draw a multi-panel summary, and annotate national events that
plausibly left marks in the series. All four indicators are distributed
in wpp2024, so this exercise needs no external data
files.
| Indicator | Source object | Column | Package |
|---|---|---|---|
| TFR | tfr1dt |
tfr |
wpp2024 |
| e0 | e01dt |
e0B |
wpp2024 |
| CBR | misc1dt |
cbr |
wpp2024 |
| CDR | misc1dt |
cdr |
wpp2024 |
misc1dt also carries cnmr (crude
net-migration rate), growthrate, and
births/deaths counts, so a fifth panel
(e.g. net migration) can be added by pulling one more column.
data(misc1dt) # country_code, name, year, births, cbr, cdr, deaths, cnmr, growthrate, ...
cc <- params$cc
yrs <- params$y_min:params$y_max
profile <- as_tibble(tfr1dt) |>
filter(country_code == cc, year %in% yrs) |>
select(year, TFR = tfr) |>
left_join(as_tibble(e01dt) |> filter(country_code == cc) |> select(year, e0 = e0B), by = "year") |>
left_join(as_tibble(misc1dt) |> filter(country_code == cc) |> select(year, CBR = cbr, CDR = cdr), by = "year")
profile
National events are illustrative placeholders for Uganda; reviewers and learners should confirm or replace them. The plot reshapes the profile to long form so each indicator gets its own free-scaled panel, with one vertical reference line per event.
events <- tibble::tribble(
~year, ~label,
1986, "1986 — political stabilization",
1992, "1992 — peak HIV/AIDS mortality",
2004, "2004 — national ART scale-up"
)
profile_long <- profile |>
pivot_longer(-year, names_to = "indicator", values_to = "value") |>
filter(!is.na(value)) |>
mutate(indicator = factor(indicator, levels = c("TFR", "CBR", "CDR", "e0")))
ggplot(profile_long, aes(year, value)) +
geom_vline(data = events, aes(xintercept = year),
linetype = "dashed", colour = "grey60") +
geom_line(linewidth = 0.8, colour = "#1f4e79") +
facet_wrap(~ indicator, scales = "free_y") +
labs(
title = paste0("Demographic profile: ", params$country,
", ", params$y_min, "–", params$y_max),
x = NULL, y = NULL,
caption = "Source: WPP 2024 (wpp2024 R package): tfr1dt, e01dt, misc1dt."
) +
theme_minimal(base_size = 12) +
theme(strip.text = element_text(face = "bold"))
| Year | Annotated event |
|---|---|
| 1986 | 1986 — political stabilization |
| 1992 | 1992 — peak HIV/AIDS mortality |
| 2004 | 2004 — national ART scale-up |
Learner deliverable: 5–10 sentences. Reference text below.
For Uganda, the profile shows a classic but incomplete demographic transition. The crude death rate falls steadily from the 1970s while the crude birth rate stays high into the 2000s, so the gap between them — natural increase — widens before it narrows, which is exactly the engine of rapid population growth. Total fertility begins its decline only later and from a high level, lagging the mortality improvement by a generation. Life expectancy at birth rises overall but carries a visible setback around the early 1990s that coincides with peak HIV/AIDS mortality, the clearest case in the profile of a national event leaving a mark in a demographic series. Reading the series against the annotated events is the habit the lab is meant to build: demographic indicators are not free-floating curves but the population’s record of its own history.