Note for reviewers. This is the full-code reference solution for the Track B lab, not the learner-facing skeleton. The learner version replaces the marked analysis blocks with TODO stubs but is otherwise identical and knits as-is. Every object and column name below was checked against the wpp2024 package data (PPgp/wpp2024); the profile uses only indicators distributed in the package, so the report knits with no external data files.

1 Setup and installation

wpp2024 is distributed through GitHub, not CRAN, because the package data exceed CRAN size limits. Install it once with remotes (or pak).

install.packages(c("remotes", "data.table", "dplyr", "tidyr", "ggplot2", "scales"))
remotes::install_github("PPgp/wpp2024")        # required
remotes::install_github("PPgp/wpp2024extra")   # optional companion indicators
library(wpp2024)
library(data.table)
library(dplyr)
library(tidyr)
library(ggplot2)
library(scales)

2 Exercise B1 — Load, look up M49 codes, and compare three countries

Goal. Load the annual wpp2024 indicator tables, use UNlocations to resolve country names to M49 codes (the same M49 standard introduced in Lesson 1.1), and plot total fertility, life expectancy at birth, and total population for three countries at different stages of the demographic transition.

2.1 Load the data

wpp2024 exposes its annual estimates as long-format data.tables whose names end in 1dt (the 1 means single-year). Column maps, as verified against the package:

Object Columns (after lazy-load) Years
tfr1dt country_code, name, year, tfr 1950–2023
e01dt country_code, name, year, e0M, e0F, e0B 1950–2023
pop1dt country_code, name, year, popM, popF, pop 1950–2023
UNlocations name, country_code, reg_code, area_name, ...

The combined-sex life expectancy is e0B (not e0), and the annual estimate series ends in 2023; 2024 is the first projection year (tables tfrproj1dt, e0proj1dt, popproj1dt).

data(tfr1dt)        # total fertility rate
data(e01dt)         # life expectancy at birth, by sex (M/F) and combined (B)
data(pop1dt)        # total population (summed over ages)
data(UNlocations)   # M49 location reference table

2.2 Look up M49 codes

UNlocations ships as a data.frame; we coerce it to a data.table to use the filter-and-select idiom, then look up the three comparison countries.

comparison <- c("Niger", "India", "Japan")   # early-, mid-, and post-transition

loc   <- as.data.table(UNlocations)
codes <- loc[name %in% comparison, .(name, country_code)]
codes
keep <- codes$country_code   # M49 codes: Niger 562, India 356, Japan 392

2.3 Assemble a tidy long frame and plot

We pull one value column from each table, label it, stack the three indicators into one long frame, and facet. Keeping everything long means a single ggplot call draws all three panels for all three countries.

tfr <- as_tibble(tfr1dt) |>
  filter(country_code %in% keep) |>
  transmute(name, year, indicator = "Total fertility rate", value = tfr)

e0 <- as_tibble(e01dt) |>
  filter(country_code %in% keep) |>
  transmute(name, year, indicator = "Life expectancy at birth (e0, both sexes)", value = e0B)

pop <- as_tibble(pop1dt) |>
  filter(country_code %in% keep) |>
  transmute(name, year, indicator = "Total population (thousands)", value = pop)

trends <- bind_rows(tfr, e0, pop) |>
  filter(year >= 1950, year <= 2023) |>
  mutate(indicator = factor(indicator, levels = c(
    "Total fertility rate",
    "Life expectancy at birth (e0, both sexes)",
    "Total population (thousands)"
  )))
ggplot(trends, aes(year, value, colour = name)) +
  geom_line(linewidth = 0.8) +
  facet_wrap(~ indicator, ncol = 1, scales = "free_y") +
  scale_y_continuous(labels = label_comma()) +
  scale_x_continuous(breaks = seq(1950, 2020, 10)) +
  labs(
    title = "Three countries at different transition stages, 1950–2023",
    subtitle = "Annual WPP 2024 estimates",
    x = NULL, y = NULL, colour = "Country",
    caption = "Source: United Nations, WPP 2024 (wpp2024 R package)."
  ) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "top", strip.text = element_text(face = "bold"))

The three panels should show the textbook sequence: Niger still early in fertility decline with the fastest population growth, India mid-transition, and Japan post-transition with sub-replacement fertility, the highest life expectancy, and a population now plateauing.


3 Exercise B2 — Single-country demographic profile as a reproducible report

Goal. For the reference country (Uganda, M49 800), compile a tidy annual profile — TFR, CBR, e0, CDR — for 1970–2023, draw a multi-panel summary, and annotate national events that plausibly left marks in the series. All four indicators are distributed in wpp2024, so this exercise needs no external data files.

3.1 Indicator sources

Indicator Source object Column Package
TFR tfr1dt tfr wpp2024
e0 e01dt e0B wpp2024
CBR misc1dt cbr wpp2024
CDR misc1dt cdr wpp2024

misc1dt also carries cnmr (crude net-migration rate), growthrate, and births/deaths counts, so a fifth panel (e.g. net migration) can be added by pulling one more column.

3.2 Assemble the profile

data(misc1dt)   # country_code, name, year, births, cbr, cdr, deaths, cnmr, growthrate, ...
cc  <- params$cc
yrs <- params$y_min:params$y_max
profile <- as_tibble(tfr1dt) |>
  filter(country_code == cc, year %in% yrs) |>
  select(year, TFR = tfr) |>
  left_join(as_tibble(e01dt)   |> filter(country_code == cc) |> select(year, e0 = e0B),  by = "year") |>
  left_join(as_tibble(misc1dt) |> filter(country_code == cc) |> select(year, CBR = cbr, CDR = cdr), by = "year")
profile

3.3 Multi-panel profile with event annotations

National events are illustrative placeholders for Uganda; reviewers and learners should confirm or replace them. The plot reshapes the profile to long form so each indicator gets its own free-scaled panel, with one vertical reference line per event.

events <- tibble::tribble(
  ~year, ~label,
  1986,  "1986 — political stabilization",
  1992,  "1992 — peak HIV/AIDS mortality",
  2004,  "2004 — national ART scale-up"
)
profile_long <- profile |>
  pivot_longer(-year, names_to = "indicator", values_to = "value") |>
  filter(!is.na(value)) |>
  mutate(indicator = factor(indicator, levels = c("TFR", "CBR", "CDR", "e0")))

ggplot(profile_long, aes(year, value)) +
  geom_vline(data = events, aes(xintercept = year),
             linetype = "dashed", colour = "grey60") +
  geom_line(linewidth = 0.8, colour = "#1f4e79") +
  facet_wrap(~ indicator, scales = "free_y") +
  labs(
    title = paste0("Demographic profile: ", params$country,
                   ", ", params$y_min, "–", params$y_max),
    x = NULL, y = NULL,
    caption = "Source: WPP 2024 (wpp2024 R package): tfr1dt, e01dt, misc1dt."
  ) +
  theme_minimal(base_size = 12) +
  theme(strip.text = element_text(face = "bold"))

Illustrative national events (confirm/replace per country).
Year Annotated event
1986 1986 — political stabilization
1992 1992 — peak HIV/AIDS mortality
2004 2004 — national ART scale-up

3.4 Narrative interpretation

Learner deliverable: 5–10 sentences. Reference text below.

For Uganda, the profile shows a classic but incomplete demographic transition. The crude death rate falls steadily from the 1970s while the crude birth rate stays high into the 2000s, so the gap between them — natural increase — widens before it narrows, which is exactly the engine of rapid population growth. Total fertility begins its decline only later and from a high level, lagging the mortality improvement by a generation. Life expectancy at birth rises overall but carries a visible setback around the early 1990s that coincides with peak HIV/AIDS mortality, the clearest case in the profile of a national event leaving a mark in a demographic series. Reading the series against the annotated events is the habit the lab is meant to build: demographic indicators are not free-floating curves but the population’s record of its own history.