Fertility Rates and Longevity From 1960 to 2016

Author

Rin Hwang

library(tidyverse)
Warning: package 'dplyr' was built under R version 4.5.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library("dslabs")
data(package="dslabs")
data('gapminder')
names(gapminder)
[1] "country"          "year"             "infant_mortality" "life_expectancy" 
[5] "fertility"        "population"       "gdp"              "continent"       
[9] "region"          
countries_inuse <- c("Canada", "Germany", "Japan", "Ethiopia", "Cambodia", "Peru")

filtered_data <- gapminder %>% 
  filter(country %in% countries_inuse)
ggplot(filtered_data, aes(x = fertility, y = life_expectancy, color = country)) +
  geom_point(size = 2, alpha = 0.5) +
  geom_smooth(method = lm, se = FALSE, lty = 2, linewidth = 0.3) +
  labs(
    x = "Fertility Rate (Births per Woman)", 
    y = "Life Expectancy (Years)", 
    title = "Comparing Fertility Rates and Life Expectancy From 1960-2016", 
    caption = "Source: DS Labs"
  ) +
  scale_color_brewer(name = "Country", palette = "Set1") +
  theme_minimal(base_family = "serif")
`geom_smooth()` using formula = 'y ~ x'
Warning: Removed 6 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 6 rows containing missing values or values outside the scale range
(`geom_point()`).

In this visualization, I used the “gapminder” dataset from the DS Labs library which shows health and income outcomes for 184 countries from 1960-2016. I wanted to look at fertility rate and life expectancy in certain continental regions so I filtered the data to only show Cambodia, Canada, Ethiopia, Germany, Japan, and Peru. I created a scatterplot depicting the comparisons. By using geom_point and geom_smooth, I was able to create the plot part and scale_color_brewer for making the legend more understandable. In this chart, the general trend seems to a negative relationship where the lower the fertility rates, the higher the life expectancy is. It can also be observed that Cambodia has a steep dip in life expectancy with a higher rate of fertility. Many factors can explain this dip, one being most likely the Cambodian Genocide that occurred from 1975 to 1979.