`— title: “FINAL PROJECT” author: “Mukwaya Kasimu” date: “2025-04-08” output: html_document: toc: true toc_float: true self_contained: true — DATASET DESCRIPTION.
For this analysis, I’ve chosen the gapminder dataset which contains demographic and economic indicators for various countries over time. This dataset is particularly interesting because it includes a time series component that will be useful for our animated visualization.
if (!dir.exists("animation_frames")) {
dir.create("animation_frames")
}
library(ggplot2)
library(gapminder)
library(gganimate)
library(gifski)
Variables in the gapminder dataset: ##country: Factor with 142 levels, representing different countries
##continent: Factor with 5 levels (Africa, Americas, Asia, Europe, Oceania)
##year: Integer (1952-2007 in 5-year increments)
##lifeExp: Numeric, life expectancy at birth in years
##pop: Integer, total population
##gdpPercap: Numeric, GDP per capita (US$, inflation-adjusted)
This dataset tracks the development of countries across several decades, making it ideal for exploring relationships between economic indicators and life expectancy over time.
scatter plot showing the relationship between GDP per capita and life expectancy, with points colored by continent and sized by population
static_plot <- ggplot(gapminder, aes(x = gdpPercap, y = lifeExp,
size = pop, color = continent)) +
geom_point(alpha = 0.7) +
scale_size(range = c(2, 12), name = "Population (M)") +
scale_x_log10(labels = scales::dollar) +
geom_smooth(method = "lm", color = "black", se = FALSE) +
labs(title = "Life Expectancy vs GDP per Capita (1952-2007)",
x = "GDP per Capita (log scale)",
y = "Life Expectancy (years)",
color = "Continent") +
theme_minimal() +
theme(legend.position = "bottom")
print(static_plot)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: size.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
Visualization Description: This static visualization explores the
relationship between a country’s wealth (GDP per capita) and the health
of its population (life expectancy). Key features include:
.Each point represents a country in a specific year
.Points are colored by continent to show regional patterns
.Point size represents population size (larger points = more populous countries)
.Transparency (alpha = 0.7) helps with overlapping points
.A dashed regression line shows the overall positive relationship between wealth and longevity
.The x-axis (GDP) is on a log scale to better visualize the wide range of economic conditions
.The plot uses a professional theme with clear labels and a centered title
The plot clearly shows that higher GDP per capita is generally associated with longer life expectancy, though the relationship appears stronger at lower income levels (the curve flattens as countries get richer). There are also visible continental patterns, with African countries generally having lower life expectancy and GDP, while European countries cluster at the higher end of both measures.
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
From here, I convert the static ggplot2 visualization into an interactive plot using plotly.
interactive_plot <- ggplotly(static_plot, tooltip = c("country", "lifeExp", "gdpPercap", "pop"))
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: size.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
interactive_plot
Finally, I create an animated version that shows how the relationship between GDP and life expectancy has evolved over time.
# Setting up directory for animation files
if(!dir.exists("animations")) dir.create("animations")
knitr::opts_chunk$set(fig.path = "animations/")
library(ggplot2)
library(gapminder)
library(gganimate)
library(gifski)
library(scales)
# Creating base plot
base_plot <- ggplot(gapminder, aes(x = gdpPercap, y = lifeExp,
size = pop, color = continent)) +
geom_point(alpha = 0.7) +
scale_size(range = c(2, 12), name = "Population (M)") +
scale_x_log10(labels = scales::dollar) +
theme_minimal()
# Creating and save animation
animated_plot <- base_plot +
labs(title = 'Year: {frame_time}') +
transition_time(year) +
ease_aes('linear')
animate(
animated_plot,
fps = 10,
width = 6,
height = 5,
units = "in",
res = 150,
renderer = gifski_renderer()
)
##Analysis of the Animated Results: The animation reveals several important trends:
Global Improvement Over Time:
There’s clear upward movement in both life expectancy and GDP per capita across all continents from 1952 to 2007.
The regression line shifts right and upward, indicating that countries are generally getting richer and healthier over time.
Regional Divergence:
Asian countries show dramatic improvement, particularly after 1980 (likely reflecting rapid growth in China and India).
African countries improve but at a slower pace, leading to widening inequality.
European and American countries maintain their lead throughout the period.
Historical Events:
The HIV/AIDS epidemic is visible in the 1990s-2000s as some African countries experience declines in life expectancy.
The collapse of the Soviet Union appears to cause temporary setbacks for some Eastern European countries in the 1990s.
Changing Relationship:
The positive correlation between GDP and life expectancy remains strong throughout.
The relationship appears slightly less steep in later years, suggesting diminishing returns of wealth on health as countries develop.
Population Shifts:
The increasing size of Asian points reflects rapid population growth in countries like China and India.
Some European points shrink slightly, reflecting stagnant or declining populations.
This animation powerfully demonstrates both the overall progress in global health and wealth, as well as the persistent inequalities between regions. It suggests that while economic growth is associated with health improvements, other factors (like healthcare systems, education, and governance) likely play important roles in determining how effectively wealth translates into longer lives.
##Conclusion This analysis has demonstrated:
Data exploration and visualization techniques in R
The power of ggplot2 for creating sophisticated static visualizations
How to enhance exploration with interactive plotly charts
The unique insights that can be gained from animated visualizations of time series data
The gapminder dataset reveals both encouraging global trends and persistent inequalities that warrant further investigation. The tools used here - ggplot2, plotly, and gganimate - provide a powerful toolkit for exploring and communicating such data stories.