Assignment — LA1
What this assignment explores:
Load required libraries
The three-package toolkit for this analysis is:
ggplot2 - The standard grammar-of-graphics engine for data visualization in R. Powers the bubble chart.
dplyr - Provides pipe-friendly verbs — filter(), mutate() — for cleaning and reshaping the dataset.
countrycode - Converts country names into standardized continent labels — the categorical grouping for our chart.
Data extraction
Pulling live data from Our World in Data
Source - Our World in Data — Life Expectancy vs GDP per Capita dataset, Loaded directly via url() with no manual download.
Column standardization All column names lowercased and special characters replaced with underscores — prevents “object not found” errors during plotting.
Data cleaning & categorical grouping
Filtering, renaming, and mapping countries to continents
Why filter? Raw data includes rows like “World” and “High Income” — regional aggregates, not countries. These are removed by checking that the code column is non-empty.
Continent mapping countrycode() translates country names into Africa, America, Asia, Europe, Oceania — the color grouping in the chart.
The bubble chart - Visualizing three dimensions simultaneously
Aesthetic mappings
x axis — GDP per capita (log scale)
y axis — Life expectancy in years
size — Population of the country
color — Continent category
Why log scale on x? - GDP ranges from ~$800 to ~$100,000. Without scale_x_log10(), all points would cluster against the left axis and the chart would be unreadable.
ggplot2 code - Full plotting block with chunk options
What the chart reveals about global health and wealth Regional breakdown?
AF - Africa — predominantly lower-left quadrant; significant room for growth in both wealth and life expectancy
AS - Asia — largest bubbles (India, China); a middle-ground transition with high population and rising longevity despite varied GDP
AM - Americas — wide spread; North America leads in GDP while Latin America shows strong health outcomes relative to income
EU - Europe — clustered in the top-right corner, representing high longevity and economic stability
What we showed?
A raw CSV of thousands of rows was transformed into a single intuitive visual narrative — highlighting global inequality, population scale, and the diminishing returns of wealth on health.
Techniques used
live data extraction
dplyr pipes
countrycode mapping
log scale
alpha transparency
size encoding
Quarto Report (RPubs): https://rpubs.com/anushayk29/1418835
Quarto Presentation (RPubs): https://rpubs.com/anushayk29/1419167
Demonstration Video: https://drive.google.com/file/d/1mcxeVGJ19T1rPfvEs8zdApKSVGqH6Oyr/view?usp=sharing
Life Expectancy vs GDP Analysis