Dataset Overview and Source

Gapminder Dataset

This analysis examines data from 1704 observations across 142 countries and 5 continents, spanning 1952–2007.

Data Source: Gapminder Foundation — gapminder.org

Key Variables:

  • Country / Continent: Geographic identifiers across 5 continents
  • Year: Observations every 5 years from 1952 to 2007
  • lifeExp: Life expectancy at birth, in years
  • pop: Total population
  • gdpPercap: GDP per capita, inflation-adjusted USD

Life Expectancy by Continent

Europe and Oceania show the highest and most consistent life expectancy, with medians above 70 years. Asia and the Americas sit in the mid-range but with notably wider spreads. Africa has the lowest median (~49 years) and the greatest variability, reflecting deep inequality in health outcomes across the continent.

GDP per Capita vs Life Expectancy

Greater GDP per capita has a strong predictive value in the longer span of life expectancy in all continents. The log-linear trend indicates that the returns reduce at the uppermost part shifting between. Greater returns are realized through $10K to $30K than through $30K to $100K. African nations cluster in the low-GDP, high-lifeExp segment, and the European nations are on the top right. The two largest bubbles China and India occupy the middle position on both axes,portraying their size regardless of average wealthiness. —

Life Expectancy Trends Over Time

The graph indicates that the life expectancy of all the continents has been rising steadily since 1952 and 2007. Oceania and Europe were always the most life expectancy and Africa was the least during the period. The Americas and Asia were quite positive and Asia has one of the highest growth rates. All in all, the trend is a reflection of development in healthcare, living standards, and technologies worldwide, but there were still big differences between the regions.

GDP, Life Expectancy & Population in 3D

This graph confirms that GDP and life expectancy move together, while population size is largely independent of both. European and Oceanian countries cluster tightly in the high-GDP, high-lifeExp corner at moderate population sizes. African nations scatter across the low-GDP, low-lifeExp region at varying populations. China and India are the clear outliers on the population axis log10 values above 8.5 sitting at mid-range GDP and life expectancy. —

Statistical Analysis: Linear Regression

model <- lm(lifeExp ~ log10(gdpPercap) + year + continent,
            data = gap)
summary(model)
## 
## Call:
## lm(formula = lifeExp ~ log10(gdpPercap) + year + continent, data = gap)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -25.0433  -3.2175   0.3482   3.6657  15.1321 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       -4.659e+02  1.667e+01  -27.94   <2e-16 ***
## log10(gdpPercap)   1.157e+01  3.672e-01   31.50   <2e-16 ***
## year               2.416e-01  8.586e-03   28.14   <2e-16 ***
## continentAmericas  8.926e+00  4.630e-01   19.28   <2e-16 ***
## continentAsia      7.063e+00  3.959e-01   17.84   <2e-16 ***
## continentEurope    1.251e+01  5.097e-01   24.54   <2e-16 ***
## continentOceania   1.275e+01  1.275e+00   10.00   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.813 on 1697 degrees of freedom
## Multiple R-squared:  0.7982, Adjusted R-squared:  0.7975 
## F-statistic:  1119 on 6 and 1697 DF,  p-value: < 2.2e-16

Regression Interpretation

Detailed Findings:

  • Strong model fit: The model explains 79.8% of variance in life expectancy (R² = 0.798), with all predictors significant at p < 0.001.

  • GDP effect: A 10× increase in GDP per capita associates with +11.6 years of life expectancy — the single strongest predictor in the model.

  • Time trend: Each additional calendar year contributes +0.24 years, reflecting steady global health improvements independent of wealth.

  • Continent offsets: Relative to Africa (baseline), Europe and Oceania lead with +12.5 years, while the Americas and Asia gain +7–9 years.

  • Residual error: A standard error of 5.8 years suggests the model fits well but leaves room for country-level factors not captured by GDP and continent alone.

ANOVA: Life Expectancy Across Continents

anova_model <- aov(lifeExp ~ continent, data = gap)
summary(anova_model)
##               Df Sum Sq Mean Sq F value Pr(>F)    
## continent      4 139343   34836   408.7 <2e-16 ***
## Residuals   1699 144805      85                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ANOVA Interpretation

Detailed Findings:

  • Strong significance: The ANOVA produces F = 408.7 (p < 0.001), confirming that life expectancy differs meaningfully across continents — this result would almost never occur by chance.

  • Between-group variance: The continent factor accounts for a sum of squares of 139,343, nearly matching the residual sum of 144,805 — continent alone explains close to half of all variation in life expectancy.

  • Within-group variance: The mean squared residual of 85 years² reflects that considerable spread exists within continents, particularly in Africa and Asia, where country-level outcomes vary widely.

  • Continent as a key predictor: An F-statistic of 408.7 means the variation between continents is roughly 408 times greater than the variation within them — a remarkably strong grouping effect for a single categorical variable.

Key Insights and Conclusions

Major Findings:

First: GDP per capita is the strongest single predictor of life expectancy, with each 10× increase associating with +11.6 years — a relationship that holds across all continents and time periods.

Second: Continent of residence is a powerful determinant in its own right, with the ANOVA confirming that between-continent variation is 408 times greater than within-continent variation (F = 408.7, p < 0.001).

Third: Time matters independently of wealth — global life expectancy rose by roughly +0.24 years per calendar year, reflecting improvements in medicine, sanitation, and public health that benefit all countries regardless of GDP.

Thank You