Analysis of Gapminder Data

Description: This analysis will explore key economic indicators from the Gapminder dataset.

Objective :- Identify key trends. We will visualize our findings using appropriate plots and provide an interpretation of results.

Key Questions:

  1. Which countries have the highest and lowest life expectancy since 2000, and how has it changed over time?
  2. How do GDP per capita and life expectancy correlate across continents and countries?
  3. Are there observable patterns or trends in economic development at the continental level?
  4. Which countries and regions have made the most significant improvements in life expectancy and GDP since 2000?
  5. How can data visualization be used to illustrate these global disparities and changes in development?
library(dplyr)    
## Warning: package 'dplyr' was built under R version 4.4.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)  
## Warning: package 'ggplot2' was built under R version 4.4.2
library(gapminder)
## Warning: package 'gapminder' was built under R version 4.4.2
# Data Wrangling
# Select the relevant columns - country, continent, year, lifeExp, gdpPercap, and pop
gapminder_selected <- gapminder %>% 
  select(country, continent, year, lifeExp, gdpPercap, pop)

# Filter the data for the years 2000 and onwards
gapminder_filtered <- gapminder_selected %>% 
  filter(year >= 2000)

# Create a new column for GDP (Gross Domestic Product) by multiplying population with GDP per capita
gapminder_transformed <- gapminder_filtered %>% 
  mutate(GDP = pop * gdpPercap)

# Table Output
# Generate a summary table showing the top 20 countries with the highest average life expectancy from 2000 onwards
life_exp_summary <- gapminder_transformed %>% 
  group_by(country) %>% 
  summarise(avg_lifeExp = mean(lifeExp, na.rm = TRUE)) %>% 
  arrange(desc(avg_lifeExp)) %>% 
  head(20)

print("Top 20 Countries with Highest Average Life Expectancy (2000 Onwards)")
## [1] "Top 20 Countries with Highest Average Life Expectancy (2000 Onwards)"
print(life_exp_summary)
## # A tibble: 20 × 2
##    country          avg_lifeExp
##    <fct>                  <dbl>
##  1 Japan                   82.3
##  2 Hong Kong, China        81.9
##  3 Switzerland             81.2
##  4 Iceland                 81.1
##  5 Australia               80.8
##  6 Sweden                  80.5
##  7 Italy                   80.4
##  8 Spain                   80.4
##  9 Israel                  80.2
## 10 Canada                  80.2
## 11 France                  80.1
## 12 New Zealand             79.7
## 13 Norway                  79.6
## 14 Austria                 79.4
## 15 Singapore               79.4
## 16 Netherlands             79.1
## 17 Germany                 79.0
## 18 United Kingdom          78.9
## 19 Belgium                 78.9
## 20 Greece                  78.9

Data Visualization

Create a scatter plot of GDP per capita vs Life Expectancy, colored by continent

# Line plot for life expectancy trends of specific countries
line_plot <- function(data, countries) {
  data %>% 
    filter(country %in% countries) %>% 
    ggplot(aes(x = year, y = lifeExp, color = country)) +
    geom_line(size = 1) +
    labs(title = "Life Expectancy Trend Over Time", 
         x = "Year", 
         y = "Life Expectancy") +
    theme_minimal()
}

# Call the function for specific countries
line_plot(gapminder, c("India", "United States", "China", "South Africa"))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

# This facet plot shows the changes in life expectancy over time for each continent from 2000 onwards. It allows for a side-by-side comparison of trends for each continent
facet_plot <- ggplot(gapminder_transformed, aes(x = year, y = lifeExp, color = continent)) + 
  geom_line() + 
  facet_wrap(~continent) + 
  labs(title = "Life Expectancy Over Time by Continent (2000 Onwards)",
       x = "Year", 
       y = "Life Expectancy", 
       color = "Continent") + 
  theme_minimal()

# Print the facet plot
print(facet_plot)

# Heatmap for life expectancy across countries and years
heatmap_plot <- function(data, countries) {
  data %>% 
    filter(country %in% countries) %>% 
    ggplot(aes(x = year, y = country, fill = lifeExp)) +
    geom_tile(color = "white") +
    labs(title = "Heatmap of Life Expectancy Over Time", 
         x = "Year", 
         y = "Country", 
         fill = "Life Expectancy") +
    theme_minimal()
}

# Call the function for specific countries
heatmap_plot(gapminder, c("India", "United States", "China", "Brazil", "South Africa"))

Summary :- 1. The countries with the highest average life expectancy since 2000 are dominated by high-income nations.There is a positive relationship between GDP per capita and life expectancy, with wealthier countries generally experiencing higher life expectancy.

Key Findings : The global increase in life expectancy from 1952 to 2007 shows improvement in healthcare, nutrition, and education. However, disparities still exist. Oceania and Europe have significantly higher life expectancy than Africa, reflecting ongoing healthcare challenges in the region. GDP Per Capita Reflects Wealth Inequality

Africa lags significantly behind other continents in GDP per capita. Oceania and Europe have seen consistent economic growth, while Asia is experiencing rapid growth driven by countries like China and India. Some countries (like Norway, Switzerland, and the USA) have GDP per capita that is more than 100 times higher than the poorest countries. Population Growth is Exponential

The global population increased from 2.5 billion (1952) to 6 billion (2007), driven by population surges in India, China, and developing nations. While population growth in developed countries like Europe is stable, developing countries, especially in Asia and Africa, experience much faster growth. Continental Disparities Are Clear

Oceania and Europe have the highest life expectancy and GDP per capita, reflecting higher standards of living and better access to healthcare. Africa has the lowest GDP, lowest life expectancy, and the most health-related disparities. Asia shows substantial growth, driven by countries like China, Japan, and South Korea, but still has countries like Afghanistan and Yemen with much lower health and economic indicators.