Introduction

Global economic performance varies widely across countries, with clear differences in output levels, living standards, and regional economic strength. The year 2023 provides a useful snapshot of these global disparities, showing how countries differ in Gross Domestic Product (GDP), GDP per capita, and overall contributions to the world economy. Examining GDP patterns within a single year also makes it possible to compare regions, assess inequality in global output, and identify which economies dominate world production. This study analyzes global GDP data for 2023 obtained from Kaggle, focusing on cross-country differences, regional distributions, and the concentration of global economic activity. The main variables in the data are Gross Domestic Product(GDP), Population, GDP per capita and the Share of the world GDP.

Research Aim

To assess global GDP distribution across countries and regions in 2023.

Research Questions

  1. How do GDP and GDP per capita vary across countries in 2023?

  2. Are there significant regional or income-group differences in GDP distribution in 2023?

  3. How concentrated is global economic output in 2023, and what share is held by the top economies?

  4. What patterns of global GDP in 2023?

Research Objectives

  1. Examine the differences in GDP and GDP per capita across countries in 2023.

  2. Compare GDP levels across regions or income groups in 2023.

  3. Assess the concentration of global GDP in 2023 by identifying the share held by top-performing economies.

  4. Examine global GDP patterns in 2023

Definition of terms

  1. Gross Domestic Product (GDP): GDP is the total monetary value of all final goods and services produced within a country’s borders over a specified period, usually one year. It is a broad measure of a country’s overall economic activity and output.

  2. GDP per capita: GDP per capita is GDP divided by a country’s population. It measures average economic output per person and is commonly used as an indicator of living standards and economic well-being.

  3. Share of World GDP: The share of world GDP refers to the proportion of total global GDP that is produced by a particular country. It indicates a country’s relative economic size and importance in the global economy.

Loading the dataset

library(readr)
Global_GDP <- read_csv("Global GDP Explorer 2025 (World Bank  UN Data).csv")

# Having an overview of my data
library(tidyverse)
sum(is.na(Global_GDP))
## [1] 0
glimpse(Global_GDP)
## Rows: 181
## Columns: 8
## $ ...1                  <dbl> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14…
## $ Country               <chr> "United States", "China", "Germany", "Japan", "I…
## $ `GDP (nominal, 2023)` <chr> "$27,720,700,000,000", "$17,794,800,000,000", "$…
## $ `GDP (abbrev.)`       <chr> "27.721 trillion", "17.795 trillion", "4.526 tri…
## $ `GDP Growth`          <chr> "2.89%", "5.25%", "−0.27%", "1.68%", "8.15%", "0…
## $ `Population 2023`     <dbl> 343477335, 1422584933, 84548231, 124370947, 1438…
## $ `GDP per capita`      <chr> "$80,706", "$12,509", "$53,528", "$33,806", "$2,…
## $ `Share of World GDP`  <chr> "26.11%", "16.76%", "4.26%", "3.96%", "3.36%", "…
Global_GDP <- Global_GDP %>%
  rename(
    GDP_Nominal      = 'GDP (nominal, 2023)',
    GDP              = 'GDP (abbrev.)',
    GDP_Growth       = 'GDP Growth',
    Population       = 'Population 2023',
    GDP_per_capita   = 'GDP per capita',
    Share_of_World_GDP = 'Share of World GDP'
  )
Global_GDP[,1] <- NULL

# Cleaning the data
# GDP_Nominal
Global_GDP$GDP_Nominal <- str_replace(Global_GDP$GDP_Nominal, "\\$", "")  # to remove the dollar sign
Global_GDP$GDP_Nominal <- str_replace_all(Global_GDP$GDP_Nominal, ",", "") # remove thousand comma
# The function str_replace replaces only the first occurrence of the pattern in 
# each string while str_replace_all replaces all occurrences of the pattern in each string

# GDP Growth
unique(Global_GDP$GDP_Growth[grepl("-", Global_GDP$GDP_Growth)])
## character(0)
unique(Global_GDP$GDP_Growth[grepl("−", Global_GDP$GDP_Growth)])  # Unicode minus
##  [1] "−0.27%"  "−0.76%"  "−1.61%"  "−0.31%"  "−5.53%"  "−0.95%"  "−0.09%" 
##  [8] "−0.04%"  "−1.16%"  "−0.55%"  "−2.94%"  "−0.91%"  "−3.64%"  "−20.11%"
## [15] "−1.1%"   "−2.3%"   "−3.02%"  "−1.86%"  "−5.41%"  "−5.09%"  "−18.12%"
## [22] "−3.93%"
unique(Global_GDP$GDP_Growth[grepl("\\(", Global_GDP$GDP_Growth)]) # Parentheses negatives
## character(0)
# Standardize the negatives
# Case A: Unicode minus Some datasets use a Unicode minus (−) instead of the standard ASCII minus (-). R does not recognize − as a negative sign.
Global_GDP$GDP_Growth <- str_replace_all(Global_GDP$GDP_Growth, "−", "-")

# Case B: Parentheses
Global_GDP$GDP_Growth <- str_replace_all(Global_GDP$GDP_Growth, "\\(([^)]+)\\)", "-\\1")

# Case C: Spaces before minus
Global_GDP$GDP_Growth <- str_trim(Global_GDP$GDP_Growth)


Global_GDP$GDP_Growth <- str_replace_all(Global_GDP$GDP_Growth, "[^0-9.-]", "")
Global_GDP$GDP_Growth <- as.numeric(Global_GDP$GDP_Growth)

# GDP per capita
Global_GDP$GDP_per_capita <- str_replace(Global_GDP$GDP_per_capita, "\\$", "")  # to remove the dollar sign
Global_GDP$GDP_per_capita <- str_replace(Global_GDP$GDP_per_capita, ",", "") # remove thousand comma

# Share of the World
Global_GDP$Share_of_World_GDP <- str_replace(Global_GDP$Share_of_World_GDP, "\\%", "")

# Converting the variables to the right data type
Global_GDP <- Global_GDP %>%
  mutate(
    GDP_Nominal = as.numeric(GDP_Nominal),
    GDP_per_capita = as.numeric(GDP_per_capita),
    Share_of_World_GDP = as.numeric(Share_of_World_GDP)
  )

library(skimr)
skim(Global_GDP)
Data summary
Name Global_GDP
Number of rows 181
Number of columns 7
_______________________
Column type frequency:
character 2
numeric 5
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Country 0 1 4 24 0 181 0
GDP 0 1 11 15 0 180 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
GDP_Nominal 0 1 5.777519e+11 2.516689e+12 62280312.00 1.464452e+10 4.852960e+10 3.355330e+11 2.77207e+13 ▇▁▁▁▁
GDP_Growth 0 1 3.280000e+00 6.860000e+00 -20.11 1.190000e+00 2.940000e+00 5.040000e+00 7.50600e+01 ▁▇▁▁▁
Population 0 1 4.358453e+07 1.555468e+08 9816.00 2.311472e+06 9.130429e+06 3.363516e+07 1.43807e+09 ▇▁▁▁▁
GDP_per_capita 0 1 1.771129e+04 2.330149e+04 193.00 2.478000e+03 7.182000e+03 2.279800e+04 1.28936e+05 ▇▁▁▁▁
Share_of_World_GDP 0 1 5.400000e-01 2.370000e+00 0.00 1.000000e-02 5.000000e-02 3.200000e-01 2.61100e+01 ▇▁▁▁▁
glimpse(Global_GDP)
## Rows: 181
## Columns: 7
## $ Country            <chr> "United States", "China", "Germany", "Japan", "Indi…
## $ GDP_Nominal        <dbl> 2.77207e+13, 1.77948e+13, 4.52570e+12, 4.20449e+12,…
## $ GDP                <chr> "27.721 trillion", "17.795 trillion", "4.526 trilli…
## $ GDP_Growth         <dbl> 2.89, 5.25, -0.27, 1.68, 8.15, 0.34, 0.94, 0.70, 2.…
## $ Population         <dbl> 343477335, 1422584933, 84548231, 124370947, 1438069…
## $ GDP_per_capita     <dbl> 80706, 12509, 53528, 33806, 2481, 49224, 45934, 386…
## $ Share_of_World_GDP <dbl> 26.11, 16.76, 4.26, 3.96, 3.36, 3.18, 2.87, 2.17, 2…
# Importing data so that I can add IncomeGroup colums and Region column
library(readr)
CountryData <- read_csv("Country.txt")

Global_GDP <- left_join(Global_GDP, CountryData, by = "Country")

fix_regions <- data.frame(
  Country = c(
    "Hong Kong",
    "Czech Republic (Czechia)",
    "Côte d'Ivoire",
    "DR Congo",
    "Macao",
    "State of Palestine",
    "Brunei",
    "Aruba",
    "Saint Lucia",
    "St. Vincent & Grenadines",
    "Saint Kitts & Nevis",
    "Sao Tome & Principe",
    "Palau",
    "Marshall Islands",
    "Tuvalu"
  ),
  Region = c(
    "East Asia & Pacific",
    "Europe & Central Asia",
    "Sub-Saharan Africa",
    "Sub-Saharan Africa",
    "East Asia & Pacific",
    "Middle East & North Africa",
    "East Asia & Pacific",
    "Latin America & Caribbean",
    "Latin America & Caribbean",
    "Latin America & Caribbean",
    "Latin America & Caribbean",
    "Sub-Saharan Africa",
    "East Asia & Pacific",
    "East Asia & Pacific",
    "East Asia & Pacific"
  ),
  IncomeGroup = c(
    "High income",
    "High income",
    "Lower-middle income",
    "Low income",
    "High income",
    "Lower-middle income",
    "High income",
    "High income",
    "Upper-middle income",
    "Upper-middle income",
    "High income",
    "Lower-middle income",
    "Upper-middle income",
    "Upper-middle income",
    "Upper-middle income"
  ),
  stringsAsFactors = FALSE
)

Global_GDP <- Global_GDP %>%
  left_join(fix_regions, by = "Country")

Global_GDP <- Global_GDP %>% 
  mutate(
    Region = coalesce(Region.x, Region.y),
    IncomeGroup = coalesce(IncomeGroup.x, IncomeGroup.y)
  ) %>%
  select(-Region.x, -Region.y, -IncomeGroup.x, -IncomeGroup.y)

Visualizing the data

The histogram has three panels representing GDP Growth, Nominal GDP and GDP per capita respectively.

GDP Growth Each bar shows how many countries had a particular growth rate. Bars to the right mean faster growth. Bars to the left mean economic slowdown or recession.

Nominal GDP Most countries cluster on the left, meaning their economies are relatively small. A few countries appear far to the right — these are the global economic giants.

GDP per capita Countries on the right generally have higher living standards. Countries on the left have lower average incomes.

Global_GDP_long <- Global_GDP %>%
  pivot_longer(cols = c(GDP_Nominal, GDP_per_capita, GDP_Growth),
                names_to = "Variable",
                values_to = "Value")

# Histogram faceted by variable and colored by Region

ggplot(Global_GDP_long, aes(x = Value, fill = Region)) +
  geom_histogram(color = "black", alpha = 0.6) +
  facet_wrap(~Variable, scales = "free") +
  labs(title = "Histograms of GDP Variables by Region") +
  theme_minimal()

Objective 1

Examine the differences in GDP and GDP per capita across countries in 2023.

summary_stats <- Global_GDP %>% 
  summarise(
    n = n(), 
    mean_GDP = mean(GDP_Nominal, na.rm = TRUE),
    median_GDP = median(GDP_Nominal, na.rm = TRUE),
    sd_GDP = sd(GDP_Nominal, na.rm = TRUE),
    min_GDP = min(GDP_Nominal, na.rm = TRUE),
    max_GDP = max(GDP_Nominal, na.rm = TRUE),
    mean_GDPpc = mean(GDP_per_capita, na.rm = TRUE),
    median_GDPpc = median(GDP_per_capita, na.rm = TRUE),
    sd_GDPpc = sd(GDP_per_capita, na.rm = TRUE),
    min_GDPpc = min(GDP_per_capita, na.rm = TRUE),
    max_GDPpc = max(GDP_per_capita, na.rm = TRUE)
    )
library(knitr)
kable(summary_stats)
n mean_GDP median_GDP sd_GDP min_GDP max_GDP mean_GDPpc median_GDPpc sd_GDPpc min_GDPpc max_GDPpc
181 577751901977 48529595417 2.516689e+12 62280312 2.77207e+13 17711.29 7182 23301.49 193 128936
GDP_Nominal
top10_GDP <- Global_GDP %>% arrange(desc(GDP_Nominal)) %>% slice(1:10)
top10_GDP
## # A tibble: 10 × 9
##    Country        GDP_Nominal GDP           GDP_Growth Population GDP_per_capita
##    <chr>                <dbl> <chr>              <dbl>      <dbl>          <dbl>
##  1 United States      2.77e13 27.721 trill…       2.89  343477335          80706
##  2 China              1.78e13 17.795 trill…       5.25 1422584933          12509
##  3 Germany            4.53e12 4.526 trilli…      -0.27   84548231          53528
##  4 Japan              4.20e12 4.204 trilli…       1.68  124370947          33806
##  5 India              3.57e12 3.568 trilli…       8.15 1438069596           2481
##  6 United Kingdom     3.38e12 3.381 trilli…       0.34   68682962          49224
##  7 France             3.05e12 3.052 trilli…       0.94   66438822          45934
##  8 Italy              2.30e12 2.301 trilli…       0.7    59499453          38672
##  9 Brazil             2.17e12 2.174 trilli…       2.91  211140729          10295
## 10 Canada             2.14e12 2.142 trilli…       1.25   39299105          54517
## # ℹ 3 more variables: Share_of_World_GDP <dbl>, Region <chr>, IncomeGroup <chr>
bottom10_GDP <- Global_GDP %>% arrange(GDP_Nominal) %>% slice(1:10)
bottom10_GDP
## # A tibble: 10 × 9
##    Country                GDP_Nominal GDP   GDP_Growth Population GDP_per_capita
##    <chr>                        <dbl> <chr>      <dbl>      <dbl>          <dbl>
##  1 Tuvalu                    62280312 62.2…       3.85       9816           6345
##  2 Marshall Islands         259300000 259.…      -3.93      38827           6678
##  3 Kiribati                 279208903 279.…       4.12     132530           2107
##  4 Palau                    281849063 281.…       1.88      17727          15899
##  5 Micronesia               460000000 460 …       0.78     112630           4084
##  6 Dominica                 653992593 653.…       4.71      66510           9833
##  7 Sao Tome & Principe      678976265 678.…       0.37     230871           2941
##  8 Samoa                    938189444 938.…       8.58     216663           4330
##  9 Saint Kitts & Nevis     1055499778 1.05…       2.28      46758          22574
## 10 St. Vincent & Grenadi…  1065962963 1.06…       6.02     101323          10520
## # ℹ 3 more variables: Share_of_World_GDP <dbl>, Region <chr>, IncomeGroup <chr>
GDP per capita
top10_GDPpc <- Global_GDP %>% arrange(desc(GDP_per_capita)) %>% slice(1:10)
top10_GDPpc
## # A tibble: 10 × 9
##    Country       GDP_Nominal GDP            GDP_Growth Population GDP_per_capita
##    <chr>               <dbl> <chr>               <dbl>      <dbl>          <dbl>
##  1 Luxembourg        8.58e10 85.755 billion      -1.1      665098         128936
##  2 Ireland           5.51e11 551.395 billi…      -5.53    5196630         106106
##  3 Switzerland       8.85e11 884.94 billion       0.72    8870561          99761
##  4 Norway            4.85e11 485.311 billi…       0.48    5519167          87932
##  5 Singapore         5.01e11 501.428 billi…       1.07    5789090          86616
##  6 Iceland           3.13e10 31.325 billion       5.04     387558          80827
##  7 United States     2.77e13 27.721 trilli…       2.89  343477335          80706
##  8 Qatar             2.13e11 213.003 billi…       1.19    2979082          71500
##  9 Denmark           4.07e11 407.092 billi…       2.5     5948136          68440
## 10 Australia         1.73e12 1.728 trillion       3.44   26451124          65330
## # ℹ 3 more variables: Share_of_World_GDP <dbl>, Region <chr>, IncomeGroup <chr>

This shows the countries in the world with the highest average income. Their standard of living is higher than that of other countries even countries that has the highest share of the world GDP(as at 2023).

bottom10_GDPpc <- Global_GDP %>% arrange(GDP_per_capita) %>% slice(1:10)
bottom10_GDPpc
## # A tibble: 10 × 9
##    Country                GDP_Nominal GDP   GDP_Growth Population GDP_per_capita
##    <chr>                        <dbl> <chr>      <dbl>      <dbl>          <dbl>
##  1 Burundi                 2642161669 2.64…       2.7    13689450            193
##  2 Afghanistan            17233051620 17.2…       2.71   41454761            416
##  3 Central African Repub…  2555492085 2.55…       0.87    5152421            496
##  4 Madagascar             15790113247 15.7…       3.8    31195932            506
##  5 Malawi                 12712150082 12.7…       1.89   21104482            602
##  6 Mozambique             20954220984 20.9…       5.44   33635160            623
##  7 DR Congo               66383287003 66.3…       8.56  105789731            628
##  8 Niger                  16819170421 16.8…       2.5    26159867            643
##  9 Chad                   13149325362 13.1…       4.12   19319064            681
## 10 Sierra Leone            6411869546 6.41…       5.71    8460512            758
## # ℹ 3 more variables: Share_of_World_GDP <dbl>, Region <chr>, IncomeGroup <chr>

Still under GDP per capita, these 10 countries shows those at the bottom of the list of income per head in the world.

Population
top10_pop <- Global_GDP %>% arrange(desc(Population)) %>% slice(1:10)
top10_pop
## # A tibble: 10 × 9
##    Country       GDP_Nominal GDP            GDP_Growth Population GDP_per_capita
##    <chr>               <dbl> <chr>               <dbl>      <dbl>          <dbl>
##  1 India             3.57e12 3.568 trillion       8.15 1438069596           2481
##  2 China             1.78e13 17.795 trilli…       5.25 1422584933          12509
##  3 United States     2.77e13 27.721 trilli…       2.89  343477335          80706
##  4 Indonesia         1.37e12 1.371 trillion       5.05  281190067           4876
##  5 Pakistan          3.38e11 337.912 billi…      -0.04  247504495           1365
##  6 Nigeria           3.64e11 363.846 billi…       2.86  227882945           1597
##  7 Brazil            2.17e12 2.174 trillion       2.91  211140729          10295
##  8 Bangladesh        4.37e11 437.415 billi…       5.78  171466990           2551
##  9 Russia            2.02e12 2.021 trillion       3.6   145440500          13899
## 10 Mexico            1.79e12 1.789 trillion       3.2   129739759          13790
## # ℹ 3 more variables: Share_of_World_GDP <dbl>, Region <chr>, IncomeGroup <chr>
bottom10_pop <- Global_GDP %>% arrange(Population) %>% slice(1:10)
bottom10_pop
## # A tibble: 10 × 9
##    Country                GDP_Nominal GDP   GDP_Growth Population GDP_per_capita
##    <chr>                        <dbl> <chr>      <dbl>      <dbl>          <dbl>
##  1 Tuvalu                    62280312 62.2…       3.85       9816           6345
##  2 Palau                    281849063 281.…       1.88      17727          15899
##  3 Marshall Islands         259300000 259.…      -3.93      38827           6678
##  4 Saint Kitts & Nevis     1055499778 1.05…       2.28      46758          22574
##  5 Dominica                 653992593 653.…       4.71      66510           9833
##  6 Andorra                 3785067332 3.78…       2.58      80856          46812
##  7 Antigua and Barbuda     2033085185 2.03…       3.86      93316          21787
##  8 St. Vincent & Grenadi…  1065962963 1.06…       6.02     101323          10520
##  9 Aruba                   3648573136 3.64…       4.26     107939          33802
## 10 Micronesia               460000000 460 …       0.78     112630           4084
## # ℹ 3 more variables: Share_of_World_GDP <dbl>, Region <chr>, IncomeGroup <chr>
Global_GDP$IncomeGroup <- factor(
  Global_GDP$IncomeGroup,
  levels = c(
    "Low income",
    "Lower middle income",
    "Upper middle income",
    "High income"
  ),
  ordered = TRUE
)

Income_Summary <- Global_GDP %>%
  group_by(IncomeGroup) %>%
  summarise(
    Countries = n(),
    Mean_GDP_pc = mean(GDP_per_capita, na.rm = TRUE),
    Median_GDP_pc = median(GDP_per_capita, na.rm = TRUE),
    Mean_Total_GDP = mean(GDP_Nominal, na.rm = TRUE)
  )

Income_Summary
## # A tibble: 5 × 5
##   IncomeGroup         Countries Mean_GDP_pc Median_GDP_pc Mean_Total_GDP
##   <ord>                   <int>       <dbl>         <dbl>          <dbl>
## 1 Low income                 23        922.          869         2.66e10
## 2 Lower middle income        45       2771.         2478         1.97e11
## 3 Upper middle income        48       9340.         8133         6.29e11
## 4 High income                57      44734.        33832         1.14e12
## 5 <NA>                        8       7711.         6512.        1.26e10
Top10_by_GDPNominal <- Global_GDP %>%
  group_by(IncomeGroup) %>%
  arrange(desc(GDP_Nominal), .by_group = TRUE) %>%
  slice_head(n = 10) %>%
  ungroup() %>%
  select(IncomeGroup, Country)

Top10_by_GDPNominal
## # A tibble: 48 × 2
##    IncomeGroup Country     
##    <ord>       <chr>       
##  1 Low income  Ethiopia    
##  2 Low income  Sudan       
##  3 Low income  DR Congo    
##  4 Low income  Uganda      
##  5 Low income  Guinea      
##  6 Low income  Mozambique  
##  7 Low income  Mali        
##  8 Low income  Burkina Faso
##  9 Low income  Haiti       
## 10 Low income  Afghanistan 
## # ℹ 38 more rows
Top10_by_GDPpc <- Global_GDP %>%
  group_by(IncomeGroup) %>%
  arrange(desc(GDP_per_capita), .by_group = TRUE) %>%
  slice_head(n = 10) %>%
  ungroup() %>%
  select(IncomeGroup, Country)

Top10_by_GDPpc
## # A tibble: 48 × 2
##    IncomeGroup Country      
##    <ord>       <chr>        
##  1 Low income  Sudan        
##  2 Low income  Haiti        
##  3 Low income  Comoros      
##  4 Low income  Guinea       
##  5 Low income  Ethiopia     
##  6 Low income  Rwanda       
##  7 Low income  Uganda       
##  8 Low income  Togo         
##  9 Low income  Guinea-Bissau
## 10 Low income  Gambia       
## # ℹ 38 more rows

Objective 2

Compare GDP levels across regions or income groups in 2023.

North America exhibits extreme economic dominance. With only two countries (n=2), it maintains the highest median GDP ($14.9 trillion) and the highest mean GDP per capita ($67,611.5). The “mean shares” value of 14.06% indicates that these two nations represent a massive portion of the global total compared to any other individual region

Europe & Central Asia has the highest count of countries (50). While its median GDP per capita ($23,929) is the second highest, the gap between the median and the mean ($33,065) suggests significant internal inequality—likely driven by wealthy Western European nations pulling up the average for lower-income Central Asian or Eastern European countries. East Asia & Pacific: This region shows the most dramatic disparity between its median GDP per capita ($5,922) and its mean ($18,169). This suggests the region is home to a few massive economic powerhouses (such as China or Japan) that skew the average, while many other nations remain at much lower income levels. Middle East & North Africa: This region has a high median GDP ($188 billion), yet its median GDP per capita ($5,869) is relatively low. This often indicates high-output economies (like oil-producing nations) where wealth may be concentrated or the population size is large relative to the total output. Sub-Saharan Africa: This region shows the lowest economic indicators across the board, with a median GDP per capita of just $1,541 and a median global share of 0.016%. South Asia: Despite having a higher median GDP than Sub-Saharan Africa ($84 billion), it has the second-lowest mean GDP per capita ($3,484), reflecting high population densities that dilute the per-capita wealth.

# group summaries
group_summary <- Global_GDP %>%
  group_by(Region) %>%
  summarise(
    n = n(),
    median_GDP = median(GDP_Nominal, na.rm=TRUE),
    median_GDPpc = median(GDP_per_capita, na.rm=TRUE),
    mean_GDPpc = mean(GDP_per_capita, na.rm=TRUE),
    mean_shares = mean(Share_of_World_GDP, na.rm=TRUE),
    median_shares = median(Share_of_World_GDP, na.rm=TRUE)
  ) %>% arrange(desc(median_GDPpc))
kable(group_summary)
Region n median_GDP median_GDPpc mean_GDPpc mean_shares median_shares
North America 2 1.493158e+13 67611.5 67611.500 14.0650000 14.065
Europe & Central Asia 50 1.176580e+11 23929.0 33065.600 0.5181940 0.113
Latin America & Caribbean 32 3.069395e+10 10619.0 13012.656 0.1903409 0.029
East Asia & Pacific 29 4.233565e+10 5922.0 18169.724 0.9736151 0.040
Middle East & North Africa 16 1.883540e+11 5869.0 20790.000 0.2484375 0.175
South Asia 7 8.435686e+10 2481.0 3484.714 0.6043143 0.079
Sub-Saharan Africa 45 1.681917e+10 1541.0 2597.356 0.0421609 0.016

GDP per capita varies by several orders of magnitude across countries. Using a logarithmic scale allows meaningful comparison of distributions across regions by compressing extreme values while preserving relative differences.

Interpretation of the plot North America The region has very high GDP per capita, extremely narrow distribution and this reflects only a few countries.It indicates high income levels allocated to little countries leading to inequality among other nations.

Europe & Central Asia This region is characterized by a wide spread across middle- to high-income levels. Median around upper-middle income. The long right tail shows very rich Western European economies. This hows strong heterogeneity within the region

Latin America & Caribbean The region is characterized by mostly upper-middle income nations. It has relatively compact distribution. Fewer extremely poor or extremely rich countries. This indicates moderate dispersion

East Asia & Pacific This region shows a very wide distribution. It includes low-income countries, middle-income and very high-income. This explains the long right tail

Middle East & North Africa This region has a median higher than Sub-Saharan Africa and South Asia. It has a very long right tail due to oil-rich economies. This indicates high inequality within the region

South Asia I shows very low median GDP per capita. Also tight clustering at low income levels and few high-income outliers.

Sub-Saharan Africa This region is seen to have the lowest median GDP per capita. It has a strong right skew with a large number of low-income countries

# boxplot of GDP per capita by region (log scale)
ggplot(Global_GDP, aes(x = reorder(Region, GDP_per_capita, FUN = median), y = GDP_per_capita)) +
  geom_violin(alpha = 0.6) +
  geom_boxplot(width = 0.1) +
  scale_y_log10() +
  coord_flip() + 
  labs(title = "GDP per capita by Region (2023)", x = "Region", y = "GDP per capita (log scale)")

Objective 3

Assess the concentration of global GDP in 2023 by identifying the share held by top-performing economies.

In 2023, global economic output was highly concentrated among a small number of countries. The top one economy accounted for approximately 26.11% of world GDP, while the top five and top ten economies jointly contributed about 54.45% and 66.74% respectively. This reflects a significant imbalance in global production capacity. These top ten countries are United States, China, Germany, Japan, India, United Kingdom, France, Italy, Brazil, and Canada.

top1_share <- Global_GDP %>% slice(1) %>% pull(Share_of_World_GDP)
top1_share
## [1] 26.11
top5_share <- Global_GDP %>% slice(1:5) %>% summarise(sum = sum(Share_of_World_GDP)) %>% pull(sum)
top5_share
## [1] 54.45
top10_share <- Global_GDP %>% slice(1:10) %>% summarise(sum = sum(Share_of_World_GDP)) %>% pull(sum)
top10_share
## [1] 66.74

A Gini coefficient of 0.865 indicates extreme concentration, where a very small number of countries account for the majority of global output, while most countries contribute only marginally. This confirms that global economic production is highly concentrated among a few large economies, with substantial disparities between high-output and low-output countries.

library(ineq) # for Lorenz curve and Gini
# top shares of world GDP %
# Lorenz + Gini (use GDP values)
Lorenz_Curve <- Lc(Global_GDP$GDP_Nominal)
Lorenz_Curve
## $p
##   [1] 0.000000000 0.005524862 0.011049724 0.016574586 0.022099448 0.027624309
##   [7] 0.033149171 0.038674033 0.044198895 0.049723757 0.055248619 0.060773481
##  [13] 0.066298343 0.071823204 0.077348066 0.082872928 0.088397790 0.093922652
##  [19] 0.099447514 0.104972376 0.110497238 0.116022099 0.121546961 0.127071823
##  [25] 0.132596685 0.138121547 0.143646409 0.149171271 0.154696133 0.160220994
##  [31] 0.165745856 0.171270718 0.176795580 0.182320442 0.187845304 0.193370166
##  [37] 0.198895028 0.204419890 0.209944751 0.215469613 0.220994475 0.226519337
##  [43] 0.232044199 0.237569061 0.243093923 0.248618785 0.254143646 0.259668508
##  [49] 0.265193370 0.270718232 0.276243094 0.281767956 0.287292818 0.292817680
##  [55] 0.298342541 0.303867403 0.309392265 0.314917127 0.320441989 0.325966851
##  [61] 0.331491713 0.337016575 0.342541436 0.348066298 0.353591160 0.359116022
##  [67] 0.364640884 0.370165746 0.375690608 0.381215470 0.386740331 0.392265193
##  [73] 0.397790055 0.403314917 0.408839779 0.414364641 0.419889503 0.425414365
##  [79] 0.430939227 0.436464088 0.441988950 0.447513812 0.453038674 0.458563536
##  [85] 0.464088398 0.469613260 0.475138122 0.480662983 0.486187845 0.491712707
##  [91] 0.497237569 0.502762431 0.508287293 0.513812155 0.519337017 0.524861878
##  [97] 0.530386740 0.535911602 0.541436464 0.546961326 0.552486188 0.558011050
## [103] 0.563535912 0.569060773 0.574585635 0.580110497 0.585635359 0.591160221
## [109] 0.596685083 0.602209945 0.607734807 0.613259669 0.618784530 0.624309392
## [115] 0.629834254 0.635359116 0.640883978 0.646408840 0.651933702 0.657458564
## [121] 0.662983425 0.668508287 0.674033149 0.679558011 0.685082873 0.690607735
## [127] 0.696132597 0.701657459 0.707182320 0.712707182 0.718232044 0.723756906
## [133] 0.729281768 0.734806630 0.740331492 0.745856354 0.751381215 0.756906077
## [139] 0.762430939 0.767955801 0.773480663 0.779005525 0.784530387 0.790055249
## [145] 0.795580110 0.801104972 0.806629834 0.812154696 0.817679558 0.823204420
## [151] 0.828729282 0.834254144 0.839779006 0.845303867 0.850828729 0.856353591
## [157] 0.861878453 0.867403315 0.872928177 0.878453039 0.883977901 0.889502762
## [163] 0.895027624 0.900552486 0.906077348 0.911602210 0.917127072 0.922651934
## [169] 0.928176796 0.933701657 0.939226519 0.944751381 0.950276243 0.955801105
## [175] 0.961325967 0.966850829 0.972375691 0.977900552 0.983425414 0.988950276
## [181] 0.994475138 1.000000000
## 
## $L
##   [1] 0.000000e+00 5.955673e-07 3.075173e-06 5.745161e-06 8.440396e-06
##   [6] 1.283923e-05 1.909316e-05 2.558600e-05 3.455761e-05 4.465103e-05
##  [11] 5.484450e-05 6.561509e-05 7.820660e-05 9.113900e-05 1.067579e-04
##  [16] 1.261997e-04 1.457874e-04 1.656770e-04 1.859304e-04 2.064084e-04
##  [21] 2.293217e-04 2.525605e-04 2.767906e-04 3.012280e-04 3.264941e-04
##  [26] 3.558215e-04 3.888620e-04 4.237521e-04 4.599475e-04 5.004934e-04
##  [31] 5.429792e-04 5.950198e-04 6.563345e-04 7.193612e-04 7.836295e-04
##  [36] 8.556422e-04 9.433441e-04 1.045203e-03 1.160535e-03 1.278515e-03
##  [41] 1.396624e-03 1.518186e-03 1.643929e-03 1.777689e-03 1.912501e-03
##  [46] 2.049616e-03 2.189657e-03 2.334324e-03 2.480835e-03 2.631577e-03
##  [51] 2.782573e-03 2.934076e-03 3.092238e-03 3.253074e-03 3.417165e-03
##  [56] 3.581960e-03 3.748549e-03 3.919045e-03 4.104450e-03 4.289929e-03
##  [61] 4.475668e-03 4.663824e-03 4.853651e-03 5.048009e-03 5.242372e-03
##  [66] 5.439955e-03 5.640333e-03 5.852619e-03 6.066141e-03 6.291315e-03
##  [71] 6.521640e-03 6.783393e-03 7.046508e-03 7.310227e-03 7.604082e-03
##  [76] 7.898400e-03 8.193394e-03 8.492946e-03 8.816996e-03 9.142277e-03
##  [81] 9.471238e-03 9.808145e-03 1.019934e-02 1.059419e-02 1.099819e-02
##  [86] 1.140304e-02 1.181381e-02 1.224506e-02 1.267667e-02 1.311467e-02
##  [91] 1.355532e-02 1.401940e-02 1.448576e-02 1.495700e-02 1.544439e-02
##  [96] 1.602416e-02 1.665896e-02 1.729735e-02 1.795859e-02 1.864574e-02
## [101] 1.933766e-02 2.006797e-02 2.080660e-02 2.156086e-02 2.231691e-02
## [106] 2.307992e-02 2.385777e-02 2.465452e-02 2.546119e-02 2.626823e-02
## [111] 2.707938e-02 2.789943e-02 2.872658e-02 2.969807e-02 3.067737e-02
## [116] 3.167619e-02 3.270933e-02 3.374986e-02 3.479474e-02 3.593121e-02
## [121] 3.709255e-02 3.836350e-02 3.974452e-02 4.130991e-02 4.287537e-02
## [126] 4.458477e-02 4.661578e-02 4.865266e-02 5.098116e-02 5.334913e-02
## [131] 5.574786e-02 5.815934e-02 6.067090e-02 6.322991e-02 6.599462e-02
## [136] 6.882070e-02 7.202930e-02 7.526064e-02 7.854263e-02 8.189700e-02
## [141] 8.537298e-02 8.885232e-02 9.249283e-02 9.613442e-02 9.992126e-02
## [146] 1.037435e-01 1.076128e-01 1.115057e-01 1.156150e-01 1.197953e-01
## [151] 1.239781e-01 1.286190e-01 1.334140e-01 1.383071e-01 1.432186e-01
## [156] 1.481351e-01 1.530595e-01 1.583324e-01 1.639262e-01 1.700920e-01
## [161] 1.762702e-01 1.840084e-01 1.924708e-01 2.026797e-01 2.133732e-01
## [166] 2.244120e-01 2.375240e-01 2.530165e-01 2.693953e-01 2.859202e-01
## [171] 3.030289e-01 3.223592e-01 3.428469e-01 3.636331e-01 3.856362e-01
## [176] 4.148199e-01 4.471500e-01 4.812653e-01 5.214716e-01 5.647494e-01
## [181] 7.349156e-01 1.000000e+00
## 
## $L.general
##   [1] 0.000000e+00 3.440901e+05 1.776687e+06 3.319277e+06 4.876455e+06
##   [6] 7.417891e+06 1.103111e+07 1.478236e+07 1.996573e+07 2.579722e+07
##  [11] 3.168652e+07 3.790924e+07 4.518401e+07 5.265573e+07 6.167959e+07
##  [16] 7.291211e+07 8.422895e+07 9.572020e+07 1.074217e+08 1.192529e+08
##  [21] 1.324911e+08 1.459173e+08 1.599163e+08 1.740350e+08 1.886326e+08
##  [26] 2.055765e+08 2.246657e+08 2.448236e+08 2.657356e+08 2.891610e+08
##  [31] 3.137073e+08 3.437738e+08 3.791985e+08 4.156123e+08 4.527434e+08
##  [36] 4.943489e+08 5.450189e+08 6.038681e+08 6.705012e+08 7.386645e+08
##  [41] 8.069022e+08 8.771351e+08 9.497833e+08 1.027063e+09 1.104951e+09
##  [46] 1.184169e+09 1.265078e+09 1.348660e+09 1.433307e+09 1.520399e+09
##  [51] 1.607637e+09 1.695168e+09 1.786546e+09 1.879470e+09 1.974274e+09
##  [56] 2.069484e+09 2.165732e+09 2.264235e+09 2.371354e+09 2.478514e+09
##  [61] 2.585826e+09 2.694533e+09 2.804206e+09 2.916497e+09 3.028791e+09
##  [66] 3.142944e+09 3.258713e+09 3.381362e+09 3.504725e+09 3.634819e+09
##  [71] 3.767890e+09 3.919118e+09 4.071133e+09 4.223498e+09 4.393273e+09
##  [76] 4.563316e+09 4.733749e+09 4.906816e+09 5.094036e+09 5.281968e+09
##  [81] 5.472026e+09 5.666674e+09 5.892686e+09 6.120814e+09 6.354228e+09
##  [86] 6.588126e+09 6.825454e+09 7.074606e+09 7.323972e+09 7.577028e+09
##  [91] 7.831613e+09 8.099732e+09 8.369174e+09 8.641436e+09 8.923024e+09
##  [96] 9.257990e+09 9.624749e+09 9.993575e+09 1.037561e+10 1.077261e+10
## [101] 1.117237e+10 1.159431e+10 1.202105e+10 1.245683e+10 1.289364e+10
## [106] 1.333446e+10 1.378387e+10 1.424419e+10 1.471025e+10 1.517652e+10
## [111] 1.564516e+10 1.611895e+10 1.659684e+10 1.715812e+10 1.772391e+10
## [116] 1.830098e+10 1.889788e+10 1.949905e+10 2.010273e+10 2.075933e+10
## [121] 2.143029e+10 2.216459e+10 2.296247e+10 2.386688e+10 2.477133e+10
## [126] 2.575894e+10 2.693236e+10 2.810917e+10 2.945446e+10 3.082256e+10
## [131] 3.220843e+10 3.360167e+10 3.505273e+10 3.653120e+10 3.812852e+10
## [136] 3.976129e+10 4.161506e+10 4.348198e+10 4.537816e+10 4.731615e+10
## [141] 4.932440e+10 5.133460e+10 5.343791e+10 5.554184e+10 5.772970e+10
## [146] 5.993801e+10 6.217352e+10 6.442264e+10 6.679677e+10 6.921194e+10
## [151] 7.162860e+10 7.430987e+10 7.708019e+10 7.990718e+10 8.274481e+10
## [156] 8.558531e+10 8.843044e+10 9.147682e+10 9.470865e+10 9.827098e+10
## [161] 1.018405e+11 1.063112e+11 1.112004e+11 1.170986e+11 1.232768e+11
## [166] 1.296544e+11 1.372300e+11 1.461807e+11 1.556437e+11 1.651910e+11
## [171] 1.750755e+11 1.862436e+11 1.980805e+11 2.100897e+11 2.228021e+11
## [176] 2.396630e+11 2.583417e+11 2.780520e+11 3.012812e+11 3.262851e+11
## [181] 4.245989e+11 5.777519e+11
## 
## attr(,"class")
## [1] "Lc"
gini_val <- Gini(Global_GDP$GDP_Nominal, na.rm=TRUE)
gini_val
## [1] 0.8647664

The Lorenz curve further illustrates this concentration, as it deviates markedly from the line of perfect equality. Consistent with this visual evidence, the Gini coefficient of 0.865 indicates severe inequality in the distribution of GDP across countries, confirming that global output is dominated by a few large economies.

# plot Lorenz
plot(Lorenz_Curve, main = "Lorenz Curve of Global GDP (2023)", xlab = "Cumulative share of countries", ylab = "Cumulative share of GDP")
legend("topleft", paste0("Gini = ", round(gini_val, 3)))

Objective 4

Examine global GDP patterns in 2023

Maps showing spatial patterns (clusters of high GDP in North America, Western Europe, East Asia; high GDPpc in small high-income countries).

Spatial maps reveal concentration of absolute GDP in North America, Western Europe, and East Asia, while GDP per capita highlights high-income small states and developed economies.

library(rnaturalearth)
library(rnaturalearthdata)
library(sf)
library(countrycode)
Global_GDP$iso3 <- countrycode(Global_GDP$Country, 
                               origin = "country.name",
                               destination = "iso3c")

# get world map
world <- ne_countries(scale = "medium", returnclass = "sf")
# assume gdp2023 has iso3 codes named iso3
mapdata <- world %>%
  left_join(Global_GDP, by = c("iso_a3" = "iso3"))

# choropleth GDP (log scale)
ggplot(mapdata) +
  geom_sf(aes(fill = log10(GDP_Nominal))) +
  labs(title = "World map of log10(GDP) (2023)", fill = "log10 GDP") +
  theme_minimal()

# choropleth GDP per capita
ggplot(mapdata) +
  geom_sf(aes(fill = log10(GDP_per_capita))) +
  labs(title = "World map of log10(GDP per capita) (2023)", fill = "log10 GDPpc") +
  theme_minimal()

Conclusion

This study examined the global distribution of economic output using nominal GDP, GDP per capita, income group classifications, and regional comparisons. The findings reveal a highly concentrated global economic structure, where a small number of countries account for a disproportionate share of total world GDP. The Lorenz curve and Gini coefficient provide strong empirical evidence of this concentration, indicating extreme dispersion in economic output across countries.

While high-income and upper-middle-income countries dominate global production, low-income countries contribute only marginally, reflecting deep structural differences in productive capacity, technological advancement, and institutional development. Regional analysis further underscores these disparities, with significant variation in income levels both across and within regions.

Importantly, the results demonstrate that large shares of global GDP do not necessarily translate into higher standards of living, as GDP per capita varies substantially due to population size and distribution. Overall, the analysis highlights the persistent imbalance in global economic participation and the structural challenges faced by lower-income economies in achieving sustainable growth.