Introduction

In this analysis, we examine Olympic medal counts and their relationship with GDP and population data to uncover patterns in Olympic performance across various countries and regions. This report addresses the following key points:

Data Overview

We use the Olympic dataset containing information about medals won, GDP, population, and region data for each participating country. You can find the dataset here.

Top 10 Medal-Winning Countries

To identify which countries consistently perform well in the Olympics, we start by examining the top 10 countries by total medal count.

top_medal_winners <- olympics %>%
  arrange(desc(total)) %>%
  head(10)
top_medal_winners
##          country country_code        region gold silver bronze total      gdp
## 1  United States          USA North America   40     44     42   126 81695.19
## 2          China          CHN          Asia   40     27     24    91 12614.06
## 3  Great Britain          GBR        Europe   14     22     29    65 48866.60
## 4         France          FRA        Europe   16     26     22    64 44460.82
## 5      Australia          AUS       Oceania   18     19     16    53 64711.77
## 6          Japan          JPN          Asia   20     12     13    45 33834.39
## 7          Italy          ITA        Europe   12     13     15    40 38373.17
## 8    Netherlands          NLD        Europe   15      7     12    34 62536.73
## 9        Germany          DEU        Europe   12     13      8    33 52745.76
## 10   South Korea          KOR          Asia   13      9     10    32 33121.37
##    gdp_year population
## 1      2023      334.9
## 2      2023     1410.7
## 3      2023       68.3
## 4      2023       68.2
## 5      2023       26.6
## 6      2023      124.5
## 7      2023       58.8
## 8      2023       17.9
## 9      2023       84.5
## 10     2023       51.7

Bar plot of total medals for top 10 countries

ggplot(top_medal_winners, aes(x = reorder(country, -total), y = total, fill = country)) +
  geom_bar(stat = "identity", color = "black", alpha = 0.8) +
  labs(
    title = "Top 10 Medal-Winning Countries",
    subtitle = "Countries with the highest total medal counts",
    x = "Country",
    y = "Total Medals"
  ) +
  theme_minimal(base_size = 15) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, color = "black"),
    plot.title = element_text(face = "bold", size = 18),
    plot.subtitle = element_text(size = 14),
    legend.position = "none"
  ) +
  scale_fill_brewer(palette = "Set3")

Interpretation

This bar plot displays the top 10 countries by total medals, providing insights into which countries have historically dominated the Olympic Games. A few countries consistently lead in the Olympics, which could be due to factors such as investment in athletic programs or population size.

Medal Distribution by Region

Here, we analyze the distribution of gold, silver, and bronze medals across different regions. This analysis allows us to observe if certain regions excel in specific medal categories.

medal_distribution_by_region <- olympics %>%
  group_by(region) %>%
  summarise(
    gold = sum(gold, na.rm = TRUE),
    silver = sum(silver, na.rm = TRUE),
    bronze = sum(bronze, na.rm = TRUE),
    total = sum(total, na.rm = TRUE)
  )
medal_distribution_by_region
## # A tibble: 7 × 5
##   region         gold silver bronze total
##   <chr>         <int>  <int>  <int> <int>
## 1 Africa           13     12     14    39
## 2 Asia            100     83     99   282
## 3 Caribbean         2      1      4     7
## 4 Europe          125    131    166   422
## 5 North America    54     58     66   178
## 6 Oceania          28     27     19    74
## 7 South America     6     15     15    36

Stacked bar chart for medal types by region

medal_distribution_long <- medal_distribution_by_region %>%
  pivot_longer(cols = c("gold", "silver", "bronze"), names_to = "medal_type", values_to = "count")

ggplot(medal_distribution_long, aes(x = reorder(region, -total), y = count, fill = medal_type)) +
  geom_bar(stat = "identity", color = "black", alpha = 0.8) +
  labs(
    title = "Medal Distribution by Region",
    subtitle = "Stacked by medal type",
    x = "Region",
    y = "Medals"
  ) +
  theme_minimal(base_size = 15) +
  scale_fill_manual(values = c("gold" = "gold", "silver" = "grey", "bronze" = "saddlebrown")) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    plot.title = element_text(face = "bold", size = 18),
    plot.subtitle = element_text(size = 14)
  )

Interpretation

The stacked bar chart illustrates regional strengths in different medal types. Some regions may have a higher count of particular medals, possibly reflecting cultural or institutional focus on certain sports.

Medals per Capita

We calculate medals per capita (per 10 million people) and analyze its relationship with GDP per capita (log-transformed) to understand how efficiently countries translate resources into Olympic success.

olympics_with_medals_per_capita <- olympics %>%
  mutate(
    log_gdp_per_capita = log10(gdp * 1e9 / population),
    medals_per_10_million = total / (population / 10)
  )
olympics_with_medals_per_capita
##               country country_code        region gold silver bronze total
## 1       United States          USA North America   40     44     42   126
## 2               China          CHN          Asia   40     27     24    91
## 3               Japan          JPN          Asia   20     12     13    45
## 4           Australia          AUS       Oceania   18     19     16    53
## 5              France          FRA        Europe   16     26     22    64
## 6         Netherlands          NLD        Europe   15      7     12    34
## 7       Great Britain          GBR        Europe   14     22     29    65
## 8         South Korea          KOR          Asia   13      9     10    32
## 9               Italy          ITA        Europe   12     13     15    40
## 10            Germany          DEU        Europe   12     13      8    33
## 11        New Zealand          NZL       Oceania   10      7      3    20
## 12             Canada          CAN North America    9      7     11    27
## 13         Uzbekistan          UZB          Asia    8      2      3    13
## 14            Hungary          HUN        Europe    6      7      6    19
## 15              Spain          ESP        Europe    5      4      9    18
## 16             Sweden          SWE        Europe    4      4      3    11
## 17              Kenya          KEN        Africa    4      2      5    11
## 18             Norway          NOR        Europe    4      1      3     8
## 19            Ireland          IRL        Europe    4      0      3     7
## 20             Brazil          BRA South America    3      7     10    20
## 21               Iran          IRN          Asia    3      6      3    12
## 22            Ukraine          UKR        Europe    3      5      4    12
## 23            Romania          ROU        Europe    3      4      2     9
## 24            Georgia          GEO        Europe    3      3      1     7
## 25            Belgium          BEL        Europe    3      1      6    10
## 26           Bulgaria          BGR        Europe    3      1      3     7
## 27             Serbia          SRB        Europe    3      1      1     5
## 28     Czech Republic          CZE        Europe    3      0      2     5
## 29            Denmark          DNK        Europe    2      2      5     9
## 30         Azerbaijan          AZE          Asia    2      2      3     7
## 31            Croatia          HRV        Europe    2      2      3     7
## 32               Cuba          CUB North America    2      1      6     9
## 33            Bahrain          BHR          Asia    2      1      1     4
## 34           Slovenia          SVN        Europe    2      1      0     3
## 35             Taiwan          TPE          Asia    2      0      5     7
## 36            Austria          AUT        Europe    2      0      3     5
## 37          Hong Kong          HKG          Asia    2      0      2     4
## 38        Philippines          PHL          Asia    2      0      2     4
## 39            Algeria          DZA        Africa    2      0      1     3
## 40          Indonesia          IDN          Asia    2      0      1     3
## 41             Israel          ISR          Asia    1      5      1     7
## 42             Poland          POL        Europe    1      4      5    10
## 43         Kazakhstan          KAZ          Asia    1      3      3     7
## 44            Jamaica          JAM North America    1      3      2     6
## 45       South Africa          ZAF        Africa    1      3      2     6
## 46           Thailand          THA          Asia    1      3      2     6
## 47           Ethiopia          ETH        Africa    1      3      0     4
## 48        Switzerland          CHE        Europe    1      2      5     8
## 49            Ecuador          ECU South America    1      2      2     5
## 50           Portugal          PRT        Europe    1      2      1     4
## 51             Greece          GRC        Europe    1      1      6     8
## 52          Argentina          ARG South America    1      1      1     3
## 53              Egypt          EGY        Africa    1      1      1     3
## 54            Tunisia          TUN        Africa    1      1      1     3
## 55           Botswana          BWA        Africa    1      1      0     2
## 56              Chile          CHL South America    1      1      0     2
## 57           St Lucia          LCA     Caribbean    1      1      0     2
## 58             Uganda          UGA        Africa    1      1      0     2
## 59 Dominican Republic          DOM North America    1      0      2     3
## 60          Guatemala          GTM North America    1      0      1     2
## 61            Morocco          MAR        Africa    1      0      1     2
## 62           Dominica          DMA     Caribbean    1      0      0     1
## 63           Pakistan          PAK          Asia    1      0      0     1
## 64             Turkey          TUR          Asia    0      3      5     8
## 65             Mexico          MEX North America    0      3      2     5
## 66            Armenia          ARM          Asia    0      3      1     4
## 67           Colombia          COL South America    0      3      1     4
## 68        North Korea          PRK          Asia    0      2      4     6
## 69         Kyrgyzstan          KGZ          Asia    0      2      4     6
## 70          Lithuania          LTU        Europe    0      2      2     4
## 71              India          IND          Asia    0      1      5     6
## 72            Moldova          MDA        Europe    0      1      3     4
## 73             Kosovo          XKX        Europe    0      1      1     2
## 74             Cyprus          CYP        Europe    0      1      0     1
## 75               Fiji          FJI       Oceania    0      1      0     1
## 76             Jordan          JOR          Asia    0      1      0     1
## 77           Mongolia          MNG          Asia    0      1      0     1
## 78             Panama          PAN South America    0      1      0     1
## 79         Tajikistan          TJK          Asia    0      0      3     3
## 80            Albania          ALB        Europe    0      0      2     2
## 81            Grenada          GRD     Caribbean    0      0      2     2
## 82           Malaysia          MYS          Asia    0      0      2     2
## 83        Puerto Rico          PRI     Caribbean    0      0      2     2
## 84         Cape Verde          CPV        Africa    0      0      1     1
## 85        Ivory Coast          CIV        Africa    0      0      1     1
## 86               Peru          PER South America    0      0      1     1
## 87              Qatar          QAT          Asia    0      0      1     1
## 88          Singapore          SGP          Asia    0      0      1     1
## 89           Slovakia          SVK        Europe    0      0      1     1
## 90             Zambia          ZMB        Africa    0      0      1     1
##          gdp gdp_year population log_gdp_per_capita medals_per_10_million
## 1   81695.19     2023      334.9          11.387281            3.76231711
## 2   12614.06     2023     1410.7           9.951420            0.64506982
## 3   33834.39     2023      124.5          11.434189            3.61445783
## 4   64711.77     2023       26.6          12.386102           19.92481203
## 5   44460.82     2023       68.2          11.814193            9.38416422
## 6   62536.73     2023       17.9          12.543282           18.99441341
## 7   48866.60     2023       68.3          11.854591            9.51683748
## 8   33121.37     2023       51.7          11.806618            6.18955513
## 9   38373.17     2023       58.8          11.814650            6.80272109
## 10  52745.76     2023       84.5          11.795331            3.90532544
## 11  48527.83     2023        5.2          12.969988           38.46153846
## 12  53371.70     2023       40.1          12.124167            6.73316708
## 13   2496.11     2023       36.4          10.836162            3.57142857
## 14  22147.21     2023        9.6          12.363048           19.79166667
## 15  32676.98     2023       48.4          11.829397            3.71900826
## 16  56305.25     2023       10.5          12.729360           10.47619048
## 17   1949.90     2023       55.1          10.548861            1.99637024
## 18  87961.78     2023        5.5          13.203931           14.54545455
## 19 103684.88     2023        5.3          13.291440           13.20754717
## 20  10043.62     2023      216.4          10.666633            0.92421442
## 21   4502.55     2023       89.2          10.703094            1.34529148
## 22   5181.36     2023       37.0          11.146242            3.24324324
## 23  18419.42     2023       19.1          11.984243            4.71204188
## 24   8120.36     2023        3.8          12.329792           18.42105263
## 25  53475.29     2023       11.8          12.656271            8.47457627
## 26  15797.60     2023        6.4          12.392411           10.93750000
## 27  11360.96     2023        6.6          12.235871            7.57575758
## 28  30427.42     2023       10.9          12.445839            4.58715596
## 29  67967.38     2023        5.9          13.061449           15.25423729
## 30   7155.08     2023       10.1          11.850293            6.93069307
## 31  21459.78     2023        3.9          12.740561           17.94871795
## 32  56495.85     2022       11.2          12.702799            8.03571429
## 33  29084.31     2023        1.5          13.287568           26.66666667
## 34  32163.51     2023        2.1          13.185144           14.28571429
## 35  32443.71     2023       23.2          12.145643            3.01724138
## 36  56505.97     2023        9.1          12.793053            5.49450549
## 37  50696.59     2023        7.5          12.829917            5.33333333
## 38   3725.55     2023      117.3          10.501892            0.34100597
## 39   5260.21     2023       45.6          11.062038            0.65789474
## 40   4940.55     2023      277.5          10.250512            0.10810811
## 41  52261.68     2023        9.8          12.726957            7.14285714
## 42  22112.86     2023       36.7          11.779979            2.72479564
## 43  13136.62     2023       19.9          11.819631            3.51758794
## 44   6874.20     2023        2.8          12.390064           21.42857143
## 45   6253.16     2023       60.4          11.015063            0.99337748
## 46   7171.81     2023       71.8          10.999504            0.83565460
## 47   1293.78     2023      126.5          10.009770            0.31620553
## 48  99994.94     2023        8.8          13.055495            9.09090909
## 49   6533.35     2023       18.2          11.555065            2.74725275
## 50  27275.11     2023       10.5          12.414577            3.80952381
## 51  22990.01     2023       10.4          12.344506            7.69230769
## 52  13730.51     2023       46.7          11.468370            0.64239829
## 53   3512.58     2023      112.7          10.493702            0.26619343
## 54   3895.39     2023       12.5          11.493641            2.40000000
## 55   7249.80     2023        2.7          12.428962            7.40740741
## 56  17093.24     2023       19.6          11.940568            1.02040816
## 57  13980.09     2023        0.2          13.844480          100.00000000
## 58   1014.21     2023       48.6          10.319492            0.41152263
## 59  10716.01     2023       11.3          11.976955            2.65486726
## 60   5797.52     2023       17.6          11.517730            1.13636364
## 61   3672.11     2023       37.8          10.987424            0.52910053
## 62   8953.90     2023        0.1          13.952012          100.00000000
## 63   1407.02     2023      240.5           9.767185            0.04158004
## 64  12985.75     2023       85.3          11.182518            0.93786635
## 65  13926.11     2023      128.5          11.034927            0.38910506
## 66   8715.77     2023        2.8          12.493148           14.28571429
## 67   6979.73     2023       52.1          11.127001            0.76775432
## 68   1217.00     2023       26.2          10.666989            2.29007634
## 69   1969.87     2023        7.1          11.443179            8.45070423
## 70  27102.78     2023        2.9          12.970616           13.79310345
## 71   2484.85     2023     1428.6           9.240390            0.04199916
## 72   6650.65     2023        2.5          12.424924           16.00000000
## 73   5943.13     2023        1.8          12.518743           11.11111111
## 74  34701.44     2023        1.3          13.426404            7.69230769
## 75   5868.16     2023        0.9          12.814259           11.11111111
## 76   4482.09     2023       11.3          11.598402            0.88495575
## 77   5764.80     2023        3.4          12.229305            2.94117647
## 78  18661.77     2023        4.5          12.617740            2.22222222
## 79   1188.99     2023       10.1          11.070857            2.97029703
## 80   8367.78     2023        2.7          12.491246            7.40740741
## 81  10463.65     2023        0.1          14.019683          200.00000000
## 82  11648.67     2023       34.3          11.530982            0.58309038
## 83  36779.06     2023        3.2          13.060451            6.25000000
## 84   4321.58     2023        0.6          12.857491           16.66666667
## 85   2728.80     2023       28.9          10.975074            0.34602076
## 86   7789.87     2023       34.4          11.354972            0.29069767
## 87  87480.42     2022        2.7          13.510547            3.70370370
## 88  84734.26     2023        5.9          13.157207            1.69491525
## 89  24470.24     2023        5.4          12.656244            1.85185185
## 90   1369.13     2023       20.6          10.822577            0.48543689
ggplot(olympics_with_medals_per_capita, aes(x = log_gdp_per_capita, y = medals_per_10_million, color = region)) +
  geom_point(size = 3, alpha = 0.7) +
  labs(
    title = "Log of GDP per Capita vs. Medals per 10 Million People",
    subtitle = "An insight into wealth and Olympic success",
    x = "Log of GDP per Capita (USD)",
    y = "Medals per 10 Million People"
  ) +
  theme_minimal(base_size = 15) +
  scale_color_brewer(palette = "Set1") +
  theme(
    plot.title = element_text(face = "bold", size = 18),
    plot.subtitle = element_text(size = 14),
    legend.position = "bottom"
  )

Interpretation

The scatter plot reveals that wealthier countries, represented by higher GDP per capita, tend to achieve more medals per capita. This suggests a positive relationship between a country’s wealth and its ability to produce Olympic success.

GDP and Medal Count Correlation

To understand the direct relationship between economic power and Olympic success, we calculate the correlation between a country’s GDP and total medal count.

gdp_medal_correlation <- olympics %>%
  summarise(correlation = cor(gdp, total, use = "complete.obs"))
gdp_medal_correlation
##   correlation
## 1   0.3479416
ggplot(olympics, aes(x = gdp, y = total)) +
  geom_point(color = "darkgreen", size = 2, alpha = 0.6) +
  labs(
    title = paste("GDP vs. Total Medals (Correlation:", round(gdp_medal_correlation$correlation, 2), ")"),
    subtitle = "Examining economic impact on medal count",
    x = "GDP (in billions)",
    y = "Total Medals"
  ) +
  theme_minimal(base_size = 15) +
  theme(
    plot.title = element_text(face = "bold", size = 18),
    plot.subtitle = element_text(size = 14)
  )

Interpretation

This plot shows a positive correlation between GDP and total medals, indicating that countries with larger economies tend to perform better in the Olympics. This could imply that higher GDP enables more investment in sports and training infrastructure.

Average Medals by GDP Brackets

To further investigate the impact of economic status on Olympic performance, we categorize countries into GDP brackets and examine the average medals won by each bracket.

avg_medals_by_gdp_bracket <- olympics %>%
  mutate(gdp_bracket = cut(gdp, breaks = c(0, 20000, 50000, 100000, Inf),
                           labels = c("Low", "Medium", "High", "Very High"))) %>%
  group_by(gdp_bracket) %>%
  summarise(avg_medals = mean(total, na.rm = TRUE))
avg_medals_by_gdp_bracket
## # A tibble: 4 × 2
##   gdp_bracket avg_medals
##   <fct>            <dbl>
## 1 Low               6.15
## 2 Medium           18.0 
## 3 High             21.6 
## 4 Very High         7
ggplot(avg_medals_by_gdp_bracket, aes(x = gdp_bracket, y = avg_medals, fill = gdp_bracket)) +
  geom_bar(stat = "identity", color = "black", alpha = 0.8) +
  labs(
    title = "Average Medals by GDP Bracket",
    subtitle = "Comparing Olympic success across economic categories",
    x = "GDP Bracket",
    y = "Average Medals"
  ) +
  theme_minimal(base_size = 15) +
  scale_fill_brewer(palette = "Pastel1") +
  theme(
    plot.title = element_text(face = "bold", size = 18),
    plot.subtitle = element_text(size = 14),
    legend.position = "none"
  )

Interpretation

The bar plot indicates that countries with higher GDP brackets tend to win more medals on average. This suggests a correlation between a country’s wealth and its success in the Olympics, possibly due to greater investment in athletic development.

Conclusion

This analysis illustrates the significant influence of economic factors, such as GDP and GDP per capita, on Olympic success. Wealthier countries generally perform better in the Olympics, possibly due to better infrastructure, training, and investment in sports. However, other factors like population, cultural emphasis on sports, and training programs also play crucial roles in achieving Olympic success.