In this analysis, we examine Olympic medal counts and their relationship with GDP and population data to uncover patterns in Olympic performance across various countries and regions. This report addresses the following key points:
We use the Olympic dataset containing information about medals won, GDP, population, and region data for each participating country. You can find the dataset here.
To identify which countries consistently perform well in the Olympics, we start by examining the top 10 countries by total medal count.
top_medal_winners <- olympics %>%
arrange(desc(total)) %>%
head(10)
top_medal_winners
## country country_code region gold silver bronze total gdp
## 1 United States USA North America 40 44 42 126 81695.19
## 2 China CHN Asia 40 27 24 91 12614.06
## 3 Great Britain GBR Europe 14 22 29 65 48866.60
## 4 France FRA Europe 16 26 22 64 44460.82
## 5 Australia AUS Oceania 18 19 16 53 64711.77
## 6 Japan JPN Asia 20 12 13 45 33834.39
## 7 Italy ITA Europe 12 13 15 40 38373.17
## 8 Netherlands NLD Europe 15 7 12 34 62536.73
## 9 Germany DEU Europe 12 13 8 33 52745.76
## 10 South Korea KOR Asia 13 9 10 32 33121.37
## gdp_year population
## 1 2023 334.9
## 2 2023 1410.7
## 3 2023 68.3
## 4 2023 68.2
## 5 2023 26.6
## 6 2023 124.5
## 7 2023 58.8
## 8 2023 17.9
## 9 2023 84.5
## 10 2023 51.7
ggplot(top_medal_winners, aes(x = reorder(country, -total), y = total, fill = country)) +
geom_bar(stat = "identity", color = "black", alpha = 0.8) +
labs(
title = "Top 10 Medal-Winning Countries",
subtitle = "Countries with the highest total medal counts",
x = "Country",
y = "Total Medals"
) +
theme_minimal(base_size = 15) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, color = "black"),
plot.title = element_text(face = "bold", size = 18),
plot.subtitle = element_text(size = 14),
legend.position = "none"
) +
scale_fill_brewer(palette = "Set3")
This bar plot displays the top 10 countries by total medals, providing insights into which countries have historically dominated the Olympic Games. A few countries consistently lead in the Olympics, which could be due to factors such as investment in athletic programs or population size.
Here, we analyze the distribution of gold, silver, and bronze medals across different regions. This analysis allows us to observe if certain regions excel in specific medal categories.
medal_distribution_by_region <- olympics %>%
group_by(region) %>%
summarise(
gold = sum(gold, na.rm = TRUE),
silver = sum(silver, na.rm = TRUE),
bronze = sum(bronze, na.rm = TRUE),
total = sum(total, na.rm = TRUE)
)
medal_distribution_by_region
## # A tibble: 7 × 5
## region gold silver bronze total
## <chr> <int> <int> <int> <int>
## 1 Africa 13 12 14 39
## 2 Asia 100 83 99 282
## 3 Caribbean 2 1 4 7
## 4 Europe 125 131 166 422
## 5 North America 54 58 66 178
## 6 Oceania 28 27 19 74
## 7 South America 6 15 15 36
medal_distribution_long <- medal_distribution_by_region %>%
pivot_longer(cols = c("gold", "silver", "bronze"), names_to = "medal_type", values_to = "count")
ggplot(medal_distribution_long, aes(x = reorder(region, -total), y = count, fill = medal_type)) +
geom_bar(stat = "identity", color = "black", alpha = 0.8) +
labs(
title = "Medal Distribution by Region",
subtitle = "Stacked by medal type",
x = "Region",
y = "Medals"
) +
theme_minimal(base_size = 15) +
scale_fill_manual(values = c("gold" = "gold", "silver" = "grey", "bronze" = "saddlebrown")) +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
plot.title = element_text(face = "bold", size = 18),
plot.subtitle = element_text(size = 14)
)
The stacked bar chart illustrates regional strengths in different medal types. Some regions may have a higher count of particular medals, possibly reflecting cultural or institutional focus on certain sports.
We calculate medals per capita (per 10 million people) and analyze its relationship with GDP per capita (log-transformed) to understand how efficiently countries translate resources into Olympic success.
olympics_with_medals_per_capita <- olympics %>%
mutate(
log_gdp_per_capita = log10(gdp * 1e9 / population),
medals_per_10_million = total / (population / 10)
)
olympics_with_medals_per_capita
## country country_code region gold silver bronze total
## 1 United States USA North America 40 44 42 126
## 2 China CHN Asia 40 27 24 91
## 3 Japan JPN Asia 20 12 13 45
## 4 Australia AUS Oceania 18 19 16 53
## 5 France FRA Europe 16 26 22 64
## 6 Netherlands NLD Europe 15 7 12 34
## 7 Great Britain GBR Europe 14 22 29 65
## 8 South Korea KOR Asia 13 9 10 32
## 9 Italy ITA Europe 12 13 15 40
## 10 Germany DEU Europe 12 13 8 33
## 11 New Zealand NZL Oceania 10 7 3 20
## 12 Canada CAN North America 9 7 11 27
## 13 Uzbekistan UZB Asia 8 2 3 13
## 14 Hungary HUN Europe 6 7 6 19
## 15 Spain ESP Europe 5 4 9 18
## 16 Sweden SWE Europe 4 4 3 11
## 17 Kenya KEN Africa 4 2 5 11
## 18 Norway NOR Europe 4 1 3 8
## 19 Ireland IRL Europe 4 0 3 7
## 20 Brazil BRA South America 3 7 10 20
## 21 Iran IRN Asia 3 6 3 12
## 22 Ukraine UKR Europe 3 5 4 12
## 23 Romania ROU Europe 3 4 2 9
## 24 Georgia GEO Europe 3 3 1 7
## 25 Belgium BEL Europe 3 1 6 10
## 26 Bulgaria BGR Europe 3 1 3 7
## 27 Serbia SRB Europe 3 1 1 5
## 28 Czech Republic CZE Europe 3 0 2 5
## 29 Denmark DNK Europe 2 2 5 9
## 30 Azerbaijan AZE Asia 2 2 3 7
## 31 Croatia HRV Europe 2 2 3 7
## 32 Cuba CUB North America 2 1 6 9
## 33 Bahrain BHR Asia 2 1 1 4
## 34 Slovenia SVN Europe 2 1 0 3
## 35 Taiwan TPE Asia 2 0 5 7
## 36 Austria AUT Europe 2 0 3 5
## 37 Hong Kong HKG Asia 2 0 2 4
## 38 Philippines PHL Asia 2 0 2 4
## 39 Algeria DZA Africa 2 0 1 3
## 40 Indonesia IDN Asia 2 0 1 3
## 41 Israel ISR Asia 1 5 1 7
## 42 Poland POL Europe 1 4 5 10
## 43 Kazakhstan KAZ Asia 1 3 3 7
## 44 Jamaica JAM North America 1 3 2 6
## 45 South Africa ZAF Africa 1 3 2 6
## 46 Thailand THA Asia 1 3 2 6
## 47 Ethiopia ETH Africa 1 3 0 4
## 48 Switzerland CHE Europe 1 2 5 8
## 49 Ecuador ECU South America 1 2 2 5
## 50 Portugal PRT Europe 1 2 1 4
## 51 Greece GRC Europe 1 1 6 8
## 52 Argentina ARG South America 1 1 1 3
## 53 Egypt EGY Africa 1 1 1 3
## 54 Tunisia TUN Africa 1 1 1 3
## 55 Botswana BWA Africa 1 1 0 2
## 56 Chile CHL South America 1 1 0 2
## 57 St Lucia LCA Caribbean 1 1 0 2
## 58 Uganda UGA Africa 1 1 0 2
## 59 Dominican Republic DOM North America 1 0 2 3
## 60 Guatemala GTM North America 1 0 1 2
## 61 Morocco MAR Africa 1 0 1 2
## 62 Dominica DMA Caribbean 1 0 0 1
## 63 Pakistan PAK Asia 1 0 0 1
## 64 Turkey TUR Asia 0 3 5 8
## 65 Mexico MEX North America 0 3 2 5
## 66 Armenia ARM Asia 0 3 1 4
## 67 Colombia COL South America 0 3 1 4
## 68 North Korea PRK Asia 0 2 4 6
## 69 Kyrgyzstan KGZ Asia 0 2 4 6
## 70 Lithuania LTU Europe 0 2 2 4
## 71 India IND Asia 0 1 5 6
## 72 Moldova MDA Europe 0 1 3 4
## 73 Kosovo XKX Europe 0 1 1 2
## 74 Cyprus CYP Europe 0 1 0 1
## 75 Fiji FJI Oceania 0 1 0 1
## 76 Jordan JOR Asia 0 1 0 1
## 77 Mongolia MNG Asia 0 1 0 1
## 78 Panama PAN South America 0 1 0 1
## 79 Tajikistan TJK Asia 0 0 3 3
## 80 Albania ALB Europe 0 0 2 2
## 81 Grenada GRD Caribbean 0 0 2 2
## 82 Malaysia MYS Asia 0 0 2 2
## 83 Puerto Rico PRI Caribbean 0 0 2 2
## 84 Cape Verde CPV Africa 0 0 1 1
## 85 Ivory Coast CIV Africa 0 0 1 1
## 86 Peru PER South America 0 0 1 1
## 87 Qatar QAT Asia 0 0 1 1
## 88 Singapore SGP Asia 0 0 1 1
## 89 Slovakia SVK Europe 0 0 1 1
## 90 Zambia ZMB Africa 0 0 1 1
## gdp gdp_year population log_gdp_per_capita medals_per_10_million
## 1 81695.19 2023 334.9 11.387281 3.76231711
## 2 12614.06 2023 1410.7 9.951420 0.64506982
## 3 33834.39 2023 124.5 11.434189 3.61445783
## 4 64711.77 2023 26.6 12.386102 19.92481203
## 5 44460.82 2023 68.2 11.814193 9.38416422
## 6 62536.73 2023 17.9 12.543282 18.99441341
## 7 48866.60 2023 68.3 11.854591 9.51683748
## 8 33121.37 2023 51.7 11.806618 6.18955513
## 9 38373.17 2023 58.8 11.814650 6.80272109
## 10 52745.76 2023 84.5 11.795331 3.90532544
## 11 48527.83 2023 5.2 12.969988 38.46153846
## 12 53371.70 2023 40.1 12.124167 6.73316708
## 13 2496.11 2023 36.4 10.836162 3.57142857
## 14 22147.21 2023 9.6 12.363048 19.79166667
## 15 32676.98 2023 48.4 11.829397 3.71900826
## 16 56305.25 2023 10.5 12.729360 10.47619048
## 17 1949.90 2023 55.1 10.548861 1.99637024
## 18 87961.78 2023 5.5 13.203931 14.54545455
## 19 103684.88 2023 5.3 13.291440 13.20754717
## 20 10043.62 2023 216.4 10.666633 0.92421442
## 21 4502.55 2023 89.2 10.703094 1.34529148
## 22 5181.36 2023 37.0 11.146242 3.24324324
## 23 18419.42 2023 19.1 11.984243 4.71204188
## 24 8120.36 2023 3.8 12.329792 18.42105263
## 25 53475.29 2023 11.8 12.656271 8.47457627
## 26 15797.60 2023 6.4 12.392411 10.93750000
## 27 11360.96 2023 6.6 12.235871 7.57575758
## 28 30427.42 2023 10.9 12.445839 4.58715596
## 29 67967.38 2023 5.9 13.061449 15.25423729
## 30 7155.08 2023 10.1 11.850293 6.93069307
## 31 21459.78 2023 3.9 12.740561 17.94871795
## 32 56495.85 2022 11.2 12.702799 8.03571429
## 33 29084.31 2023 1.5 13.287568 26.66666667
## 34 32163.51 2023 2.1 13.185144 14.28571429
## 35 32443.71 2023 23.2 12.145643 3.01724138
## 36 56505.97 2023 9.1 12.793053 5.49450549
## 37 50696.59 2023 7.5 12.829917 5.33333333
## 38 3725.55 2023 117.3 10.501892 0.34100597
## 39 5260.21 2023 45.6 11.062038 0.65789474
## 40 4940.55 2023 277.5 10.250512 0.10810811
## 41 52261.68 2023 9.8 12.726957 7.14285714
## 42 22112.86 2023 36.7 11.779979 2.72479564
## 43 13136.62 2023 19.9 11.819631 3.51758794
## 44 6874.20 2023 2.8 12.390064 21.42857143
## 45 6253.16 2023 60.4 11.015063 0.99337748
## 46 7171.81 2023 71.8 10.999504 0.83565460
## 47 1293.78 2023 126.5 10.009770 0.31620553
## 48 99994.94 2023 8.8 13.055495 9.09090909
## 49 6533.35 2023 18.2 11.555065 2.74725275
## 50 27275.11 2023 10.5 12.414577 3.80952381
## 51 22990.01 2023 10.4 12.344506 7.69230769
## 52 13730.51 2023 46.7 11.468370 0.64239829
## 53 3512.58 2023 112.7 10.493702 0.26619343
## 54 3895.39 2023 12.5 11.493641 2.40000000
## 55 7249.80 2023 2.7 12.428962 7.40740741
## 56 17093.24 2023 19.6 11.940568 1.02040816
## 57 13980.09 2023 0.2 13.844480 100.00000000
## 58 1014.21 2023 48.6 10.319492 0.41152263
## 59 10716.01 2023 11.3 11.976955 2.65486726
## 60 5797.52 2023 17.6 11.517730 1.13636364
## 61 3672.11 2023 37.8 10.987424 0.52910053
## 62 8953.90 2023 0.1 13.952012 100.00000000
## 63 1407.02 2023 240.5 9.767185 0.04158004
## 64 12985.75 2023 85.3 11.182518 0.93786635
## 65 13926.11 2023 128.5 11.034927 0.38910506
## 66 8715.77 2023 2.8 12.493148 14.28571429
## 67 6979.73 2023 52.1 11.127001 0.76775432
## 68 1217.00 2023 26.2 10.666989 2.29007634
## 69 1969.87 2023 7.1 11.443179 8.45070423
## 70 27102.78 2023 2.9 12.970616 13.79310345
## 71 2484.85 2023 1428.6 9.240390 0.04199916
## 72 6650.65 2023 2.5 12.424924 16.00000000
## 73 5943.13 2023 1.8 12.518743 11.11111111
## 74 34701.44 2023 1.3 13.426404 7.69230769
## 75 5868.16 2023 0.9 12.814259 11.11111111
## 76 4482.09 2023 11.3 11.598402 0.88495575
## 77 5764.80 2023 3.4 12.229305 2.94117647
## 78 18661.77 2023 4.5 12.617740 2.22222222
## 79 1188.99 2023 10.1 11.070857 2.97029703
## 80 8367.78 2023 2.7 12.491246 7.40740741
## 81 10463.65 2023 0.1 14.019683 200.00000000
## 82 11648.67 2023 34.3 11.530982 0.58309038
## 83 36779.06 2023 3.2 13.060451 6.25000000
## 84 4321.58 2023 0.6 12.857491 16.66666667
## 85 2728.80 2023 28.9 10.975074 0.34602076
## 86 7789.87 2023 34.4 11.354972 0.29069767
## 87 87480.42 2022 2.7 13.510547 3.70370370
## 88 84734.26 2023 5.9 13.157207 1.69491525
## 89 24470.24 2023 5.4 12.656244 1.85185185
## 90 1369.13 2023 20.6 10.822577 0.48543689
ggplot(olympics_with_medals_per_capita, aes(x = log_gdp_per_capita, y = medals_per_10_million, color = region)) +
geom_point(size = 3, alpha = 0.7) +
labs(
title = "Log of GDP per Capita vs. Medals per 10 Million People",
subtitle = "An insight into wealth and Olympic success",
x = "Log of GDP per Capita (USD)",
y = "Medals per 10 Million People"
) +
theme_minimal(base_size = 15) +
scale_color_brewer(palette = "Set1") +
theme(
plot.title = element_text(face = "bold", size = 18),
plot.subtitle = element_text(size = 14),
legend.position = "bottom"
)
The scatter plot reveals that wealthier countries, represented by higher GDP per capita, tend to achieve more medals per capita. This suggests a positive relationship between a country’s wealth and its ability to produce Olympic success.
To understand the direct relationship between economic power and Olympic success, we calculate the correlation between a country’s GDP and total medal count.
gdp_medal_correlation <- olympics %>%
summarise(correlation = cor(gdp, total, use = "complete.obs"))
gdp_medal_correlation
## correlation
## 1 0.3479416
ggplot(olympics, aes(x = gdp, y = total)) +
geom_point(color = "darkgreen", size = 2, alpha = 0.6) +
labs(
title = paste("GDP vs. Total Medals (Correlation:", round(gdp_medal_correlation$correlation, 2), ")"),
subtitle = "Examining economic impact on medal count",
x = "GDP (in billions)",
y = "Total Medals"
) +
theme_minimal(base_size = 15) +
theme(
plot.title = element_text(face = "bold", size = 18),
plot.subtitle = element_text(size = 14)
)
This plot shows a positive correlation between GDP and total medals, indicating that countries with larger economies tend to perform better in the Olympics. This could imply that higher GDP enables more investment in sports and training infrastructure.
To further investigate the impact of economic status on Olympic performance, we categorize countries into GDP brackets and examine the average medals won by each bracket.
avg_medals_by_gdp_bracket <- olympics %>%
mutate(gdp_bracket = cut(gdp, breaks = c(0, 20000, 50000, 100000, Inf),
labels = c("Low", "Medium", "High", "Very High"))) %>%
group_by(gdp_bracket) %>%
summarise(avg_medals = mean(total, na.rm = TRUE))
avg_medals_by_gdp_bracket
## # A tibble: 4 × 2
## gdp_bracket avg_medals
## <fct> <dbl>
## 1 Low 6.15
## 2 Medium 18.0
## 3 High 21.6
## 4 Very High 7
ggplot(avg_medals_by_gdp_bracket, aes(x = gdp_bracket, y = avg_medals, fill = gdp_bracket)) +
geom_bar(stat = "identity", color = "black", alpha = 0.8) +
labs(
title = "Average Medals by GDP Bracket",
subtitle = "Comparing Olympic success across economic categories",
x = "GDP Bracket",
y = "Average Medals"
) +
theme_minimal(base_size = 15) +
scale_fill_brewer(palette = "Pastel1") +
theme(
plot.title = element_text(face = "bold", size = 18),
plot.subtitle = element_text(size = 14),
legend.position = "none"
)
The bar plot indicates that countries with higher GDP brackets tend to win more medals on average. This suggests a correlation between a country’s wealth and its success in the Olympics, possibly due to greater investment in athletic development.
This analysis illustrates the significant influence of economic factors, such as GDP and GDP per capita, on Olympic success. Wealthier countries generally perform better in the Olympics, possibly due to better infrastructure, training, and investment in sports. However, other factors like population, cultural emphasis on sports, and training programs also play crucial roles in achieving Olympic success.