Introduction

As the people around the globe gather around to watch the 2022 World Cup, there is international attention around the participating countries, their cultures, and their level of economic stability. The 2022 World Cup is taking place in Quatar during November and December of 2022. The tournament features soccer teams around the world who have the honor of playing for their countries on the World Cup stage in Quatar. This is the highest level of play for many of these players and it often brings feelings of unity and national celebration in each country, similar to the Olympics.

Hypothesis

This project will compare the countries participating in the 2022 World Cup with their country’s Gross Domestic Product (GDP) and participation in the global economy. It can be predicted that because countries with a higher GDP may have access to more resources and training, they may have a higher FIFA ranking, which has the potential to influence a country’s success in the World Cup. That being said, we are aware that the 2022 World Cup will not wrap up until mid-December 2022. Therefore, it is impossible to know for sure if a higher GDP leads to or is related to a country’s success in the world cup. However, we can predict that there may be a correlation between at least some of the countries that have a high GDP and those that have a higher FIFA ranking.

Process

First, I searched for data surrounding the GDP of different countries around the world. I knew this would allow me to find out more about each country’s economic growth or decline and their contribution to the overall global economy. Once I had my data about GDP by country, I found data about the 2022 World Cup participants, past winners and host countries, and current and previous rankings by the Federation Internationale de Football Association (FIFA), which is responsible for overseeing international soccer teams, matches, rankings, and the World Cup. FIFA is also who hosts games that countries must play in to qualify for the World Cup, so this data was crucial to understanding the ability of each team.

Before working with my data, I loaded the required libraries.

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.8     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.1
## ✔ readr   2.1.2     ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(tidytext)
library(ggthemes)
library(readr)
library(gridExtra)
## 
## Attaching package: 'gridExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     combine

Then, I imported the data I found to be most relevant to my hypothesis.

library(readr)
gdp_1960_2020 <- read_csv("gdp_1960_2020.csv")
## Rows: 10134 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): country, state
## dbl (4): year, rank, gdp, gdp_percent
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
View(gdp_1960_2020)
## Warning in system2("/usr/bin/otool", c("-L", shQuote(DSO)), stdout = TRUE):
## running command ''/usr/bin/otool' -L '/Library/Frameworks/R.framework/Resources/
## modules/R_de.so'' had status 1
library(readr)
fifa_ranking_2022_10_06 <- read_csv("fifa_ranking-2022-10-06.csv")   
## Rows: 63916 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (3): country_full, country_abrv, confederation
## dbl  (4): rank, total_points, previous_points, rank_change
## date (1): rank_date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

After loading all the required information, I started to work with my data. I first looked at the countries with the top 10 GDPs in the years 2000 and 2020. I chose to analyze this time span because it allowed me to compare the top 10 countries over a span of twenty years.

gdp_1960_2020 %>% 
  filter(year == 2000)
## # A tibble: 198 × 6
##     year  rank country           state       gdp gdp_percent
##    <dbl> <dbl> <chr>             <chr>     <dbl>       <dbl>
##  1  2000     1 the United States America 1.03e13      0.308 
##  2  2000     2 Japan             Asia    4.89e12      0.147 
##  3  2000     3 Germany           Europe  1.94e12      0.0584
##  4  2000     4 United Kingdom    Europe  1.66e12      0.0498
##  5  2000     5 France            Europe  1.36e12      0.0409
##  6  2000     6 China             Asia    1.21e12      0.0364
##  7  2000     7 Italy             Europe  1.14e12      0.0344
##  8  2000     8 Canada            America 7.45e11      0.0224
##  9  2000     9 Mexico            America 7.08e11      0.0213
## 10  2000    10 Brazil            America 6.55e11      0.0197
## # … with 188 more rows
gdp_1960_2020 %>% 
  filter(year == 2020)
## # A tibble: 175 × 6
##     year  rank country           state       gdp gdp_percent
##    <dbl> <dbl> <chr>             <chr>     <dbl>       <dbl>
##  1  2020     1 the United States America 2.09e13      0.270 
##  2  2020     2 China             Asia    1.47e13      0.190 
##  3  2020     3 Germany           Europe  3.81e12      0.0491
##  4  2020     4 United Kingdom    Europe  2.71e12      0.0349
##  5  2020     5 India             Asia    2.62e12      0.0338
##  6  2020     6 France            Europe  2.60e12      0.0336
##  7  2020     7 Italy             Europe  1.89e12      0.0243
##  8  2020     8 Canada            America 1.64e12      0.0212
##  9  2020     9 South Korea       Asia    1.63e12      0.0210
## 10  2020    10 Russia            Europe  1.48e12      0.0191
## # … with 165 more rows

As we can see from these two lists, the top 10 countries with the highest GDP changed between 2000 and 2020. Some countries were on the list in both 2000 and 2020, while others dropped off the list or joined the list in 2020. Now, let’s graph the GDP of the thirteen countries that appeared on the list either in 2000 or 2020.

gdp_1960_2020 %>% 
  filter(year >= 2000 & year <= 2020) %>% 
  filter(country %in% c('the United States', 'Japan', 'Germany', 'United Kingdom', 'France', 'China', 'Italy', 'Canada', 'Mexico', 'Brazil', 'India', 'South Korea', 'Russia')) %>%
  ggplot(aes(year, gdp, color = country)) + geom_line()

This graph shows that the United States consistently had the highest GDP between 2000 and 2020. Japan, Mexico and Brazil were on the 2000 list but were replaced in 2020 by India, South Korea and Brazil. Next, let’s look at the FIFA rankings in 2022. This will allow us to see the FIFA rankings of any countries with the highest GDPs between 2000 and 2020.

fifa_ranking_2022_10_06 %>% 
  filter(country_full %in% c('Brazil', 'Belgium', 'Argentina', 'France', 'England', 'Spain', 'Netherlands', 'Portugal', 'Denmark', 'Germany')) %>%
ggplot(aes(rank_date, rank, color = country_full)) + geom_line()

This graph shows us the change in FIFA rankings from early 1993 to 2022. If we look just at the years 2000 to 2020, we can see that the teams with the best overall FIFA rankings between 2000 and 2020 were Spain, Portugal, Brazil, and France. However, only one country on this list is also on the list of 13 countries with the highest GDP between 2000 and 2020. Therefore, this data shows that there is little to no correlation between countries with a good FIFA ranking and high national GDP.

Geomapping with GDP and World Cup Hosts

The second piece of this project aimed to analyze if there was any correlation between a country’s choice to host the World Cup and their GDP. We predict that because the World Cup is an international event, it has the power to bring attention to countries that may not be always acknowledged or noticed for their contribution to the global economy. Hosting the World Cup is a huge undertaking, but it also has the opportunity to bring a lot of business and tourism to a country. Therefore, I predict there will be little to no overlap between countries that have hosted the World Cup and those that are one of the countries mentioned above with the highest GDP.

In order to best view this data, I decided to build two maps using Tableau. The first world map below shows the countries that hosted the World Cup between 1930 and 2022.

var divElement = document.getElementById('viz1670395046914');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement); 

The map above shows all the countries that hosted World Cup tournaments every four years between 1930 and 2022. We can see right off the bat that some of these countries like Russia, The United States, and France appear on the list of countries with the highest GDP between 2000 and 2020. However, it will possibly be easier to view when we have a map that shows the different GDPs for each country.

var divElement = document.getElementById('viz1670395826972');                    var vizElement = divElement.getElementsByTagName('object')[0];                    vizElement.style.width='100%';vizElement.style.height=(divElement.offsetWidth*0.75)+'px';                    var scriptElement = document.createElement('script');                    scriptElement.src = 'https://public.tableau.com/javascripts/api/viz_v1.js';                    vizElement.parentNode.insertBefore(scriptElement, vizElement);        

This map uses a color gradient to show each country’s GDP around the world. This is a map of their most recent GDP, so it doesn’t necessarily take into consideration the growth of the economy in each country. However, it is useful to cross reference which countries are hosting the World Cup that also have a high GDP and vice versa. Six of the countries that appeared on th highest GDP list—the United States, Mexico, Brazil, France, Italy, and Russia—also appeared to have hosted the World Cup tournament at some point. Although this was not what was originally predicted, we can assume this is also the result of these countries having strong, steady economies; while they may not need to host the tournament to boost their economy, they are able to because they have a steady national economy that may show growth.

Conclusion

The first part of this project analyzed if countries with a higher GDP tended to have a higher or lower FIFA ranking. We predicted that the countries with a higher GDP would have a higher FIFA ranking. However, this was not necessarily the case. While there was some overlap, many of the countries with a good FIFA ranking were in South America or Europe and did not appear on the same list of countries with a high GDP. The second part of this project analyzed the countries which have hosted the World Cup throughout history and whether they appeared on the list of countries with the highest GDPs. Although we predict there would be little overlap due to underdeveloped or less economically-successful countries wanting to increase tourism in their country, the data reported the opposite. In fact, six of the countries that had hosted the World Cup also appeared on the list of countries with the highest GDP. This is most likely because these countries know they can balance preparation and construction for the World Cup while also maintaining a healthy national economy.

The results of this project do not state that a high GDP causes a better FIFA rank or vice versa. It also does not aim to state that countries that host the World Cup tend to have a higher or lower GDP. This study only aims to discover any correlation between Wolrd Cup participation, hosting, and national GDP.

We also recognize the limitations of this study. The years examined for GDP and World Cup participation or hosting did not overlap exactly, and some data had been reported for longer and more in depth than other pieces of data. For example, GDP was calculated based on the United Kingdom, but the FIFA rankings included those of the individual countries that make up the UK, which are England, Scotland, Wales, and Northern Ireland. That being said, a more in-depth study could be beneficial to explain the factors that contribute to a country’s national GDP and discuss the relation to FIFA rankings over a long period of time—maybe even since the 1930s when the first World Cup was held. Although my predictions did not necessarily ring true, this study can inform others on the correlation between World Cup teams and different countries’ GDP.