Overview

We will be looking at a dataset of chess players who have transferred federations from the years 2000 - 2017. This data was provided alongside the fivethirtyeight article ‘American Chess Is Great Again’ (https://fivethirtyeight.com/features/american-chess-is-great-again/). The main point of the article is that the USA has built a very strong chess presence by attracting strong players from other federations, leading to what is currently known as the holy trinity of Nakamura, Caruana, and So. I will look to recreate some of the findings from the article.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)

##This code simply downloads the data if it is not already present in your working directory.
url <- 'https://raw.githubusercontent.com/fivethirtyeight/data/master/chess-transfers/transfers.csv'
download_file <- 'chesstransfers.csv'
if (!exists(download_file)){
  download.file(url, download_file)
}
#Read in the data
df <- read.csv(download_file)

#Remove the url column and also remove duplicate rows (there are quite a few)
df <- df %>% select(-url) %>% distinct()

#Rename columns
names(df) <- c('playerid', 'newfederation', 'oldfederation', 'transferdate')

Questions

Now that we have our data, let’s answer two questions:

How many players have transferred to the US?

How many players have transferred from the US?

old <- df %>% filter(oldfederation == 'USA')
cat('Transfers from the US:', nrow(old))
## Transfers from the US: 28
new <- df %>% filter(newfederation == 'USA')
cat('Transfers to the US:', nrow(new))
## Transfers to the US: 66

We see that indeed there are much more migrations to the US federation than from.

Visualization

Let’s also plot the top 10 countries by number of transfers to and from.

old10 <- df %>% group_by(oldfederation) %>% summarize(count = n()) %>% arrange(desc(count))
old10 <- old10 %>% head(10)
old10 %>% ggplot(aes(x = reorder(oldfederation, -count), y = count)) + geom_bar(stat = 'identity') + xlab('Country Code') + ylab('Number of Transfers') + ggtitle('Countries with most transfers FROM')

new10 <- df %>% group_by(newfederation) %>% summarize(count = n()) %>% arrange(desc(count))
new10 <- new10 %>% head(10)
new10 %>% ggplot(aes(x = reorder(newfederation, -count), y = count)) + geom_bar(stat = 'identity') + xlab('Country Code') + ylab('Number of Transfers') + ggtitle('Countries with most transfers TO')

Results

The data are fairly clear, the USA is far and away receiving the most player transfers of all countries. Further analysis and much more data would be needed to determine why this is the case. We also see an exodus from eastern European countries, namely Russia and Ukraine, which would be interesting to examine more closely as well.