library(tidyverse)

Assignment

Based on difference in ratings between the chess players and each of their opponents in our Project 1 tournament, calculate each player’s expected score (e.g. 4.3) and the difference from their actual score (e.g 4.0). List the five players who most overperformed relative to their expected score, and the five players that most underperformed relative to their expected score.

You’ll find some small differences in different implementation of ELO formulas. You may use any reasonably sourced formula, but please cite your source.

Read in data

I’ll begin by reading in the csv that I created as part of the Project 1 assignment.

df <- read_csv('https://raw.githubusercontent.com/kac624/cuny/main/D607/output/proj1_chessRatings.csv')

glimpse(df)
## Rows: 64
## Columns: 12
## $ name          <chr> "GARY HUA", "DAKSHESH DARURI", "ADITYA BAJAJ", "PATRICK …
## $ state         <chr> "ON", "MI", "MI", "MI", "MI", "OH", "MI", "MI", "ON", "M…
## $ total_pts     <dbl> 6.0, 6.0, 6.0, 5.5, 5.5, 5.0, 5.0, 5.0, 5.0, 5.0, 4.5, 4…
## $ pre_rating    <dbl> 1794, 1553, 1384, 1716, 1655, 1686, 1649, 1641, 1411, 13…
## $ oppnt1        <dbl> 39, 63, 8, 23, 45, 34, 57, 3, 25, 16, 38, 42, 36, 54, 19…
## $ oppnt2        <dbl> 21, 58, 61, 28, 37, 29, 46, 32, 18, 19, 56, 33, 27, 44, …
## $ oppnt3        <dbl> 18, 4, 25, 2, 12, 11, 13, 14, 59, 55, 6, 5, 7, 8, 30, NA…
## $ oppnt4        <dbl> 14, 17, 21, 26, 13, 35, 11, 9, 8, 31, 7, 38, 5, 1, 22, 3…
## $ oppnt5        <dbl> 7, 16, 11, 5, 4, 10, 1, 47, 26, 6, 3, NA, 33, 27, 54, 2,…
## $ oppnt6        <dbl> 12, 20, 13, 19, 14, 27, 9, 28, 7, 25, 34, 1, 3, 5, 33, 3…
## $ oppnt7        <dbl> 4, 7, 12, 1, 17, 21, 2, 19, 20, 18, 26, 3, 32, 31, 38, N…
## $ avg_oppnt_rtg <dbl> 1605.286, 1469.286, 1563.571, 1573.571, 1500.857, 1518.7…

Calculating expected score

The formula I’ll use for expected score is as follows: \[ E_{A} = \frac{1}{1 + 10^{(R_B - R_A)/400}} \] where \(E\) is the expected score, \(R\) is the rating, and subscripts \(A\) and \(B\) refer to two players, A and B.
Sources: https://medium.com/purple-theory/what-is-elo-rating-c4eb7a9061e0 & https://en.wikipedia.org/wiki/Elo_rating_system

I begin by defining a function to calculate the expected score.

expected_score <- function(player_rating, oppnt_rating) {
  result <- 1 / (1 + 10^((oppnt_rating - player_rating)/400))
  return(result)
}

I then use that function in a loop that (i) gathers ratings for all opponents, then (ii) calculates the expected score against each, then (iii) sums them and adds the total expected score as a new column in the dataframe.

for (i in 1:nrow(df)) {
  # initiate variables, and gather index for each opponent 
  oppnts <- df[i,5:11]
  ratings <- c()
  player_rating <- df[i,'pre_rating']
  total_exp_score <- 0
  
  # loop through opponent indexes to gather ratings in list
  for (j in 1:length(oppnts)) {
    if (!is.na(oppnts[[j]])) {
      ratings <- c(ratings, df[oppnts[[j]], 'pre_rating'])
    }
  }
  
  # loop through ratings to calculate expected score for each and sum
  for (j in 1:length(ratings)) {
    if (!is.na(ratings[[j]])) {
      exp_score <- expected_score(player_rating, ratings[[j]])
      total_exp_score <- total_exp_score + exp_score
    }
  }
  
  # add score to df
  df[i, 'total_exp_score'] <- total_exp_score
}

I then add a new column to the dataframe that shows the difference between actual and expected scores. This column is calculated as \(Actual - Expected\) so that a positive number represents overperformance and a negative represents underperformance.

df <- df %>%
  mutate(score_diff = total_pts - total_exp_score)

To respond to the assignment’s primary question, I arrange our dataframe to highlight the five players that most over and underperformed relative to their expected scores.

df %>%
  arrange(desc(score_diff)) %>%
  select(!matches('oppnt[0-9]')) %>%
  head(5)
## # A tibble: 5 × 7
##   name                     state total_pts pre_rating avg_oppn…¹ total…² score…³
##   <chr>                    <chr>     <dbl>      <dbl>      <dbl>   <dbl>   <dbl>
## 1 ADITYA BAJAJ             MI          6         1384      1564.  1.95      4.05
## 2 ZACHARY JAMES HOUGHTON   MI          4.5       1220      1484.  1.37      3.13
## 3 ANVIT RAO                MI          5         1365      1554.  1.94      3.06
## 4 JACOB ALEXANDER LAVALLEY MI          3          377      1358.  0.0432    2.96
## 5 AMIYATOSH PWNANANDAM     MI          3.5        980      1385.  0.773     2.73
## # … with abbreviated variable names ¹​avg_oppnt_rtg, ²​total_exp_score,
## #   ³​score_diff
df %>%
  arrange(score_diff) %>%
  select(!matches('oppnt[0-9]')) %>%
  head(5)
## # A tibble: 5 × 7
##   name               state total_pts pre_rating avg_oppnt_rtg total_ex…¹ score…²
##   <chr>              <chr>     <dbl>      <dbl>         <dbl>      <dbl>   <dbl>
## 1 LOREN SCHWIEBERT   MI          3.5       1745         1363.       6.28   -2.78
## 2 GEORGE AVERY JONES ON          3.5       1522         1144.       6.02   -2.52
## 3 JARED GE           MI          3         1332         1150.       5.01   -2.01
## 4 RISHI SHETTY       MI          3.5       1494         1260.       5.09   -1.59
## 5 JOSHUA DAVID LEE   MI          3.5       1438         1150.       4.96   -1.46
## # … with abbreviated variable names ¹​total_exp_score, ²​score_diff

Finally, I put together a plot to visualize actual versus expected scores. I’ve added both linear and loess lines to get a sense of the trend. Neither, however, appears to provide a very good fit. There is a clear positive trend, but it’s very noisy. From this, we can conclude that, at least for this particular tournament, ELO ratings served as a relatively poor predictor of success.

ggplot(df, aes(total_pts,total_exp_score)) +
  geom_smooth(method = 'lm', formula = 'y~x', color = 'darkslategray1', alpha = 0.5) +
  geom_smooth(method = 'loess', formula = 'y~x', color = 'darksalmon', alpha = 0.5) +
  geom_point()