Assignment 5B – ELO Calculations Codebase

Author

Muhammad Suffyan Khan

Published

March 1, 2026

Objective

The objective of this assignment is to compute each player’s expected tournament score using the Elo rating model and compare it with their actual score from the Project 1 tournament.

We will then identify:

The five players who most overperformed relative to expectation
The five players who most underperformed relative to expectation

Data Source

This analysis builds directly on the structured dataset created in Project 1.

The dataset contains exactly one row per player with the following variables:

Player Name
Player State
Total Points (Actual Score)
Pre-Tournament Rating
Average Pre-Tournament Rating of Opponents

This dataset was exported as a CSV file and uploaded to GitHub to ensure reproducibility. The file will be loaded into R using the GitHub raw link.

Elo Expected Score Formula

The Elo rating system models the probability that Player A defeats Player B using a logistic function based on rating differences.

The expected score formula used in this assignment is:

\[ E_A = \frac{1}{1 + 10^{(R_B - R_A)/400}} \]

Where:

(R_A) = Player A’s pre-tournament rating
(R_B) = Opponent’s pre-tournament rating
400 = scaling constant in the Elo system

This formulation implies that a 400-point rating difference corresponds to approximately a 10-to-1 expected win ratio.

Sources:
- Glickman, M. E. A Comprehensive Guide to Chess Ratings: https://www.glicko.net/research/acjpaper.pdf
- Elo rating system (Wikipedia): https://en.wikipedia.org/wiki/Elo_rating_system
- The Elo Rating System for Chess and Beyond (YouTube, 2019): https://www.youtube.com/watch?v=AsYfbmp0To0

Approximation Strategy

In the standard Elo system, expected score is computed separately for each game using each opponent’s rating, and then summed across games.

However, the Project 1 dataset retains only the average pre-tournament rating of opponents, rather than the full match level opponent list. Therefore, this analysis approximates expected tournament score by applying the Elo expected score formula using the player’s rating versus their average opponent rating, and then scaling by the number of rounds.

The expected per-game score is computed as:

\[ E = \frac{1}{1 + 10^{(\overline{R_{opp}} - R_{player})/400}} \]

The expected tournament score is then:

\[ \text{Expected Tournament Score} = n \times E \]

Where:

(n) = number of rounds (games) in the tournament
({R_{opp}}) = average opponent pre-rating

This provides a transparent approximation consistent with the Elo logistic framework given the available Project 1 output.

Implementation Steps

The analysis will proceed as follows:

Load Project 1 dataset
- Load Project1_Chess_Summary.csv directly from the GitHub raw link using read.csv() to ensure reproducibility.
Compute expected per-game probability
- For each player, compute expected per-game score using the Elo formula and the player’s pre-rating versus the average opponent pre-rating.
Multiply by number of rounds
- Since the tournament consists of seven rounds, the expected per-game probability will be multiplied by the number of rounds played to estimate the player’s total expected tournament score.
Compute performance difference
- Compute:

\[ \text{Performance Difference} = \text{Actual Score} - \text{Expected Score} \]

Positive values indicate overperformance relative to rating.
Negative values indicate underperformance.

Rank players
- Sort players by performance difference.
- Report the top five overperformers and bottom five underperformers.

Final Output

The final output will include:

Player Name
Pre-Tournament Rating
Actual Score
Expected Score (approximate)
Performance Difference

Players will be ranked to identify:

Top 5 Overperformers
Top 5 Underperformers

This approach builds directly upon the structured output from Project 1 and applies the Elo rating model to evaluate tournament performance relative to expectation.

Codebase

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(knitr)

1. Load Project 1 dataset

This analysis uses the cleaned Project 1 summary dataset (one row per player) containing:

Player Name
Player State
Total Points (Actual Score)
Pre-Tournament Rating
Average Pre-Tournament Rating of Opponents

chess <- read.csv(
  "https://raw.githubusercontent.com/suffyankhan77/Assignment5B-DATA-607/main/Project1_Chess_Summary.csv"
)

chess <- as_tibble(chess)

glimpse(chess)

Rows: 64
Columns: 5
$ Player_Name                 <chr> "Gary Hua", "Dakshesh Daruri", "Aditya Baj…
$ State                       <chr> "ON", "MI", "MI", "MI", "MI", "OH", "MI", …
$ Total_Points                <dbl> 6.0, 6.0, 6.0, 5.5, 5.5, 5.0, 5.0, 5.0, 5.…
$ Pre_Rating                  <int> 1794, 1553, 1384, 1716, 1655, 1686, 1649, …
$ Average_Opponent_Pre_Rating <int> 1605, 1469, 1564, 1574, 1501, 1519, 1372, …

colnames(chess)

[1] "Player_Name"                 "State"                      
[3] "Total_Points"                "Pre_Rating"                 
[5] "Average_Opponent_Pre_Rating"

2. Elo expected score formula (per-game)

We use the standard Elo expected score formula:

\[ E_A = \frac{1}{1 + 10^{(R_B - R_A)/400}} \]

This follows the standard Elo logistic model (see Approach section for detailed explanation and sources).

3. Compute expected tournament score (approximation)

Because the Project 1 output retains only the average opponent pre-rating (not the full opponent list), I approximated the expected tournament score by computing expected per-game score versus the average opponent rating, then multiplying by the number of rounds played.

In this tournament there are 7 rounds, so we use n_rounds = 7.

n_rounds <- 7

elo_expected <- function(r_player, r_opp_avg) {
  1 / (1 + 10^((r_opp_avg - r_player) / 400))
}

results <- chess %>%
  mutate(
    expected_per_game = elo_expected(Pre_Rating, Average_Opponent_Pre_Rating),
    expected_score = n_rounds * expected_per_game,
    actual_score = Total_Points,
    performance_diff = actual_score - expected_score
  )

results %>%
  select(Player_Name, State, Pre_Rating, Average_Opponent_Pre_Rating, actual_score, expected_score, performance_diff) %>%
  arrange(desc(performance_diff)) %>%
  head(10) %>%
  kable(digits = 3, caption = "Top 10 players by performance difference (Actual - Expected)")

Top 10 players by performance difference (Actual - Expected)
Player_Name	State	Pre_Rating	Average_Opponent_Pre_Rating	actual_score	expected_score	performance_diff
Aditya Bajaj	MI	1384	1564	6.0	1.833	4.167
Zachary James Houghton	MI	1220	1484	4.5	1.257	3.243
Anvit Rao	MI	1365	1554	5.0	1.764	3.236
Jacob Alexander Lavalley	MI	377	1358	3.0	0.025	2.975
Amiyatosh Pwnanandam	MI	980	1385	3.5	0.620	2.880
Stefano Lee	ON	1411	1523	5.0	2.409	2.591
Ethan Guo	MI	935	1495	2.5	0.268	2.232
Michael R Aldrich	MI	1229	1357	4.0	2.266	1.734
Dakshesh Daruri	MI	1553	1469	6.0	4.330	1.670
Tejas Ayyagari	MI	1011	1356	2.5	0.845	1.655

4. Identify overperformers and underperformers

We rank players by: \[ Performance Difference=Actual Score−Expected Score \]

Positive values indicate overperformance
Negative values indicate underperformance

Top 5 Overperformers

top_5_over <- results %>%
  arrange(desc(performance_diff)) %>%
  select(Player_Name, State, Pre_Rating, actual_score, expected_score, performance_diff) %>%
  slice(1:5)

kable(top_5_over, digits = 3, caption = "Top 5 Overperformers (Actual - Expected)")

Top 5 Overperformers (Actual - Expected)
Player_Name	State	Pre_Rating	actual_score	expected_score	performance_diff
Aditya Bajaj	MI	1384	6.0	1.833	4.167
Zachary James Houghton	MI	1220	4.5	1.257	3.243
Anvit Rao	MI	1365	5.0	1.764	3.236
Jacob Alexander Lavalley	MI	377	3.0	0.025	2.975
Amiyatosh Pwnanandam	MI	980	3.5	0.620	2.880

Top 5 Underperformers

top_5_under <- results %>%
  arrange(performance_diff) %>%
  select(Player_Name, State, Pre_Rating, actual_score, expected_score, performance_diff) %>%
  slice(1:5)

kable(top_5_under, digits = 3, caption = "Top 5 Underperformers (Actual - Expected)")

Top 5 Underperformers (Actual - Expected)
Player_Name	State	Pre_Rating	actual_score	expected_score	performance_diff
Ashwin Balaji	MI	1530	1.0	6.151	-5.151
Loren Schwiebert	MI	1745	3.5	6.301	-2.801
George Avery Jones	ON	1522	3.5	6.286	-2.786
Gaurav Gidwani	MI	1552	3.5	6.089	-2.589
Chiedozie Okorie	MI	1602	3.5	5.880	-2.380

5. Visualization

ggplot(results, aes(x = performance_diff)) +
  geom_histogram(bins = 15) +
  labs(
    title = "Distribution of Performance Difference (Actual - Expected)",
    x = "Performance Difference",
    y = "Number of Players"
  )

6. Brief Interpretation

Players with the largest positive performance differences significantly exceeded their expected scores based on pre-tournament ratings and average opponent strength. In several cases, lower-rated players substantially outperformed rating-based expectations.

Conversely, the largest negative differences indicate higher-rated players who scored fewer points than predicted by the Elo model.

Because expected scores were calculated using average opponent ratings (rather than game-by-game calculations), these results represent an approximation of true Elo expectations. However, the ranking clearly highlights players whose tournament performance diverged most from rating-based predictions.