week_5B_assignment

Author

Brandon Chanderban

Introduction/Approach

The objective for this Week 5B assignment is to evaluate how each player (within the Project One chess tournament dataset) performed relative to what would be expected based upon the ELO rating differences between them and each of their opponents. Using each player’s pre-tournament rating and the pre-tournament ratings of each of their opponents, we will attempt to compute an expected score across the entirety of the tournament (for instance, 4.3 points across the 7 played matches), compare it to the player’s actual obtained points, and then identify the five players who would have most overperformed and most underperformed relative to their expectations.

With reference to the ELO expected score formula, this will be sourced from a reasonable reference, such as Professor Catlin’s provided video: The Elo Rating System for Chess and Beyond. Feb 15, 2019. [video, 7m]

Data Structure

This assignment requires a player=to=opponent mapping, because the expected score is computed per opponent (then summed). Based on the completed Project One, the two key structures to be utilized are:

A player-level table comprising features such as pair number, player name, total points (the player’s actual score), and the player’s pre-rating.
An opponent-level table containing multiple rows per each player, with one row mapping to each opponent across the played rounds. This table includes features such as the player’s pair number, the opponents’ pair numbers, as well as the opponents’ pre-ratings.

Using the above data, the expected scores can be calculated per match-up and then aggregated to determine the expected score per player (in a one-row-per-player summary table).

Proposed Plan

The analytical approach will likely follow the steps outlined below:

Reuse the cleaned Project One outputs rather than re-parsing the raw tournament text file.
In doing so, we must ensure that we have:
- each player’s pre-rating and total points, as well as
- each player’s list of opponents with opponent pre-ratings.
Compute the expected score per each of the match-ups using the standard ELO expected score formula below (singingbanana, 2019; Glickman, n.d.):

In the above, the RA represents the player’s pre-rating and the RB is the opponent’s pre-rating. The formula should output an expected value per match-up (ranging between 0 and 1).
Once the above has been calculated for each match-up, we can then group by the player’s pair number and sum the expected match-up scores across all rounds to obtain their expected points.
The opponent-level table containing each player’s expected points can then be joined back to the player-level table, allowing us to then calculate the difference between total points (actual performance) and expected performance.
We will then sort by these performance differences to determine the top 5 overperformers and the top 5 underperformers.

Anticipated Challenges

One expected challenge lies in ensuring the join logic is correct and that the opponent pre-ratings correspond accurately to each player-round entry. Even small join errors can produce seemingly plausible looking outputs, so validation checks may be called for.

Optional Endeavors

Include plots exhibiting the distribution of performance differences for the top and bottom performers.
Include hand-calculated validations for one or two players, ensuring that the expected score calculations match the R outputs.

Code Base/Body

As mentioned within the outlined approach of the previous section, the two pieces of data to be drawn upon in the course of this analysis are the player-level and the opponent-level tables (extracted from the raw chess tournament results text file) from within Project One.

To transfer these tables over to this working directory, the Project One Quarto document was revisited and the two desired tables were written to CSV files, which were then pushed to my personal GitHub repository. They are then read in as dataframes once more as follows (after loading in the required library, as per usual).

library(tidyverse)

Warning: package 'tidyverse' was built under R version 4.5.2

Warning: package 'ggplot2' was built under R version 4.5.2

Warning: package 'tidyr' was built under R version 4.5.2

Warning: package 'purrr' was built under R version 4.5.2

Warning: package 'stringr' was built under R version 4.5.2

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.6.0
✔ ggplot2   4.0.1     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

player_url <- "https://raw.githubusercontent.com/bkchanderban/CUNY_SPS/refs/heads/main/DATA607/DATA607/week_5B_assignment/player_level.csv"

opponent_url <- "https://raw.githubusercontent.com/bkchanderban/CUNY_SPS/refs/heads/main/DATA607/DATA607/week_5B_assignment/opponent_level.csv"

player_df <- read_csv(player_url, show_col_types = FALSE)
opp_df <- read_csv(opponent_url, show_col_types = FALSE)

glimpse(player_df)

Rows: 64
Columns: 5
$ pair_number  <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17…
$ player_name  <chr> "GARY HUA", "DAKSHESH DARURI", "ADITYA BAJAJ", "PATRICK H…
$ player_state <chr> "ON", "MI", "MI", "MI", "MI", "OH", "MI", "MI", "ON", "MI…
$ total_points <dbl> 6.0, 6.0, 6.0, 5.5, 5.5, 5.0, 5.0, 5.0, 5.0, 5.0, 4.5, 4.…
$ pre_rating   <dbl> 1794, 1553, 1384, 1716, 1655, 1686, 1649, 1641, 1411, 136…

glimpse(opp_df)

Rows: 408
Columns: 3
$ pair_number <dbl> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3…
$ opponents   <dbl> 39, 21, 18, 14, 7, 12, 4, 63, 58, 4, 17, 16, 20, 7, 8, 61,…
$ pre_rating  <dbl> 1436, 1563, 1600, 1610, 1649, 1663, 1716, 1175, 917, 1716,…

Now that we have obtained the required player and opponent dataframes, our next step is to conduct some feature engineering; principally, renaming the pre_rating column within the opp_df to opponent_pre_rating to avoid confusion down the line.

opp_df <- opp_df %>%
  rename(opponent_pre_rating = pre_rating)
head(opp_df)

# A tibble: 6 × 3
  pair_number opponents opponent_pre_rating
        <dbl>     <dbl>               <dbl>
1           1        39                1436
2           1        21                1563
3           1        18                1600
4           1        14                1610
5           1         7                1649
6           1        12                1663

Now, we can attach (to the opp_df) each player’s own pre-rating by joining based upon the coinciding pair numbers.

matchups <- opp_df %>%
  left_join(
    player_df %>% select(pair_number, pre_rating),
    by = "pair_number"
  ) %>%
  rename (player_pre_rating = pre_rating)

glimpse(matchups)

Rows: 408
Columns: 4
$ pair_number         <dbl> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3,…
$ opponents           <dbl> 39, 21, 18, 14, 7, 12, 4, 63, 58, 4, 17, 16, 20, 7…
$ opponent_pre_rating <dbl> 1436, 1563, 1600, 1610, 1649, 1663, 1716, 1175, 91…
$ player_pre_rating   <dbl> 1794, 1794, 1794, 1794, 1794, 1794, 1794, 1553, 15…

head(matchups, 9)

# A tibble: 9 × 4
  pair_number opponents opponent_pre_rating player_pre_rating
        <dbl>     <dbl>               <dbl>             <dbl>
1           1        39                1436              1794
2           1        21                1563              1794
3           1        18                1600              1794
4           1        14                1610              1794
5           1         7                1649              1794
6           1        12                1663              1794
7           1         4                1716              1794
8           2        63                1175              1553
9           2        58                 917              1553

Next, the task is that of computing the expected score per matchup using the ELO formula/calculation (singingbanana, 2019; Glickman, n.d.).

matchups <- matchups %>%
  mutate(expected_score = 1 / (1 + 10^((opponent_pre_rating - player_pre_rating) / 400)))
  
matchups %>% select(pair_number, opponents, player_pre_rating, opponent_pre_rating, expected_score) %>% head(n = 9)

# A tibble: 9 × 5
  pair_number opponents player_pre_rating opponent_pre_rating expected_score
        <dbl>     <dbl>             <dbl>               <dbl>          <dbl>
1           1        39              1794                1436          0.887
2           1        21              1794                1563          0.791
3           1        18              1794                1600          0.753
4           1        14              1794                1610          0.743
5           1         7              1794                1649          0.697
6           1        12              1794                1663          0.680
7           1         4              1794                1716          0.610
8           2        63              1553                1175          0.898
9           2        58              1553                 917          0.975

Given that we would have just calculated the expected score gained per matchup for each of the players’ partaken rounds, our next step is to sum these expected scores in order to determine the expected tournament score per player.

expected_by_player <- matchups %>%
  group_by(pair_number) %>%
  summarise(expected_points = sum(expected_score), .groups = "drop")

head(expected_by_player, 9)

# A tibble: 9 × 2
  pair_number expected_points
        <dbl>           <dbl>
1           1            5.16
2           2            3.78
3           3            1.95
4           4            4.74
5           5            4.38
6           6            4.94
7           7            4.58
8           8            5.03
9           9            2.29

Now that we are in possession of each player’s expected points throughout the entirety of the tournament, our next goal is to determine the difference between their actual scores and their predicted ones. Following this, we will then be able to ascertain the top 5 overperformers and top 5 underperformers.

performance <- player_df %>%
  left_join(expected_by_player, by = "pair_number") %>%
  mutate(
    difference = total_points - expected_points
  ) %>% arrange(desc(difference))

performance %>% select(pair_number, player_name, total_points, expected_points, difference) %>% head(n = 9)

# A tibble: 9 × 5
  pair_number player_name              total_points expected_points difference
        <dbl> <chr>                           <dbl>           <dbl>      <dbl>
1           3 ADITYA BAJAJ                      6            1.95         4.05
2          15 ZACHARY JAMES HOUGHTON            4.5          1.37         3.13
3          10 ANVIT RAO                         5            1.94         3.06
4          46 JACOB ALEXANDER LAVALLEY          3            0.0432       2.96
5          37 AMIYATOSH PWNANANDAM              3.5          0.773        2.73
6           9 STEFANO LEE                       5            2.29         2.71
7           2 DAKSHESH DARURI                   6            3.78         2.22
8          52 ETHAN GUO                         2.5          0.295        2.20
9          59 SEAN M MC CORMICK                 2            0.415        1.59

Top 5 Overperformers

In determining the top 5 overperforming players within the chess tournament, all that needs to be done is to sort the performance dataframe in descending order of the difference column (that is, the difference between the players’ total points and their expected points).

top_overperformers <- performance %>%
  arrange(desc(difference)) %>%
  slice_head(n = 5) %>%
  select(pair_number, player_name, total_points, expected_points, difference)

top_overperformers

# A tibble: 5 × 5
  pair_number player_name              total_points expected_points difference
        <dbl> <chr>                           <dbl>           <dbl>      <dbl>
1           3 ADITYA BAJAJ                      6            1.95         4.05
2          15 ZACHARY JAMES HOUGHTON            4.5          1.37         3.13
3          10 ANVIT RAO                         5            1.94         3.06
4          46 JACOB ALEXANDER LAVALLEY          3            0.0432       2.96
5          37 AMIYATOSH PWNANANDAM              3.5          0.773        2.73

Top 5 Underperformers

Conversely, to obtain the top 5 underperformers from the chess tournament, we simply sort the difference column of the performance dataframe in ascending order.

top_underperformers <- performance %>%
  arrange(difference) %>%
  slice_head(n = 5) %>%
  select(pair_number, player_name, total_points, expected_points, difference)

top_underperformers

# A tibble: 5 × 5
  pair_number player_name        total_points expected_points difference
        <dbl> <chr>                     <dbl>           <dbl>      <dbl>
1          25 LOREN SCHWIEBERT            3.5            6.28      -2.78
2          30 GEORGE AVERY JONES          3.5            6.02      -2.52
3          42 JARED GE                    3              5.01      -2.01
4          31 RISHI SHETTY                3.5            5.09      -1.59
5          35 JOSHUA DAVID LEE            3.5            4.96      -1.46

Now that we’ve obtained the names of the top 5 over- and under-performing chess participants, I will perform a hand calculation for one player to ensure that the derived expected points (and thus the difference between total/actual and expected points) is accurate. This hand validation will be done for the top underperformer, the player with the pair number of 25, Loren Schwiebert.

In this calculation, using the data from the opp_df, we can see that Loren partook in the following matchups:

opp_df %>%
  filter(pair_number == 25)

# A tibble: 7 × 3
  pair_number opponents opponent_pre_rating
        <dbl>     <dbl>               <dbl>
1          25         9                1411
2          25        53                1393
3          25         3                1384
4          25        24                1229
5          25        34                1399
6          25        10                1365
7          25        47                1362

Using the ELO expected-score formula (singingbanana, 2019; Glickman, n.d.), Loren’s expected scores for each round, in the above order, are: 0.8724, 0.8835, 0.8888, 0.9512, 0.8799, 0.8991, and 0.9007. To determine his total expected score for the entirety of the tournament, we simply add these figures, which yields 6.2757. In observing the table of the top 5 underperformers, we can see that this matches the figure in the expected_points column for Loren (6.275650), and would therefore result in approximately the same value in the difference column if substituted in place.

Visualize the Distribution of Performance Differences for the Top and Bottom Performers

Now that we have validated our findings regarding the performance differences of players within the chess tournament, one final step is to visualize these differences between expected points and the total/actual points obtained.

top_bottom_10 <- bind_rows(
  top_underperformers %>% mutate(group = "Underperformers"),
  top_overperformers %>% mutate(group = "Overperformers")
) %>%
  mutate(player_name = forcats::fct_reorder(player_name, difference))

ggplot(top_bottom_10, aes(x = player_name, y = difference, fill = group)) +
  geom_col() +
  geom_hline(yintercept = 0) +
  coord_flip() +
  scale_fill_manual(values = c(
    "Overperformers" = "seagreen3",
    "Underperformers" = "pink2"
  )) +
  labs(
    title = "Top 5 Underperformers and Top 5 Overperformers (Actual - Expected)",
    x = "Player",
    y = "Performance difference (points)",
    fill = ""
  ) +
  theme(plot.title = element_text(hjust = 0.5))

Conclusion

In completing this Week 5B ELO calculations assignment, the Project One tournament data were used to compare each player’s actual performance against what would be expected based on their pre-tournament rating differences across seven matchups. Using the standard ELO expected-score formula (singingbanana, 2019; Glickman, n.d.), an expected score was calculated for each player-opponent pairing, summed to produce each player’s total expected points, and then compared to the player’s actual total points to obtain a performance difference (actual minus expected).

From this, the five players who most overperformed relative to their expectations and the five who most underperformed were identified, demonstrating how tournament outcomes can differ from rating-based predictions due to variability exisiting within individual matchups. Overall, although the results show that ELO provides a strong baseline for expected performance, individual tournament play can still produce notable over and underperformance with respect to those expectations.

References

Glickman, M. E. (n.d.). A comprehensive guide to chess ratings [PDF]. Glicko.net.
OpenAI. (2026). ChatGPT (Version 4o) [Large language model]. https://chat.openai.com . Accessed February 28, 2026.
singingbanana. (2019, February 15). The Elo rating system for chess and beyond [Video]. YouTube.