Chess.com: Exploratory Data Analysis

Manipulating & Visualizing chess.com Game Data

Author

Reece Iriye

Introduction

Chess.com is a website where players can match-make against one another to play games of chess. Players are typically matched against one another based on their skill level, and the website’s welcoming interface has caused it to boost in popularity in recent years. Ordinary people are making the decision to learn chess because of chess.com’s convenience, and it’s even become popular to watch high-rated players matchmake against random stranger similarly ranked to them on streaming sites like Twitch.

Because of the newfound hype around chess.com, I have decided to make a analyze a dataset of high-rated players’ games and how well they performed. Throughout my analysis, I focuses on the performance of the highest rated players of the dataset, and I also measured how players performed overall depending on whether they were playing as white or black.

Data Overview

Issues with the Data

The original dataset converted win indicators from 1-0 to a date-time variable Jan-00, so I cleaned those elements of the dataset by converting wins back to 1-0. Additionally, for the specific player of interest in each row for the game data, I added a column to indicate the outcome of the game from the perspective of the player of interest. Below is a table of the players in the dataset who won the most amount of games.

Player Results

While the original dataset is intuitive on its own with the stories that the columns already have the potential to tell, I believe a Result column will be essential displaying the outcome of the game from the perspective of the player of interest.

With this information, I compiled a list of the top 10 players with the most wins in our dataset.

Name Win Count
Hikaru Nakamura 234
Baadur Jobava 145
Christopher Woojin Yoo 136
Arjun Erigaisi 131
Alexey Vasilyevich Sarana 118
Denis Lazavik 115
Sergei Zhigalko 115
Nodirbek Abdusattorov 112
Levon Aronian 111
Wesley So 111

Hikaru Nakamura outperforms all other players by a longshot with 234 wins, outperforming the 2nd best performer in our dataset (Baadur Jodava) who has 145 wins. Nakamura could likely have this win count, because he is frequently active on chess.com.

I believe we should now explore chess.com’s internal rating system, so that we can understand how the website ranks players based on metrics that they believe evaluate them to the best of their ability. Observing whether or not Nakamura has one of the higher ratings may help us determine whether or not he may be one of the best players in our dataset.

ELO Ratings

Each player is assigned an “ELO Rating”. In short, an ELO rating is meant to estimate a player’s skill-level, and the higher a player’s rating is, the better they are estimated to be at playing chess. Chess.com uses multiple variables to calculate ELO, but to explain the ranking system simply, a player’s ELO increases as players win and draw other players who are ranked higher than them in most cases, and it decreases otherwise. More information about Chess ELO and how chess.com specifically calculates player ELO for giving players a rating can be found here.

Here is a distribution of each individual player’s average rating within the dataset.

It shows that most players’ average ratings were at least above 2000, meaning that most players highlighted are at least at the Master level or higher. It appears as if around half of the players have a rating of 2400 or higher, which shows that our dataset consists of games primarily from GrandMasters. Players with these ELO’s are exceptional at playing chess, so it may be interesting to observe whether or not the ELO system in chess.com is accurate in determining strength for players at this skill level and above.

Game Outcomes

Within our data, a total of 10336 unique matches have been played, 1920 unique players are included in the whole dataset, and 2964 games have resulted in draws. Here is a barplot indicating the results summary in our dataset.

This barplot shows that most games resulted in White winning the game. There is a chance that our dataset may be restricted because all players rated as Masters and above may not be included. Because of that, it may not necessarily tell a full story of high-rated raters, but from what I am able to see so far, I am making an assumption that because of the small inherent advantage that players have when making the first move in the game as White, playing as White may give high-rated players a better chance of winning the game than playing as Black. This claim makes sense in some capacity, especially when understanding that high-ranking players rarely play huge blunders (major mistakes) when they play games.

Results

Player Summary

To understand factors like Player ELO and player outcomes, it is important to observe overall results from the perspective of players themselves. More specifically, I believe that creating a player summary table and observing how top players performed on an overall scale can help us understand this data on a more individual level. Below is a summary table from the player perspective, in descending order of Total Wins.

Name Games Played Average Moves per Game Total Wins Win Percentage (%) Average ELO Score Average Opponent ELO Score
Hikaru Nakamura 408 48.0 234 57.35 2853 2650
Baadur Jobava 253 45.0 145 57.31 2697 2553
Christopher Woojin Yoo 253 46.6 136 53.75 2152 2505
Arjun Erigaisi 284 47.5 131 46.13 2573 2580
Alexey Vasilyevich Sarana 208 50.5 118 56.73 2659 2484
Denis Lazavik 214 51.7 115 53.74 2437 2466
Sergei Zhigalko 181 45.4 115 63.54 2663 2422
Nodirbek Abdusattorov 196 49.8 112 57.14 2631 2482
Levon Aronian 318 49.7 111 34.91 2768 2709
Wesley So 274 44.9 111 40.51 2781 2745
Sanan Sjugirov 197 45.3 109 55.33 2573 2515
Aleksandar Indjic 184 46.3 108 58.70 2631 2469
Dmitry Vladimirovich Andreikin 157 42.7 108 68.79 2762 2508
Andrew Tang 185 45.5 105 56.76 2586 2457
Parham Maghsoodloo 197 46.6 105 53.30 2610 2529
Magnus Carlsen 263 48.7 103 39.16 2865 2742
Vladislav Artemiev 235 48.5 101 42.98 2745 2674
Volodar Murzin 194 46.3 96 49.48 2509 2417
Awonder Liang 194 47.2 95 48.97 2399 2479
Vincent Keymer 203 50.6 94 46.31 2540 2464

In this table, we can see that Hikaru Makamura won the most games in part because he played more games than the rest of the top players in the dataset. Nevertheless, he still has a pretty high win percentage, which leads us to believe that he consistently performs extremely well against high-ranking players. His average opponent’s rating is less than his average rating, but the average rating of players he competes against is still extremely high at 2650.

Observing the players with the top 5 wins in our dataset, let’s use a Violin Plot to visualize exactly what rated players our top competitors tend to play against.

The vast majority of players these people are ranked between 2400 - 2900. We can see that Nakamura is playing the highest-rated competitors on average than the rest of these players. Griscuk is playing against opponents of similary ranked ELO’s to Nakamura, and the other players are competing against players with a little bit lower ratings—still extremely high and competitive nevertheless.

Individual Highest ELO

Observing exactly who had the highest ELO is important as well in determining whether or not the ELO system accurately ranks players at the GrandMaster level and above. Here is a table of players’ highest single-moment ELO’s.

Game ID Player with Max ELO Maximum Player ELO
16076986 Hikaru Nakamura 3223
16117900 Alexander Grischuk 3075
16165329 David Paravyan 3029
16130874 Dmitry Vladimirovich Andreikin 3027
16076010 David Wei Liang Howell 3012
16130748 Nodirbek Abdusattorov 3004
16093258 Samuel Sevian 3000
16062436 Grigoriy Alekseyevich Oparin 2992
16124234 Sanan Sjugirov 2988
16123090 Andrew Tang 2986
16124424 Anton Korobov 2984
16164551 Nodirbek Yakubboev 2972
16130768 Baadur Jobava 2969
16123062 Sergei Zhigalko 2968
16131586 Christopher Woojin Yoo 2968

Nakamura once again makes it to the top of our dataset. He had a maximum high rating of 3223, showcasing that it may be accurate to justify win count as a measure of competitiveness and skill in playing chess on chess.com.

Lowest Move Count

Chess games typically have around <a href-“https://www.alexcrompton.com/blog/time-thoughts-chess”>40 moves in the game, however some games last much longer or much shorter. Below is a summary tale of the games with the lowest number of moves in a game.

Game ID Date Move Count
16064010 2021.06.21 5
16118788 2021.09.18 5
16153863 2021.11.08 5
16129536 2021.10.02 6
16152763 2021.11.08 6
16025283 Mon Mar 15 2021 6
16044044 Sun May 23 2021 6
16093332 2021.08.11 7
16160339 2021.11.16 7
16162079 2021.12.04 7
15793131 Sun Feb 14 2021 7
16050844 Sun May 30 2021 7
16046924 Thu May 27 2021 7
16063154 2021.06.19 8
16072304 2021.07.06 8

Here, we can see clearly that some of these games with grandmasters in our dataset lasted exceedingly short. Some opening traps exist in chess where players checkmate other players within the first couple of moves, however that is not likely the case here. In chess.com, players have the option to resign a game at all points, so these games may have ended quickly because one of these players had to leave. We can see, however, that there are not many of these games, showcasing that players with high ratings are reluctant to leave games early because it could negatively impact their ELO, thus impacting their reputation.

Biggest Upsets

It is pretty rare for players to defeat players exceedingly higher rated than them, because ELO is an indicator of skill. However, it still happens at some points. Here is a table of the biggest upsets that have occurred on chess.com for these grandmasters.

Game ID Date Underdog Player ELO Opponent ELO ELO Difference
15790435 Sat Jan 09 2021 Sanan Sjugirov 26 2649 2623
15791483 Sat Jan 09 2021 Sanan Sjugirov 26 2649 2623
15791957 Thu Jan 07 2021 Julien Song 23 2619 2596
15791981 Wed Jan 06 2021 Max Warmerdam 25 2552 2527
15792057 Wed Jan 06 2021 Max Warmerdam 25 2386 2361
15791955 Fri Jan 15 2021 Atulya Shetty 24 2357 2333
16137734 2021.10.15 Leon Luke Mendonca 1391 2542 1151
16060066 2021.06.10 Leon Luke Mendonca 1391 2497 1106
16127070 2021.09.25 L M S T De Silva 1621 2581 960
16126672 2021.09.25 Enkh-Amgalan Amgalantengis 1716 2662 946

For a player to have a double-digit ranking, they have to be pretty terrible at the game, especially considering that each player starts at a rating of 800 immediately upon creating an account. Players with highly differentiating ELO’s aren’t matched with each other either, meaning that these players willingly chose to play each other, or they faced each other in a chess.com tournament. I believe that the double-digit rating players defeating GrandMasters was just an experiment between these two players, and therefore, they are outliers. These games may have either been accidentally played, or they were coordinated as a potential experiment between the two players. Either way, these games should not be factored into our overall analysis of ELO’s impact on game outcomes, as the games likely did not occur under normal circumstances.

Game Outcome Classification Based on Ratings

I think we have enough data to observe how the ratings of White and Black directly impact the chances of whether or not they will win. I plotted White ELO versus Black ELO to determine exactly how ratings as White and Black effect who may win and who may lose. I excluded major outliers.

Based on our data, we can see that ELO is a pretty good measure of whether or not White or Black will win. We can see that our plot is overwhelmingly green, because white just wins more often. However, higher ranking Black players do tend to defeat lower-rated White players. The opposite is true as well. There are a lot of drawing games and games where White wins in points on the scatterplot where White and Black are similarly ranked.

Conclusion

I can conclude that ELO is a decently good measure of skill for determining how good players are good. White, however, tends to defeat Black when high-rated players are playing against one another. There are clear limitations to this dataset. For example, we only have a subset of game data for players in these rating categories. Nevertheless, I do believe my conclusions are pretty solid.