Chess.com: Exploratory Data Analysis
Manipulating & Visualizing chess.com Game Data
Introduction
Chess.com is a website where players can match-make against one another to play games of chess. Players are typically matched against one another based on their skill level, and the website’s welcoming interface has caused it to boost in popularity in recent years. Ordinary people are making the decision to learn chess because of chess.com’s convenience, and it’s even become popular to watch high-rated players matchmake against random stranger similarly ranked to them on streaming sites like Twitch.
Because of the newfound hype around chess.com, I have decided to make a analyze a dataset of high-rated players’ games and how well they performed. Throughout my analysis, I focuses on the performance of the highest rated players of the dataset, and I also measured how players performed overall depending on whether they were playing as white or black.
Data Overview
Issues with the Data
The original dataset converted win indicators from 1-0 to a date-time variable Jan-00, so I cleaned those elements of the dataset by converting wins back to 1-0. Additionally, for the specific player of interest in each row for the game data, I added a column to indicate the outcome of the game from the perspective of the player of interest. Below is a table of the players in the dataset who won the most amount of games.
Player Results
While the original dataset is intuitive on its own with the stories that the columns already have the potential to tell, I believe a Result column will be essential displaying the outcome of the game from the perspective of the player of interest.
With this information, I compiled a list of the top 10 players with the most wins in our dataset.
| Name | Win Count |
|---|---|
| Hikaru Nakamura | 234 |
| Baadur Jobava | 145 |
| Christopher Woojin Yoo | 136 |
| Arjun Erigaisi | 131 |
| Alexey Vasilyevich Sarana | 118 |
| Denis Lazavik | 115 |
| Sergei Zhigalko | 115 |
| Nodirbek Abdusattorov | 112 |
| Levon Aronian | 111 |
| Wesley So | 111 |
Hikaru Nakamura outperforms all other players by a longshot with 234 wins, outperforming the 2nd best performer in our dataset (Baadur Jodava) who has 145 wins. Nakamura could likely have this win count, because he is frequently active on chess.com.
I believe we should now explore chess.com’s internal rating system, so that we can understand how the website ranks players based on metrics that they believe evaluate them to the best of their ability. Observing whether or not Nakamura has one of the higher ratings may help us determine whether or not he may be one of the best players in our dataset.
ELO Ratings
Each player is assigned an “ELO Rating”. In short, an ELO rating is meant to estimate a player’s skill-level, and the higher a player’s rating is, the better they are estimated to be at playing chess. Chess.com uses multiple variables to calculate ELO, but to explain the ranking system simply, a player’s ELO increases as players win and draw other players who are ranked higher than them in most cases, and it decreases otherwise. More information about Chess ELO and how chess.com specifically calculates player ELO for giving players a rating can be found here.
Here is a distribution of each individual player’s average rating within the dataset.
It shows that most players’ average ratings were at least above 2000, meaning that most players highlighted are at least at the Master level or higher. It appears as if around half of the players have a rating of 2400 or higher, which shows that our dataset consists of games primarily from GrandMasters. Players with these ELO’s are exceptional at playing chess, so it may be interesting to observe whether or not the ELO system in chess.com is accurate in determining strength for players at this skill level and above.
Game Outcomes
Within our data, a total of 10336 unique matches have been played, 1920 unique players are included in the whole dataset, and 2964 games have resulted in draws. Here is a barplot indicating the results summary in our dataset.
This barplot shows that most games resulted in White winning the game. There is a chance that our dataset may be restricted because all players rated as Masters and above may not be included. Because of that, it may not necessarily tell a full story of high-rated raters, but from what I am able to see so far, I am making an assumption that because of the small inherent advantage that players have when making the first move in the game as White, playing as White may give high-rated players a better chance of winning the game than playing as Black. This claim makes sense in some capacity, especially when understanding that high-ranking players rarely play huge blunders (major mistakes) when they play games.
Results
Player Summary
To understand factors like Player ELO and player outcomes, it is important to observe overall results from the perspective of players themselves. More specifically, I believe that creating a player summary table and observing how top players performed on an overall scale can help us understand this data on a more individual level. Below is a summary table from the player perspective, in descending order of Total Wins.
| Name | Games Played | Average Moves per Game | Total Wins | Win Percentage (%) | Average ELO Score | Average Opponent ELO Score |
|---|---|---|---|---|---|---|
| Hikaru Nakamura | 408 | 48.0 | 234 | 57.35 | 2853 | 2650 |
| Baadur Jobava | 253 | 45.0 | 145 | 57.31 | 2697 | 2553 |
| Christopher Woojin Yoo | 253 | 46.6 | 136 | 53.75 | 2152 | 2505 |
| Arjun Erigaisi | 284 | 47.5 | 131 | 46.13 | 2573 | 2580 |
| Alexey Vasilyevich Sarana | 208 | 50.5 | 118 | 56.73 | 2659 | 2484 |
| Denis Lazavik | 214 | 51.7 | 115 | 53.74 | 2437 | 2466 |
| Sergei Zhigalko | 181 | 45.4 | 115 | 63.54 | 2663 | 2422 |
| Nodirbek Abdusattorov | 196 | 49.8 | 112 | 57.14 | 2631 | 2482 |
| Levon Aronian | 318 | 49.7 | 111 | 34.91 | 2768 | 2709 |
| Wesley So | 274 | 44.9 | 111 | 40.51 | 2781 | 2745 |
| Sanan Sjugirov | 197 | 45.3 | 109 | 55.33 | 2573 | 2515 |
| Aleksandar Indjic | 184 | 46.3 | 108 | 58.70 | 2631 | 2469 |
| Dmitry Vladimirovich Andreikin | 157 | 42.7 | 108 | 68.79 | 2762 | 2508 |
| Andrew Tang | 185 | 45.5 | 105 | 56.76 | 2586 | 2457 |
| Parham Maghsoodloo | 197 | 46.6 | 105 | 53.30 | 2610 | 2529 |
| Magnus Carlsen | 263 | 48.7 | 103 | 39.16 | 2865 | 2742 |
| Vladislav Artemiev | 235 | 48.5 | 101 | 42.98 | 2745 | 2674 |
| Volodar Murzin | 194 | 46.3 | 96 | 49.48 | 2509 | 2417 |
| Awonder Liang | 194 | 47.2 | 95 | 48.97 | 2399 | 2479 |
| Vincent Keymer | 203 | 50.6 | 94 | 46.31 | 2540 | 2464 |
In this table, we can see that Hikaru Makamura won the most games in part because he played more games than the rest of the top players in the dataset. Nevertheless, he still has a pretty high win percentage, which leads us to believe that he consistently performs extremely well against high-ranking players. His average opponent’s rating is less than his average rating, but the average rating of players he competes against is still extremely high at 2650.
Observing the players with the top 5 wins in our dataset, let’s use a Violin Plot to visualize exactly what rated players our top competitors tend to play against.
The vast majority of players these people are ranked between 2400 - 2900. We can see that Nakamura is playing the highest-rated competitors on average than the rest of these players. Griscuk is playing against opponents of similary ranked ELO’s to Nakamura, and the other players are competing against players with a little bit lower ratings—still extremely high and competitive nevertheless.
Individual Highest ELO
Observing exactly who had the highest ELO is important as well in determining whether or not the ELO system accurately ranks players at the GrandMaster level and above. Here is a table of players’ highest single-moment ELO’s.
| Game ID | Player with Max ELO | Maximum Player ELO |
|---|---|---|
| 16076986 | Hikaru Nakamura | 3223 |
| 16117900 | Alexander Grischuk | 3075 |
| 16165329 | David Paravyan | 3029 |
| 16130874 | Dmitry Vladimirovich Andreikin | 3027 |
| 16076010 | David Wei Liang Howell | 3012 |
| 16130748 | Nodirbek Abdusattorov | 3004 |
| 16093258 | Samuel Sevian | 3000 |
| 16062436 | Grigoriy Alekseyevich Oparin | 2992 |
| 16124234 | Sanan Sjugirov | 2988 |
| 16123090 | Andrew Tang | 2986 |
| 16124424 | Anton Korobov | 2984 |
| 16164551 | Nodirbek Yakubboev | 2972 |
| 16130768 | Baadur Jobava | 2969 |
| 16123062 | Sergei Zhigalko | 2968 |
| 16131586 | Christopher Woojin Yoo | 2968 |
Nakamura once again makes it to the top of our dataset. He had a maximum high rating of 3223, showcasing that it may be accurate to justify win count as a measure of competitiveness and skill in playing chess on chess.com.
Lowest Move Count
Chess games typically have around <a href-“https://www.alexcrompton.com/blog/time-thoughts-chess”>40 moves in the game, however some games last much longer or much shorter. Below is a summary tale of the games with the lowest number of moves in a game.
| Game ID | Date | Move Count |
|---|---|---|
| 16064010 | 2021.06.21 | 5 |
| 16118788 | 2021.09.18 | 5 |
| 16153863 | 2021.11.08 | 5 |
| 16129536 | 2021.10.02 | 6 |
| 16152763 | 2021.11.08 | 6 |
| 16025283 | Mon Mar 15 2021 | 6 |
| 16044044 | Sun May 23 2021 | 6 |
| 16093332 | 2021.08.11 | 7 |
| 16160339 | 2021.11.16 | 7 |
| 16162079 | 2021.12.04 | 7 |
| 15793131 | Sun Feb 14 2021 | 7 |
| 16050844 | Sun May 30 2021 | 7 |
| 16046924 | Thu May 27 2021 | 7 |
| 16063154 | 2021.06.19 | 8 |
| 16072304 | 2021.07.06 | 8 |
Here, we can see clearly that some of these games with grandmasters in our dataset lasted exceedingly short. Some opening traps exist in chess where players checkmate other players within the first couple of moves, however that is not likely the case here. In chess.com, players have the option to resign a game at all points, so these games may have ended quickly because one of these players had to leave. We can see, however, that there are not many of these games, showcasing that players with high ratings are reluctant to leave games early because it could negatively impact their ELO, thus impacting their reputation.
Biggest Upsets
It is pretty rare for players to defeat players exceedingly higher rated than them, because ELO is an indicator of skill. However, it still happens at some points. Here is a table of the biggest upsets that have occurred on chess.com for these grandmasters.
| Game ID | Date | Underdog | Player ELO | Opponent ELO | ELO Difference |
|---|---|---|---|---|---|
| 15790435 | Sat Jan 09 2021 | Sanan Sjugirov | 26 | 2649 | 2623 |
| 15791483 | Sat Jan 09 2021 | Sanan Sjugirov | 26 | 2649 | 2623 |
| 15791957 | Thu Jan 07 2021 | Julien Song | 23 | 2619 | 2596 |
| 15791981 | Wed Jan 06 2021 | Max Warmerdam | 25 | 2552 | 2527 |
| 15792057 | Wed Jan 06 2021 | Max Warmerdam | 25 | 2386 | 2361 |
| 15791955 | Fri Jan 15 2021 | Atulya Shetty | 24 | 2357 | 2333 |
| 16137734 | 2021.10.15 | Leon Luke Mendonca | 1391 | 2542 | 1151 |
| 16060066 | 2021.06.10 | Leon Luke Mendonca | 1391 | 2497 | 1106 |
| 16127070 | 2021.09.25 | L M S T De Silva | 1621 | 2581 | 960 |
| 16126672 | 2021.09.25 | Enkh-Amgalan Amgalantengis | 1716 | 2662 | 946 |
For a player to have a double-digit ranking, they have to be pretty terrible at the game, especially considering that each player starts at a rating of 800 immediately upon creating an account. Players with highly differentiating ELO’s aren’t matched with each other either, meaning that these players willingly chose to play each other, or they faced each other in a chess.com tournament. I believe that the double-digit rating players defeating GrandMasters was just an experiment between these two players, and therefore, they are outliers. These games may have either been accidentally played, or they were coordinated as a potential experiment between the two players. Either way, these games should not be factored into our overall analysis of ELO’s impact on game outcomes, as the games likely did not occur under normal circumstances.
Game Outcome Classification Based on Ratings
I think we have enough data to observe how the ratings of White and Black directly impact the chances of whether or not they will win. I plotted White ELO versus Black ELO to determine exactly how ratings as White and Black effect who may win and who may lose. I excluded major outliers.
Based on our data, we can see that ELO is a pretty good measure of whether or not White or Black will win. We can see that our plot is overwhelmingly green, because white just wins more often. However, higher ranking Black players do tend to defeat lower-rated White players. The opposite is true as well. There are a lot of drawing games and games where White wins in points on the scatterplot where White and Black are similarly ranked.
Conclusion
I can conclude that ELO is a decently good measure of skill for determining how good players are good. White, however, tends to defeat Black when high-rated players are playing against one another. There are clear limitations to this dataset. For example, we only have a subset of game data for players in these rating categories. Nevertheless, I do believe my conclusions are pretty solid.