Chess Analysis

Introduction

The game of chess has been around for 1500 years and still is one of the greatest puzzles known to man. The endless possibilities and combinations make it impossible to “solve”, but one can certainly give themselves an advantage by knowing which situations help and hurt them. I’ve been playing the game of chess since I was in fourth grade and have always loved the strategy behind it. I wanted to use this project as an opportunity to improve my abilities and become and better player. The main things I want to identify is how to set myself up for success. What sorts of moves, openings, and game types, give me that best chance of winning the game?

Background

For those unfamiliar with the game of chess, the object of the game is to move your pieces around and “capture” the opponents king piece. It’s a game that that requires a lot of critical thinking, and in higher level games, knowledge of chess theory. Due to the nearly limitless combinations of moves a chess game can involve, knowing all the textbook pattern moves and variations of them is next to impossible. An important piece of information to know, since I will be focusing a lot of my analysis on it, is that there is a white side and a black side. The white side always goes first and the black side follows. Since white always goes first, it is more of the attacking position, whereas black is more of the defensive position.

Data Set

The data I am using for this project is a data frame that contains roughly 500,000 games of chess. It is pulled from lichess.org and contains games played during July 2016.

Data Dictionary

Event- Game type White- The player ID of white Black- The player ID of black Result- 1-0 indicates that white won and 0-1 indicates that black won UTCDate- Universal date game was played UTCTime- Universal time game was played WhiteElo- Elo rating of white (chess rating system, higher is better) BlackElo- Elo rating of black WhiteRatingDiff- How many Elo points won/lost after game for white BlackRatingDiff- How many Elo points won/lost after game for black ECO- Opening in ECO coding Opening- Name of opening TimeControl- Time of game in seconds Termination- Reason the game ended AN- Sequence of game moves in movetext format first_move- The first move of the game move_count- How many move were made in the game simple_opening- The base opening that does not include the variation of the opening winner- The color of the winner player_rating- The two players Elo ratings averaged Category- What level of player the player_rating is according to the US Chess Federation standards (USCF) Elo_Difference- The difference in the Elo rating of the two players Upset- Game is classified as an upset if a player with an Elo_Difference of 75 or greater, wins

White vs. Black

Now I want to how the different sides stack up against each other. Is one side statistically better than the other? What are the different strategies one must use when using each color?

Most of the time, the side with the higher Elo wins. However, when white has a higher Elo than black, black is winning at a slightly higher rate than when white beats a higher rated black side.

Black is pulling off upsets a little more often than white is.

When a game ends due to a player running out of time, more often than not, it is white that ran out of time.

Playstyle

So far, everything has come back in favor of black. While the margins may not be huge, black is consistently winning when stacked up against white in different categories. So what gives black this advantage? I mentioned earlier that white and black naturally have two different play styles due to white always going first. So, I want to look at how the style of game affects the outcome.

A clear trend is shown, as the game gets longer, the more black is favored. The graph starts in favor of white which makes sense. Since white goes first, they have the ability to immediately push all out to try and get an early victory. However, graph then flips to in favor of black because an early all out push leaves white vulnerable. So if black can withstand the initial push, they can launch a quick counterattack to get the early victory. After this, it levels out and we enter the mid game, where neither side has that much of an advantage. But as the game progresses, the odds shift in favor of black.

How to Play White

Now that we know the strengths and weaknesses of each side, it’s all about utilizing them. So, if we are playing as white, it’s in our best interest to not let the game drag on. To incorporate this into our plan of attack, we need to look for strategies that result in checkmates earlier in the game.

The first graph looks at the openings that on average have the least amount of game moves, games end quickly. However, we don’t use want to be reckless and look for the the games with the least amount of moves because that could result in us losing just as fast. We mainly want to look at the games where white is winning. The second graph looks at the same parameters, except is filtered on the games that white won.

Taking a closer look at the King’s Pawn opening because it’s an opening from that list that I know will have a high sample size. This proves to be an extremely effective opening for white.

How to Play Black

For black we know we want to play conservatively and stretch the game out. We want to identify strategies force a lot of moves and make our opponents think.

One to going about this is to implement intricate defenses that make our opponent think and eat up their clock. Among games where the games ends because a side ran out of time, these were the top openings.

This is a similar graph we looked at for white, except instead of looking for the least amount of moves, it’s the most amount of moves. The second grapg is filtered on only those that black won.

While this opening does not have a great sample size, we can still see black dominates this particular opening.

Chess.com Scrape

                 word
opening_name      abilities advanced advantages c4 lead pawn positionally
  Catalan Opening         1        1          1  1    1    1            1
                 word
opening_name      pressure requires subtle technical term vulnerable win
  Catalan Opening        1        1      1         1    1          1   1

Chess.com is the most popular chess website in the world. They offer games, lessons, forums, and all other things chess related. We can use their resources on the different chess openings to help inform our decision. By scraping the pros and cons section for each opening, we can get a feel for how that opening functions. For example, let’s look at the Catalan Opening. Some of the keys words pulled from that scrape were “lead”, “pressure”, “vulnerable”, and “win”. Judging on those key words, one can assume it’s a more offensive minded opening. Perfect for when using the white side. It seems to be an aggressive opening that puts a lot of pressure on your opponent early on in the game, however it can leave you vulnerable if failed. This just what we’re looking for in an opening for white.

Conclusion

Using this analysis, I can now use statistics to put myself in the best situations, and give myself the best chance of winning. I know there is a statistical advantage to playing as black, and how to utilize that advantage. To recap what I have learned, when given the choice, always chose black. People will inherently always want to play fast and aggressive, it’s more fun this way, black can take advantage of that. Use black to sent up a strong defense and weather the storm. Drag the game out, and wait for white to make the mistake. If you’re white, focus on attacking but don’t be reckless. Set an attack early and control board. Look for holes in black’s defense and exploit them.