Required packages

library(dplyr)
library(tidyr)
library(readr)
library(knitr)
library(kableExtra)
library(stringr)
library(lubridate)
library(ggplot2)
library(outliers)
library(caret)
setwd("C:\\Users\\Parikshit\\Documents\\Master Of Analytics\\Semester 3\\Data Preprocessing")

Executive Summary

Having played professional chess for quite some time now, it was one of the motivations for me to take up two chess data sets, merge them and preprocess them thereby making it amenable for further processing and analysis. The end goal is also to put together a win summary of all the players (who have won atleast one game) in the data who took part in the 2014 world rapid chess championships.

In this assignment, the data is at the game level and I have gone forward merging it with the opening table to get a sense of opening classification a game came under. Importantly, to keep ambiguity at bay, I have prefixed the column names of white and black moves with the letter “op”. Also, I have converted the data to a tidy format by gathering the columns with chess moves into two columns namely ‘color_turn’ and ‘move’.

I went on to split the results column into white player’s and black player’s results. Even the round and table number features are created from the round column based on the delimiter “.”

In addition, I have also examined for outliers in the numeric variables present in the game.opening (merged) data set.

Also, I have range normalised the ELO ratings of the players to facilitate talent comparison among the players.

Also, some of the challenges that I faced in this exercise are with regards to dealing with NA values. NA values were a result of left joining games table with the opening table. Though the presence of NA values were justified, for representational purposes, I have replaced it with a hyphen(“-”).

Data

The data set used here is of the 2014 world rapid chess championships. Having played chess professionally for quite some time now has motivated me enough to pick this data set and get it to a form where I can quickly calculate the number of points a player has gained or scored in this tournament.

The data set resided on a remote SQL server database named Chess [https://relational.fit.cvut.cz/dataset/Chess]. The data was accessed through a MySQL workbench installed on my desktop machine and later exported and stored as a csv file in my working directory.

With regards to the data sets: They were already relational in nature with two data sets namely game and opening. game table consists of game level details and a clear description of the data is as below:

The opening table consists of chess opening level details and a clear description of the same is as below:

I left join the game table with the opening table to get the type of opening played in every game at the same time retaining all the entries in the game table. Also, before doing so, to keep ambiguity at bay, I prefix the column names of the first two pairs of white player’s and black player’s moves (in the opening table) with the letter “op”.

game<-read.csv("chess_game.csv") #Reading game table
opening<-read.csv("chess_opening.csv") #Reading opening table

kable(head(game), caption = "Game Table")
Game Table
game_id opening_id event site event_date round white black game_result ECO whiteElo BlackElo opening w1 b1 w2 b2 w3 b3 w4 b4 w5 b5 w6 b6 w7 b7 w8 b8 w9 b9 w10 b10
1 713 FIDE World Rapid 2014 Dubai UAE 2014-06-16 2.1 Van Wely, Loek Caruana, Fabiano 1/2-1/2 A41 2654 2791 Rat Defense d4 d6 Nf3 Nf6 c4 g6 g3 Bg7 Bg2 O-O O-O Bf5 Nh4 Be4 f3 Bc6 Nc3 e5 e4 Ne8
2 1019 FIDE World Rapid 2014 Dubai UAE 2014-06-16 2.2 Guseinov, Gadir Carlsen, Magnus 1/2-1/2 C08 2613 2881 King’s Pawn Opening e4 e6 d4 d5 Nd2 c5 exd5 exd5 Bb5+ Nc6 Ngf3 cxd4 O-O Bd6 Re1+ Nge7 Nxd4 O-O h3 a6
3 513 FIDE World Rapid 2014 Dubai UAE 2014-06-16 2.3 Mamedyarov, Shakhriyar Sargissian, Gabriel 1-0 A45 2743 2686 Trompowsky Attack d4 Nf6 Bg5 e6 e4 h6 Bxf6 Qxf6 Nc3 Bb4 Qd2 d6 a3 Bxc3 Qxc3 Nc6 Nf3 O-O Bd3 dxe5
4 152 FIDE World Rapid 2014 Dubai UAE 2014-06-16 2.4 Yu, Yangyi Svidler, Peter 1-0 B48 2675 2753 Sicilian Defense e4 c5 Nf3 e6 d4 cxd4 Nxd4 Nc6 Nc3 Qc7 Be3 a6 Qd2 Nf6 O-O-O Be7 f4 b5 exf6 bxc3
5 818 FIDE World Rapid 2014 Dubai UAE 2014-06-16 2.5 Milov, Vadim Karjakin, Sergey 0-1 E46 2622 2771 Indian Game d4 Nf6 c4 e6 Nc3 Bb4 e3 O-O Nge2 d5 a3 Be7 cxd5 exd5 b4 c6 Ng3 Be6 O-O b5
6 526 FIDE World Rapid 2014 Dubai UAE 2014-06-16 2.6 Safarli, Eltaj Korobov, Anton 1/2-1/2 C25 2649 2680 Vienna Game e4 e5 Nc3 Bc5 Nf3 d6 Na4 Bb6 Nxb6 axb6 d4 exd4 Qxd4 Nf6 e5 Nc6 Bb5 dxe5 Bf4 O-O
kable(head(opening), caption = "Opening Table")
Opening Table
opening_id name code w1 b1 w2 b2 w3 b3 w4 b4 variation
9 Scandinavian Defense - e4 d5 exd5 Qxd5 Nc3 Qa5 d4 e5 Anderssen Counterattack
10 Scandinavian Defense - e4 d5 exd5 Qxd5 Nc3 Qd6 d4 Nf6 Bronstein Variation
11 Scandinavian Defense - e4 d5 exd5 Qxd5 Nc3 Qa5 d4 Nf6 Classical Variation
12 Scandinavian Defense - e4 d5 exd5 Qxd5 Nc3 Qd6 - - Gubinsky-Melts Defense
13 Scandinavian Defense - e4 d5 exd5 Nf6 c4 e6 - - Icelandic-Palme Gambit
14 Scandinavian Defense - e4 d5 exd5 Nf6 d4 Nxd5 c4 Nb4 Kiel Variation
#Padding the opening moves with a prefix "op"
names(opening)[4:11]<-str_pad(names(opening)[4:11],width=3,side="left",pad=c("p"))
names(opening)[4:11]<-str_pad(names(opening)[4:11],width=4,side="left",pad=c("o"))

#Left joining game table with opening table on opening_id
game.opening<-merge(game,opening,by.x="opening_id",by.y="opening_id",all.x=TRUE)

kable(head(game.opening[c(1:3,37:40)]),caption = "Merged Table - Game table left joined with Opening table")
Merged Table - Game table left joined with Opening table
opening_id game_id event opb1 opw2 opb2 opw3
19 47 FIDE World Rapid 2014 d5 exd5 Qxd5 -
28 7 FIDE World Rapid 2014 e5 Nf3 Nc6 d4
30 109 FIDE World Rapid 2014 e5 Nf3 Nc6 d4
41 285 FIDE World Rapid 2014 e5 Nf3 Nc6 d4
43 162 FIDE World Rapid 2014 e5 Nf3 Nc6 d4
43 211 FIDE World Rapid 2014 e5 Nf3 Nc6 d4

Understand

Here I check for the structure of my data set and apply appropriate type conversions. The structure of game.opening data set is an R data frame.

#Checking the structure of the data frame game.opening
str(game.opening)
## 'data.frame':    295 obs. of  44 variables:
##  $ opening_id : int  19 28 30 41 43 43 86 89 94 94 ...
##  $ game_id    : int  47 7 109 285 162 211 13 118 94 54 ...
##  $ event      : Factor w/ 1 level "FIDE World Rapid 2014": 1 1 1 1 1 1 1 1 1 1 ...
##  $ site       : Factor w/ 1 level "Dubai UAE": 1 1 1 1 1 1 1 1 1 1 ...
##  $ event_date : Factor w/ 2 levels "2014-06-16","2014-06-17": 1 1 2 2 2 2 1 2 2 1 ...
##  $ round      : num  4.13 2.7 6.41 10.4 7.44 9.16 2.13 6.51 6.26 5.3 ...
##  $ white      : Factor w/ 104 levels "Adly, Ahmed",..: 102 69 4 63 26 101 55 15 39 74 ...
##  $ black      : Factor w/ 105 levels "Adly, Ahmed",..: 50 51 6 52 4 104 74 58 3 43 ...
##  $ game_result: Factor w/ 3 levels "0-1","1-0","1/2-1/2": 1 2 1 3 2 3 1 3 2 1 ...
##  $ ECO        : Factor w/ 137 levels "A01","A04","A05",..: 75 75 75 106 107 102 39 39 59 59 ...
##  $ whiteElo   : int  2675 2730 2396 2586 2662 2715 2659 2299 2682 2685 ...
##  $ BlackElo   : int  2712 2659 2635 2546 2396 2546 2685 2632 2505 2771 ...
##  $ opening    : Factor w/ 27 levels "English Opening",..: 18 18 18 19 19 19 20 20 20 20 ...
##  $ w1         : Factor w/ 5 levels "b3","c4","d4",..: 4 4 4 3 3 3 4 4 4 4 ...
##  $ b1         : Factor w/ 9 levels "c5","c6","d5",..: 5 5 5 3 3 3 1 1 1 1 ...
##  $ w2         : Factor w/ 13 levels "b3","Bb2","Bc4",..: 13 13 13 6 6 6 5 5 13 13 ...
##  $ b2         : Factor w/ 17 levels "a6","b6","Bc5",..: 16 16 16 12 12 12 8 17 9 9 ...
##  $ w3         : Factor w/ 29 levels "b3","b4","Bb2",..: 15 15 15 25 25 25 21 20 15 15 ...
##  $ b3         : Factor w/ 35 levels "a6","b5","b6",..: 23 23 23 32 15 15 35 30 16 16 ...
##  $ w4         : Factor w/ 45 levels "a3","a4","b3",..: 42 42 42 38 24 24 22 22 42 42 ...
##  $ b4         : Factor w/ 52 levels "a5","a6","b5",..: 10 7 38 17 37 11 35 18 38 38 ...
##  $ w5         : Factor w/ 60 levels "a3","a4","b3",..: 49 19 49 16 44 11 44 44 42 42 ...
##  $ b5         : Factor w/ 62 levels "a5","a6","axb6",..: 23 10 23 32 53 39 17 35 2 2 ...
##  $ w6         : Factor w/ 71 levels "a3","a4","b3",..: 11 13 34 2 63 38 12 9 41 13 ...
##  $ b6         : Factor w/ 70 levels "a5","a6","b5",..: 56 7 67 6 10 57 30 49 40 39 ...
##  $ w7         : Factor w/ 81 levels "a3","a4","b3",..: 67 57 75 37 13 56 31 7 45 56 ...
##  $ b7         : Factor w/ 76 levels "a5","a6","b4",..: 63 39 48 45 32 63 35 31 54 15 ...
##  $ w8         : Factor w/ 99 levels "a3","a4","b4",..: 58 23 31 25 25 78 57 72 46 12 ...
##  $ b8         : Factor w/ 91 levels "a5","a6","axb6",..: 1 3 7 78 2 6 8 37 5 77 ...
##  $ w9         : Factor w/ 105 levels "a3","a4","a5",..: 95 65 4 84 84 13 84 40 1 84 ...
##  $ b9         : Factor w/ 107 levels "a4","a5","a6",..: 42 77 54 24 90 6 91 28 8 92 ...
##  $ w10        : Factor w/ 116 levels "a3","a4","axb5",..: 102 33 51 112 41 109 68 97 47 74 ...
##  $ b10        : Factor w/ 115 levels "a4","a5","a5+",..: 22 105 97 98 53 77 76 106 45 90 ...
##  $ name       : Factor w/ 41 levels "Gruenfeld Defense",..: 24 25 25 25 25 25 27 27 27 27 ...
##  $ code       : Factor w/ 1 level "-": 1 1 1 1 1 1 1 1 1 1 ...
##  $ opw1       : Factor w/ 8 levels "-","c3","d4",..: 5 5 5 5 5 5 5 5 5 5 ...
##  $ opb1       : Factor w/ 14 levels "-","a6","b5",..: 7 9 9 9 9 9 5 5 5 5 ...
##  $ opw2       : Factor w/ 20 levels "-","a3","b3",..: 14 20 20 20 20 20 20 20 8 8 ...
##  $ opb2       : Factor w/ 28 levels "-","a6","b5",..: 28 22 22 22 22 22 22 22 14 14 ...
##  $ opw3       : Factor w/ 33 levels "-","a3","b3",..: 1 18 18 18 18 18 18 18 24 24 ...
##  $ opb3       : Factor w/ 45 levels "-","a6","b5",..: 1 26 26 26 26 26 16 16 45 45 ...
##  $ opw4       : Factor w/ 44 levels "-","a3","b4",..: 1 37 37 37 6 6 37 37 1 1 ...
##  $ opb4       : Factor w/ 55 levels "-","a6","b5",..: 1 10 10 10 1 1 36 36 1 1 ...
##  $ variation  : Factor w/ 921 levels "-"," Accelerated Gurgenidze ",..: 502 160 507 740 782 782 7 472 33 33 ...
#Coercing opening_id to a factor data type
game.opening$opening_id<-as.factor(game.opening$opening_id)

#Splitting the round column into round and table number column

game.opening<-game.opening%>%separate(round,into=c("round","table_number"),sep="\\.")

#Converting the round and table number column to a factor type
#Ordering levels of round column
game.opening$round<-factor(game.opening$round,
                           levels=c("1","2","3","4","5",                                      "6","7","8","9","10"),
                           ordered = TRUE)

game.opening$table_number<-factor(game.opening$table_number)

#Converting column: players who played white to a character column
game.opening$white<-as.character(game.opening$white)

#similarly with black
game.opening$black<-as.character(game.opening$black)

#Coercing game_id to a character data type
game.opening$game_id<-as.character(game.opening$game_id)

#Coercing event to a character data type
game.opening$event<-as.character(game.opening$event)

#Coercing site to a character data type
game.opening$site<-as.character(game.opening$site)

#Coercing the moves columns to a character vector
game.opening[15:34]<-sapply(game.opening[15:34],as.character)
game.opening[37:44]<-sapply(game.opening[37:44],as.character)

#Coercing ELO ratings of players to a numeric data type
game.opening$whiteElo<-as.numeric(game.opening$whiteElo)
game.opening$BlackElo<-as.numeric(game.opening$BlackElo)

kable(head(game.opening[c('round','table_number')]), 'latex',caption = "Game.Opening table after manipulating the round column")

Tidy & Manipulate Data I

The first few pairs of moves (white side and black side) that unfolded during the game (columns 15 to 34) and as well as the first few pairs of opening moves according to the classical chess books (columns 37 to 44) are all in separate columns of the game.opening data frame. This contributes to the unitidiness of the data set.

Keeping in mind the above and to make the data set amenable for further analyses, I have gathered the above mentioned columns into two columns (‘color_turn’ and ‘move’).

To understand and see the result of the above exercise, I would like to print out the first 10 rows of the ‘color_turn’ and ‘move’ column from the game.opening data frame.

game.opening<-game.opening%>%gather(c(15:34,37:44),key="color_turn",value="move")

kable(head(game.opening,10),
      caption="Tidy Game.Opening Table")
Tidy Game.Opening Table
opening_id game_id event site event_date round table_number white black game_result ECO whiteElo BlackElo opening name code variation color_turn move
19 47 FIDE World Rapid 2014 Dubai UAE 2014-06-16 4 13 Yu, Yangyi Le, Quang Liem 0-1 C45 2675 2712 Scotch Game Scandinavian Defense - Mieses-Kotroc Variation w1 e4
28 7 FIDE World Rapid 2014 Dubai UAE 2014-06-16 2 7 Nepomniachtchi, Ian Lenic, Luka 1-0 C45 2730 2659 Scotch Game Scotch Game - Classical Variation w1 e4
30 109 FIDE World Rapid 2014 Dubai UAE 2014-06-17 6 41 Al Qudaimi, Basheer Amin, Bassem 0-1 C45 2396 2635 Scotch Game Scotch Game - Millennium Variation w1 e4
41 285 FIDE World Rapid 2014 Dubai UAE 2014-06-17 10 4 Moradiabadi, Elshan Lu, Shanglei 1/2-1/2 D44 2586 2546 Semi-Slav Defense Scotch Game - Romanishin Variation w1 d4
43 162 FIDE World Rapid 2014 Dubai UAE 2014-06-17 7 44 Fedoseev, Vladimir3 Al Qudaimi, Basheer 1-0 D45 2662 2396 Semi-Slav Defense Scotch Game - Scotch Gambit w1 d4
43 211 FIDE World Rapid 2014 Dubai UAE 2014-06-17 9 16 Wojtaszek, Radoslaw Yudin, Sergei 1/2-1/2 D31 2715 2546 Semi-Slav Defense Scotch Game - Scotch Gambit w1 d4
86 13 FIDE World Rapid 2014 Dubai UAE 2014-06-16 2 13 Mamedov, Rauf Polgar, Judit 0-1 B22 2659 2685 Sicilian Defense Sicilian Defense - Accelerated Dragon, Exchange Variation w1 e4
89 118 FIDE World Rapid 2014 Dubai UAE 2014-06-17 6 51 Breckenridge, Steven James Mchedlishvili, Mikheil 1/2-1/2 B22 2299 2632 Sicilian Defense Sicilian Defense - Maroczy Bind, Breyer Variation w1 e4
94 94 FIDE World Rapid 2014 Dubai UAE 2014-06-17 6 26 Inarkiev, Ernesto Al-Sayed, Mohammed 1-0 B90 2682 2505 Sicilian Defense Sicilian Defense - Alapin Variation, Barmen Defense w1 e4
94 54 FIDE World Rapid 2014 Dubai UAE 2014-06-16 5 3 Polgar, Judit Karjakin, Sergey 0-1 B90 2685 2771 Sicilian Defense Sicilian Defense - Alapin Variation, Barmen Defense w1 e4

Tidy & Manipulate Data II

I would like to split the results column into white player result (wp_rslt) and black player result (bp_rslt) by separating it on the delimiter “-”. wp_rslt and bp_rslt will be coerced to a numeric data type.

Also, from this, I would like to create a column (winner) with the name of the winning player and if the game was drawn, I would like to flag the respective cell of the column as “DRAWN”.

game.opening<-game.opening%>%separate(game_result,into=c("wp_rslt","bp_rslt"),sep="-")

game.opening$wp_rslt<-ifelse(game.opening$wp_rslt=="1/2",0.5,game.opening$wp_rslt)

game.opening$bp_rslt<-ifelse(game.opening$bp_rslt=="1/2",0.5,game.opening$bp_rslt)

game.opening$wp_rslt<-as.numeric(game.opening$wp_rslt)
game.opening$bp_rslt<-as.numeric(game.opening$bp_rslt)

game.opening$winner<-ifelse(game.opening$wp_rslt==1.0,
                            game.opening$white,
                      ifelse(game.opening$bp_rslt==1.0,
                             game.opening$black,"DRAWN"))

kable(head(game.opening), caption = "game.opening table after manipulation of game result column")
game.opening table after manipulation of game result column
opening_id game_id event site event_date round table_number white black wp_rslt bp_rslt ECO whiteElo BlackElo opening name code variation color_turn move winner
19 47 FIDE World Rapid 2014 Dubai UAE 2014-06-16 4 13 Yu, Yangyi Le, Quang Liem 0.0 1.0 C45 2675 2712 Scotch Game Scandinavian Defense - Mieses-Kotroc Variation w1 e4 Le, Quang Liem
28 7 FIDE World Rapid 2014 Dubai UAE 2014-06-16 2 7 Nepomniachtchi, Ian Lenic, Luka 1.0 0.0 C45 2730 2659 Scotch Game Scotch Game - Classical Variation w1 e4 Nepomniachtchi, Ian
30 109 FIDE World Rapid 2014 Dubai UAE 2014-06-17 6 41 Al Qudaimi, Basheer Amin, Bassem 0.0 1.0 C45 2396 2635 Scotch Game Scotch Game - Millennium Variation w1 e4 Amin, Bassem
41 285 FIDE World Rapid 2014 Dubai UAE 2014-06-17 10 4 Moradiabadi, Elshan Lu, Shanglei 0.5 0.5 D44 2586 2546 Semi-Slav Defense Scotch Game - Romanishin Variation w1 d4 DRAWN
43 162 FIDE World Rapid 2014 Dubai UAE 2014-06-17 7 44 Fedoseev, Vladimir3 Al Qudaimi, Basheer 1.0 0.0 D45 2662 2396 Semi-Slav Defense Scotch Game - Scotch Gambit w1 d4 Fedoseev, Vladimir3
43 211 FIDE World Rapid 2014 Dubai UAE 2014-06-17 9 16 Wojtaszek, Radoslaw Yudin, Sergei 0.5 0.5 D31 2715 2546 Semi-Slav Defense Scotch Game - Scotch Gambit w1 d4 DRAWN

Scan I

I scan for missing values in the game.opening data frame and I find that there are NA entries corresponding to a few opening id’s in the game table. These NA’s are present in the code column and the first four pairs of opening moves as present in the opening table.

Also, there are some hyphens (-) under the code and opening moves columns. This is completely fine and I felt no need of any modification.

For representational purposes, I corece the NA cells to appear as “-”.

sapply(game.opening,function(x){sum(is.na(x))})
##   opening_id      game_id        event         site   event_date 
##            0            0            0            0            0 
##        round table_number        white        black      wp_rslt 
##            0            0            0            0            0 
##      bp_rslt          ECO     whiteElo     BlackElo      opening 
##            0            0            0            0            0 
##         name         code    variation   color_turn         move 
##         1764         1764         1764            0          504 
##       winner 
##            0
game.opening[is.na(game.opening)]<-"-"
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "-"): invalid factor
## level, NA generated
kable(head(game.opening), caption = "game.opening table after scanning for missing values")
game.opening table after scanning for missing values
opening_id game_id event site event_date round table_number white black wp_rslt bp_rslt ECO whiteElo BlackElo opening name code variation color_turn move winner
19 47 FIDE World Rapid 2014 Dubai UAE 2014-06-16 4 13 Yu, Yangyi Le, Quang Liem 0.0 1.0 C45 2675 2712 Scotch Game Scandinavian Defense - Mieses-Kotroc Variation w1 e4 Le, Quang Liem
28 7 FIDE World Rapid 2014 Dubai UAE 2014-06-16 2 7 Nepomniachtchi, Ian Lenic, Luka 1.0 0.0 C45 2730 2659 Scotch Game Scotch Game - Classical Variation w1 e4 Nepomniachtchi, Ian
30 109 FIDE World Rapid 2014 Dubai UAE 2014-06-17 6 41 Al Qudaimi, Basheer Amin, Bassem 0.0 1.0 C45 2396 2635 Scotch Game Scotch Game - Millennium Variation w1 e4 Amin, Bassem
41 285 FIDE World Rapid 2014 Dubai UAE 2014-06-17 10 4 Moradiabadi, Elshan Lu, Shanglei 0.5 0.5 D44 2586 2546 Semi-Slav Defense Scotch Game - Romanishin Variation w1 d4 DRAWN
43 162 FIDE World Rapid 2014 Dubai UAE 2014-06-17 7 44 Fedoseev, Vladimir3 Al Qudaimi, Basheer 1.0 0.0 D45 2662 2396 Semi-Slav Defense Scotch Game - Scotch Gambit w1 d4 Fedoseev, Vladimir3
43 211 FIDE World Rapid 2014 Dubai UAE 2014-06-17 9 16 Wojtaszek, Radoslaw Yudin, Sergei 0.5 0.5 D31 2715 2546 Semi-Slav Defense Scotch Game - Scotch Gambit w1 d4 DRAWN

Scan II

In this section, I scan the ELO ratings column and game results columns (white and black) for possibe outliers.

par(mfrow=c(3,3))
hist(game.opening$whiteElo,main="Histogram of White Player ELO Ratings",xlab="White - ELO Ratings")
boxplot(game.opening$whiteElo,main="Box plot of White player ELO Ratings ")

hist(game.opening$BlackElo,main="Histogram of Black Player ELO Ratings",xlab="Black - ELO Ratings")
boxplot(game.opening$BlackElo,main="Box plot of black player ELO Ratings ")

boxplot(game.opening$wp_rslt,main="Box plot of white players' results")

boxplot(game.opening$bp_rslt,main="Box plot of balck players' results")

From the above, though the ELO ratings box plots indicate the presence of outliers, it is quite natural to have a wide spectrum of ELO rated players participating in a rapid championship. More importantly, FIDE chess ratings are not to be tinkered and hence I leave them as it is.

Transform

Though we should not tamper with the FIDE ratings, I fell to enable comparison and to discern the strength of the players, I create two new columns with the range normalised ELO ratings (For White and Black).

For this, I use a custom defined function that carries out the range normalisation.

#Custom defined function
rangenormal<-function(x){(x-min(x))/(max(x)-min(x))}

game.opening$whiteElo_trfm<-rangenormal(game.opening$whiteElo)
game.opening$BlackElo_trfm<-rangenormal(game.opening$BlackElo)
kable(head(game.opening[22:23]),caption = "Transformed ELO ratings of the players")
Transformed ELO ratings of the players
whiteElo_trfm BlackElo_trfm
0.6628478 0.7415902
0.7528642 0.6605505
0.2062193 0.6238532
0.5171849 0.4877676
0.6415712 0.2584098
0.7283142 0.4877676

Win Summary

I would like to summarise the number of wins by each player (who have won atleast one game) in the 2014 world rapid chess championships. For representational purpose, I would like to print out just the first 5 rows of the winner_summary table.

winner_summary<-game.opening%>%group_by(winner)%>%summarise(wins=n_distinct(game_id))


names(winner_summary)[1]<-"Winners"
names(winner_summary)[2]<-"# Wins"
kable(head(winner_summary),caption = "Table of Player Wins")
Table of Player Wins
Winners # Wins
Adly, Ahmed 2
Al-Modiahki, Mohamad 2
Aleksandrov, Aleksej 1
Amin, Bassem 2
Anand, Viswanathan 3
Andriasian, Zaven 3

References

Relational.fit.cvut.cz. (2018). Dataset. [online] Available at: https://relational.fit.cvut.cz/dataset/Chess [Accessed 3 Jun. 2018].