Advanced R Programming - Final Project - Chess Toolkit - Khon Hoang Nguyen

Introduction

This project is a part of the course Advanced R Programming. In this project, the author uses R to manipulate chess data from lichess.org. The programs in this project helps users get to know the database from lichess, pre-proceed data more sophisticated machine learning jobs. It also helps users create their own collection of players. The players data can be recorded and updated with different methods. Therefore, it is suitable for the organizer of chess clubs or of chess tournaments. There are five tools in this project.
1. Summarize data of players in the database
  
  The input data is from lichess database (https://database.lichess.org/) In this project, the author creates a demo with data ‘2013 - January’. For the data from lichess, when loading using read.pgn function from package bigchess, it has 27 variables. I’d review some of the important variables:
  - Event: Type of the game
  - Site: Link of the game
  - White: Nickname of white player
  - Black: Nickname of black player
  - Result: Result of the game
  - Movetext: The record of the game
  For the remaining variables, they provide additional information about how many moves each pieace of black and white. In addition, it’s necessary to notice that the data are store in the level of games. Each row is a game. In this first tool, the author transform the data from game level to player level. After that, it is summrized so that each row represents one unique player in the dataset. tidyr and dplyr are two librabries that are used in this tool.
2. Create valid moves list in chess
  
  This part is all about create the valid moves in chess. This was not the main part of the project initially. However, when searching for the list of legal chess moves on the internet, the author cannot find the suitable one. So, this part is dedicated to achieve it. The list of valid moves is used later in tool 3 to check if a move received from the user is valid or not. In this part, I use functions and vectorization to achieve the goal.
3. Create vector of moves until some specific moves happens for the first time
  
  In one subject I have in the previous course in University of Warsaw, I investigated the question “What do players often play before they do castle?” In that project, I have to create the dataset of the sequences of moves the start from the beginning of the games and ends when castle move (O-O) happens. So, in this project, I write codes so that people can extract the desired sequences that end with any target moves they want to investigate. Similar to tool 2, in this part, functions and vectorization are also used to complete the tasks.
4. Create class player to record information about players and methods so that players can interact to others
  
  This is a separate part that the organizers of chess tournaments or chess clubs can use. Each player is an object and when they play against other, their score and Elo can change according to the result of the games. In this part, I use class S4 and methods to tackle the problem.
5. Create package for other chess organizer usage
  
  For this part, I aim to create a package so that the other people can use it too. It will include the materials and codes presented in part d.
Tool 1 - Summarize data of players in the database

Firstly, my goal is to transform the data from game level to player level. One game has 2 players, so the goal is to transform from n rows of games to 2n rows of players. Column Result should be split into two, one contains score for white, one contains score for black. In addition, instead of having 1/2, I replace that value with 0.5 to keep it consistent accross the later part. Plus, it is more convenient to convert them to numeric with “0.5”.

#Split the Result column
x <- separate(x, col = Result, into = c("Result_White", "Result_Black"), sep = "-" )
x$Result_white[x$Result_White == "1/2"] <- 0.5
x$Result_Black[x$Result_Black == "1/2"] <- 0.5

After that, I split x into two separate parts, one is black and one is white. Later, after proceeding separately, I will combine them again.

#Split the data into player level instead of match level, x1 is for white and x2 is for black
x1 <- select(x, Event, White, Result_White)
head(x1)

##                  Event            White Result_White
## 1 Rated Classical game            BFG9k            1
## 2 Rated Classical game   Desmond_Wilson            1
## 3 Rated Classical game    Kozakmamay007            1
## 4    Rated Bullet game Naitero_Nagasaki            0
## 5    Rated Bullet game     nichiren1967            0
## 6     Rated Blitz game            sport            1

#Same procedure for Black part
x2 <- select(x, Event, Black, Result_Black)
head(x2)

##                  Event             Black Result_Black
## 1 Rated Classical game           mamalak            0
## 2 Rated Classical game         savinka59            0
## 3 Rated Classical game VanillaShamanilla            0
## 4    Rated Bullet game               800            1
## 5    Rated Bullet game  Naitero_Nagasaki            1
## 6     Rated Blitz game          shamirbj            0

Then, I fix x1 and x2, adding one column to indicate the color so that later when merging into one, it is possible to identify if the player play black or white in the game. And also, I rename some columns to make the two parts consistent for later merge.

#Add new column to note that the player is White in the game
x1$Color <- "white"
#Rename to relect the Nickname of the player
x1<- rename(x1,Nickname = White, Result = Result_White)
#Add new column to note that the player is Black in the game
x2$Color <- "Black"
#Rename to relect the Nickname of the player
x2 <- rename(x2,Nickname = Black, Result = Result_Black)

After that, I merge them back. Then I convert the columns to appropriate data type. Event, Color and Nickname should be factor while Result should be numeric.

#Merge x1 and x2
x <- rbind(x1,x2)
x$Event <- as.factor(x$Event)
x$Result <- as.numeric(x$Result)
x$Color <- as.factor(x$Color)
x$Nickname <- as.factor(x$Nickname)
head(x)

##                  Event         Nickname Result Color
## 1 Rated Classical game            BFG9k      1 white
## 2 Rated Classical game   Desmond_Wilson      1 white
## 3 Rated Classical game    Kozakmamay007      1 white
## 4    Rated Bullet game Naitero_Nagasaki      0 white
## 5    Rated Bullet game     nichiren1967      0 white
## 6     Rated Blitz game            sport      1 white

After that, I create the summary table to show the details of each unique player in the dataset. Three details included are number of games, total scores achieved after those game and average score gotten.

#Create summary table of unique people with total score, number of game, and mean score
x_player <- group_by(x,Nickname)
summary_table <- summarize(x_player,games = n(),total_scores = sum(Result,na.rm=TRUE),
                           avg_result = round(mean(Result,na.rm=TRUE),2)) %>% arrange(desc(games))
head(summary_table)

## # A tibble: 6 × 4
##   Nickname          games total_scores avg_result
##   <fct>             <int>        <dbl>      <dbl>
## 1 F1_ALONSO_FERRARI  1729         940.       0.56
## 2 nichiren1967       1667         834        0.51
## 3 german11           1611         844.       0.53
## 4 cheesedout         1452         952        0.66
## 5 ChikiPuki          1262         750.       0.61
## 6 Redneck            1224         508.       0.42

Tool 2 - Create valid moves list in chess

At first, I plan to find a valid moves list on the internet to validate the input move that I prepare for the next part. However, I cannot find a good source, so I decide to create one for myself. To briefly demonstrate how the notes of chess moves look like, it can be shown as example:

First, there are the basic move and the capture move:

Basic move: Rook to B5 - Rb5 Knight captures Rook - Nxc3

Then, there are check move and promotion move.

Check move: Queen checks - Qe3+ Promotion move: pawn moves to d8 and is promoted to a Queen - d8=Q

And here are checkmate and a move with special note because of duplication.

Checkmate move: Rook moves to d8 and checkmates - Rd8# Special note, because two rooks are on the same file, it is necessary to specify which rook move to square d3 - R8d3

For a basic move, it includes the name of the piece and the destination square. However, the pieces can have actions such as capture, check, promote, etc. So the notes for moves can become more complicated. In some special cases, a move can have some actions at the same time, e.g. Rxa8+ which means the Rook captures a piece at square a8 and check. Therefore, in the section below, I aim to create a list of all possible moves that can happen in chess.

First, I create several variables as containers. Because pawn moves note is different from other pieces moves note so there are two separate variables.


Check move: Queen checks - Qe3+	Promotion move: pawn moves to d8 and is promoted to a Queen - d8=Q

#Write the list of valid move
#Pawn moves
pawn = c()
#Other pieces moves
op = c()
#Valid moves
vm = c()

After that, I create the ranks, files, and pieces variable.

#Create 8 rows of the board which are called 'rank' in chess.
r = c(1,2,3,4,5,6,7,8)
#Create 8 columns of the board which are call 'file' in chess.
f = c("a","b","c","d","e","f","g","h")
#Create 5 other pieces different from pawn which are Rook, Knight, Bishop, King, Queen.
p = c("R","N","B","K","Q")

Because the move note in chess is a combination of piece name, rank location, file location and action (if there is). For example, “Nxb6” means knight (piece name) takes an opponent piece (action) at square b6 (locations). So, there is a need to combine the text in 2 vectors, and this action is needed repeatedly. Therefore, I create a function for this need.

#Create function that is used to for creating valid moves
joinText <- function (x, y) {
  res <- c()
  for (i in x) {
    for (j in y) {
      res <- c(res,paste0(i,j))
    }
  }
  return (res)
}

Next step, I am creating the location for 64 cells in chess from two vectors f and r, the goal is to have: a1, a2,…,a8, b1, b2,…, b8,…, h8 These are not only the the square address but also the valid moves for pawns. E.g. h4 means that a pawn is going to square h4.

#Create target squares
ts <- joinText(f,r)
head(ts)

## [1] "a1" "a2" "a3" "a4" "a5" "a6"

After that, I also create the target squares which includes capture symbol ‘x’. The goal is to have: xa1, xa2,…, xh8

#Create target squares with capture symbols
tss = joinText("x",ts)
head(tss)

## [1] "xa1" "xa2" "xa3" "xa4" "xa5" "xa6"

Then, I create the basic move for other pieces than pawn. Basic moves mean moves without actions like capture or check or checkmate. The goal is to have: Ra1, Ra2,…, Na1, Na2,…, Qh7, Qh8.

#Create basic other pieces moves
op <- joinText(p,ts)
head(op)

## [1] "Ra1" "Ra2" "Ra3" "Ra4" "Ra5" "Ra6"

Next step, the basic pawn moves are created. It should be similar ts, however, one difference is that for pawn, we do not have basic pawn move in rank 1 and rank 8. Because when the pawns reach rank 1 (for black) and rank 8 (for white), they will be promoted. So, they are not basic moves anymore.

#Create basic pawn moves
pawn <- joinText(f,r[r!= 1 & r!= 8])
head(pawn)

## [1] "a2" "a3" "a4" "a5" "a6" "a7"

Then I create the capturing moves for pawn. As the rule, the pawns can only capture the piece standing in the columns next to their column. And I exclude the capturing moves that happen in rank 1 and rank 8 because of the same reason as above. For moves that happen in rank 1 and rank 8, either basic or capture moves, they will be dealt with later. The goal for this step is to have: axb2, axb3,…,bxa2, bxc2,…, gxh7.

#Create pawn moves that do capture
pawnc = c()
for (i in seq(1,length(f)-1)) {
  for (j in r) {
    if (j != 1 & j != 8) {
      pawnc = c(pawnc,paste0(f[i],"x",f[i+1],j))
    }
  }
}
for (i in seq(length(f),2)) {
  for (j in r) {
    if (j != 1 & j != 8) {
      pawnc = c(pawnc,paste0(f[i],"x",f[i-1],j))
    }
  }
}
#Merge to the pawn moves
pawn = c(pawn, pawnc)

After that, as noticing above about dealing with the promotion moves. The promotion moves have the ending “= [Any pieces accept the King]”. So the strategy here is to create the basic moves to rank 1 and rank 8 and then the capture moves at rank 1 and rank 8. Then after having all those moves, I add the all possible endings.

#create pawn move to promote
  #basic promote move
pm1 <- joinText(f,1)
pm8 <- joinText(f,8)
pm <- c(pm1,pm8)
  #capture promote move
cpm = c()
for (i in seq(1,length(f)-1)) {
  for (j in r) {
    if (j == 1 | j == 8) {
      cpm = c(cpm,paste0(f[i],"x",f[i+1],j))
    }
  }
}
for (i in seq(length(f),2)) {
  for (j in r) {
    if (j == 1 | j == 8) {
       cpm = c(cpm,paste0(f[i],"x",f[i-1],j))
    }
  }
}
pm = c(pm,cpm)
  #promote pieces
pp <- joinText("=",p[p != "K"])

for (i in pm) {
  for (j in pp) {
    pawn = c(pawn,paste0(pm,pp))
  }
}

For next steps, I turn to deal with the other pieces moves. One point that the other pieces are different from pawn is that it is possible to have more than 1 piece for each other pieces except the King, e.g. more than 1 Rook, more than 1 Knight, etc. Thus, in some special cases, we need to define which piece is mentioned in the notes. There can be Rb4, but there can also be Rab4 (in case there are more than 1 Rook standing in rank 4), R5b4 (in case there is more than 1 Rook standing in file b). Because of that, I create the note for those special cases, the goal is to have: Ra, Rb, Rc,…, R1, R2,…, Na, Nb,…,Q8, Q9.

#Create other pieces move in case of duplication
#For files (other pieces duplication - file)
opdf <- joinText(p[p != "K"],f)
#For ranks (other pieces duplication - rile)
opdr <- joinText(p[p != "K"],r)
#Merge to have a complete other pieces duplication
opd <- c(opdf,opdr)
head(opd)

## [1] "Ra" "Rb" "Rc" "Rd" "Re" "Rf"

After that, I merge that opd with the basic other pieces (recorded in variable p already) to have op_starting. After that I just need to add the address of the target squares, and another version with the capture and target squares. After all those steps, I have all other pieces moves.

#Merge to have complete, both basic pieces and pieces duplication
op_starting <- c(opd,p)
#Create basic other pieces moves
basic_op <- joinText(op_starting,ts)
#Creating capture pieces moves
capture_op <- joinText(op_starting,tss)
#Merge to have a complete list of other pieces moves
op <- c(basic_op,capture_op)

Then, I pack pawn moves and other moves into valid moves variable. In addition to pawn moves and other pieces more, we also have short castle and long castle.

#Add castle moves to the valid moves
vm = c(pawn, op, "O-O","O-O-O")

After all, one last step need to be done, which is to indicate that the moves might have action check or checkmate. When the move leads to check, it ends with “+”. When the move leads to checkmate, it ends with “:”. So the list of valid moves will expand 3 times after adding those 2 actions.

#Add check and checkmate to the valid moves
for (i in vm) {
  vm = c(vm,paste0(i,"+"))
  vm = c(vm,paste0(i,"#"))
}

Eventually, the final list of valid moves is completed with 50,130 moves.

#Demo of moves from the valid moves list
print(vm[c(1:10,100:110, 1000:1010, 10000:10010, 50000:50010)])

##  [1] "a2"     "a3"     "a4"     "a5"     "a6"     "a7"     "b2"     "b3"    
##  [9] "b4"     "b5"     "gxf5"   "gxf6"   "gxf7"   "fxe2"   "fxe3"   "fxe4"  
## [17] "fxe5"   "fxe6"   "fxe7"   "exd2"   "exd3"   "hxg8=Q" "gxf1=R" "gxf8=N"
## [25] "fxe1=B" "fxe8=Q" "exd1=R" "exd8=N" "dxc1=B" "dxc8=Q" "cxb1=R" "cxb8=N"
## [33] "R2b4"   "R2b5"   "R2b6"   "R2b7"   "R2b8"   "R2c1"   "R2c2"   "R2c3"  
## [41] "R2c4"   "R2c5"   "R2c6"   "Qxa1#"  "Qxa2+"  "Qxa2#"  "Qxa3+"  "Qxa3#" 
## [49] "Qxa4+"  "Qxa4#"  "Qxa5+"  "Qxa5#"  "Qxa6+"  "Qxa6#"

Tool 3 - Create vector of moves until some specific moves happens for the first time

To explain about the goal of this tool, I take an example of a short game like below: 1. e4 e5 2. Nf3 Nf6 3. Bc4 Bc5 4. O-O O-O 5. Re1 Re8 This is the format that I have in column Movetext in the dataset. Given that I want to know what sequence lead to O-O, I input O-O and then I will receive 2 columns wsqtm and bsqtm. wsqtm has c(“e4”,“Nf3”,“Bc4”,“O-O”) and bsqtm has c(“e5”,“Nf6”,“Bc5”,“O-O”,“Re8”). Firstly, I remove all “1.”, “2.”,… from the Movetext column, then I split the moves into small elements belonging to a vector.

#remove the number before the move
df$moves = gsub('\\d{1,2}[\\,\\.] ', '',df$Movetext)
#split the moves
df$moves <- strsplit(df$moves, " ")
#demo
head(df[,"moves"])

## [[1]]
##  [1] "e4"   "e6"   "d4"   "b6"   "a3"   "Bb7"  "Nc3"  "Nh6"  "Bxh6" "gxh6"
## [11] "Be2"  "Qg5"  "Bg4"  "h5"   "Nf3"  "Qg6"  "Nh4"  "Qg5"  "Bxh5" "Qxh4"
## [21] "Qf3"  "Kd8"  "Qxf7" "Nc6"  "Qe8#"
## 
## [[2]]
##  [1] "d4"    "d5"    "Nf3"   "Nf6"   "e3"    "Bf5"   "Nh4"   "Bg6"   "Nxg6" 
## [10] "hxg6"  "Nd2"   "e6"    "Bd3"   "Bd6"   "e4"    "dxe4"  "Nxe4"  "Rxh2" 
## [19] "Ke2"   "Rxh1"  "Qxh1"  "Nc6"   "Bg5"   "Ke7"   "Qh7"   "Nxd4+" "Kd2"  
## [28] "Qe8"   "Qxg7"  "Qh8"   "Bxf6+" "Kd7"   "Qxh8"  "Rxh8"  "Bxh8" 
## 
## [[3]]
##  [1] "e4"    "e5"    "Nf3"   "Nc6"   "Bc4"   "Nf6"   "Nc3"   "Bc5"   "a3"   
## [10] "Bxf2+" "Kxf2"  "Nd4"   "d3"    "Ng4+"  "Kf1"   "Qf6"   "h3"    "d5"   
## [19] "Nxd5"  "Qe6"   "Nxc7+"
## 
## [[4]]
##  [1] "e4"    "c6"    "Nc3"   "d5"    "Qf3"   "dxe4"  "Nxe4"  "Nd7"   "Bc4"  
## [10] "Ngf6"  "Nxf6+" "Nxf6"  "Qg3"   "Bf5"   "d3"    "Bg6"   "Ne2"   "e6"   
## [19] "Bf4"   "Nh5"   "Qf3"   "Nxf4"  "Nxf4"  "Be7"   "Bxe6"  "fxe6"  "Nxe6" 
## [28] "Qa5+"  "c3"    "Qe5+"  "Qe3"   "Qxe3+" "fxe3"  "Kd7"   "Nf4"   "Bd6"  
## [37] "Nxg6"  "hxg6"  "h3"    "Bg3+"  "Kd2"   "Raf8"  "Rhf1"  "Ke7"   "d4"   
## [46] "Rxf1"  "Rxf1"  "Rf8"   "Rxf8"  "Kxf8"  "e4"    "Ke7"   "Ke3"   "g5"   
## [55] "Kf3"   "Be1"   "Kg4"   "Bd2"   "Kf5"   "Bc1"   "Kg6"   "Kf8"   "e5"   
## [64] "Bxb2"  "Kxg5"  "Bxc3"  "h4"    "Bxd4"  "h5"    "Bxe5"  "g4"    "Bb2"  
## [73] "Kf5"   "Kf7"   "g5"    "Bc1"   "g6+"   "Ke7"   "Ke5"   "b5"    "Kd4"  
## [82] "Kd6"   "Kc3"   "c5"    "a3"    "Bg5"   "a4"    "bxa4"  "Kb2"   "Kd5"  
## [91] "Ka3"   "Kd4"   "Kxa4"  "c4"   
## 
## [[5]]
##  [1] "e4"    "e6"    "f4"    "d5"    "e5"    "c5"    "Nf3"   "Qb6"   "c3"   
## [10] "Nc6"   "d3"    "Bd7"   "Be2"   "Nh6"   "O-O"   "Nf5"   "g4"    "Nh6"  
## [19] "Kg2"   "Nxg4"  "h3"    "Nh6"   "Ng5"   "Nf5"   "Bg4"   "Nce7"  "Nd2"  
## [28] "Ne3+"  "Kf3"   "Nxd1"  "Rxd1"  "h6"    "Nxf7"  "Kxf7"  "Rf1"   "h5"   
## [37] "Bxe6+" "Bxe6"  "Kg3"   "Nf5+"  "Kg2"   "Ne3+"  "Kf2"   "Nxf1"  "Kxf1" 
## [46] "Bxh3+"
## 
## [[6]]
##  [1] "e4"    "b6"    "Bc4"   "Bb7"   "d3"    "Nh6"   "Bxh6"  "gxh6"  "Qf3"  
## [10] "e6"    "Nh3"   "Bg7"   "c3"    "Nc6"   "Qg3"   "Rg8"   "Qf3"   "Ne5"  
## [19] "Qe3"   "Nxc4"  "dxc4"  "Qe7"   "O-O"   "Qc5"   "Qxc5"  "b5"    "Qxb5" 
## [28] "Bxe4"  "Nd2"   "Bc6"   "Qb3"   "Bxc3"  "g3"    "Bxd2"  "Rad1"  "Bg5"  
## [37] "Nxg5"  "hxg5"  "Qd3"   "h6"    "b4"    "Ba4"   "Rd2"   "Rb8"   "b5"   
## [46] "d6"    "Qa3"   "Bxb5"  "cxb5"  "Rxb5"  "Qxa7"  "Rc5"   "Qa8+"  "Ke7"  
## [55] "Qxg8"  "e5"    "Qh8"   "d5"    "Qxe5+" "Kd7"   "Rxd5+" "Rxd5"  "Qxd5+"

Then, I use lapply to calculate the total number of moves that both black and white make.

#total moves of white and black
df$length<- lapply(df$moves, length)

After that, my next quest is to split the moves into one column for black and one column for white. I know the length of the total number of moves, and the White moves are move 1, 3, 5, … in column moves, while black moves are move 2, 4, 6 in column moves. So I create wml and bml which contains white moves location and black moves location.

#Create white move locations and black move locations
#White moves are move item 1, 3, 5, ...
#Black moves are move item 2, 4, 6, ...
wml <- function(x) {
  if (x==0) {
    result <- c(0)
  } else {
    result <- seq(1,x,2)
  }
}

bml <- function(x) {
  if (x==1) {
  result <- c(0)
  } else {
  result <- seq(2,x,2)
  }
}

df$wml <- lapply(df$length, wml)
df$bml <- lapply(df$length,bml)
#Demo for a random row
df$wml[15]

## [[1]]
##  [1]   1   3   5   7   9  11  13  15  17  19  21  23  25  27  29  31  33  35  37
## [20]  39  41  43  45  47  49  51  53  55  57  59  61  63  65  67  69  71  73  75
## [39]  77  79  81  83  85  87  89  91  93  95  97  99 101 103 105 107 109 111 113
## [58] 115 117

df$bml[15]

## [[1]]
##  [1]   2   4   6   8  10  12  14  16  18  20  22  24  26  28  30  32  34  36  38
## [20]  40  42  44  46  48  50  52  54  56  58  60  62  64  66  68  70  72  74  76
## [39]  78  80  82  84  86  88  90  92  94  96  98 100 102 104 106 108 110 112 114
## [58] 116

As I have the moves columns and wml, bml, I try to decode the location in order to have the actual moves of black and white.

#Based on the moves and the vector white moves location, vector black moves location, I write the decode function to decode and get the vectors of white moves and vector of black moves
decode <- function(x,y) {
  if (y[1] == 0) {
  result <- 0
  } else {
  result <- x[y]
  return(result)
  }
}

After that, I use mapply to decode and have the white moves and black moves vectors saved in two column wm and bm.

df$wm <- mapply(decode,df$moves,df$wml)
df$bm <- mapply(decode,df$moves,df$bml)
#Demo for a random row
df$wm[15]

## [[1]]
##  [1] "e4"    "f4"    "exd5"  "Nc3"   "Bc4"   "d3"    "g4"    "a4"    "Bd2"  
## [10] "Nf3"   "Qxf3"  "Qg3"   "hxg3"  "O-O-O" "f5"    "fxe6"  "dxc4"  "Rde1" 
## [19] "Bf4"   "gxf4"  "g5"    "Rxe6"  "Rf1"   "Re5"   "Kd2"   "Kc1"   "Nd5"  
## [28] "Ne7+"  "Rxe7"  "Rxd7"  "b3"    "Kb2"   "Kc3"   "Rh1"   "Rf1"   "Rh1"  
## [37] "Rxh7"  "Rxb7"  "Rc7"   "Rxc5"  "b4"    "Rc4+"  "Rc5"   "Rf5+"  "b5"   
## [46] "axb5"  "Kb4"   "Rd5"   "Rd1+"  "Rxg1+" "c4"    "c5"    "b6"    "b7"   
## [55] "Kb5"   "Kc6"   "Kc7"   "b8=Q"  "Kxb8"

df$bm[15]

## [[1]]
##  [1] "c5"    "d5"    "Qxd5"  "Qd8"   "Bf5"   "a6"    "Bd7"   "e6"    "Bc6"  
## [10] "Bxf3"  "Qh4+"  "Qxg3+" "Nc6"   "O-O-O" "Ne5"   "Nxc4"  "fxe6"  "Bd6"  
## [19] "Bxf4+" "Nh6"   "Nf5"   "Rd4"   "Rxc4"  "g6"    "Rd8+"  "Rd7"   "Rd6"  
## [28] "Nxe7"  "Rd7"   "Kxd7"  "Re4"   "Ke6"   "Kf5"   "Re7"   "Re4"   "Rxf4" 
## [37] "Kxg5"  "Rf6"   "Kf4"   "g5"    "g4"    "Kf3"   "Rg6"   "Kg2"   "axb5" 
## [46] "g3"    "Kh1"   "g2"    "g1=Q"  "Kxg1"  "Kf2"   "Ke3"   "Kd4"   "Rg1"  
## [55] "Rb1+"  "Rb4"   "Kxc5"  "Rxb8"

Now, it is the time for the user type the move they want to target. I use variable valid moves (vm) that was created in tool 2 to validate if the move is valid or not.

input_tm <- function(tm){
  tm = readline(prompt = "Enter the move you need to find: ")
  #Defensive Programming
  if(tm %in% vm){
    message('Everything is just fine.')
  }
  else {
    stop("You did not provide the valide chess move, please check again. Chess moves are case sensitive, therefore, please check if you turn off the Caps Lock")
    tm = readline(prompt = "Enter the move you need to find again: ")
  }
return (tm)
}

#tm <- input_tm()
#The above line is what should be available in the actual program.
#However, for demonstration purpose, I would use O-O-O as an example for target move
tm <- "O-O-O"

Then, I create a function to find the exact location of the target move in wm and bm.

#Find move
find_move <- function(x) {
  tml <- which(x == tm)
  if (length(tml) == 0) {
    result <- c(0)
  }
  else {
    result <- as.integer(tml[1])
  }
  return (result)
}
#White target move location
df$wtml <- sapply(df$wm,FUN = find_move)
#Black target move location
df$btml <- sapply(df$bm,FUN = find_move)

After that, I aim to to create the vector containing the sequence of locations of moves that lead to the target move.

#Moves location
ml <- function(x) {
  if (x==0) {
    result <- c(0)
  } else {
    result <- seq(1,x)
  }
}
#White sequence (to) target move locations
df$wsqtml <- lapply(df$wtml, ml)
#Black sequence (to) target move locations
df$bsqtml <- lapply(df$btml, ml)

Finally, I decode and have the sequence to target move.

df$wsqtm <- mapply(decode,df$wm,df$wsqtml)
df$bsqtm <- mapply(decode,df$bm,df$bsqtml)

Below, I would summarize the input and output of this tool.

For example as mentioned above using O-O-O as target move find. The input is:

print(df[15,'Movetext'])

## [1] "1. e4 c5 2. f4 d5 3. exd5 Qxd5 4. Nc3 Qd8 5. Bc4 Bf5 6. d3 a6 7. g4 Bd7 8. a4 e6 9. Bd2 Bc6 10. Nf3 Bxf3 11. Qxf3 Qh4+ 12. Qg3 Qxg3+ 13. hxg3 Nc6 14. O-O-O O-O-O 15. f5 Ne5 16. fxe6 Nxc4 17. dxc4 fxe6 18. Rde1 Bd6 19. Bf4 Bxf4+ 20. gxf4 Nh6 21. g5 Nf5 22. Rxe6 Rd4 23. Rf1 Rxc4 24. Re5 g6 25. Kd2 Rd8+ 26. Kc1 Rd7 27. Nd5 Rd6 28. Ne7+ Nxe7 29. Rxe7 Rd7 30. Rxd7 Kxd7 31. b3 Re4 32. Kb2 Ke6 33. Kc3 Kf5 34. Rh1 Re7 35. Rf1 Re4 36. Rh1 Rxf4 37. Rxh7 Kxg5 38. Rxb7 Rf6 39. Rc7 Kf4 40. Rxc5 g5 41. b4 g4 42. Rc4+ Kf3 43. Rc5 Rg6 44. Rf5+ Kg2 45. b5 axb5 46. axb5 g3 47. Kb4 Kh1 48. Rd5 g2 49. Rd1+ g1=Q 50. Rxg1+ Kxg1 51. c4 Kf2 52. c5 Ke3 53. b6 Kd4 54. b7 Rg1 55. Kb5 Rb1+ 56. Kc6 Rb4 57. Kc7 Kxc5 58. b8=Q Rxb8 59. Kxb8"

Output for white

print(df[15,'wsqtm'])

## [[1]]
##  [1] "e4"    "f4"    "exd5"  "Nc3"   "Bc4"   "d3"    "g4"    "a4"    "Bd2"  
## [10] "Nf3"   "Qxf3"  "Qg3"   "hxg3"  "O-O-O"

Output for black

print(df[15,'bsqtm'])

## [[1]]
##  [1] "c5"    "d5"    "Qxd5"  "Qd8"   "Bf5"   "a6"    "Bd7"   "e6"    "Bc6"  
## [10] "Bxf3"  "Qh4+"  "Qxg3+" "Nc6"   "O-O-O"

Create class player to record information about players and methods so that players can interact to others

In the section, I create a class ‘player’ which has some assets: - ID - Name - Gender - Nationality - Title - Elo - Score: This is an asset for a specific tournament First, I create the vector of valid titles in the chess world.

titles <- c("GM","IM","FM","CM","WIM","WGM","WFM","WCM","Not available")

After that, I create the class. There are some rules for validation. The elo and score should be numeric and not negative. The title must belong to the legitimate valid titles list. And the men cannot pursuit the women title.

setClass("player", slots = list(ID = "character",
                                name = "character",
                                gender = "character",
                                nationality = "character",
                                title = "character",
                                elo = "numeric",
                                score = "numeric"),
         validity = function(object)
         {
           if(object@elo < 0) return("Elo cannot be negative!")
           if(!object@title %in% titles) return("The title is not valid!")
           if(!object@gender %in% c("F", "M")) 
             return('Incorrect gender - use "F" or "M"!')
           if(object@gender == "M" & (object@title == "WIM" | object@title == "WGM" |
                                      object@title == "WFM" | object@title == "WCM")) return("Men cannot have female titles")
           return(TRUE)
         })

Then, I experiment creating 3 players.

player1 <- new(Class = "player", name = "Magnus Carlsen", elo = 2864, ID = "01", nationality = "Norwegian", title = "GM", gender = "M", score = 0)
player2 <- new(Class = "player", name = "Hikaru Nakamura", elo = 2760, ID = "02", nationality = "American", title = "GM", gender = "M", score = 0)
player3 <- new(Class = "player", name = "Eric Rosen", elo = 2360, ID = "03", nationality = "American", title = "IM", gender = "M", score = 0)

These are the photos of those 3 people I took as examples.


Magnus Carlsen	Hikaru Nakamura	Eric Rosen

For the next step, I create the one function to calculate Elo rating and one function to calculate score after each game is played between two players. I use the definition from the following website website: https://www.omnicalculator.com/sports/elo

Before doing that, I try to standardize the way the result of the game should be written, so I allow only 3 input as following: - “1-0” White wins, the first person in the formula gets the point and increase his/her Elo. - “0.5-0.5” Draw. Both players get 0.5 points, and the Elo is changed if there is difference between 2 players’ Elo. - “0-1” Black wins, the second person in the formula gets the point and increase his/her Elo.

#Function to calculate Elo
elo <- function(eloA,eloB,result,KA=20,KB=20) {
  #Defensive programming
  eligible_result <- c("1-0","0-1","0.5-0.5")
  if (!all(is.numeric(eloA),is.numeric(eloB))) {
    stop('Ratings must be numeric.')
  }
  if (eloA < 0 | eloB <0) {
    warning('Rating should be positive')
  }
  if (!result %in% eligible_result) {
    stop('Result must be one of the following: "1-0","0-1","0.5-0.5"')
  }
  if (KA < 10 | KA > 40) {
    stop('K-factor should be in the range [20, 40]')
  }
  if (!KA %in% c(10,20,40)) {
    warning('K-factor should be 10, 20 or 40')
  }
  if (KB < 10 | KB > 40) {
    stop('K-factor should be in the range [20, 40]')
  }
  if (!KB %in% c(10,20,40)) {
    warning('K-factor should be 10, 20 or 40')
  }
  
  #Main function
  dif <- eloB - eloA
  if (result == "1-0") {
    scoreA = 1
    scoreB = 0
  } else if (result == "0-1") {
    scoreA = 0
    scoreB = 1
  } else if (result == "0.5-0.5") {
    scoreA = 0.5
    scoreB = 0.5
  }
  expected_score_A <- 1/(1+10**(dif/400))
  expected_score_B <- 1/(1+10**(-dif/400))
  new_eloA = eloA + KA*(scoreA - expected_score_A)
  new_eloB = eloB + KB*(scoreB - expected_score_B)
  return(c(new_eloA,new_eloB))
}

#Function to calculate score
score <- function(scoreA,scoreB,result) {
  #Defensive programming
  eligible_result <- c("1-0","0-1","0.5-0.5")
  if (!all(is.numeric(scoreA),is.numeric(scoreB))) {
    stop('Score must be numeric.')
    }
  if (scoreA < 0 | scoreB <0) {
    warning('Score should be positive')
    }
  if (!result %in% eligible_result) {
    stop('Result must be one of the following: "1-0","0-1","0.5-0.5"')
    }
  if (result == "1-0") {
    scoreA = scoreA + 1
  } else if (result == "0-1") {
    scoreB = scoreB + 1
  } else if (result == "0.5-0.5") {
    scoreA = scoreA + 0.5
    scoreB = scoreB + 0.5
  }
  return(c(scoreA,scoreB))
}

Then, I create the method to update the score and Elo after each game, using the two function created above.

#Method to calculate the score after each game
setGeneric("update_score_Elo",
           function(playerA, playerB, result, KA, KB) standardGeneric("update_score_Elo"))

## [1] "update_score_Elo"

setMethod("update_score_Elo",
          signature = c("player","player","character","numeric","numeric"),
          definition = function(playerA, playerB, result, KA=20,KB=20) {
            new_elo <- elo(playerA@elo,playerB@elo,result,KA,KB)
            playerA@elo <- new_elo[1]
            playerB@elo <- new_elo[2]
            new_score <- score(playerA@score,playerB@score,result)
            playerA@score <- new_score[1]
            playerB@score <- new_score[2]
            return(c(playerA,playerB)) }
)

Then I test with the current player to see the result.

Before the game:

print(player1)

## An object of class "player"
## Slot "ID":
## [1] "01"
## 
## Slot "name":
## [1] "Magnus Carlsen"
## 
## Slot "gender":
## [1] "M"
## 
## Slot "nationality":
## [1] "Norwegian"
## 
## Slot "title":
## [1] "GM"
## 
## Slot "elo":
## [1] 2864
## 
## Slot "score":
## [1] 0

print(player2)

## An object of class "player"
## Slot "ID":
## [1] "02"
## 
## Slot "name":
## [1] "Hikaru Nakamura"
## 
## Slot "gender":
## [1] "M"
## 
## Slot "nationality":
## [1] "American"
## 
## Slot "title":
## [1] "GM"
## 
## Slot "elo":
## [1] 2760
## 
## Slot "score":
## [1] 0

After the game:

player1 <- update_score_Elo(player1,player2,"1-0",KA=20,KB=20)[[1]]
player2 <- update_score_Elo(player1,player2,"1-0",KA=20,KB=20)[[2]]
print(player1)

## An object of class "player"
## Slot "ID":
## [1] "01"
## 
## Slot "name":
## [1] "Magnus Carlsen"
## 
## Slot "gender":
## [1] "M"
## 
## Slot "nationality":
## [1] "Norwegian"
## 
## Slot "title":
## [1] "GM"
## 
## Slot "elo":
## [1] 2871.093
## 
## Slot "score":
## [1] 1

print(player2)

## An object of class "player"
## Slot "ID":
## [1] "02"
## 
## Slot "name":
## [1] "Hikaru Nakamura"
## 
## Slot "gender":
## [1] "M"
## 
## Slot "nationality":
## [1] "American"
## 
## Slot "title":
## [1] "GM"
## 
## Slot "elo":
## [1] 2753.093
## 
## Slot "score":
## [1] 0

Finally, I create a method so that the organizer can promote the title of the player. In chess, people are getting higher title if they have good performance but they will not get lower title after losing their Elo or having poor performance. Therefore, this method is set with defensive programming.

Rating	Titles	Women only titles
2500	Grandmaster
2400	International Master
2300	FIDE Master	Woman Grandmaster
2200	Candidate Master	Woman International Master
2100		Woman FIDE Master
2000		Woman Candidate Master

#Method to promote the title
setGeneric("promote",
           function(playerA, title) standardGeneric("promote"))

## [1] "promote"

setMethod("promote",
          signature = c("player","character"),
          definition = function (playerA, title) {
            titles <- c("GM","IM","FM","CM","WIM","WGM","WFM","WCM","Not available")
            title_dict <- c("GM" = 6,"IM" = 5,"FM" = 4,"CM" = 3,"WIM" = 3, "WGM" = 4,"WFM" = 2,"WCM" = 1,"Not available" = 0)
            #Defensive programming
            if (!title %in% titles) {
              stop("Your input is not valid, please check again, your input should be one of the following: GM, IM, FM, CM, WIM, WGM, WFM, WCM, Not available.")
            }
            if (title_dict[playerA@title] > title_dict[title]) {
              stop("The current title of the player is higher than the input title.")
            }
            #Main function
            playerA@title <- title
            return(playerA)
          })

Below, I would give a demo for the method.

Before the promotion:

print(player3)

## An object of class "player"
## Slot "ID":
## [1] "03"
## 
## Slot "name":
## [1] "Eric Rosen"
## 
## Slot "gender":
## [1] "M"
## 
## Slot "nationality":
## [1] "American"
## 
## Slot "title":
## [1] "IM"
## 
## Slot "elo":
## [1] 2360
## 
## Slot "score":
## [1] 0

After the promotion:

player3 <- promote(player3,"GM")
print(player3)

## An object of class "player"
## Slot "ID":
## [1] "03"
## 
## Slot "name":
## [1] "Eric Rosen"
## 
## Slot "gender":
## [1] "M"
## 
## Slot "nationality":
## [1] "American"
## 
## Slot "title":
## [1] "GM"
## 
## Slot "elo":
## [1] 2360
## 
## Slot "score":
## [1] 0

Tool 5 - Create package for other chess organizers usage

After all, I create a package that contains class S4 player, the Elo function and the score function to calculate Elo and score respectively. I’d show the code for the file I use to generate the package below but I’d not run it as I run already separately. Later, I am going to attach a separate zip file for this tool.

First step, I load the necessary library, create a path to my folder and create a skeleton for the package.

# load the necessary packages
library(devtools)
library(pkgbuild)
library(roxygen2)

path_to_packages <- "C:/Users/ChotC/Downloads/"
create_package(path = paste0(path_to_packages, 
                             "ChessToolKit"))

After that I prepare the R files: elo, score functions and player class, then I copy them all to the R folder. Then I run the below function to let roxygenise create the manual files for me.

roxygenise(package.dir = paste0(path_to_packages, 
                                "ChessToolKit"))

After that, I adjust the details in the DESCRIPTION file. Then I check if the package works well.

devtools::check(paste0(path_to_packages, 
                       "ChessToolKit"))

After it confirms that all are good, I install the package and check. Finally I create a compressed file which is ready to share to other people.

devtools::install(pkg = paste0(path_to_packages, 
                               "ChessToolKit"), 
                  reload = TRUE)

devtools::build(paste0(path_to_packages, 
                       "ChessToolKit"))

Advanced R Programming - Final Project - Chess Toolkit - Khon Hoang Nguyen - 444135

Khon Hoang Nguyen

06/06/2022


Checkmate move: Rook moves to d8 and checkmates - Rd8#	Special note, because two rooks are on the same file, it is necessary to specify which rook move to square d3 - R8d3


Basic move: Rook to B5 - Rb5	Knight captures Rook - Nxc3