Overview

The goal of this project is to turn a chess tournament document into a csv file.

It doesn’t seem like it would be that difficult, but that data currently looks like this:

The project has a few stipulations for what needs to be included in the end file

  1. Player’s Name - Simply, add in their name
  2. Player’s State - It’s where we see ‘Num’ on the top title
  3. Player’s total Number of Points - the numbers that come after the outcome of the tournament
  4. Player’s Pre-Rating (And Player’s ID) - Simply this particular players pre-rating by using their player ID
  5. Average Pre Chess Rating of Opponents - For all the opponents, you need to get the score for each person and then divide by the total number of games played, which is a max of seven for this tournament.

We will accomplish this in a series of steps which you can find below.

  1. Reading and Cleaning the data - we will load the data and make it easy to work with
  2. Extracting the Data
  3. Creating a Dataframe

Load the Required Packages

library(stringr)
library(dplyr)
library("kableExtra")

Step One

Reading the Data

Since this is a txt file and we want to read it in as a character file and maintain the line structure, so we want to use ‘readLines’.

chess <- readLines("https://raw.githubusercontent.com/jhumms/DATA607/main/chess/tournamentinfo.txt")
# Let's check out the top rows
paste(head(chess))
## [1] "-----------------------------------------------------------------------------------------" 
## [2] " Pair | Player Name                     |Total|Round|Round|Round|Round|Round|Round|Round| "
## [3] " Num  | USCF ID / Rtg (Pre->Post)       | Pts |  1  |  2  |  3  |  4  |  5  |  6  |  7  | "
## [4] "-----------------------------------------------------------------------------------------" 
## [5] "    1 | GARY HUA                        |6.0  |W  39|W  21|W  18|W  14|W   7|D  12|D   4|" 
## [6] "   ON | 15445895 / R: 1794   ->1817     |N:2  |W    |B    |W    |B    |W    |B    |W    |"

Now, there is one more thing we have to do, and that is remove the header from the chess document since we will be creating our own column names later.

chess <- chess[-(1:4)]

And now we are ready for Step Two.

Step Two

Extracting the Data

In this section we are going to use some Regex to get all the data we need.

A quick note, for the Regex formulas, I made frequent use of the RegExr Website.

Player ID

chess_id <- unlist(str_extract_all(unlist(chess), "\\d{1,2}(?=\\s\\|)"))
chess_id <- str_trim(chess_id, side = "right")
chess_id
##  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15"
## [16] "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30"
## [31] "31" "32" "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44" "45"
## [46] "46" "47" "48" "49" "50" "51" "52" "53" "54" "55" "56" "57" "58" "59" "60"
## [61] "61" "62" "63" "64"

Player Name

chess_player <- unlist(str_extract_all(unlist(chess), "([[:upper:]]+\\s){2,}"))
chess_player <- str_trim(chess_player, side = "right")
chess_player
##  [1] "GARY HUA"                 "DAKSHESH DARURI"         
##  [3] "ADITYA BAJAJ"             "PATRICK H SCHILLING"     
##  [5] "HANSHI ZUO"               "HANSEN SONG"             
##  [7] "GARY DEE SWATHELL"        "EZEKIEL HOUGHTON"        
##  [9] "STEFANO LEE"              "ANVIT RAO"               
## [11] "CAMERON WILLIAM MC LEMAN" "KENNETH J TACK"          
## [13] "TORRANCE HENRY JR"        "BRADLEY SHAW"            
## [15] "ZACHARY JAMES HOUGHTON"   "MIKE NIKITIN"            
## [17] "RONALD GRZEGORCZYK"       "DAVID SUNDEEN"           
## [19] "DIPANKAR ROY"             "JASON ZHENG"             
## [21] "DINH DANG BUI"            "EUGENE L MCCLURE"        
## [23] "ALAN BUI"                 "MICHAEL R ALDRICH"       
## [25] "LOREN SCHWIEBERT"         "MAX ZHU"                 
## [27] "GAURAV GIDWANI"           "SOFIA ADINA"             
## [29] "CHIEDOZIE OKORIE"         "GEORGE AVERY JONES"      
## [31] "RISHI SHETTY"             "JOSHUA PHILIP MATHEWS"   
## [33] "JADE GE"                  "MICHAEL JEFFERY THOMAS"  
## [35] "JOSHUA DAVID LEE"         "SIDDHARTH JHA"           
## [37] "AMIYATOSH PWNANANDAM"     "BRIAN LIU"               
## [39] "JOEL R HENDON"            "FOREST ZHANG"            
## [41] "KYLE WILLIAM MURPHY"      "JARED GE"                
## [43] "ROBERT GLEN VASEY"        "JUSTIN D SCHILLING"      
## [45] "DEREK YAN"                "JACOB ALEXANDER LAVALLEY"
## [47] "ERIC WRIGHT"              "DANIEL KHAIN"            
## [49] "MICHAEL J MARTIN"         "SHIVAM JHA"              
## [51] "TEJAS AYYAGARI"           "ETHAN GUO"               
## [53] "JOSE C YBARRA"            "LARRY HODGE"             
## [55] "ALEX KONG"                "MARISA RICCI"            
## [57] "MICHAEL LU"               "VIRAJ MOHILE"            
## [59] "SEAN M MC CORMICK"        "JULIA SHEN"              
## [61] "JEZZEL FARKAS"            "ASHWIN BALAJI"           
## [63] "THOMAS JOSEPH HOSMER"     "BEN LI"

Player’s State

chess_state <- unlist(str_extract_all(unlist(chess), "([[:upper:]]){2}\\s(?=\\|)"))
chess_state <- str_trim(chess_state, side = "right")
chess_state
##  [1] "ON" "MI" "MI" "MI" "MI" "OH" "MI" "MI" "ON" "MI" "MI" "MI" "MI" "MI" "MI"
## [16] "MI" "MI" "MI" "MI" "MI" "ON" "MI" "ON" "MI" "MI" "ON" "MI" "MI" "MI" "ON"
## [31] "MI" "ON" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI"
## [46] "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI" "MI"
## [61] "ON" "MI" "MI" "MI"

Player’s Number of Points

chess_points <- unlist(str_extract_all(unlist(chess), "\\d\\.\\d"))
chess_points
##  [1] "6.0" "6.0" "6.0" "5.5" "5.5" "5.0" "5.0" "5.0" "5.0" "5.0" "4.5" "4.5"
## [13] "4.5" "4.5" "4.5" "4.0" "4.0" "4.0" "4.0" "4.0" "4.0" "4.0" "4.0" "4.0"
## [25] "3.5" "3.5" "3.5" "3.5" "3.5" "3.5" "3.5" "3.5" "3.5" "3.5" "3.5" "3.5"
## [37] "3.5" "3.0" "3.0" "3.0" "3.0" "3.0" "3.0" "3.0" "3.0" "3.0" "2.5" "2.5"
## [49] "2.5" "2.5" "2.5" "2.5" "2.0" "2.0" "2.0" "2.0" "2.0" "2.0" "2.0" "1.5"
## [61] "1.5" "1.0" "1.0" "1.0"

Player’s Pre-Rating

chess_rate <- unlist(str_extract_all(unlist(chess), "(?<!\\>\\s)(?<=\\s{1,2}|\\s\\:)(\\d{3,4}(?=\\s|P))"))
chess_rate
##  [1] "1794" "1553" "1384" "1716" "1655" "1686" "1649" "1641" "1411" "1365"
## [11] "1712" "1663" "1666" "1610" "1220" "1604" "1629" "1600" "1564" "1595"
## [21] "1563" "1555" "1363" "1229" "1745" "1579" "1552" "1507" "1602" "1522"
## [31] "1494" "1441" "1449" "1399" "1438" "1355" "980"  "1423" "1436" "1348"
## [41] "1403" "1332" "1283" "1199" "1242" "377"  "1362" "1382" "1291" "1056"
## [51] "1011" "935"  "1393" "1270" "1186" "1153" "1092" "917"  "853"  "967" 
## [61] "955"  "1530" "1175" "1163"

Opponents Score

For this one, we need to extract all the values and then insert them into a dataframe that we will later add everything to.

chess_score <- unlist(str_extract_all(unlist(chess), "(\\d{1,}|[[:blank:]]{1})(?=\\|)"))
# Let's convert all blanks to '0'
chess_score[chess_score==" "]  <- "0"
#Now create a DataFrame
chess_temp <- data.frame(matrix(chess_score, nrow = 64, byrow = T))

##Now remove the columns without data
chess_temp <- subset(chess_temp, select = 4:10)
# Coerce the characters to numeric
chess_temp <- as.data.frame(sapply(chess_temp, as.numeric))
#Do some final math

Step Three

Creating the Dataframe

First we need to create an intermediate table to get the values together

chess_table <- data.frame(chess_id,chess_player, chess_state, chess_points, chess_rate)
chess_final <- 'NA'

chess_int <- cbind(chess_table, chess_temp)

# Let's rename our columns
colnames(chess_int) <-c("player.id", "player.name", "player.state","player.points", "player.rating","o1","o2","o3","o4","o5","o6","o7")

Now, let’s add in the average pre-score value for the oponents.

  1. We need to make a new key value pair (like a Python dictionary)
  2. We need to add in the new columns for the values
  3. We need to match the key value pair with the data frame
  4. We need to turn the NA’s to 0 in order to do effective math
  5. Finally, we need to make sure the values are numeric
# Key Value pair
mat <- chess_int %>% select(player.id,player.rating)

# Create new columns
chess_int$v1 <- 0
chess_int$v2 <- 0
chess_int$v3 <- 0
chess_int$v4 <- 0
chess_int$v5 <- 0
chess_int$v6 <- 0
chess_int$v7 <- 0

# Match the values!
chess_int$v1 <- mat$player.rating[match(chess_int$o1, mat$player.id)]
chess_int$v2 <- mat$player.rating[match(chess_int$o2, mat$player.id)]
chess_int$v3 <- mat$player.rating[match(chess_int$o3, mat$player.id)]
chess_int$v4 <- mat$player.rating[match(chess_int$o4, mat$player.id)]
chess_int$v5 <- mat$player.rating[match(chess_int$o5, mat$player.id)]
chess_int$v6 <- mat$player.rating[match(chess_int$o6, mat$player.id)]
chess_int$v7 <- mat$player.rating[match(chess_int$o7, mat$player.id)]


#Handle the NAs
chess_int[is.na(chess_int)] <- 0

# Turn the Values to Numeric

chess_int[, 13:19] <- sapply(chess_int[, 13:19], as.numeric)


# Get the final avg score
chess_int$oponent.avg <- round((rowSums(chess_int[,13:19]) / 7))

Finally, let’s create our final table!

chess_final <- chess_int %>% select(player.id,player.name,player.state,player.points, player.rating, oponent.avg)
kable(chess_final, "html") %>% kable_styling("striped") %>% scroll_box(width = "100%", height = "350px")
player.id player.name player.state player.points player.rating oponent.avg
1 GARY HUA ON 6.0 1794 1605
2 DAKSHESH DARURI MI 6.0 1553 1469
3 ADITYA BAJAJ MI 6.0 1384 1564
4 PATRICK H SCHILLING MI 5.5 1716 1574
5 HANSHI ZUO MI 5.5 1655 1501
6 HANSEN SONG OH 5.0 1686 1519
7 GARY DEE SWATHELL MI 5.0 1649 1372
8 EZEKIEL HOUGHTON MI 5.0 1641 1468
9 STEFANO LEE ON 5.0 1411 1523
10 ANVIT RAO MI 5.0 1365 1554
11 CAMERON WILLIAM MC LEMAN MI 4.5 1712 1468
12 KENNETH J TACK MI 4.5 1663 1291
13 TORRANCE HENRY JR MI 4.5 1666 1498
14 BRADLEY SHAW MI 4.5 1610 1515
15 ZACHARY JAMES HOUGHTON MI 4.5 1220 1484
16 MIKE NIKITIN MI 4.0 1604 990
17 RONALD GRZEGORCZYK MI 4.0 1629 1499
18 DAVID SUNDEEN MI 4.0 1600 1480
19 DIPANKAR ROY MI 4.0 1564 1426
20 JASON ZHENG MI 4.0 1595 1411
21 DINH DANG BUI ON 4.0 1563 1470
22 EUGENE L MCCLURE MI 4.0 1555 1115
23 ALAN BUI ON 4.0 1363 1214
24 MICHAEL R ALDRICH MI 4.0 1229 1357
25 LOREN SCHWIEBERT MI 3.5 1745 1363
26 MAX ZHU ON 3.5 1579 1507
27 GAURAV GIDWANI MI 3.5 1552 1047
28 SOFIA ADINA MI 3.5 1507 1522
29 CHIEDOZIE OKORIE MI 3.5 1602 1126
30 GEORGE AVERY JONES ON 3.5 1522 1144
31 RISHI SHETTY MI 3.5 1494 1260
32 JOSHUA PHILIP MATHEWS ON 3.5 1441 1379
33 JADE GE MI 3.5 1449 1277
34 MICHAEL JEFFERY THOMAS MI 3.5 1399 1375
35 JOSHUA DAVID LEE MI 3.5 1438 1150
36 SIDDHARTH JHA MI 3.5 1355 1190
37 AMIYATOSH PWNANANDAM MI 3.5 980 989
38 BRIAN LIU MI 3.0 1423 1319
39 JOEL R HENDON MI 3.0 1436 1430
40 FOREST ZHANG MI 3.0 1348 1391
41 KYLE WILLIAM MURPHY MI 3.0 1403 713
42 JARED GE MI 3.0 1332 1150
43 ROBERT GLEN VASEY MI 3.0 1283 1107
44 JUSTIN D SCHILLING MI 3.0 1199 1137
45 DEREK YAN MI 3.0 1242 1152
46 JACOB ALEXANDER LAVALLEY MI 3.0 377 1358
47 ERIC WRIGHT MI 2.5 1362 1392
48 DANIEL KHAIN MI 2.5 1382 968
49 MICHAEL J MARTIN MI 2.5 1291 918
50 SHIVAM JHA MI 2.5 1056 1111
51 TEJAS AYYAGARI MI 2.5 1011 1356
52 ETHAN GUO MI 2.5 935 1495
53 JOSE C YBARRA MI 2.0 1393 577
54 LARRY HODGE MI 2.0 1270 1034
55 ALEX KONG MI 2.0 1186 1205
56 MARISA RICCI MI 2.0 1153 1010
57 MICHAEL LU MI 2.0 1092 1168
58 VIRAJ MOHILE MI 2.0 917 1192
59 SEAN M MC CORMICK MI 2.0 853 1131
60 JULIA SHEN MI 1.5 967 950
61 JEZZEL FARKAS ON 1.5 955 1327
62 ASHWIN BALAJI MI 1.0 1530 169
63 THOMAS JOSEPH HOSMER MI 1.0 1175 964
64 BEN LI MI 1.0 1163 1263

Now that we have the final table, let’s print it out as a CSV.

write.csv(chess_final, file = "chess_tournament.csv")

Extra Credit

let’s get the results of each match and compare it to the expected outcome of the match

First, let’s get the match results in a nice clean format
The process is described here:
1. Pull all data in between pipes
2. Put that data in a dataframe
3. Remove the unneeded columns
4. Remove numeric data from the matrix and add it back as a dataframe (it was converted from the sapply function) 5. Convert the matrix to a dataframe
6. Trim the white spaces
7. Add in column names
8. Remove every other row since it is data not needed ( we need 64 rows)

# 1
chess_e <- unlist(str_extract_all(unlist(chess), '(?<=\\|)[^|]++(?=\\|)'))
# 2
chess_e <- as.data.frame(matrix(chess_e, ncol = 9, byrow = T))
# 3
chess_e <- subset(chess_e, select = 3:9)
# 4
chess_e <-sapply(chess_e,function(x) gsub("[0-9]","",as.character(x)))
# 5
chess_e <- as.data.frame(chess_e)
# 6
chess_e<- chess_e %>%  mutate_all(trimws)
# 7
colnames(chess_e) <-c("r1","r2","r3","r4","r5","r6","r7")
# 8
toDelete <- seq(1, nrow(chess_e), 2)
chess_e <- chess_e[toDelete,]

Great, now that we have the data we need, let’s add it to a larger dataframe that has the all the ratings of each player (what we found above) and merge it with this data.

Then we have a couple of things to calculate
1. First, we need to merge the data then we need to do some ELO math to compute the the odds for each game. 2. We need to compare those odds to the results

Before we begin, let’s go over the formula we need for the ELO ranking system.
- How to predict the outcome of a match: 1/ 1+10^((Rb-Ra)/400)
- Basically this is comparing the scores of each player to determine who has the better odds

chess_ec <- chess_int %>% select(player.id, player.rating,v1,v2,v3,v4,v5,v6,v7)

chess_full <- cbind(chess_ec, chess_e)

# Turn the Values to Numeric

chess_full[, 2:9] <- sapply(chess_full[, 2:9], as.numeric)



chess_full$result1 <- 0
chess_full$result2 <- 0
chess_full$result3 <- 0
chess_full$result4 <- 0
chess_full$result5 <- 0
chess_full$result6 <- 0
chess_full$result7 <- 0


# I decided to round it two positions to make it easier to read
chess_full$result1 <- ifelse(chess_full$v1 >0, round((1/ (1+(10^((chess_full$v1-chess_full$player.rating)/400)))*100)),101)

chess_full$result2 <- ifelse(chess_full$v2 >0,round((1/ (1+(10^((chess_full$v2-chess_full$player.rating)/400)))*100)),101)

chess_full$result3 <- ifelse(chess_full$v3 >0,round((1/ (1+(10^((chess_full$v3-chess_full$player.rating)/400)))*100)),101)

chess_full$result4 <- ifelse(chess_full$v4 >0,round((1/ (1+(10^((chess_full$v4-chess_full$player.rating)/400)))*100)),101)

chess_full$result5 <- ifelse(chess_full$v5 >0,round((1/ (1+(10^((chess_full$v5-chess_full$player.rating)/400)))*100)),101)

chess_full$result6 <- ifelse(chess_full$v6 >0,round((1/ (1+(10^((chess_full$v6-chess_full$player.rating)/400)))*100)),101)

chess_full$result7 <- ifelse(chess_full$v7 >0,round((1/ (1+(10^((chess_full$v7-chess_full$player.rating)/400)))*100)),101)

#Now make sure they are numerical
chess_full[, 17:23] <- sapply(chess_full[, 17:23], as.numeric)

Now that we have all the data we need, we are going to analyze and compare the results to what was the expected outcome. In order to do that we need to learn a little more about the ELO system.

The rating system is able to tell you the outcome based on the probability of a win + Draw - loss. So for all of these numbers we need to determine which one is higher. There are more complicated ways to show this, but to keep it simple, I will categorize each range of probability (i.e. 0%-10%) by the highest expected outcome.

  1. 0-30%, there is a higher chance of loss.
  2. 40-60%, there is a higher chance for a draw.
  3. 70-100%, there is a higher chance for winning.

Thanks to this Stack Exchange forum for the odds (although not the only distribution, it’s assumption is that the rankings are accurate to the person).

Below, we are going to do several things.
1. Calculate the odds, as described above
2. Convert the results into math (leaving out everything but W, L, D)
3. Compare the the expected outcome vs. actual outcome as Expected, Better, or Worse 4. Count the results and add it back to the main data frame.

chess_full$expect1 <- ifelse(chess_full$result1 < 40, 0, 
                             ifelse(chess_full$result1 <= 60, .5, 
                                     ifelse(chess_full$result1 <= 100 ,1, NA)))

chess_full$expect2 <- ifelse(chess_full$result2 < 40, 0, 
                             ifelse(chess_full$result2 <= 60, .5, 
                                     ifelse(chess_full$result2 <= 100 ,1, NA)))

chess_full$expect3 <- ifelse(chess_full$result3 < 40, 0, 
                             ifelse(chess_full$result3 <= 60, .5, 
                                     ifelse(chess_full$result3 <= 100 ,1, NA)))

chess_full$expect4 <- ifelse(chess_full$result4 < 40, 0,
                             ifelse(chess_full$result4 <= 60, .5, 
                                     ifelse(chess_full$result4 <= 100 ,1, NA)))

chess_full$expect5 <- ifelse(chess_full$result5 < 40, 0,
                             ifelse(chess_full$result5 <= 60, .5, 
                                     ifelse(chess_full$result5 <= 100 ,1, NA)))

chess_full$expect6 <- ifelse(chess_full$result6 < 40, 0, 
                             ifelse(chess_full$result6 <= 60, .5, 
                                     ifelse(chess_full$result6 <= 100 ,1, NA)))

chess_full$expect7 <- ifelse(chess_full$result7 < 40, 0, 
                             ifelse(chess_full$result7 <= 60, .5, 
                                     ifelse(chess_full$result7 <= 100 ,1, NA)))
##################################################################################################
chess_full$res1 <- ifelse(chess_full$r1 == "W", 1, 
                             ifelse(chess_full$r1 == "D", .5, 
                                     ifelse(chess_full$r1 == "L" ,0, NA)))

chess_full$res2 <- ifelse(chess_full$r2 == "W", 1, 
                             ifelse(chess_full$r2 == "D", .5, 
                                     ifelse(chess_full$r2 == "L" ,0, NA)))

chess_full$res3 <- ifelse(chess_full$r3 == "W", 1, 
                             ifelse(chess_full$r3 == "D", .5, 
                                     ifelse(chess_full$r3 == "L" ,0, NA)))

chess_full$res4 <- ifelse(chess_full$r4 == "W", 1, 
                             ifelse(chess_full$r4 == "D", .5, 
                                     ifelse(chess_full$r4 == "L" ,0, NA)))

chess_full$res5 <- ifelse(chess_full$r5 == "W", 1, 
                             ifelse(chess_full$r5 == "D", .5, 
                                     ifelse(chess_full$r5 == "L" ,0, NA)))

chess_full$res6 <- ifelse(chess_full$r6 == "W", 1, 
                             ifelse(chess_full$r6 == "D", .5, 
                                     ifelse(chess_full$r6 == "L" ,0, NA)))

chess_full$res7 <- ifelse(chess_full$r7 == "W", 1, 
                             ifelse(chess_full$r7 == "D", .5, 
                                     ifelse(chess_full$r7 == "L" ,0, NA)))
##################################################################################################
chess_full[, 24:37] <- sapply(chess_full[, 24:37], as.numeric)
  
  
chess_full$compare1 <- ifelse(chess_full$res1 == chess_full$expect1, "Expected",
                              ifelse(chess_full$res1 >  chess_full$expect1, "Better","Worse"))


chess_full$compare2 <- ifelse(chess_full$res2 == chess_full$expect2, "Expected",
                              ifelse(chess_full$res2 >  chess_full$expect2, "Better","Worse"))

chess_full$compare3 <- ifelse(chess_full$res3 == chess_full$expect3, "Expected",
                              ifelse(chess_full$res3 >  chess_full$expect3, "Better","Worse"))

chess_full$compare4 <- ifelse(chess_full$res4 == chess_full$expect4, "Expected",
                              ifelse(chess_full$res4 >  chess_full$expect4, "Better","Worse"))

chess_full$compare5 <- ifelse(chess_full$res5 == chess_full$expect5, "Expected",
                              ifelse(chess_full$res5 >  chess_full$expect5, "Better","Worse"))

chess_full$compare6 <- ifelse(chess_full$res6 == chess_full$expect6, "Expected",
                              ifelse(chess_full$res6 >  chess_full$expect6, "Better","Worse"))

chess_full$compare7 <- ifelse(chess_full$res7 == chess_full$expect7, "Expected",
                              ifelse(chess_full$res7 >  chess_full$expect7, "Better","Worse"))
  
chess_full[, 38:44] <- sapply(chess_full[, 38:44], as.factor)

##################################################################################################

chess_full$player.expected <- apply(chess_full[38:44], 1, function(x) length(which(x=="Expected")))

chess_full$player.better <- apply(chess_full[38:44], 1, function(x) length(which(x=="Better")))

chess_full$player.worse <- apply(chess_full[38:44], 1, function(x) length(which(x=="Worse")))

chess_full[, 45:47] <- sapply(chess_full[, 45:47], as.numeric)


chess_full$perc.expected <- round((chess_full$player.expected / (chess_full$player.better+ chess_full$player.expected+chess_full$player.worse)*100))

chess_full$perc.better <- round((chess_full$player.better / (chess_full$player.better+ chess_full$player.expected+chess_full$player.worse)*100))

chess_full$perc.worse <- round((chess_full$player.worse / (chess_full$player.better+ chess_full$player.expected+chess_full$player.worse)*100))


##################################################################################################
extra_crd <- chess_full %>% select(player.id, player.expected, player.better, player.worse,perc.expected, perc.better, perc.worse)
chess_ec_final <- merge(chess_final,extra_crd,by="player.id")
kable(chess_ec_final, "html") %>% kable_styling("striped") %>% scroll_box(width = "100%", height = "350px")
player.id player.name player.state player.points player.rating oponent.avg player.expected player.better player.worse perc.expected perc.better perc.worse
1 GARY HUA ON 6.0 1794 1605 5 0 2 71 0 29
10 ANVIT RAO MI 5.0 1365 1554 2 5 0 29 71 0
11 CAMERON WILLIAM MC LEMAN MI 4.5 1712 1468 3 1 3 43 14 43
12 KENNETH J TACK MI 4.5 1663 1291 4 1 1 67 17 17
13 TORRANCE HENRY JR MI 4.5 1666 1498 5 0 2 71 0 29
14 BRADLEY SHAW MI 4.5 1610 1515 5 1 1 71 14 14
15 ZACHARY JAMES HOUGHTON MI 4.5 1220 1484 2 5 0 29 71 0
16 MIKE NIKITIN MI 4.0 1604 990 3 0 2 60 0 40
17 RONALD GRZEGORCZYK MI 4.0 1629 1499 3 1 3 43 14 43
18 DAVID SUNDEEN MI 4.0 1600 1480 5 0 2 71 0 29
19 DIPANKAR ROY MI 4.0 1564 1426 5 1 1 71 14 14
2 DAKSHESH DARURI MI 6.0 1553 1469 3 4 0 43 57 0
20 JASON ZHENG MI 4.0 1595 1411 4 0 3 57 0 43
21 DINH DANG BUI ON 4.0 1563 1470 6 0 1 86 0 14
22 EUGENE L MCCLURE MI 4.0 1555 1115 3 0 3 50 0 50
23 ALAN BUI ON 4.0 1363 1214 7 0 0 100 0 0
24 MICHAEL R ALDRICH MI 4.0 1229 1357 4 3 0 57 43 0
25 LOREN SCHWIEBERT MI 3.5 1745 1363 3 0 4 43 0 57
26 MAX ZHU ON 3.5 1579 1507 4 1 2 57 14 29
27 GAURAV GIDWANI MI 3.5 1552 1047 6 0 0 100 0 0
28 SOFIA ADINA MI 3.5 1507 1522 4 2 1 57 29 14
29 CHIEDOZIE OKORIE MI 3.5 1602 1126 3 1 2 50 17 33
3 ADITYA BAJAJ MI 6.0 1384 1564 2 5 0 29 71 0
30 GEORGE AVERY JONES ON 3.5 1522 1144 3 0 4 43 0 57
31 RISHI SHETTY MI 3.5 1494 1260 3 1 3 43 14 43
32 JOSHUA PHILIP MATHEWS ON 3.5 1441 1379 6 1 0 86 14 0
33 JADE GE MI 3.5 1449 1277 5 0 2 71 0 29
34 MICHAEL JEFFERY THOMAS MI 3.5 1399 1375 4 2 1 57 29 14
35 JOSHUA DAVID LEE MI 3.5 1438 1150 3 1 3 43 14 43
36 SIDDHARTH JHA MI 3.5 1355 1190 4 2 0 67 33 0
37 AMIYATOSH PWNANANDAM MI 3.5 980 989 3 2 0 60 40 0
38 BRIAN LIU MI 3.0 1423 1319 2 3 1 33 50 17
39 JOEL R HENDON MI 3.0 1436 1430 6 0 1 86 0 14
4 PATRICK H SCHILLING MI 5.5 1716 1574 5 1 1 71 14 14
40 FOREST ZHANG MI 3.0 1348 1391 6 1 0 86 14 0
41 KYLE WILLIAM MURPHY MI 3.0 1403 713 4 0 0 100 0 0
42 JARED GE MI 3.0 1332 1150 3 0 4 43 0 57
43 ROBERT GLEN VASEY MI 3.0 1283 1107 5 0 2 71 0 29
44 JUSTIN D SCHILLING MI 3.0 1199 1137 4 1 1 67 17 17
45 DEREK YAN MI 3.0 1242 1152 3 1 3 43 14 43
46 JACOB ALEXANDER LAVALLEY MI 3.0 377 1358 4 3 0 57 43 0
47 ERIC WRIGHT MI 2.5 1362 1392 6 0 1 86 0 14
48 DANIEL KHAIN MI 2.5 1382 968 3 0 2 60 0 40
49 MICHAEL J MARTIN MI 2.5 1291 918 3 0 2 60 0 40
5 HANSHI ZUO MI 5.5 1655 1501 5 2 0 71 29 0
50 SHIVAM JHA MI 2.5 1056 1111 5 1 0 83 17 0
51 TEJAS AYYAGARI MI 2.5 1011 1356 4 3 0 57 43 0
52 ETHAN GUO MI 2.5 935 1495 3 4 0 43 57 0
53 JOSE C YBARRA MI 2.0 1393 577 2 0 1 67 0 33
54 LARRY HODGE MI 2.0 1270 1034 3 0 3 50 0 50
55 ALEX KONG MI 2.0 1186 1205 5 1 0 83 17 0
56 MARISA RICCI MI 2.0 1153 1010 4 1 0 80 20 0
57 MICHAEL LU MI 2.0 1092 1168 4 1 1 67 17 17
58 VIRAJ MOHILE MI 2.0 917 1192 5 1 0 83 17 0
59 SEAN M MC CORMICK MI 2.0 853 1131 5 1 0 83 17 0
6 HANSEN SONG OH 5.0 1686 1519 4 0 3 57 0 43
60 JULIA SHEN MI 1.5 967 950 3 2 0 60 40 0
61 JEZZEL FARKAS ON 1.5 955 1327 4 2 1 57 29 14
62 ASHWIN BALAJI MI 1.0 1530 169 1 0 0 100 0 0
63 THOMAS JOSEPH HOSMER MI 1.0 1175 964 3 1 1 60 20 20
64 BEN LI MI 1.0 1163 1263 4 2 1 57 29 14
7 GARY DEE SWATHELL MI 5.0 1649 1372 4 2 1 57 29 14
8 EZEKIEL HOUGHTON MI 5.0 1641 1468 5 0 2 71 0 29
9 STEFANO LEE ON 5.0 1411 1523 3 4 0 43 57 0

And finally, let’s find out who played as expected, who played better, and who played worse.

Expected

kable(chess_ec_final %>% arrange(desc(perc.expected)), "html") %>% kable_styling("striped") %>% scroll_box(width = "100%", height = "350px") 
player.id player.name player.state player.points player.rating oponent.avg player.expected player.better player.worse perc.expected perc.better perc.worse
23 ALAN BUI ON 4.0 1363 1214 7 0 0 100 0 0
27 GAURAV GIDWANI MI 3.5 1552 1047 6 0 0 100 0 0
41 KYLE WILLIAM MURPHY MI 3.0 1403 713 4 0 0 100 0 0
62 ASHWIN BALAJI MI 1.0 1530 169 1 0 0 100 0 0
21 DINH DANG BUI ON 4.0 1563 1470 6 0 1 86 0 14
32 JOSHUA PHILIP MATHEWS ON 3.5 1441 1379 6 1 0 86 14 0
39 JOEL R HENDON MI 3.0 1436 1430 6 0 1 86 0 14
40 FOREST ZHANG MI 3.0 1348 1391 6 1 0 86 14 0
47 ERIC WRIGHT MI 2.5 1362 1392 6 0 1 86 0 14
50 SHIVAM JHA MI 2.5 1056 1111 5 1 0 83 17 0
55 ALEX KONG MI 2.0 1186 1205 5 1 0 83 17 0
58 VIRAJ MOHILE MI 2.0 917 1192 5 1 0 83 17 0
59 SEAN M MC CORMICK MI 2.0 853 1131 5 1 0 83 17 0
56 MARISA RICCI MI 2.0 1153 1010 4 1 0 80 20 0
1 GARY HUA ON 6.0 1794 1605 5 0 2 71 0 29
13 TORRANCE HENRY JR MI 4.5 1666 1498 5 0 2 71 0 29
14 BRADLEY SHAW MI 4.5 1610 1515 5 1 1 71 14 14
18 DAVID SUNDEEN MI 4.0 1600 1480 5 0 2 71 0 29
19 DIPANKAR ROY MI 4.0 1564 1426 5 1 1 71 14 14
33 JADE GE MI 3.5 1449 1277 5 0 2 71 0 29
4 PATRICK H SCHILLING MI 5.5 1716 1574 5 1 1 71 14 14
43 ROBERT GLEN VASEY MI 3.0 1283 1107 5 0 2 71 0 29
5 HANSHI ZUO MI 5.5 1655 1501 5 2 0 71 29 0
8 EZEKIEL HOUGHTON MI 5.0 1641 1468 5 0 2 71 0 29
12 KENNETH J TACK MI 4.5 1663 1291 4 1 1 67 17 17
36 SIDDHARTH JHA MI 3.5 1355 1190 4 2 0 67 33 0
44 JUSTIN D SCHILLING MI 3.0 1199 1137 4 1 1 67 17 17
53 JOSE C YBARRA MI 2.0 1393 577 2 0 1 67 0 33
57 MICHAEL LU MI 2.0 1092 1168 4 1 1 67 17 17
16 MIKE NIKITIN MI 4.0 1604 990 3 0 2 60 0 40
37 AMIYATOSH PWNANANDAM MI 3.5 980 989 3 2 0 60 40 0
48 DANIEL KHAIN MI 2.5 1382 968 3 0 2 60 0 40
49 MICHAEL J MARTIN MI 2.5 1291 918 3 0 2 60 0 40
60 JULIA SHEN MI 1.5 967 950 3 2 0 60 40 0
63 THOMAS JOSEPH HOSMER MI 1.0 1175 964 3 1 1 60 20 20
20 JASON ZHENG MI 4.0 1595 1411 4 0 3 57 0 43
24 MICHAEL R ALDRICH MI 4.0 1229 1357 4 3 0 57 43 0
26 MAX ZHU ON 3.5 1579 1507 4 1 2 57 14 29
28 SOFIA ADINA MI 3.5 1507 1522 4 2 1 57 29 14
34 MICHAEL JEFFERY THOMAS MI 3.5 1399 1375 4 2 1 57 29 14
46 JACOB ALEXANDER LAVALLEY MI 3.0 377 1358 4 3 0 57 43 0
51 TEJAS AYYAGARI MI 2.5 1011 1356 4 3 0 57 43 0
6 HANSEN SONG OH 5.0 1686 1519 4 0 3 57 0 43
61 JEZZEL FARKAS ON 1.5 955 1327 4 2 1 57 29 14
64 BEN LI MI 1.0 1163 1263 4 2 1 57 29 14
7 GARY DEE SWATHELL MI 5.0 1649 1372 4 2 1 57 29 14
22 EUGENE L MCCLURE MI 4.0 1555 1115 3 0 3 50 0 50
29 CHIEDOZIE OKORIE MI 3.5 1602 1126 3 1 2 50 17 33
54 LARRY HODGE MI 2.0 1270 1034 3 0 3 50 0 50
11 CAMERON WILLIAM MC LEMAN MI 4.5 1712 1468 3 1 3 43 14 43
17 RONALD GRZEGORCZYK MI 4.0 1629 1499 3 1 3 43 14 43
2 DAKSHESH DARURI MI 6.0 1553 1469 3 4 0 43 57 0
25 LOREN SCHWIEBERT MI 3.5 1745 1363 3 0 4 43 0 57
30 GEORGE AVERY JONES ON 3.5 1522 1144 3 0 4 43 0 57
31 RISHI SHETTY MI 3.5 1494 1260 3 1 3 43 14 43
35 JOSHUA DAVID LEE MI 3.5 1438 1150 3 1 3 43 14 43
42 JARED GE MI 3.0 1332 1150 3 0 4 43 0 57
45 DEREK YAN MI 3.0 1242 1152 3 1 3 43 14 43
52 ETHAN GUO MI 2.5 935 1495 3 4 0 43 57 0
9 STEFANO LEE ON 5.0 1411 1523 3 4 0 43 57 0
38 BRIAN LIU MI 3.0 1423 1319 2 3 1 33 50 17
10 ANVIT RAO MI 5.0 1365 1554 2 5 0 29 71 0
15 ZACHARY JAMES HOUGHTON MI 4.5 1220 1484 2 5 0 29 71 0
3 ADITYA BAJAJ MI 6.0 1384 1564 2 5 0 29 71 0

Better

kable(chess_ec_final %>% arrange(desc(perc.better)), "html") %>% kable_styling("striped") %>% scroll_box(width = "100%", height = "350px")
player.id player.name player.state player.points player.rating oponent.avg player.expected player.better player.worse perc.expected perc.better perc.worse
10 ANVIT RAO MI 5.0 1365 1554 2 5 0 29 71 0
15 ZACHARY JAMES HOUGHTON MI 4.5 1220 1484 2 5 0 29 71 0
3 ADITYA BAJAJ MI 6.0 1384 1564 2 5 0 29 71 0
2 DAKSHESH DARURI MI 6.0 1553 1469 3 4 0 43 57 0
52 ETHAN GUO MI 2.5 935 1495 3 4 0 43 57 0
9 STEFANO LEE ON 5.0 1411 1523 3 4 0 43 57 0
38 BRIAN LIU MI 3.0 1423 1319 2 3 1 33 50 17
24 MICHAEL R ALDRICH MI 4.0 1229 1357 4 3 0 57 43 0
46 JACOB ALEXANDER LAVALLEY MI 3.0 377 1358 4 3 0 57 43 0
51 TEJAS AYYAGARI MI 2.5 1011 1356 4 3 0 57 43 0
37 AMIYATOSH PWNANANDAM MI 3.5 980 989 3 2 0 60 40 0
60 JULIA SHEN MI 1.5 967 950 3 2 0 60 40 0
36 SIDDHARTH JHA MI 3.5 1355 1190 4 2 0 67 33 0
28 SOFIA ADINA MI 3.5 1507 1522 4 2 1 57 29 14
34 MICHAEL JEFFERY THOMAS MI 3.5 1399 1375 4 2 1 57 29 14
5 HANSHI ZUO MI 5.5 1655 1501 5 2 0 71 29 0
61 JEZZEL FARKAS ON 1.5 955 1327 4 2 1 57 29 14
64 BEN LI MI 1.0 1163 1263 4 2 1 57 29 14
7 GARY DEE SWATHELL MI 5.0 1649 1372 4 2 1 57 29 14
56 MARISA RICCI MI 2.0 1153 1010 4 1 0 80 20 0
63 THOMAS JOSEPH HOSMER MI 1.0 1175 964 3 1 1 60 20 20
12 KENNETH J TACK MI 4.5 1663 1291 4 1 1 67 17 17
29 CHIEDOZIE OKORIE MI 3.5 1602 1126 3 1 2 50 17 33
44 JUSTIN D SCHILLING MI 3.0 1199 1137 4 1 1 67 17 17
50 SHIVAM JHA MI 2.5 1056 1111 5 1 0 83 17 0
55 ALEX KONG MI 2.0 1186 1205 5 1 0 83 17 0
57 MICHAEL LU MI 2.0 1092 1168 4 1 1 67 17 17
58 VIRAJ MOHILE MI 2.0 917 1192 5 1 0 83 17 0
59 SEAN M MC CORMICK MI 2.0 853 1131 5 1 0 83 17 0
11 CAMERON WILLIAM MC LEMAN MI 4.5 1712 1468 3 1 3 43 14 43
14 BRADLEY SHAW MI 4.5 1610 1515 5 1 1 71 14 14
17 RONALD GRZEGORCZYK MI 4.0 1629 1499 3 1 3 43 14 43
19 DIPANKAR ROY MI 4.0 1564 1426 5 1 1 71 14 14
26 MAX ZHU ON 3.5 1579 1507 4 1 2 57 14 29
31 RISHI SHETTY MI 3.5 1494 1260 3 1 3 43 14 43
32 JOSHUA PHILIP MATHEWS ON 3.5 1441 1379 6 1 0 86 14 0
35 JOSHUA DAVID LEE MI 3.5 1438 1150 3 1 3 43 14 43
4 PATRICK H SCHILLING MI 5.5 1716 1574 5 1 1 71 14 14
40 FOREST ZHANG MI 3.0 1348 1391 6 1 0 86 14 0
45 DEREK YAN MI 3.0 1242 1152 3 1 3 43 14 43
1 GARY HUA ON 6.0 1794 1605 5 0 2 71 0 29
13 TORRANCE HENRY JR MI 4.5 1666 1498 5 0 2 71 0 29
16 MIKE NIKITIN MI 4.0 1604 990 3 0 2 60 0 40
18 DAVID SUNDEEN MI 4.0 1600 1480 5 0 2 71 0 29
20 JASON ZHENG MI 4.0 1595 1411 4 0 3 57 0 43
21 DINH DANG BUI ON 4.0 1563 1470 6 0 1 86 0 14
22 EUGENE L MCCLURE MI 4.0 1555 1115 3 0 3 50 0 50
23 ALAN BUI ON 4.0 1363 1214 7 0 0 100 0 0
25 LOREN SCHWIEBERT MI 3.5 1745 1363 3 0 4 43 0 57
27 GAURAV GIDWANI MI 3.5 1552 1047 6 0 0 100 0 0
30 GEORGE AVERY JONES ON 3.5 1522 1144 3 0 4 43 0 57
33 JADE GE MI 3.5 1449 1277 5 0 2 71 0 29
39 JOEL R HENDON MI 3.0 1436 1430 6 0 1 86 0 14
41 KYLE WILLIAM MURPHY MI 3.0 1403 713 4 0 0 100 0 0
42 JARED GE MI 3.0 1332 1150 3 0 4 43 0 57
43 ROBERT GLEN VASEY MI 3.0 1283 1107 5 0 2 71 0 29
47 ERIC WRIGHT MI 2.5 1362 1392 6 0 1 86 0 14
48 DANIEL KHAIN MI 2.5 1382 968 3 0 2 60 0 40
49 MICHAEL J MARTIN MI 2.5 1291 918 3 0 2 60 0 40
53 JOSE C YBARRA MI 2.0 1393 577 2 0 1 67 0 33
54 LARRY HODGE MI 2.0 1270 1034 3 0 3 50 0 50
6 HANSEN SONG OH 5.0 1686 1519 4 0 3 57 0 43
62 ASHWIN BALAJI MI 1.0 1530 169 1 0 0 100 0 0
8 EZEKIEL HOUGHTON MI 5.0 1641 1468 5 0 2 71 0 29

Worse

kable(chess_ec_final %>% arrange(desc(perc.worse)), "html") %>% kable_styling("striped") %>% scroll_box(width = "100%", height = "350px")
player.id player.name player.state player.points player.rating oponent.avg player.expected player.better player.worse perc.expected perc.better perc.worse
25 LOREN SCHWIEBERT MI 3.5 1745 1363 3 0 4 43 0 57
30 GEORGE AVERY JONES ON 3.5 1522 1144 3 0 4 43 0 57
42 JARED GE MI 3.0 1332 1150 3 0 4 43 0 57
22 EUGENE L MCCLURE MI 4.0 1555 1115 3 0 3 50 0 50
54 LARRY HODGE MI 2.0 1270 1034 3 0 3 50 0 50
11 CAMERON WILLIAM MC LEMAN MI 4.5 1712 1468 3 1 3 43 14 43
17 RONALD GRZEGORCZYK MI 4.0 1629 1499 3 1 3 43 14 43
20 JASON ZHENG MI 4.0 1595 1411 4 0 3 57 0 43
31 RISHI SHETTY MI 3.5 1494 1260 3 1 3 43 14 43
35 JOSHUA DAVID LEE MI 3.5 1438 1150 3 1 3 43 14 43
45 DEREK YAN MI 3.0 1242 1152 3 1 3 43 14 43
6 HANSEN SONG OH 5.0 1686 1519 4 0 3 57 0 43
16 MIKE NIKITIN MI 4.0 1604 990 3 0 2 60 0 40
48 DANIEL KHAIN MI 2.5 1382 968 3 0 2 60 0 40
49 MICHAEL J MARTIN MI 2.5 1291 918 3 0 2 60 0 40
29 CHIEDOZIE OKORIE MI 3.5 1602 1126 3 1 2 50 17 33
53 JOSE C YBARRA MI 2.0 1393 577 2 0 1 67 0 33
1 GARY HUA ON 6.0 1794 1605 5 0 2 71 0 29
13 TORRANCE HENRY JR MI 4.5 1666 1498 5 0 2 71 0 29
18 DAVID SUNDEEN MI 4.0 1600 1480 5 0 2 71 0 29
26 MAX ZHU ON 3.5 1579 1507 4 1 2 57 14 29
33 JADE GE MI 3.5 1449 1277 5 0 2 71 0 29
43 ROBERT GLEN VASEY MI 3.0 1283 1107 5 0 2 71 0 29
8 EZEKIEL HOUGHTON MI 5.0 1641 1468 5 0 2 71 0 29
63 THOMAS JOSEPH HOSMER MI 1.0 1175 964 3 1 1 60 20 20
12 KENNETH J TACK MI 4.5 1663 1291 4 1 1 67 17 17
38 BRIAN LIU MI 3.0 1423 1319 2 3 1 33 50 17
44 JUSTIN D SCHILLING MI 3.0 1199 1137 4 1 1 67 17 17
57 MICHAEL LU MI 2.0 1092 1168 4 1 1 67 17 17
14 BRADLEY SHAW MI 4.5 1610 1515 5 1 1 71 14 14
19 DIPANKAR ROY MI 4.0 1564 1426 5 1 1 71 14 14
21 DINH DANG BUI ON 4.0 1563 1470 6 0 1 86 0 14
28 SOFIA ADINA MI 3.5 1507 1522 4 2 1 57 29 14
34 MICHAEL JEFFERY THOMAS MI 3.5 1399 1375 4 2 1 57 29 14
39 JOEL R HENDON MI 3.0 1436 1430 6 0 1 86 0 14
4 PATRICK H SCHILLING MI 5.5 1716 1574 5 1 1 71 14 14
47 ERIC WRIGHT MI 2.5 1362 1392 6 0 1 86 0 14
61 JEZZEL FARKAS ON 1.5 955 1327 4 2 1 57 29 14
64 BEN LI MI 1.0 1163 1263 4 2 1 57 29 14
7 GARY DEE SWATHELL MI 5.0 1649 1372 4 2 1 57 29 14
10 ANVIT RAO MI 5.0 1365 1554 2 5 0 29 71 0
15 ZACHARY JAMES HOUGHTON MI 4.5 1220 1484 2 5 0 29 71 0
2 DAKSHESH DARURI MI 6.0 1553 1469 3 4 0 43 57 0
23 ALAN BUI ON 4.0 1363 1214 7 0 0 100 0 0
24 MICHAEL R ALDRICH MI 4.0 1229 1357 4 3 0 57 43 0
27 GAURAV GIDWANI MI 3.5 1552 1047 6 0 0 100 0 0
3 ADITYA BAJAJ MI 6.0 1384 1564 2 5 0 29 71 0
32 JOSHUA PHILIP MATHEWS ON 3.5 1441 1379 6 1 0 86 14 0
36 SIDDHARTH JHA MI 3.5 1355 1190 4 2 0 67 33 0
37 AMIYATOSH PWNANANDAM MI 3.5 980 989 3 2 0 60 40 0
40 FOREST ZHANG MI 3.0 1348 1391 6 1 0 86 14 0
41 KYLE WILLIAM MURPHY MI 3.0 1403 713 4 0 0 100 0 0
46 JACOB ALEXANDER LAVALLEY MI 3.0 377 1358 4 3 0 57 43 0
5 HANSHI ZUO MI 5.5 1655 1501 5 2 0 71 29 0
50 SHIVAM JHA MI 2.5 1056 1111 5 1 0 83 17 0
51 TEJAS AYYAGARI MI 2.5 1011 1356 4 3 0 57 43 0
52 ETHAN GUO MI 2.5 935 1495 3 4 0 43 57 0
55 ALEX KONG MI 2.0 1186 1205 5 1 0 83 17 0
56 MARISA RICCI MI 2.0 1153 1010 4 1 0 80 20 0
58 VIRAJ MOHILE MI 2.0 917 1192 5 1 0 83 17 0
59 SEAN M MC CORMICK MI 2.0 853 1131 5 1 0 83 17 0
60 JULIA SHEN MI 1.5 967 950 3 2 0 60 40 0
62 ASHWIN BALAJI MI 1.0 1530 169 1 0 0 100 0 0
9 STEFANO LEE ON 5.0 1411 1523 3 4 0 43 57 0