Data Acquisition and Management Project 1

Overview: The project is to create an R Markdown file that generates a .CSV from a text file with chess tournament results where the information has some structure. The following is the definition of game result from Google:

W - win, worth 1 point L - lose, worth 0 points D - draw, worth 0.5 points B - full point bye, worth 1 point (given to the left-over player when there are an odd number of players in a tournament round) H - half point bye, worth 0.5 points (players can request these when they know they won’t be able to make it to certain rounds in a tournament. They are normally only available in the first few rounds of a tournament, and tournament directors often limit a player to a small number of bye requests) X - win by forfeit, worth 1 point U - unplayed game, worth 0 points (in a round robin, this shows up for any games that haven’t been played yet; in a Swiss tournament, this would show up for games following a forfeit loss. This could also show up in a situation where a player requests more byes than the tournament director permits—the director could allow the player to miss the games without withdrawing from the tournament, but the player would score no points for the missed games)

F - lose by forfeit, worth 0 points (and usually results in automatic withdrawal from the rest of the tournament) - not in the data file

I assume B, H, X, U don’t count as game played and won’t include them in Average Pre Chess Rating of Opponents. There is no opponent number for these result anyway.

Load the Tidyverse and Skimr packages.

library(tidyverse)
library(skimr)

Read Data from the Text File in Github

theUrl <- "https://raw.githubusercontent.com/ferrysany/CUNY607P1/master/tournamentinfo.txt"
chess <- read_delim(file=theUrl, "|",col_names=FALSE, skip=4)

Tame the data

#Create 2 tibbles to extra data from chess 
chessleft <- chess[seq(1, nrow(chess), by=3),] 
chessright <- chess[seq(2, nrow(chess), by=3),]

#Combine 2 tibbles and gather the result to a tamed tibble
chess1 <- bind_cols(chessleft, chessright) %>%
  select(number=X1,
         name=X2,
         state=X12,
         point=X3, 
         rating=X21,
         gR_=X4:X10,
         -X11, 
         -(X31:X111)) %>%
  gather(key = "game", value = "result", gR_1:gR_7) %>%
  separate(col = result, c("result", "opponent"), convert=TRUE)

#Parse the number field 
chess1$number <- parse_integer(chess1$number)

#Extract Pre-rating for players
chess1$rating <- parse_number(str_sub(chess1$rating, 15, 19))

#Create a Pre Rating tibble for "lookup/join" and average Pre rating of opponents
preRate <- chess1 %>%
  select(number,rating)%>%
  slice(1:64)

head(preRate)

## # A tibble: 6 x 2
##   number rating
##    <int>  <dbl>
## 1      1   1794
## 2      2   1553
## 3      3   1384
## 4      4   1716
## 5      5   1655
## 6      6   1686

Join tibble “chess1” and “preRate” and generate the final tibble

chess1 <- chess1 %>%
  left_join(preRate, by = c("opponent" = "number"))%>%
  group_by(name, state, point, rating.x) %>%
  summarise(
    oppRate = round(mean(rating.y, na.rm=TRUE), digits=0)
  )%>%
  arrange(desc(point), desc(rating.x))

head(chess1)

## # A tibble: 6 x 5
## # Groups:   name, state, point [6]
##   name                                state    point   rating.x oppRate
##   <chr>                               <chr>    <chr>      <dbl>   <dbl>
## 1 " GARY HUA                        " "   ON " "6.0  "     1794    1605
## 2 " DAKSHESH DARURI                 " "   MI " "6.0  "     1553    1469
## 3 " ADITYA BAJAJ                    " "   MI " "6.0  "     1384    1564
## 4 " PATRICK H SCHILLING             " "   MI " "5.5  "     1716    1574
## 5 " HANSHI ZUO                      " "   MI " "5.5  "     1655    1501
## 6 " HANSEN SONG                     " "   OH " "5.0  "     1686    1519

Rename columns according to the required name and export “chess1” to csv file

chess1 <- chess1 %>%
  rename("Player’s Name"=name,
        "Player’s State"=state,
        "Total Number of Points"=point,
        "Player’s Pre-Rating"=rating.x,
        "Average Pre Chess Rating of Opponents"=oppRate)

write_csv(chess1,"C:/data/chess.csv")

Data Acquisition and Management Project 1

Chun San Yip

2019/02/20