In this project, you’re given a text file with chess tournament results where the information has some structure. Your job is to create an R Markdown file that generates a .CSV file (that could for example be imported into a SQL database) with the following information for all of the players:
Player’s Name, Player’s State, Total Number of Points, Player’s Pre-Rating, and Average Pre Chess Rating of Opponents
For the first player, the information would be: Gary Hua, ON, 6.0, 1794, 1605
1605 was calculated by using the pre-tournament opponents’ ratings of 1436, 1563, 1600, 1610, 1649, 1663, 1716, and dividing by the total number of games played.
Step 1 - Load the raw table data from the text file.
library(stringr)
# Load the raw table data from the text file.
tournament_table_data <- readLines("/Users/stephenhaslett/Desktop/tournamentinfo.txt", skip = 1)
head(tournament_table_data)
## [1] "-----------------------------------------------------------------------------------------"
## [2] " Pair | Player Name |Total|Round|Round|Round|Round|Round|Round|Round| "
## [3] " Num | USCF ID / Rtg (Pre->Post) | Pts | 1 | 2 | 3 | 4 | 5 | 6 | 7 | "
## [4] "-----------------------------------------------------------------------------------------"
## [5] " 1 | GARY HUA |6.0 |W 39|W 21|W 18|W 14|W 7|D 12|D 4|"
## [6] " ON | 15445895 / R: 1794 ->1817 |N:2 |W |B |W |B |W |B |W |"
Step 2 - Extract all the data that we will need to create a revised version of the table.
# Extract all the data that we will need to create a revised version of the table.
ID <- as.integer(str_extract(tournament_table_data[seq(5, 196, 3)], "\\d+"))
Name <- str_replace_all(str_extract(tournament_table_data[seq(5, 196, 3)],"([|]).+?\\1"),"[|]","")
Points <- str_extract(tournament_table_data[seq(5, 196, 3)], "\\d.\\d")
State <- str_extract(tournament_table_data[seq(6, 196, 3)], "[A-Z]{2}")
Rating <- as.integer(str_replace_all(str_extract(tournament_table_data[seq(6, 196, 3)], "R: \\s?\\d{3,4}"), "R:\\s", ""))
Step 3 - Calculate the player’s opponent’s average.
# Calculate the player's opponent's average.
data <- str_extract_all(tournament_table_data[seq(5, 196, 3)], "\\d+\\|")
opponents <- str_extract_all(data, "\\d+")
Average <- length(seq(5, 196, 3))
for (row in 1:length(seq(5, 196, 3))) {
Average[row] <- round(mean(Rating[as.numeric(unlist(opponents[ID[row]]))]), digits = 0)
}
Step 4 Reconstruct our revised version of the table.
# Reconstruct our revised version of the table.
revised_tournament_table <- data.frame(ID, Name, State, Points, Rating, Average)
head(revised_tournament_table)
## ID Name State Points Rating Average
## 1 1 GARY HUA ON 6.0 1794 1605
## 2 2 DAKSHESH DARURI MI 6.0 1553 1469
## 3 3 ADITYA BAJAJ MI 6.0 1384 1564
## 4 4 PATRICK H SCHILLING MI 5.5 1716 1574
## 5 5 HANSHI ZUO MI 5.5 1655 1501
## 6 6 HANSEN SONG OH 5.0 1686 1519
Step 5 - Output the revised data to a CSV file.
# Output the revised data to a CSV file.
write.csv(revised_tournament_table, file = "revised_tournament_table_data.csv")