In this project, you’re given a text file with chess tournament results where the information has some structure. Your job is to create an R Markdown file that generates a .CSV file (that could for example be imported into a SQL database) with the following information for all of the players:

Player’s Name, Player’s State, Total Number of Points, Player’s Pre-Rating, and Average Pre Chess Rating of Opponents

For the first player, the information would be: Gary Hua, ON, 6.0, 1794, 1605

1605 was calculated by using the pre-tournament opponents’ ratings of 1436, 1563, 1600, 1610, 1649, 1663, 1716, and dividing by the total number of games played.

Step 1 - Load the raw table data from the text file.

library(stringr)
# Load the raw table data from the text file.
tournament_table_data <- readLines("/Users/stephenhaslett/Desktop/tournamentinfo.txt", skip = 1)
head(tournament_table_data)
## [1] "-----------------------------------------------------------------------------------------" 
## [2] " Pair | Player Name                     |Total|Round|Round|Round|Round|Round|Round|Round| "
## [3] " Num  | USCF ID / Rtg (Pre->Post)       | Pts |  1  |  2  |  3  |  4  |  5  |  6  |  7  | "
## [4] "-----------------------------------------------------------------------------------------" 
## [5] "    1 | GARY HUA                        |6.0  |W  39|W  21|W  18|W  14|W   7|D  12|D   4|" 
## [6] "   ON | 15445895 / R: 1794   ->1817     |N:2  |W    |B    |W    |B    |W    |B    |W    |"

Step 2 - Extract all the data that we will need to create a revised version of the table.

# Extract all the data that we will need to create a revised version of the table.
ID <- as.integer(str_extract(tournament_table_data[seq(5, 196, 3)], "\\d+"))
Name <- str_replace_all(str_extract(tournament_table_data[seq(5, 196, 3)],"([|]).+?\\1"),"[|]","")
Points <- str_extract(tournament_table_data[seq(5, 196, 3)], "\\d.\\d")
State <- str_extract(tournament_table_data[seq(6, 196, 3)], "[A-Z]{2}")
Rating <- as.integer(str_replace_all(str_extract(tournament_table_data[seq(6, 196, 3)], "R: \\s?\\d{3,4}"), "R:\\s", "")) 

Step 3 - Calculate the player’s opponent’s average.

# Calculate the player's opponent's average.
data <- str_extract_all(tournament_table_data[seq(5, 196, 3)], "\\d+\\|")
opponents <- str_extract_all(data, "\\d+")
Average <- length(seq(5, 196, 3))

for (row in 1:length(seq(5, 196, 3))) { 
  Average[row] <- round(mean(Rating[as.numeric(unlist(opponents[ID[row]]))]), digits = 0)
}

Step 4 Reconstruct our revised version of the table.

# Reconstruct our revised version of the table.
revised_tournament_table <- data.frame(ID, Name, State, Points, Rating, Average)
head(revised_tournament_table)
##   ID                              Name State Points Rating Average
## 1  1  GARY HUA                            ON    6.0   1794    1605
## 2  2  DAKSHESH DARURI                     MI    6.0   1553    1469
## 3  3  ADITYA BAJAJ                        MI    6.0   1384    1564
## 4  4  PATRICK H SCHILLING                 MI    5.5   1716    1574
## 5  5  HANSHI ZUO                          MI    5.5   1655    1501
## 6  6  HANSEN SONG                         OH    5.0   1686    1519

Step 5 - Output the revised data to a CSV file.

# Output the revised data to a CSV file.
write.csv(revised_tournament_table, file = "revised_tournament_table_data.csv")