This project involves transforming a semi-structured chess tournament text file into a clean, tabular CSV format using R Markdown.
To fulfill this assignment, I have designed a step-by-step workflow in R to parse the semi-structured tournamentinfo.txt file.
I first read the raw text and removed the decorative dashed lines that act as separators. I also strip out the initial header rows to isolate the player data.
I observe that each player’s data is split across two rows. I use indexing to separate these into two distinct vectors: one for primary data (Name and Points) and one for secondary data (State and Rating).
I use Regular Expressions to target specific fields:
Name: Characters between the first and second pipe symbols.
State: The two uppercase letters starting the second row.
Total Points: The numeric value in the “Total” column.
Pre-Rating: The digits immediately following “R:”.
I extract the numeric IDs of every opponent a player faced. I then use these IDs to look up their corresponding pre-ratings and calculate the mean for each player.