I will first split the variables, extracting name, region, total score, and pre-tournament rating. I will process B/H/U matches separately and calculate the average afterward. The output will consist of 5 columns and be exported as a CSV file.
2.What data challenges do I anticipate?
The main challenges are the presence of mixed B/H/U records within the match data.
Warning in
readLines("https://raw.githubusercontent.com/XxY-coder/data607-Proj.Y/refs/heads/main/tournamentinfo.txt"):
incomplete final line found on
'https://raw.githubusercontent.com/XxY-coder/data607-Proj.Y/refs/heads/main/tournamentinfo.txt'
# A tibble: 6 × 10
player_num name total_points round1 round2 round3 round4 round5 round6 round7
<int> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 GARY… 6 W 39 W 21 W 18 W 14 W 7 D 12 D 4
2 2 DAKS… 6 W 63 W 58 L 4 W 17 W 16 W 20 W 7
3 3 ADIT… 6 L 8 W 61 W 25 W 21 W 11 W 13 W 12
4 4 PATR… 5.5 W 23 D 28 W 2 W 26 D 5 W 19 D 1
5 5 HANS… 5.5 W 45 W 37 D 12 D 13 D 4 W 14 W 17
6 6 HANS… 5 W 34 D 29 L 11 W 35 D 10 W 27 W 21
Get the state and pre-rating.
parse_line2 <-function(x) { parts <-str_split(x, "\\|", simplify =TRUE) |>as.character() parts <-str_trim(parts) parts <- parts[parts !=""] state <- parts[1] pre <-str_match(x, "R:\\s*(\\d{3,4})")[, 2]if (is.na(pre)) pre <-str_match(x, "(\\d{3,4})\\s*->")[, 2]tibble(state = state,pre_rating =as.integer(pre) )}line2_df <-map_dfr(line_even, parse_line2)line2_df |>head()
# A tibble: 6 × 2
state pre_rating
<chr> <int>
1 ON 1794
2 MI 1553
3 MI 1384
4 MI 1716
5 MI 1655
6 OH 1686
# A tibble: 6 × 5
player_num name state total_points pre_rating
<int> <chr> <chr> <dbl> <int>
1 1 GARY HUA ON 6 1794
2 2 DAKSHESH DARURI MI 6 1553
3 3 ADITYA BAJAJ MI 6 1384
4 4 PATRICK H SCHILLING MI 5.5 1716
5 5 HANSHI ZUO MI 5.5 1655
6 6 HANSEN SONG OH 5 1686
5. Dealing with the pre-rating.
select the rounds and match the rating, then to find out the average rating.
# A tibble: 10 × 5
Name State TotalPoints PreRating AvgPreRating
<chr> <chr> <dbl> <int> <int>
1 GARY HUA ON 6 1794 1605
2 DAKSHESH DARURI MI 6 1553 1469
3 ADITYA BAJAJ MI 6 1384 1564
4 PATRICK H SCHILLING MI 5.5 1716 1574
5 HANSHI ZUO MI 5.5 1655 1501
6 HANSEN SONG OH 5 1686 1519
7 GARY DEE SWATHELL MI 5 1649 1372
8 EZEKIEL HOUGHTON MI 5 1641 1468
9 STEFANO LEE ON 5 1411 1523
10 ANVIT RAO MI 5 1365 1554