Project #1

Approach

After carefully reading the assignment I have first determined what it is that needs to be done. I will structure my approach from that understanding. We want to take a player and return the pre-average rating of opponents. We will only consider W, L, and D. Therefore, my first step in this assignment is to take be able to filter through the data-set and find every opponent for the selected player. After determining each player i need to filter the data even further and only return matches that were W, L, or D. With this new data-set I need to take every opponents pre-rating and calculate an average. This is the result we are looking for. What will also be critical is storing each players information in some form of list or using an SQL database have the information for each player accessible.

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.1     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.2.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Loading Data and Reading it from github

warn=FALSE was a necessary argument here as i was getting a read error from R. LLM advice was to include this argument because there was an empty line being recognized from the txt file.

url <- "https://raw.githubusercontent.com/AslamF/DATA607-Project-1/refs/heads/main/tournamentinfo.txt"

data <- readLines(url, warn=FALSE)

data
  [1] "-----------------------------------------------------------------------------------------" 
  [2] " Pair | Player Name                     |Total|Round|Round|Round|Round|Round|Round|Round| "
  [3] " Num  | USCF ID / Rtg (Pre->Post)       | Pts |  1  |  2  |  3  |  4  |  5  |  6  |  7  | "
  [4] "-----------------------------------------------------------------------------------------" 
  [5] "    1 | GARY HUA                        |6.0  |W  39|W  21|W  18|W  14|W   7|D  12|D   4|" 
  [6] "   ON | 15445895 / R: 1794   ->1817     |N:2  |W    |B    |W    |B    |W    |B    |W    |" 
  [7] "-----------------------------------------------------------------------------------------" 
  [8] "    2 | DAKSHESH DARURI                 |6.0  |W  63|W  58|L   4|W  17|W  16|W  20|W   7|" 
  [9] "   MI | 14598900 / R: 1553   ->1663     |N:2  |B    |W    |B    |W    |B    |W    |B    |" 
 [10] "-----------------------------------------------------------------------------------------" 
 [11] "    3 | ADITYA BAJAJ                    |6.0  |L   8|W  61|W  25|W  21|W  11|W  13|W  12|" 
 [12] "   MI | 14959604 / R: 1384   ->1640     |N:2  |W    |B    |W    |B    |W    |B    |W    |" 
 [13] "-----------------------------------------------------------------------------------------" 
 [14] "    4 | PATRICK H SCHILLING             |5.5  |W  23|D  28|W   2|W  26|D   5|W  19|D   1|" 
 [15] "   MI | 12616049 / R: 1716   ->1744     |N:2  |W    |B    |W    |B    |W    |B    |B    |" 
 [16] "-----------------------------------------------------------------------------------------" 
 [17] "    5 | HANSHI ZUO                      |5.5  |W  45|W  37|D  12|D  13|D   4|W  14|W  17|" 
 [18] "   MI | 14601533 / R: 1655   ->1690     |N:2  |B    |W    |B    |W    |B    |W    |B    |" 
 [19] "-----------------------------------------------------------------------------------------" 
 [20] "    6 | HANSEN SONG                     |5.0  |W  34|D  29|L  11|W  35|D  10|W  27|W  21|" 
 [21] "   OH | 15055204 / R: 1686   ->1687     |N:3  |W    |B    |W    |B    |B    |W    |B    |" 
 [22] "-----------------------------------------------------------------------------------------" 
 [23] "    7 | GARY DEE SWATHELL               |5.0  |W  57|W  46|W  13|W  11|L   1|W   9|L   2|" 
 [24] "   MI | 11146376 / R: 1649   ->1673     |N:3  |W    |B    |W    |B    |B    |W    |W    |" 
 [25] "-----------------------------------------------------------------------------------------" 
 [26] "    8 | EZEKIEL HOUGHTON                |5.0  |W   3|W  32|L  14|L   9|W  47|W  28|W  19|" 
 [27] "   MI | 15142253 / R: 1641P17->1657P24  |N:3  |B    |W    |B    |W    |B    |W    |W    |" 
 [28] "-----------------------------------------------------------------------------------------" 
 [29] "    9 | STEFANO LEE                     |5.0  |W  25|L  18|W  59|W   8|W  26|L   7|W  20|" 
 [30] "   ON | 14954524 / R: 1411   ->1564     |N:2  |W    |B    |W    |B    |W    |B    |B    |" 
 [31] "-----------------------------------------------------------------------------------------" 
 [32] "   10 | ANVIT RAO                       |5.0  |D  16|L  19|W  55|W  31|D   6|W  25|W  18|" 
 [33] "   MI | 14150362 / R: 1365   ->1544     |N:3  |W    |W    |B    |B    |W    |B    |W    |" 
 [34] "-----------------------------------------------------------------------------------------" 
 [35] "   11 | CAMERON WILLIAM MC LEMAN        |4.5  |D  38|W  56|W   6|L   7|L   3|W  34|W  26|" 
 [36] "   MI | 12581589 / R: 1712   ->1696     |N:3  |B    |W    |B    |W    |B    |W    |B    |" 
 [37] "-----------------------------------------------------------------------------------------" 
 [38] "   12 | KENNETH J TACK                  |4.5  |W  42|W  33|D   5|W  38|H    |D   1|L   3|" 
 [39] "   MI | 12681257 / R: 1663   ->1670     |N:3  |W    |B    |W    |B    |     |W    |B    |" 
 [40] "-----------------------------------------------------------------------------------------" 
 [41] "   13 | TORRANCE HENRY JR               |4.5  |W  36|W  27|L   7|D   5|W  33|L   3|W  32|" 
 [42] "   MI | 15082995 / R: 1666   ->1662     |N:3  |B    |W    |B    |B    |W    |W    |B    |" 
 [43] "-----------------------------------------------------------------------------------------" 
 [44] "   14 | BRADLEY SHAW                    |4.5  |W  54|W  44|W   8|L   1|D  27|L   5|W  31|" 
 [45] "   MI | 10131499 / R: 1610   ->1618     |N:3  |W    |B    |W    |W    |B    |B    |W    |" 
 [46] "-----------------------------------------------------------------------------------------" 
 [47] "   15 | ZACHARY JAMES HOUGHTON          |4.5  |D  19|L  16|W  30|L  22|W  54|W  33|W  38|" 
 [48] "   MI | 15619130 / R: 1220P13->1416P20  |N:3  |B    |B    |W    |W    |B    |B    |W    |" 
 [49] "-----------------------------------------------------------------------------------------" 
 [50] "   16 | MIKE NIKITIN                    |4.0  |D  10|W  15|H    |W  39|L   2|W  36|U    |" 
 [51] "   MI | 10295068 / R: 1604   ->1613     |N:3  |B    |W    |     |B    |W    |B    |     |" 
 [52] "-----------------------------------------------------------------------------------------" 
 [53] "   17 | RONALD GRZEGORCZYK              |4.0  |W  48|W  41|L  26|L   2|W  23|W  22|L   5|" 
 [54] "   MI | 10297702 / R: 1629   ->1610     |N:3  |W    |B    |W    |B    |W    |B    |W    |" 
 [55] "-----------------------------------------------------------------------------------------" 
 [56] "   18 | DAVID SUNDEEN                   |4.0  |W  47|W   9|L   1|W  32|L  19|W  38|L  10|" 
 [57] "   MI | 11342094 / R: 1600   ->1600     |N:3  |B    |W    |B    |W    |B    |W    |B    |" 
 [58] "-----------------------------------------------------------------------------------------" 
 [59] "   19 | DIPANKAR ROY                    |4.0  |D  15|W  10|W  52|D  28|W  18|L   4|L   8|" 
 [60] "   MI | 14862333 / R: 1564   ->1570     |N:3  |W    |B    |W    |B    |W    |W    |B    |" 
 [61] "-----------------------------------------------------------------------------------------" 
 [62] "   20 | JASON ZHENG                     |4.0  |L  40|W  49|W  23|W  41|W  28|L   2|L   9|" 
 [63] "   MI | 14529060 / R: 1595   ->1569     |N:4  |W    |B    |W    |B    |W    |B    |W    |" 
 [64] "-----------------------------------------------------------------------------------------" 
 [65] "   21 | DINH DANG BUI                   |4.0  |W  43|L   1|W  47|L   3|W  40|W  39|L   6|" 
 [66] "   ON | 15495066 / R: 1563P22->1562     |N:3  |B    |W    |B    |W    |W    |B    |W    |" 
 [67] "-----------------------------------------------------------------------------------------" 
 [68] "   22 | EUGENE L MCCLURE                |4.0  |W  64|D  52|L  28|W  15|H    |L  17|W  40|" 
 [69] "   MI | 12405534 / R: 1555   ->1529     |N:4  |W    |B    |W    |B    |     |W    |B    |" 
 [70] "-----------------------------------------------------------------------------------------" 
 [71] "   23 | ALAN BUI                        |4.0  |L   4|W  43|L  20|W  58|L  17|W  37|W  46|" 
 [72] "   ON | 15030142 / R: 1363   ->1371     |     |B    |W    |B    |W    |B    |W    |B    |" 
 [73] "-----------------------------------------------------------------------------------------" 
 [74] "   24 | MICHAEL R ALDRICH               |4.0  |L  28|L  47|W  43|L  25|W  60|W  44|W  39|" 
 [75] "   MI | 13469010 / R: 1229   ->1300     |N:4  |B    |W    |B    |B    |W    |W    |B    |" 
 [76] "-----------------------------------------------------------------------------------------" 
 [77] "   25 | LOREN SCHWIEBERT                |3.5  |L   9|W  53|L   3|W  24|D  34|L  10|W  47|" 
 [78] "   MI | 12486656 / R: 1745   ->1681     |N:4  |B    |W    |B    |W    |B    |W    |B    |" 
 [79] "-----------------------------------------------------------------------------------------" 
 [80] "   26 | MAX ZHU                         |3.5  |W  49|W  40|W  17|L   4|L   9|D  32|L  11|" 
 [81] "   ON | 15131520 / R: 1579   ->1564     |N:4  |B    |W    |B    |W    |B    |W    |W    |" 
 [82] "-----------------------------------------------------------------------------------------" 
 [83] "   27 | GAURAV GIDWANI                  |3.5  |W  51|L  13|W  46|W  37|D  14|L   6|U    |" 
 [84] "   MI | 14476567 / R: 1552   ->1539     |N:4  |W    |B    |W    |B    |W    |B    |     |" 
 [85] "-----------------------------------------------------------------------------------------" 
 [86] "   28 | SOFIA ADINA STANESCU-BELLU      |3.5  |W  24|D   4|W  22|D  19|L  20|L   8|D  36|" 
 [87] "   MI | 14882954 / R: 1507   ->1513     |N:3  |W    |W    |B    |W    |B    |B    |W    |" 
 [88] "-----------------------------------------------------------------------------------------" 
 [89] "   29 | CHIEDOZIE OKORIE                |3.5  |W  50|D   6|L  38|L  34|W  52|W  48|U    |" 
 [90] "   MI | 15323285 / R: 1602P6 ->1508P12  |N:4  |B    |W    |B    |W    |W    |B    |     |" 
 [91] "-----------------------------------------------------------------------------------------" 
 [92] "   30 | GEORGE AVERY JONES              |3.5  |L  52|D  64|L  15|W  55|L  31|W  61|W  50|" 
 [93] "   ON | 12577178 / R: 1522   ->1444     |     |W    |B    |B    |W    |W    |B    |B    |" 
 [94] "-----------------------------------------------------------------------------------------" 
 [95] "   31 | RISHI SHETTY                    |3.5  |L  58|D  55|W  64|L  10|W  30|W  50|L  14|" 
 [96] "   MI | 15131618 / R: 1494   ->1444     |     |B    |W    |B    |W    |B    |W    |B    |" 
 [97] "-----------------------------------------------------------------------------------------" 
 [98] "   32 | JOSHUA PHILIP MATHEWS           |3.5  |W  61|L   8|W  44|L  18|W  51|D  26|L  13|" 
 [99] "   ON | 14073750 / R: 1441   ->1433     |N:4  |W    |B    |W    |B    |W    |B    |W    |" 
[100] "-----------------------------------------------------------------------------------------" 
[101] "   33 | JADE GE                         |3.5  |W  60|L  12|W  50|D  36|L  13|L  15|W  51|" 
[102] "   MI | 14691842 / R: 1449   ->1421     |     |B    |W    |B    |W    |B    |W    |B    |" 
[103] "-----------------------------------------------------------------------------------------" 
[104] "   34 | MICHAEL JEFFERY THOMAS          |3.5  |L   6|W  60|L  37|W  29|D  25|L  11|W  52|" 
[105] "   MI | 15051807 / R: 1399   ->1400     |     |B    |W    |B    |B    |W    |B    |W    |" 
[106] "-----------------------------------------------------------------------------------------" 
[107] "   35 | JOSHUA DAVID LEE                |3.5  |L  46|L  38|W  56|L   6|W  57|D  52|W  48|" 
[108] "   MI | 14601397 / R: 1438   ->1392     |     |W    |W    |B    |W    |B    |B    |W    |" 
[109] "-----------------------------------------------------------------------------------------" 
[110] "   36 | SIDDHARTH JHA                   |3.5  |L  13|W  57|W  51|D  33|H    |L  16|D  28|" 
[111] "   MI | 14773163 / R: 1355   ->1367     |N:4  |W    |B    |W    |B    |     |W    |B    |" 
[112] "-----------------------------------------------------------------------------------------" 
[113] "   37 | AMIYATOSH PWNANANDAM            |3.5  |B    |L   5|W  34|L  27|H    |L  23|W  61|" 
[114] "   MI | 15489571 / R:  980P12->1077P17  |     |     |B    |W    |W    |     |B    |W    |" 
[115] "-----------------------------------------------------------------------------------------" 
[116] "   38 | BRIAN LIU                       |3.0  |D  11|W  35|W  29|L  12|H    |L  18|L  15|" 
[117] "   MI | 15108523 / R: 1423   ->1439     |N:4  |W    |B    |W    |W    |     |B    |B    |" 
[118] "-----------------------------------------------------------------------------------------" 
[119] "   39 | JOEL R HENDON                   |3.0  |L   1|W  54|W  40|L  16|W  44|L  21|L  24|" 
[120] "   MI | 12923035 / R: 1436P23->1413     |N:4  |B    |W    |B    |W    |B    |W    |W    |" 
[121] "-----------------------------------------------------------------------------------------" 
[122] "   40 | FOREST ZHANG                    |3.0  |W  20|L  26|L  39|W  59|L  21|W  56|L  22|" 
[123] "   MI | 14892710 / R: 1348   ->1346     |     |B    |B    |W    |W    |B    |W    |W    |" 
[124] "-----------------------------------------------------------------------------------------" 
[125] "   41 | KYLE WILLIAM MURPHY             |3.0  |W  59|L  17|W  58|L  20|X    |U    |U    |" 
[126] "   MI | 15761443 / R: 1403P5 ->1341P9   |     |B    |W    |B    |W    |     |     |     |" 
[127] "-----------------------------------------------------------------------------------------" 
[128] "   42 | JARED GE                        |3.0  |L  12|L  50|L  57|D  60|D  61|W  64|W  56|" 
[129] "   MI | 14462326 / R: 1332   ->1256     |     |B    |W    |B    |B    |W    |W    |B    |" 
[130] "-----------------------------------------------------------------------------------------" 
[131] "   43 | ROBERT GLEN VASEY               |3.0  |L  21|L  23|L  24|W  63|W  59|L  46|W  55|" 
[132] "   MI | 14101068 / R: 1283   ->1244     |     |W    |B    |W    |W    |B    |B    |W    |" 
[133] "-----------------------------------------------------------------------------------------" 
[134] "   44 | JUSTIN D SCHILLING              |3.0  |B    |L  14|L  32|W  53|L  39|L  24|W  59|" 
[135] "   MI | 15323504 / R: 1199   ->1199     |     |     |W    |B    |B    |W    |B    |W    |" 
[136] "-----------------------------------------------------------------------------------------" 
[137] "   45 | DEREK YAN                       |3.0  |L   5|L  51|D  60|L  56|W  63|D  55|W  58|" 
[138] "   MI | 15372807 / R: 1242   ->1191     |     |W    |B    |W    |B    |W    |B    |W    |" 
[139] "-----------------------------------------------------------------------------------------" 
[140] "   46 | JACOB ALEXANDER LAVALLEY        |3.0  |W  35|L   7|L  27|L  50|W  64|W  43|L  23|" 
[141] "   MI | 15490981 / R:  377P3 ->1076P10  |     |B    |W    |B    |W    |B    |W    |W    |" 
[142] "-----------------------------------------------------------------------------------------" 
[143] "   47 | ERIC WRIGHT                     |2.5  |L  18|W  24|L  21|W  61|L   8|D  51|L  25|" 
[144] "   MI | 12533115 / R: 1362   ->1341     |     |W    |B    |W    |B    |W    |B    |W    |" 
[145] "-----------------------------------------------------------------------------------------" 
[146] "   48 | DANIEL KHAIN                    |2.5  |L  17|W  63|H    |D  52|H    |L  29|L  35|" 
[147] "   MI | 14369165 / R: 1382   ->1335     |     |B    |W    |     |B    |     |W    |B    |" 
[148] "-----------------------------------------------------------------------------------------" 
[149] "   49 | MICHAEL J MARTIN                |2.5  |L  26|L  20|D  63|D  64|W  58|H    |U    |" 
[150] "   MI | 12531685 / R: 1291P12->1259P17  |     |W    |W    |B    |W    |B    |     |     |" 
[151] "-----------------------------------------------------------------------------------------" 
[152] "   50 | SHIVAM JHA                      |2.5  |L  29|W  42|L  33|W  46|H    |L  31|L  30|" 
[153] "   MI | 14773178 / R: 1056   ->1111     |     |W    |B    |W    |B    |     |B    |W    |" 
[154] "-----------------------------------------------------------------------------------------" 
[155] "   51 | TEJAS AYYAGARI                  |2.5  |L  27|W  45|L  36|W  57|L  32|D  47|L  33|" 
[156] "   MI | 15205474 / R: 1011   ->1097     |     |B    |W    |B    |W    |B    |W    |W    |" 
[157] "-----------------------------------------------------------------------------------------" 
[158] "   52 | ETHAN GUO                       |2.5  |W  30|D  22|L  19|D  48|L  29|D  35|L  34|" 
[159] "   MI | 14918803 / R:  935   ->1092     |N:4  |B    |W    |B    |W    |B    |W    |B    |" 
[160] "-----------------------------------------------------------------------------------------" 
[161] "   53 | JOSE C YBARRA                   |2.0  |H    |L  25|H    |L  44|U    |W  57|U    |" 
[162] "   MI | 12578849 / R: 1393   ->1359     |     |     |B    |     |W    |     |W    |     |" 
[163] "-----------------------------------------------------------------------------------------" 
[164] "   54 | LARRY HODGE                     |2.0  |L  14|L  39|L  61|B    |L  15|L  59|W  64|" 
[165] "   MI | 12836773 / R: 1270   ->1200     |     |B    |B    |W    |     |W    |B    |W    |" 
[166] "-----------------------------------------------------------------------------------------" 
[167] "   55 | ALEX KONG                       |2.0  |L  62|D  31|L  10|L  30|B    |D  45|L  43|" 
[168] "   MI | 15412571 / R: 1186   ->1163     |     |W    |B    |W    |B    |     |W    |B    |" 
[169] "-----------------------------------------------------------------------------------------" 
[170] "   56 | MARISA RICCI                    |2.0  |H    |L  11|L  35|W  45|H    |L  40|L  42|" 
[171] "   MI | 14679887 / R: 1153   ->1140     |     |     |B    |W    |W    |     |B    |W    |" 
[172] "-----------------------------------------------------------------------------------------" 
[173] "   57 | MICHAEL LU                      |2.0  |L   7|L  36|W  42|L  51|L  35|L  53|B    |" 
[174] "   MI | 15113330 / R: 1092   ->1079     |     |B    |W    |W    |B    |W    |B    |     |" 
[175] "-----------------------------------------------------------------------------------------" 
[176] "   58 | VIRAJ MOHILE                    |2.0  |W  31|L   2|L  41|L  23|L  49|B    |L  45|" 
[177] "   MI | 14700365 / R:  917   -> 941     |     |W    |B    |W    |B    |W    |     |B    |" 
[178] "-----------------------------------------------------------------------------------------" 
[179] "   59 | SEAN M MC CORMICK               |2.0  |L  41|B    |L   9|L  40|L  43|W  54|L  44|" 
[180] "   MI | 12841036 / R:  853   -> 878     |     |W    |     |B    |B    |W    |W    |B    |" 
[181] "-----------------------------------------------------------------------------------------" 
[182] "   60 | JULIA SHEN                      |1.5  |L  33|L  34|D  45|D  42|L  24|H    |U    |" 
[183] "   MI | 14579262 / R:  967   -> 984     |     |W    |B    |B    |W    |B    |     |     |" 
[184] "-----------------------------------------------------------------------------------------" 
[185] "   61 | JEZZEL FARKAS                   |1.5  |L  32|L   3|W  54|L  47|D  42|L  30|L  37|" 
[186] "   ON | 15771592 / R:  955P11-> 979P18  |     |B    |W    |B    |W    |B    |W    |B    |" 
[187] "-----------------------------------------------------------------------------------------" 
[188] "   62 | ASHWIN BALAJI                   |1.0  |W  55|U    |U    |U    |U    |U    |U    |" 
[189] "   MI | 15219542 / R: 1530   ->1535     |     |B    |     |     |     |     |     |     |" 
[190] "-----------------------------------------------------------------------------------------" 
[191] "   63 | THOMAS JOSEPH HOSMER            |1.0  |L   2|L  48|D  49|L  43|L  45|H    |U    |" 
[192] "   MI | 15057092 / R: 1175   ->1125     |     |W    |B    |W    |B    |B    |     |     |" 
[193] "-----------------------------------------------------------------------------------------" 
[194] "   64 | BEN LI                          |1.0  |L  22|D  30|L  31|D  49|L  46|L  42|L  54|" 
[195] "   MI | 15006561 / R: 1163   ->1112     |     |B    |W    |W    |B    |W    |B    |B    |" 
[196] "-----------------------------------------------------------------------------------------" 

Defining Pattern Variables

This step was critical and I heavily relied upon LLM to assist with regex pattern. With this code we are defining what we are looking for. Because the data came as a text file with extra lines and spacing we really need to clean it up in order to create a usable and readable data frame for our purposes. For example, there are 2 lines, the player name line and player rating line. When each line starts there is a space with no value. There are also characters like ” | ” separating values in the file. This all needs to be scraped. Each regex pattern for each variable scraped the data and gets what we need. Pattern name will only give us the player name, pattern points will only give us the points from each round, pattern_opps will give us the opponent number from each round, pattern_state gives us the the state code and pattern_rtg gives us the pre rating value for each player. Post rating is what we are calculating and then providing.

pattern_name  <- "(?<=\\|)\\s*[A-Z][A-Z ,.']+(?=\\s*\\|)"
pattern_pts   <- "\\d\\.\\d"
pattern_opps  <- "(?<=[WLD]\\s{1,3})\\d+"
pattern_state <- "^\\s+([A-Z]{2})\\s*\\|"
pattern_rtg   <- "(?<=R:\\s{1,4})\\d+"

Cleaning up the data to be used as a dataframe

str_extract is a key function which returns a string that matches the regex pattern argument. str_extract(string, pattern)

player_lines <- data[str_detect(data, "^\\s+\\d+\\s*\\|")]

rating_lines <- data[str_detect(data, "^\\s+[A-Z]{2}\\s*\\|")]

head(player_lines, 3)
[1] "    1 | GARY HUA                        |6.0  |W  39|W  21|W  18|W  14|W   7|D  12|D   4|"
[2] "    2 | DAKSHESH DARURI                 |6.0  |W  63|W  58|L   4|W  17|W  16|W  20|W   7|"
[3] "    3 | ADITYA BAJAJ                    |6.0  |L   8|W  61|W  25|W  21|W  11|W  13|W  12|"
head(rating_lines, 3)
[1] "   ON | 15445895 / R: 1794   ->1817     |N:2  |W    |B    |W    |B    |W    |B    |W    |"
[2] "   MI | 14598900 / R: 1553   ->1663     |N:2  |B    |W    |B    |W    |B    |W    |B    |"
[3] "   MI | 14959604 / R: 1384   ->1640     |N:2  |W    |B    |W    |B    |W    |B    |W    |"
df <- tibble(player_lines, rating_lines) 

dforganized <- df %>%
  mutate(
    player_name  = trimws(str_extract(player_lines, pattern_name)),
    total_points = as.numeric(str_extract(player_lines, pattern_pts)),
    opponents    = str_extract_all(player_lines, pattern_opps),
    player_state = trimws(str_extract(rating_lines, pattern_state, group = 1)),
    pre_rating   = as.numeric(str_extract(rating_lines, pattern_rtg)))
  
print(dforganized)
# A tibble: 64 × 7
   player_lines     rating_lines player_name total_points opponents player_state
   <chr>            <chr>        <chr>              <dbl> <list>    <chr>       
 1 "    1 | GARY H… "   ON | 15… GARY HUA             6   <chr [7]> ON          
 2 "    2 | DAKSHE… "   MI | 14… DAKSHESH D…          6   <chr [7]> MI          
 3 "    3 | ADITYA… "   MI | 14… ADITYA BAJ…          6   <chr [7]> MI          
 4 "    4 | PATRIC… "   MI | 12… PATRICK H …          5.5 <chr [7]> MI          
 5 "    5 | HANSHI… "   MI | 14… HANSHI ZUO           5.5 <chr [7]> MI          
 6 "    6 | HANSEN… "   OH | 15… HANSEN SONG          5   <chr [7]> OH          
 7 "    7 | GARY D… "   MI | 11… GARY DEE S…          5   <chr [7]> MI          
 8 "    8 | EZEKIE… "   MI | 15… EZEKIEL HO…          5   <chr [7]> MI          
 9 "    9 | STEFAN… "   ON | 14… STEFANO LEE          5   <chr [7]> ON          
10 "   10 | ANVIT … "   MI | 14… ANVIT RAO            5   <chr [7]> MI          
# ℹ 54 more rows
# ℹ 1 more variable: pre_rating <dbl>

Calculating Avg Pre Rating

 finaldf <- dforganized %>%
    mutate(
    avgPreRating = sapply(opponents, function(opps) {
      opponent_index   <- as.integer(opps)
      opponent_ratings <- pre_rating[opponent_index]
      round(mean(opponent_ratings, na.rm = TRUE))
    })
  ) %>%
  select(player_name, player_state, total_points, pre_rating, avgPreRating)
 
 print(finaldf)
# A tibble: 64 × 5
   player_name         player_state total_points pre_rating avgPreRating
   <chr>               <chr>               <dbl>      <dbl>        <dbl>
 1 GARY HUA            ON                    6         1794         1605
 2 DAKSHESH DARURI     MI                    6         1553         1469
 3 ADITYA BAJAJ        MI                    6         1384         1564
 4 PATRICK H SCHILLING MI                    5.5       1716         1574
 5 HANSHI ZUO          MI                    5.5       1655         1501
 6 HANSEN SONG         OH                    5         1686         1519
 7 GARY DEE SWATHELL   MI                    5         1649         1372
 8 EZEKIEL HOUGHTON    MI                    5         1641         1468
 9 STEFANO LEE         ON                    5         1411         1523
10 ANVIT RAO           MI                    5         1365         1554
# ℹ 54 more rows

Exporting to CV

write_csv(finaldf, "results.csv")

check <- read_csv("results.csv")
Rows: 64 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): player_name, player_state
dbl (3): total_points, pre_rating, avgPreRating

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
print(check)
# A tibble: 64 × 5
   player_name         player_state total_points pre_rating avgPreRating
   <chr>               <chr>               <dbl>      <dbl>        <dbl>
 1 GARY HUA            ON                    6         1794         1605
 2 DAKSHESH DARURI     MI                    6         1553         1469
 3 ADITYA BAJAJ        MI                    6         1384         1564
 4 PATRICK H SCHILLING MI                    5.5       1716         1574
 5 HANSHI ZUO          MI                    5.5       1655         1501
 6 HANSEN SONG         OH                    5         1686         1519
 7 GARY DEE SWATHELL   MI                    5         1649         1372
 8 EZEKIEL HOUGHTON    MI                    5         1641         1468
 9 STEFANO LEE         ON                    5         1411         1523
10 ANVIT RAO           MI                    5         1365         1554
# ℹ 54 more rows

Conclusion

The hardest part of this assignment was converting the txt file into a readable and usable format in R. LLM was extremely helpful in understanding regex patterns to parse through the txt and obtain the numbers and strings we needed to use, getting rid of empty spaces and non-usable characters such as “|”. A-lot of this was data processing and a unique way of handling a new type of data.