Importing Data

tournament_df <- read.csv("https://raw.githubusercontent.com/himalayahall/DATA607/main/Project1/tournamentinfo.txt")

tournament_df <- tournament_df %>%
  transmute( x = X.........................................................................................)

head(tournament_df)
##                                                                                            x
## 1  Pair | Player Name                     |Total|Round|Round|Round|Round|Round|Round|Round| 
## 2  Num  | USCF ID / Rtg (Pre->Post)       | Pts |  1  |  2  |  3  |  4  |  5  |  6  |  7  | 
## 3  -----------------------------------------------------------------------------------------
## 4      1 | GARY HUA                        |6.0  |W  39|W  21|W  18|W  14|W   7|D  12|D   4|
## 5     ON | 15445895 / R: 1794   ->1817     |N:2  |W    |B    |W    |B    |W    |B    |W    |
## 6  -----------------------------------------------------------------------------------------

Tidying Data - Separating the “x” column into multiple columns. I named two columns as A and B just as a placeholder to see how the separate function work.

new_df <- tournament_df %>%
  separate(x, into = c(NA, "Pair_Num", "Player_Name", "Total", "Points", "A", "B", "C", "D"), extra = "merge")
## Warning: Expected 9 pieces. Missing pieces filled with `NA` in 66 rows [3, 6, 9,
## 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60, ...].
head(new_df)
##   Pair_Num Player_Name Total Points     A     B     C
## 1     Pair      Player  Name  Total Round Round Round
## 2      Num        USCF    ID    Rtg   Pre  Post   Pts
## 3                 <NA>  <NA>   <NA>  <NA>  <NA>  <NA>
## 4        1        GARY   HUA      6     0     W    39
## 5       ON    15445895     R   1794  1817     N     2
## 6                 <NA>  <NA>   <NA>  <NA>  <NA>  <NA>
##                                            D
## 1                  Round|Round|Round|Round| 
## 2  1  |  2  |  3  |  4  |  5  |  6  |  7  | 
## 3                                       <NA>
## 4       W  21|W  18|W  14|W   7|D  12|D   4|
## 5 W    |B    |W    |B    |W    |B    |W    |
## 6                                       <NA>

Dropped the rows with NA. I also dropped the first two rows because they aren’t needed for the output.

new_df <- new_df %>% drop_na()
new_df <- new_df %>% slice(-1)
new_df <- new_df %>% slice(-1)
head(new_df, 10)
##    Pair_Num Player_Name  Total    Points    A B  C
## 1         1        GARY    HUA         6    0 W 39
## 2        ON    15445895      R      1794 1817 N  2
## 3         2    DAKSHESH DARURI         6    0 W 63
## 4        MI    14598900      R      1553 1663 N  2
## 5         3      ADITYA  BAJAJ         6    0 L  8
## 6        MI    14959604      R      1384 1640 N  2
## 7         4     PATRICK      H SCHILLING    5 5  W
## 8        MI    12616049      R      1716 1744 N  2
## 9         5      HANSHI    ZUO         5    5 W 45
## 10       MI    14601533      R      1655 1690 N  2
##                                             D
## 1        W  21|W  18|W  14|W   7|D  12|D   4|
## 2  W    |B    |W    |B    |W    |B    |W    |
## 3        W  58|L   4|W  17|W  16|W  20|W   7|
## 4  B    |W    |B    |W    |B    |W    |B    |
## 5        W  61|W  25|W  21|W  11|W  13|W  12|
## 6  W    |B    |W    |B    |W    |B    |W    |
## 7     23|D  28|W   2|W  26|D   5|W  19|D   1|
## 8  W    |B    |W    |B    |W    |B    |B    |
## 9        W  37|D  12|D  13|D   4|W  14|W  17|
## 10 B    |W    |B    |W    |B    |W    |B    |

Unfortunately, this is where I got stuck and couldn’t move on to complete this project.