In project 1, we were given a text file with chess tournament results. The goal of the project was to tidy the data in r, calculate the average pre chess rating of opponents, and generate a .CSV file. In this extra credit assignment, we identify the top 5 over performers and under performers by comparing the player’s actual score and their calculated expected score.
In project 1, I loaded the text file into R. Below is the table of the text file. The table has the following issues:
1. There are dashes and empty rows and column.
2. Each player’s information are within every two rows.
2. The ‘Player names’, ‘USCF ID’, ‘Pre-Rating’ and ‘Post-Rating’ are in every 2 rows of the 2nd column.
3. There are no header.
4. Every cell contains both numeric and non-meric values.
#library(dplyr)
library(tidyverse)
library(tidyr)
library(DT)
# I used the following links to see how to read txt file http://rfunction.com/archives/1441 and
txtfile <- readLines('https://raw.githubusercontent.com/suswong/DATA-607-Project-1/main/tournamentinfo.txt')
table1<-read.delim(textConnection(txtfile),header=FALSE,sep="|")
datatable(table1)
Below is a table of the tidied data. Below are the steps I took to tidy the data:
new <-table1 %>%
filter(row_number() %% 3 != 1) ## Delete every 3rd row starting from 1
#I searched how to fitler certain rows that I do not want to keep in my datatable. <https://stackoverflow.com/questions/24440258/selecting-multiple-odd-or-even-columns-rows-for-dataframe>
colnames(new) <- c("Pair_Num", "Player_Name", "Total_Points", "Round_1", "Round_2", "Round_3", "Round_4", "Round_5", "Round_6", "Round_7", "n/a")
new <- new[-1,] #Remove the 1st row
new <- new[-1,] #Remove the 2nd row
new <- new[,-11] #Remove the last column
#I searched how to select and extract even and odd rows using the following link. <https://datacarpentry.org/R-genomics/03-data-frames.html>
odd <- seq(1,nrow(new),2)
even <- seq(2,nrow(new),2)
players <-new[odd,]
player_state <- select(new[even,],1,2) #I used this link to see how to select certain columns https://sparkbyexamples.com/r-programming/r-select-function-from-dplyr/
colnames(player_state) <- c("State", "USCF_ID.Rating")
# I used the following link to split a column into multiple columns. https://www.statology.org/split-column-in-r/
library(stringr)
player_state[c('USCF_ID', 'Rating')] <- str_split_fixed(player_state$USCF_ID.Rating, '/ ', 2)
df1 <- player_state[c('State','USCF_ID', 'Rating')]
df1[c('Prerating', 'Postrating')] <- str_split_fixed(df1$Rating, '->', 2)
df2 <- df1[c('State','USCF_ID','Prerating','Postrating')]
df2$Prerating<-gsub("R:","",as.character(df2$Prerating)) # I used this link to help me remove "R:" in the Prerating column. https://www.tutorialspoint.com/how-to-remove-a-character-in-an-r-data-frame-column#:~:text=Side%20ProgrammingProgramming-,To%20remove%20a%20character%20in%20an%20R%20data%20frame%20column,%22%2C%22%22%2Cas.
# I search and used the following link to help me combine two tables. <https://statisticsglobe.com/cbind-r-command-example/>
df3 <- cbind(players,df2)
#df3 <- df3[,-1] #Remove the first column
#http://www.sthda.com/english/wiki/reordering-data-frame-columns-in-r
col_order <- c("Pair_Num","Player_Name", "State","USCF_ID","Prerating","Postrating","Total_Points", "Round_1", "Round_2", "Round_3", "Round_4", "Round_5", "Round_6", "Round_7")
df4 <- df3[, col_order]
# I search how to remove any letter from each round https://datascience.stackexchange.com/questions/15589/remove-part-of-string-in-r
df5 <- df4[c('Pair_Num', 'Player_Name','State','Prerating','Total_Points','Round_1','Round_2','Round_3','Round_4','Round_5','Round_6','Round_7')]
df5$Round_1<-gsub("[a-zA-Z ]","",as.character(df5$Round_1))
df5$Round_2<-gsub("[a-zA-Z ]","",as.character(df5$Round_2))
df5$Round_3<-gsub("[a-zA-Z ]","",as.character(df5$Round_3))
df5$Round_4<-gsub("[a-zA-Z ]","",as.character(df5$Round_4))
df5$Round_5<-gsub("[a-zA-Z ]","",as.character(df5$Round_5))
df5$Round_6<-gsub("[a-zA-Z ]","",as.character(df5$Round_6))
df5$Round_7<-gsub("[a-zA-Z ]","",as.character(df5$Round_7))
df5$Prerating<-as.numeric(unlist(str_extract(df5$Prerating,"\\d+\\d")))
df5$Pair_Num<-as.numeric(df5$Pair_Num)
Final <- df5
Tidied_Table <- df4
Tidied_Table$Prerating<-as.numeric(unlist(str_extract(df5$Prerating,"\\d+\\d")))
Tidied_Table$Postrating<-as.numeric(unlist(str_extract(df5$Prerating,"\\d+\\d")))
colnames(Tidied_Table) <- c("Player Number", "Player Name", "State","USCF_ID", "Pre-Rating","Post-Rating","Total Points", "Round 1", "Round 2", "Round 3", "Round 4", "Round 5", "Round 6", "Round 7")
rownames(Tidied_Table) <- NULL
library(kableExtra)
kbl(Tidied_Table) %>%
kable_classic() %>%
add_header_above(c("Player Information" = 7 , "Opponent Player Number for Each Round" = 7))
Player Information
|
Opponent Player Number for Each Round
|
||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Player Number | Player Name | State | USCF_ID | Pre-Rating | Post-Rating | Total Points | Round 1 | Round 2 | Round 3 | Round 4 | Round 5 | Round 6 | Round 7 |
1 | GARY HUA | ON | 15445895 | 1794 | 1794 | 6.0 | W 39 | W 21 | W 18 | W 14 | W 7 | D 12 | D 4 |
2 | DAKSHESH DARURI | MI | 14598900 | 1553 | 1553 | 6.0 | W 63 | W 58 | L 4 | W 17 | W 16 | W 20 | W 7 |
3 | ADITYA BAJAJ | MI | 14959604 | 1384 | 1384 | 6.0 | L 8 | W 61 | W 25 | W 21 | W 11 | W 13 | W 12 |
4 | PATRICK H SCHILLING | MI | 12616049 | 1716 | 1716 | 5.5 | W 23 | D 28 | W 2 | W 26 | D 5 | W 19 | D 1 |
5 | HANSHI ZUO | MI | 14601533 | 1655 | 1655 | 5.5 | W 45 | W 37 | D 12 | D 13 | D 4 | W 14 | W 17 |
6 | HANSEN SONG | OH | 15055204 | 1686 | 1686 | 5.0 | W 34 | D 29 | L 11 | W 35 | D 10 | W 27 | W 21 |
7 | GARY DEE SWATHELL | MI | 11146376 | 1649 | 1649 | 5.0 | W 57 | W 46 | W 13 | W 11 | L 1 | W 9 | L 2 |
8 | EZEKIEL HOUGHTON | MI | 15142253 | 1641 | 1641 | 5.0 | W 3 | W 32 | L 14 | L 9 | W 47 | W 28 | W 19 |
9 | STEFANO LEE | ON | 14954524 | 1411 | 1411 | 5.0 | W 25 | L 18 | W 59 | W 8 | W 26 | L 7 | W 20 |
10 | ANVIT RAO | MI | 14150362 | 1365 | 1365 | 5.0 | D 16 | L 19 | W 55 | W 31 | D 6 | W 25 | W 18 |
11 | CAMERON WILLIAM MC LEMAN | MI | 12581589 | 1712 | 1712 | 4.5 | D 38 | W 56 | W 6 | L 7 | L 3 | W 34 | W 26 |
12 | KENNETH J TACK | MI | 12681257 | 1663 | 1663 | 4.5 | W 42 | W 33 | D 5 | W 38 | H | D 1 | L 3 |
13 | TORRANCE HENRY JR | MI | 15082995 | 1666 | 1666 | 4.5 | W 36 | W 27 | L 7 | D 5 | W 33 | L 3 | W 32 |
14 | BRADLEY SHAW | MI | 10131499 | 1610 | 1610 | 4.5 | W 54 | W 44 | W 8 | L 1 | D 27 | L 5 | W 31 |
15 | ZACHARY JAMES HOUGHTON | MI | 15619130 | 1220 | 1220 | 4.5 | D 19 | L 16 | W 30 | L 22 | W 54 | W 33 | W 38 |
16 | MIKE NIKITIN | MI | 10295068 | 1604 | 1604 | 4.0 | D 10 | W 15 | H | W 39 | L 2 | W 36 | U |
17 | RONALD GRZEGORCZYK | MI | 10297702 | 1629 | 1629 | 4.0 | W 48 | W 41 | L 26 | L 2 | W 23 | W 22 | L 5 |
18 | DAVID SUNDEEN | MI | 11342094 | 1600 | 1600 | 4.0 | W 47 | W 9 | L 1 | W 32 | L 19 | W 38 | L 10 |
19 | DIPANKAR ROY | MI | 14862333 | 1564 | 1564 | 4.0 | D 15 | W 10 | W 52 | D 28 | W 18 | L 4 | L 8 |
20 | JASON ZHENG | MI | 14529060 | 1595 | 1595 | 4.0 | L 40 | W 49 | W 23 | W 41 | W 28 | L 2 | L 9 |
21 | DINH DANG BUI | ON | 15495066 | 1563 | 1563 | 4.0 | W 43 | L 1 | W 47 | L 3 | W 40 | W 39 | L 6 |
22 | EUGENE L MCCLURE | MI | 12405534 | 1555 | 1555 | 4.0 | W 64 | D 52 | L 28 | W 15 | H | L 17 | W 40 |
23 | ALAN BUI | ON | 15030142 | 1363 | 1363 | 4.0 | L 4 | W 43 | L 20 | W 58 | L 17 | W 37 | W 46 |
24 | MICHAEL R ALDRICH | MI | 13469010 | 1229 | 1229 | 4.0 | L 28 | L 47 | W 43 | L 25 | W 60 | W 44 | W 39 |
25 | LOREN SCHWIEBERT | MI | 12486656 | 1745 | 1745 | 3.5 | L 9 | W 53 | L 3 | W 24 | D 34 | L 10 | W 47 |
26 | MAX ZHU | ON | 15131520 | 1579 | 1579 | 3.5 | W 49 | W 40 | W 17 | L 4 | L 9 | D 32 | L 11 |
27 | GAURAV GIDWANI | MI | 14476567 | 1552 | 1552 | 3.5 | W 51 | L 13 | W 46 | W 37 | D 14 | L 6 | U |
28 | SOFIA ADINA STANESCU-BELLU | MI | 14882954 | 1507 | 1507 | 3.5 | W 24 | D 4 | W 22 | D 19 | L 20 | L 8 | D 36 |
29 | CHIEDOZIE OKORIE | MI | 15323285 | 1602 | 1602 | 3.5 | W 50 | D 6 | L 38 | L 34 | W 52 | W 48 | U |
30 | GEORGE AVERY JONES | ON | 12577178 | 1522 | 1522 | 3.5 | L 52 | D 64 | L 15 | W 55 | L 31 | W 61 | W 50 |
31 | RISHI SHETTY | MI | 15131618 | 1494 | 1494 | 3.5 | L 58 | D 55 | W 64 | L 10 | W 30 | W 50 | L 14 |
32 | JOSHUA PHILIP MATHEWS | ON | 14073750 | 1441 | 1441 | 3.5 | W 61 | L 8 | W 44 | L 18 | W 51 | D 26 | L 13 |
33 | JADE GE | MI | 14691842 | 1449 | 1449 | 3.5 | W 60 | L 12 | W 50 | D 36 | L 13 | L 15 | W 51 |
34 | MICHAEL JEFFERY THOMAS | MI | 15051807 | 1399 | 1399 | 3.5 | L 6 | W 60 | L 37 | W 29 | D 25 | L 11 | W 52 |
35 | JOSHUA DAVID LEE | MI | 14601397 | 1438 | 1438 | 3.5 | L 46 | L 38 | W 56 | L 6 | W 57 | D 52 | W 48 |
36 | SIDDHARTH JHA | MI | 14773163 | 1355 | 1355 | 3.5 | L 13 | W 57 | W 51 | D 33 | H | L 16 | D 28 |
37 | AMIYATOSH PWNANANDAM | MI | 15489571 | 980 | 980 | 3.5 | B | L 5 | W 34 | L 27 | H | L 23 | W 61 |
38 | BRIAN LIU | MI | 15108523 | 1423 | 1423 | 3.0 | D 11 | W 35 | W 29 | L 12 | H | L 18 | L 15 |
39 | JOEL R HENDON | MI | 12923035 | 1436 | 1436 | 3.0 | L 1 | W 54 | W 40 | L 16 | W 44 | L 21 | L 24 |
40 | FOREST ZHANG | MI | 14892710 | 1348 | 1348 | 3.0 | W 20 | L 26 | L 39 | W 59 | L 21 | W 56 | L 22 |
41 | KYLE WILLIAM MURPHY | MI | 15761443 | 1403 | 1403 | 3.0 | W 59 | L 17 | W 58 | L 20 | X | U | U |
42 | JARED GE | MI | 14462326 | 1332 | 1332 | 3.0 | L 12 | L 50 | L 57 | D 60 | D 61 | W 64 | W 56 |
43 | ROBERT GLEN VASEY | MI | 14101068 | 1283 | 1283 | 3.0 | L 21 | L 23 | L 24 | W 63 | W 59 | L 46 | W 55 |
44 | JUSTIN D SCHILLING | MI | 15323504 | 1199 | 1199 | 3.0 | B | L 14 | L 32 | W 53 | L 39 | L 24 | W 59 |
45 | DEREK YAN | MI | 15372807 | 1242 | 1242 | 3.0 | L 5 | L 51 | D 60 | L 56 | W 63 | D 55 | W 58 |
46 | JACOB ALEXANDER LAVALLEY | MI | 15490981 | 377 | 377 | 3.0 | W 35 | L 7 | L 27 | L 50 | W 64 | W 43 | L 23 |
47 | ERIC WRIGHT | MI | 12533115 | 1362 | 1362 | 2.5 | L 18 | W 24 | L 21 | W 61 | L 8 | D 51 | L 25 |
48 | DANIEL KHAIN | MI | 14369165 | 1382 | 1382 | 2.5 | L 17 | W 63 | H | D 52 | H | L 29 | L 35 |
49 | MICHAEL J MARTIN | MI | 12531685 | 1291 | 1291 | 2.5 | L 26 | L 20 | D 63 | D 64 | W 58 | H | U |
50 | SHIVAM JHA | MI | 14773178 | 1056 | 1056 | 2.5 | L 29 | W 42 | L 33 | W 46 | H | L 31 | L 30 |
51 | TEJAS AYYAGARI | MI | 15205474 | 1011 | 1011 | 2.5 | L 27 | W 45 | L 36 | W 57 | L 32 | D 47 | L 33 |
52 | ETHAN GUO | MI | 14918803 | 935 | 935 | 2.5 | W 30 | D 22 | L 19 | D 48 | L 29 | D 35 | L 34 |
53 | JOSE C YBARRA | MI | 12578849 | 1393 | 1393 | 2.0 | H | L 25 | H | L 44 | U | W 57 | U |
54 | LARRY HODGE | MI | 12836773 | 1270 | 1270 | 2.0 | L 14 | L 39 | L 61 | B | L 15 | L 59 | W 64 |
55 | ALEX KONG | MI | 15412571 | 1186 | 1186 | 2.0 | L 62 | D 31 | L 10 | L 30 | B | D 45 | L 43 |
56 | MARISA RICCI | MI | 14679887 | 1153 | 1153 | 2.0 | H | L 11 | L 35 | W 45 | H | L 40 | L 42 |
57 | MICHAEL LU | MI | 15113330 | 1092 | 1092 | 2.0 | L 7 | L 36 | W 42 | L 51 | L 35 | L 53 | B |
58 | VIRAJ MOHILE | MI | 14700365 | 917 | 917 | 2.0 | W 31 | L 2 | L 41 | L 23 | L 49 | B | L 45 |
59 | SEAN M MC CORMICK | MI | 12841036 | 853 | 853 | 2.0 | L 41 | B | L 9 | L 40 | L 43 | W 54 | L 44 |
60 | JULIA SHEN | MI | 14579262 | 967 | 967 | 1.5 | L 33 | L 34 | D 45 | D 42 | L 24 | H | U |
61 | JEZZEL FARKAS | ON | 15771592 | 955 | 955 | 1.5 | L 32 | L 3 | W 54 | L 47 | D 42 | L 30 | L 37 |
62 | ASHWIN BALAJI | MI | 15219542 | 1530 | 1530 | 1.0 | W 55 | U | U | U | U | U | U |
63 | THOMAS JOSEPH HOSMER | MI | 15057092 | 1175 | 1175 | 1.0 | L 2 | L 48 | D 49 | L 43 | L 45 | H | U |
64 | BEN LI | MI | 15006561 | 1163 | 1163 | 1.0 | L 22 | D 30 | L 31 | D 49 | L 46 | L 42 | L 54 |
In order to find the average pre-rating of opponents for each player, Ineed to remove the status (W, L, or D) in each round. Then, match and replace each opponent player number with their pre-rating number for each round. The average pre-rating of opponents for each player is the sum of all the opponents’ pre-rating divided by the total number of games played.
df6 <- df5[c('Pair_Num','Prerating')]
df5$Round_1 <- df6$Prerating[match(df5$Round_1,df6$Pair_Num)]
df5$Round_2 <- df6$Prerating[match(df5$Round_2,df6$Pair_Num)]
df5$Round_3 <- df6$Prerating[match(df5$Round_3,df6$Pair_Num)]
df5$Round_4 <- df6$Prerating[match(df5$Round_4,df6$Pair_Num)]
df5$Round_5 <- df6$Prerating[match(df5$Round_5,df6$Pair_Num)]
df5$Round_6 <- df6$Prerating[match(df5$Round_6,df6$Pair_Num)]
df5$Round_7 <- df6$Prerating[match(df5$Round_7,df6$Pair_Num)]
df5$AverageRtg <- round(rowMeans(df5[,6:12], na.rm=TRUE))
Average <- df5
colnames(Average) <- c("Player Number", "Player Name", "State", "Pre-Rating","Total Points", "Opponent 1", "Opponent 2", "Opponent 3", "Opponent 4", "Opponent 5", "Opponent 6", "Opponent 7", "Average Pre-Rating of Opponents")
rownames(Average) <- NULL
kbl(Average) %>%
kable_classic() %>%
add_header_above(c("Player Information" = 5 , "Opponent Pre-Rating for Each Round" = 8))
Player Information
|
Opponent Pre-Rating for Each Round
|
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Player Number | Player Name | State | Pre-Rating | Total Points | Opponent 1 | Opponent 2 | Opponent 3 | Opponent 4 | Opponent 5 | Opponent 6 | Opponent 7 | Average Pre-Rating of Opponents |
1 | GARY HUA | ON | 1794 | 6.0 | 1436 | 1563 | 1600 | 1610 | 1649 | 1663 | 1716 | 1605 |
2 | DAKSHESH DARURI | MI | 1553 | 6.0 | 1175 | 917 | 1716 | 1629 | 1604 | 1595 | 1649 | 1469 |
3 | ADITYA BAJAJ | MI | 1384 | 6.0 | 1641 | 955 | 1745 | 1563 | 1712 | 1666 | 1663 | 1564 |
4 | PATRICK H SCHILLING | MI | 1716 | 5.5 | 1363 | 1507 | 1553 | 1579 | 1655 | 1564 | 1794 | 1574 |
5 | HANSHI ZUO | MI | 1655 | 5.5 | 1242 | 980 | 1663 | 1666 | 1716 | 1610 | 1629 | 1501 |
6 | HANSEN SONG | OH | 1686 | 5.0 | 1399 | 1602 | 1712 | 1438 | 1365 | 1552 | 1563 | 1519 |
7 | GARY DEE SWATHELL | MI | 1649 | 5.0 | 1092 | 377 | 1666 | 1712 | 1794 | 1411 | 1553 | 1372 |
8 | EZEKIEL HOUGHTON | MI | 1641 | 5.0 | 1384 | 1441 | 1610 | 1411 | 1362 | 1507 | 1564 | 1468 |
9 | STEFANO LEE | ON | 1411 | 5.0 | 1745 | 1600 | 853 | 1641 | 1579 | 1649 | 1595 | 1523 |
10 | ANVIT RAO | MI | 1365 | 5.0 | 1604 | 1564 | 1186 | 1494 | 1686 | 1745 | 1600 | 1554 |
11 | CAMERON WILLIAM MC LEMAN | MI | 1712 | 4.5 | 1423 | 1153 | 1686 | 1649 | 1384 | 1399 | 1579 | 1468 |
12 | KENNETH J TACK | MI | 1663 | 4.5 | 1332 | 1449 | 1655 | 1423 | NA | 1794 | 1384 | 1506 |
13 | TORRANCE HENRY JR | MI | 1666 | 4.5 | 1355 | 1552 | 1649 | 1655 | 1449 | 1384 | 1441 | 1498 |
14 | BRADLEY SHAW | MI | 1610 | 4.5 | 1270 | 1199 | 1641 | 1794 | 1552 | 1655 | 1494 | 1515 |
15 | ZACHARY JAMES HOUGHTON | MI | 1220 | 4.5 | 1564 | 1604 | 1522 | 1555 | 1270 | 1449 | 1423 | 1484 |
16 | MIKE NIKITIN | MI | 1604 | 4.0 | 1365 | 1220 | NA | 1436 | 1553 | 1355 | NA | 1386 |
17 | RONALD GRZEGORCZYK | MI | 1629 | 4.0 | 1382 | 1403 | 1579 | 1553 | 1363 | 1555 | 1655 | 1499 |
18 | DAVID SUNDEEN | MI | 1600 | 4.0 | 1362 | 1411 | 1794 | 1441 | 1564 | 1423 | 1365 | 1480 |
19 | DIPANKAR ROY | MI | 1564 | 4.0 | 1220 | 1365 | 935 | 1507 | 1600 | 1716 | 1641 | 1426 |
20 | JASON ZHENG | MI | 1595 | 4.0 | 1348 | 1291 | 1363 | 1403 | 1507 | 1553 | 1411 | 1411 |
21 | DINH DANG BUI | ON | 1563 | 4.0 | 1283 | 1794 | 1362 | 1384 | 1348 | 1436 | 1686 | 1470 |
22 | EUGENE L MCCLURE | MI | 1555 | 4.0 | 1163 | 935 | 1507 | 1220 | NA | 1629 | 1348 | 1300 |
23 | ALAN BUI | ON | 1363 | 4.0 | 1716 | 1283 | 1595 | 917 | 1629 | 980 | 377 | 1214 |
24 | MICHAEL R ALDRICH | MI | 1229 | 4.0 | 1507 | 1362 | 1283 | 1745 | 967 | 1199 | 1436 | 1357 |
25 | LOREN SCHWIEBERT | MI | 1745 | 3.5 | 1411 | 1393 | 1384 | 1229 | 1399 | 1365 | 1362 | 1363 |
26 | MAX ZHU | ON | 1579 | 3.5 | 1291 | 1348 | 1629 | 1716 | 1411 | 1441 | 1712 | 1507 |
27 | GAURAV GIDWANI | MI | 1552 | 3.5 | 1011 | 1666 | 377 | 980 | 1610 | 1686 | NA | 1222 |
28 | SOFIA ADINA STANESCU-BELLU | MI | 1507 | 3.5 | 1229 | 1716 | 1555 | 1564 | 1595 | 1641 | 1355 | 1522 |
29 | CHIEDOZIE OKORIE | MI | 1602 | 3.5 | 1056 | 1686 | 1423 | 1399 | 935 | 1382 | NA | 1314 |
30 | GEORGE AVERY JONES | ON | 1522 | 3.5 | 935 | 1163 | 1220 | 1186 | 1494 | 955 | 1056 | 1144 |
31 | RISHI SHETTY | MI | 1494 | 3.5 | 917 | 1186 | 1163 | 1365 | 1522 | 1056 | 1610 | 1260 |
32 | JOSHUA PHILIP MATHEWS | ON | 1441 | 3.5 | 955 | 1641 | 1199 | 1600 | 1011 | 1579 | 1666 | 1379 |
33 | JADE GE | MI | 1449 | 3.5 | 967 | 1663 | 1056 | 1355 | 1666 | 1220 | 1011 | 1277 |
34 | MICHAEL JEFFERY THOMAS | MI | 1399 | 3.5 | 1686 | 967 | 980 | 1602 | 1745 | 1712 | 935 | 1375 |
35 | JOSHUA DAVID LEE | MI | 1438 | 3.5 | 377 | 1423 | 1153 | 1686 | 1092 | 935 | 1382 | 1150 |
36 | SIDDHARTH JHA | MI | 1355 | 3.5 | 1666 | 1092 | 1011 | 1449 | NA | 1604 | 1507 | 1388 |
37 | AMIYATOSH PWNANANDAM | MI | 980 | 3.5 | NA | 1655 | 1399 | 1552 | NA | 1363 | 955 | 1385 |
38 | BRIAN LIU | MI | 1423 | 3.0 | 1712 | 1438 | 1602 | 1663 | NA | 1600 | 1220 | 1539 |
39 | JOEL R HENDON | MI | 1436 | 3.0 | 1794 | 1270 | 1348 | 1604 | 1199 | 1563 | 1229 | 1430 |
40 | FOREST ZHANG | MI | 1348 | 3.0 | 1595 | 1579 | 1436 | 853 | 1563 | 1153 | 1555 | 1391 |
41 | KYLE WILLIAM MURPHY | MI | 1403 | 3.0 | 853 | 1629 | 917 | 1595 | NA | NA | NA | 1248 |
42 | JARED GE | MI | 1332 | 3.0 | 1663 | 1056 | 1092 | 967 | 955 | 1163 | 1153 | 1150 |
43 | ROBERT GLEN VASEY | MI | 1283 | 3.0 | 1563 | 1363 | 1229 | 1175 | 853 | 377 | 1186 | 1107 |
44 | JUSTIN D SCHILLING | MI | 1199 | 3.0 | NA | 1610 | 1441 | 1393 | 1436 | 1229 | 853 | 1327 |
45 | DEREK YAN | MI | 1242 | 3.0 | 1655 | 1011 | 967 | 1153 | 1175 | 1186 | 917 | 1152 |
46 | JACOB ALEXANDER LAVALLEY | MI | 377 | 3.0 | 1438 | 1649 | 1552 | 1056 | 1163 | 1283 | 1363 | 1358 |
47 | ERIC WRIGHT | MI | 1362 | 2.5 | 1600 | 1229 | 1563 | 955 | 1641 | 1011 | 1745 | 1392 |
48 | DANIEL KHAIN | MI | 1382 | 2.5 | 1629 | 1175 | NA | 935 | NA | 1602 | 1438 | 1356 |
49 | MICHAEL J MARTIN | MI | 1291 | 2.5 | 1579 | 1595 | 1175 | 1163 | 917 | NA | NA | 1286 |
50 | SHIVAM JHA | MI | 1056 | 2.5 | 1602 | 1332 | 1449 | 377 | NA | 1494 | 1522 | 1296 |
51 | TEJAS AYYAGARI | MI | 1011 | 2.5 | 1552 | 1242 | 1355 | 1092 | 1441 | 1362 | 1449 | 1356 |
52 | ETHAN GUO | MI | 935 | 2.5 | 1522 | 1555 | 1564 | 1382 | 1602 | 1438 | 1399 | 1495 |
53 | JOSE C YBARRA | MI | 1393 | 2.0 | NA | 1745 | NA | 1199 | NA | 1092 | NA | 1345 |
54 | LARRY HODGE | MI | 1270 | 2.0 | 1610 | 1436 | 955 | NA | 1220 | 853 | 1163 | 1206 |
55 | ALEX KONG | MI | 1186 | 2.0 | 1530 | 1494 | 1365 | 1522 | NA | 1242 | 1283 | 1406 |
56 | MARISA RICCI | MI | 1153 | 2.0 | NA | 1712 | 1438 | 1242 | NA | 1348 | 1332 | 1414 |
57 | MICHAEL LU | MI | 1092 | 2.0 | 1649 | 1355 | 1332 | 1011 | 1438 | 1393 | NA | 1363 |
58 | VIRAJ MOHILE | MI | 917 | 2.0 | 1494 | 1553 | 1403 | 1363 | 1291 | NA | 1242 | 1391 |
59 | SEAN M MC CORMICK | MI | 853 | 2.0 | 1403 | NA | 1411 | 1348 | 1283 | 1270 | 1199 | 1319 |
60 | JULIA SHEN | MI | 967 | 1.5 | 1449 | 1399 | 1242 | 1332 | 1229 | NA | NA | 1330 |
61 | JEZZEL FARKAS | ON | 955 | 1.5 | 1441 | 1384 | 1270 | 1362 | 1332 | 1522 | 980 | 1327 |
62 | ASHWIN BALAJI | MI | 1530 | 1.0 | 1186 | NA | NA | NA | NA | NA | NA | 1186 |
63 | THOMAS JOSEPH HOSMER | MI | 1175 | 1.0 | 1553 | 1382 | 1291 | 1283 | 1242 | NA | NA | 1350 |
64 | BEN LI | MI | 1163 | 1.0 | 1555 | 1522 | 1494 | 1291 | 377 | 1332 | 1270 | 1263 |
The formula to calculate the expected score of each player was found in href=“https://en.wikipedia.org/wiki/Elo_rating_system#Theory”> Wikipedia.
The expected score of a player is \[E_A=\frac{1}{1+10^{\frac{R_B-R_A}{400}}}\] where \[R_A\] stands for the player’s pre-rating and \[R_B\] stands for the opponent’s pre-rating.
The difference of the actual score and expected score is ‘Total Points’-‘Expected Score’.
Below is a plot graph of the actual score and the expected score of each player.
# Expected <- Rating
# Expected$Round_1 <- 1/(1+10^(((as.numeric(Expected$Round_1))-(as.numeric(Expected$Total_Points)))/400))
# Expected$Round_2 <- 1/(1+10^(((as.numeric(Expected$Round_2))-(as.numeric(Expected$Total_Points)))/400))
# Expected$Round_3 <- 1/(1+10^(((as.numeric(Expected$Round_3))-(as.numeric(Expected$Total_Points)))/400))
# Expected$Round_4 <- 1/(1+10^(((as.numeric(Expected$Round_4))-(as.numeric(Expected$Total_Points)))/400))
# Expected$Round_5 <- 1/(1+10^(((as.numeric(Expected$Round_5))-(as.numeric(Expected$Total_Points)))/400))
# Expected$Round_6 <- 1/(1+10^(((as.numeric(Expected$Round_6))-(as.numeric(Expected$Total_Points)))/400))
# Expected$Round_7 <- 1/(1+10^(((as.numeric(Expected$Round_7))-(as.numeric(Expected$Total_Points)))/400))
#
# Expected$Round_1 <- 1/(1+10^(((as.numeric(Expected$Total_Points))-(as.numeric(Expected$Round_1)))/400))
# Expected$Round_2 <- 1/(1+10^(((as.numeric(Expected$Total_Points))-(as.numeric(Expected$Round_2)))/400))
# Expected$Round_3 <- 1/(1+10^(((as.numeric(Expected$Total_Points))-(as.numeric(Expected$Round_3)))/400))
# Expected$Round_4 <- 1/(1+10^(((as.numeric(Expected$Total_Points))-(as.numeric(Expected$Round_3)))/400))
# Expected$Round_5 <- 1/(1+10^(((as.numeric(Expected$Total_Points))-(as.numeric(Expected$Round_4)))/400))
# Expected$Round_6 <- 1/(1+10^(((as.numeric(Expected$Total_Points))-(as.numeric(Expected$Round_5)))/400))
# Expected$Round_7 <- 1/(1+10^(((as.numeric(Expected$Total_Points))-(as.numeric(Expected$Round_6)))/400))
Expected <- df5[c('Pair_Num', 'Player_Name','State','Prerating','Total_Points','Round_1','Round_2','Round_3','Round_4','Round_5','Round_6','Round_7')]
Expected$Round_1 <- 1/(1+10^(((as.numeric(Expected$Round_1))-(as.numeric(Expected$Prerating)))/400))
Expected$Round_2 <- 1/(1+10^(((as.numeric(Expected$Round_2))-(as.numeric(Expected$Prerating)))/400))
Expected$Round_3 <- 1/(1+10^(((as.numeric(Expected$Round_3))-(as.numeric(Expected$Prerating)))/400))
Expected$Round_4 <- 1/(1+10^(((as.numeric(Expected$Round_4))-(as.numeric(Expected$Prerating)))/400))
Expected$Round_5 <- 1/(1+10^(((as.numeric(Expected$Round_5))-(as.numeric(Expected$Prerating)))/400))
Expected$Round_6 <- 1/(1+10^(((as.numeric(Expected$Round_6))-(as.numeric(Expected$Prerating)))/400))
Expected$Round_7 <- 1/(1+10^(((as.numeric(Expected$Round_7))-(as.numeric(Expected$Prerating)))/400))
Expected$Expected_Score <- round(rowSums(Expected[,6:12], na.rm=TRUE), digits = 2)
Expected$Difference <- round((as.numeric(Expected$Total_Points))-(as.numeric(Expected$Expected_Score)), digits = 2)
ggplot(data = Expected, aes(x=Expected_Score, y=Total_Points, color=State)) + geom_point()
Below is a table of the expected score for each player.
Expected$Round_1 <- round(as.numeric(Expected$Round_1), digits=2)
Expected$Round_2 <- round(as.numeric(Expected$Round_2), digits=2)
Expected$Round_3 <- round(as.numeric(Expected$Round_3), digits=2)
Expected$Round_4 <- round(as.numeric(Expected$Round_4), digits=2)
Expected$Round_5 <- round(as.numeric(Expected$Round_5), digits=2)
Expected$Round_6 <- round(as.numeric(Expected$Round_6), digits=2)
Expected$Round_7 <- round(as.numeric(Expected$Round_7), digits=2)
Expected$Positive_Difference <-abs(Expected$Difference)
Expected$Relative_Percentage_Difference <- (as.numeric(Expected$Difference))/(as.numeric(Expected$Expected_Score))
Expected$Relative_Percentage_Difference <- round(as.numeric(Expected$Relative_Percentage_Difference), digits=2)
colnames(Expected) <- c("Player Number", "Player Name", "State", "Pre-Rating","Total Points", "Opponent 1", "Opponent 2", "Opponent 3", "Opponent 4", "Opponent 5", "Opponent 6", "Opponent 7", "Expected Score","Difference", "Positive Difference", "Relative Percentage Difference")
rownames(Expected) <- NULL
datatable(Expected)
Over performers are players who have a total score greater than their expected score. I can determine which player is a over performer by viewing their difference from their actual score. If the player’s difference is positive, then they are an over performer. Below is a table of the over performers. 56% of the players were over performers.
Overperformer <- Expected%>%
filter(Difference >0)
Overperformer <- Overperformer[c('Player Name','State','Pre-Rating','Total Points','Expected Score','Difference','Positive Difference','Relative Percentage Difference')]
datatable(Overperformer)
library(scales)
Percentage_of_Overperformers <- percent(nrow(Overperformer)/nrow(Expected))
Percentage_of_Overperformers
## [1] "56%"
Below are the top 5 over performers.
Top_Five <- Overperformer%>%
filter(rank(desc(Difference)) <=5)
Top_Five <- Top_Five[c('Player Name','State','Pre-Rating','Total Points','Expected Score','Difference','Positive Difference','Relative Percentage Difference')]
datatable(Top_Five)
Under performers are players who have a total score less than their expected score. If the player’s difference is negative, then they are an over performer.
Below is a table of the over performers. 44% of the players were under performers.
Underperformer <- Expected%>%
filter(Difference <0)%>%
arrange(Difference)
Underperformer <- Underperformer[c('Player Name','State','Pre-Rating','Total Points','Expected Score','Difference','Relative Percentage Difference')]
datatable(Underperformer)
Percentage_of_Underperformers <- percent(nrow(Underperformer)/nrow(Expected))
Percentage_of_Underperformers
## [1] "44%"
Below are the top 5 under performers.
Bottom_Five <- Underperformer[1:5,]
Bottom_Five <- Bottom_Five[c('Player Name','State','Pre-Rating','Total Points','Expected Score','Difference','Relative Percentage Difference')]
datatable(Bottom_Five)
56% of the players were over performers and 44% of the players were under performers.
The top 5 over performers were: 1. ADITYA BAJAJ 2. ANVIT RAO 3. ZACHARY JAMES HOUGHTON 4. AMIYATOSH PWNANANDAM 5. JACOB ALEXANDER LAVALLEY
The top 5 performers were: 1. LOREN SCHWIEBERT 2. GEORGE AVERY JONES 3. JARED GE 4. RISHI SHETTY 5. JOSHUA DAVID LEE