Project 1

In this project, you’re given a text file with chess tournament results where the information has some structure. Your job is to create an R Markdown file that generates a .CSV file (that could for example be imported into a SQL database) with the following information for all of the players:

Player’s Name, Player’s State, Total Number of Points, Player’s Pre-Rating, and Average Pre Chess Rating of Opponents

For the first player, the information would be: Gary Hua, ON, 6.0, 1794, 1605 1605 was calculated by using the pre-tournament opponents’ ratings of 1436, 1563, 1600, 1610, 1649, 1663, 1716, and dividing by the total number of games played.

If you have questions about the meaning of the data or the results, please post them on the discussion forum. Data science, like chess, is a game of back and forth… The chess rating system (invented by a Minnesota statistician named Arpad Elo) has been used in many other contexts, including assessing relative strength of employment candidates by human resource departments.

Player’s Name

library(stringr)
tournamentinfo <- as.vector(unlist(tournamentinfo)) #convert factor to vector
chess <- toString(tournamentinfo, width = NULL)
playername <- str_extract_all(chess, '([[:upper:]]+ ){2}([[:upper:]])*+')
playername <- unlist(playername)
playername <- playername[-1]
playername <- str_trim(playername)
playername
##  [1] "GARY HUA"                 "DAKSHESH DARURI"         
##  [3] "ADITYA BAJAJ"             "PATRICK H SCHILLING"     
##  [5] "HANSHI ZUO"               "HANSEN SONG"             
##  [7] "GARY DEE SWATHELL"        "EZEKIEL HOUGHTON"        
##  [9] "STEFANO LEE"              "ANVIT RAO"               
## [11] "CAMERON WILLIAM MC"       "KENNETH J TACK"          
## [13] "TORRANCE HENRY JR"        "BRADLEY SHAW"            
## [15] "ZACHARY JAMES HOUGHTON"   "MIKE NIKITIN"            
## [17] "RONALD GRZEGORCZYK"       "DAVID SUNDEEN"           
## [19] "DIPANKAR ROY"             "JASON ZHENG"             
## [21] "DINH DANG BUI"            "EUGENE L MCCLURE"        
## [23] "ALAN BUI"                 "MICHAEL R ALDRICH"       
## [25] "LOREN SCHWIEBERT"         "MAX ZHU"                 
## [27] "GAURAV GIDWANI"           "SOFIA ADINA STANESCU"    
## [29] "CHIEDOZIE OKORIE"         "GEORGE AVERY JONES"      
## [31] "RISHI SHETTY"             "JOSHUA PHILIP MATHEWS"   
## [33] "JADE GE"                  "MICHAEL JEFFERY THOMAS"  
## [35] "JOSHUA DAVID LEE"         "SIDDHARTH JHA"           
## [37] "AMIYATOSH PWNANANDAM"     "BRIAN LIU"               
## [39] "JOEL R HENDON"            "FOREST ZHANG"            
## [41] "KYLE WILLIAM MURPHY"      "JARED GE"                
## [43] "ROBERT GLEN VASEY"        "JUSTIN D SCHILLING"      
## [45] "DEREK YAN"                "JACOB ALEXANDER LAVALLEY"
## [47] "ERIC WRIGHT"              "DANIEL KHAIN"            
## [49] "MICHAEL J MARTIN"         "SHIVAM JHA"              
## [51] "TEJAS AYYAGARI"           "ETHAN GUO"               
## [53] "JOSE C YBARRA"            "LARRY HODGE"             
## [55] "ALEX KONG"                "MARISA RICCI"            
## [57] "MICHAEL LU"               "VIRAJ MOHILE"            
## [59] "SEAN M MC"                "JULIA SHEN"              
## [61] "JEZZEL FARKAS"            "ASHWIN BALAJI"           
## [63] "THOMAS JOSEPH HOSMER"     "BEN LI"

Player’s State

library(stringr)
tournamentinfo <- as.vector(unlist(tournamentinfo)) #convert factor to vector
chess <- toString(tournamentinfo, width = NULL)
playerstate <- str_extract_all(chess, ("\\sON | \\sMI | \\sOH"))
playerstate <- unlist(playerstate)
playerstate
##  [1] " ON "  "  MI " "  MI " "  MI " "  MI " "  OH"  "  MI " "  MI " " ON " 
## [10] "  MI " "  MI " "  MI " "  MI " "  MI " "  MI " "  MI " "  MI " "  MI "
## [19] "  MI " "  MI " " ON "  "  MI " " ON "  "  MI " "  MI " " ON "  "  MI "
## [28] "  MI " "  MI " " ON "  "  MI " " ON "  "  MI " "  MI " "  MI " "  MI "
## [37] "  MI " "  MI " "  MI " "  MI " "  MI " "  MI " "  MI " "  MI " "  MI "
## [46] "  MI " "  MI " "  MI " "  MI " "  MI " "  MI " "  MI " "  MI " "  MI "
## [55] "  MI " "  MI " "  MI " "  MI " "  MI " "  MI " " ON "  "  MI " "  MI "
## [64] "  MI "

Player’s Total Number of Points

library(stringr)
library(purrr)
tournamentinfo <- as.vector(unlist(tournamentinfo)) #convert factor to vector
chess <- toString(tournamentinfo, width = NULL)
playerpoints <- str_extract_all(chess, "\\d\\.\\d")
playerpoints <- unlist(playerpoints)
playerpoints <- map_dbl(playerpoints, as.numeric)
playerpoints
##  [1] 6.0 6.0 6.0 5.5 5.5 5.0 5.0 5.0 5.0 5.0 4.5 4.5 4.5 4.5 4.5 4.0 4.0 4.0 4.0
## [20] 4.0 4.0 4.0 4.0 4.0 3.5 3.5 3.5 3.5 3.5 3.5 3.5 3.5 3.5 3.5 3.5 3.5 3.5 3.0
## [39] 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 2.5 2.5 2.5 2.5 2.5 2.5 2.0 2.0 2.0 2.0 2.0
## [58] 2.0 2.0 1.5 1.5 1.0 1.0 1.0

GitHub

RPubs