I would recommend checking out “Custom AFL Elo Rating Calculation and Visualisation”, also on RPubs, prior to using this tutorial.
Elo Ratings are a measure of strength of a player or team within a group. Players or teams are given an initial rating of a specified number of points. These points are exchanged with other players or teams in match-ups depending on a win, loss or draw. The number of points exchanged in a match-up is determined by the relative ratings of each player or team, a weighting factor and the outcome of the match-up.
The values for these inputs can be tested upon existing data to determine what values will give the greatest values. This process is called hyperparameter tuning.
In this tutorial we will conduct hyperparameter tuning on Elo Ratings in Australian Rules Football.
First we will load up some required packages. The ‘PlayerRatings’ package handles the calculation of Elo Ratings. The fitzRoy package will scrape the necessary data for our use.
library(fitzRoy)
library(PlayerRatings)
library(dplyr)
library(tidyr)
library(caret)
library(ggplot2)
We can load in and begin to manipulate the data for testing upon.
# add desired years of results
importedData <- fetch_player_stats_afltables(season = c(2017:2021))
# the imported data needs to be grouped and summarised by game
seasonStats <- importedData %>%
group_by(Season, Round, Date, Local.start.time, Venue) %>%
# a round number variable will assist in creating a week counter for the overall data
summarise(roundnumber = ifelse(Round[1] == "EF" | Round[1] == "QF", "FW1", ifelse(Round[1] == "SF", "FW2", ifelse(Round[1] == "PF", "FW3", ifelse(Round[1] == "GF", "FW4", Round[1])))),
year = Season[1],
hteam = Home.team[1],
ateam = Away.team[1],
hscore = Home.score[1],
ascore = Away.score[1],
# we will create a new column to represent the S value.
# 1 for a win, 0.5 for a draw and 0 for a loss.
spoint = ifelse(hscore > ascore, 1, ifelse(hscore == ascore, 0.5, 0)))
# combine roundnumber and year to create unique value for each round, and make sure the data is arranged by date
seasonStats <- seasonStats %>%
unite("roundcode", sep = " - ", roundnumber:year, remove = TRUE) %>%
arrange(Date)
We need to create a week counter to cycle through when doing the testing.
# creating a temporary data frame that is grouped by each unique round with a starting date
temp <- seasonStats %>%
group_by(roundcode) %>%
summarise(firstgame = Date[1])
# arrange the rounds in order and assign a value to each round, this number will need to match the number of rounds
temp <- temp %>%
arrange(firstgame) %>%
mutate(Week = 1:130)
# join the temporary data frame to our main one, so that the weeks match the correct round
seasonStats <- left_join(seasonStats, temp, by = "roundcode") %>%
select(-firstgame)
# check the main data frame
head(seasonStats)
## # A tibble: 6 x 12
## # Groups: Season, Round, Date, Local.start.time [5]
## Season Round Date Local.start.time Venue roundcode hteam ateam hscore
## <dbl> <chr> <date> <int> <chr> <chr> <chr> <chr> <int>
## 1 2017 1 2017-03-23 1920 M.C.G. 1 - 2017 Carl~ Rich~ 89
## 2 2017 1 2017-03-24 1950 M.C.G. 1 - 2017 Coll~ West~ 86
## 3 2017 1 2017-03-25 1635 Docklands 1 - 2017 St K~ Melb~ 90
## 4 2017 1 2017-03-25 1635 S.C.G. 1 - 2017 Sydn~ Port~ 82
## 5 2017 1 2017-03-25 1905 Carrara 1 - 2017 Gold~ Bris~ 96
## 6 2017 1 2017-03-25 1925 M.C.G. 1 - 2017 Esse~ Hawt~ 116
## # ... with 3 more variables: ascore <int>, spoint <dbl>, Week <int>
# create a function that takes in the various Elo Rating parameters to test them
eloScore <- function(init, kfac, gamma, data, status = NULL){
# a list of all the weeks in the dataset to cycle through
predictweeks <- unique(data$Week)
# the relevant variables needed to calculate the Elo Ratings
modelVars <- c('Week','hteam','ateam','spoint')
# if an existing model isn't passed through
if (is.null(status)){
# subset all weeks bar Week 1
predictweeks <- predictweeks[-1]
# create elo calculation model using specified inputs
eloModel <- elo(data[data$Week == 1, modelVars],
init = init,
kfac = kfac,
gamma = gamma,
history = T)
} else {
# assign an existing model
eloModel <- status
}
# create a probability variable, and assign it as NA to begin with
data[c('eloProb')] <- NA
# for loop that loops through the list of weeks
for (w in predictweeks){
# subset the week
predWeek <- subset(data, Week == w)
# predict the subset week's matches
# trat and tng are arguments that determine ratings of un-rated teams, and a games requirement
predElo <- predict(eloModel,
newdata = predWeek[modelVars],
gamma = gamma,
tng = 1, trat = 1800)
# assign the predictions to the corresponding week
data$eloProb[data$Week == w] <- predElo
# create and update the ratings
eloModel <- elo(predWeek[modelVars],
status = eloModel$ratings,
init = init,
gamma = gamma,
kfac = kfac,
history = T)
}
# create a variable that indicates what the Elo Model predicts, relative to the home team
data$tipElo <- ifelse(data$eloProb > 0.5, 1, 0)
# create a confusion matrix of the results
cmElo <- confusionMatrix(data = factor(data$tipElo, levels = c(0,0.5,1)),
reference = factor(data$spoint, levels = c(0,0.5,1)))
return(
# return final list of ratings, and the confusion matrix
list(eloModel,
cmElo,
data)
)
}
Now that the model has been created, the accuracy of the models can be tested. Ranges of each input can be parsed into the function.
# this is an expanded grid of every combination of our specified inputs
# NOTE: more combinations will mean a longer run time
# be weary of the length of the data set and number of combinations of inputs
params <- expand.grid(init = seq(1600, 2400, by = 200),
kfac = seq(10, 100, by = 10),
gamma = seq(0, 1, by = 0.2))
We’ll split the data into training and testing
# we'll use a 70/30 split, but alternate splits can be used also
trainData <- subset(seasonStats, Week <= 91)
testData <- subset(seasonStats, Week >= 92)
We next want to apply or parameters to the training data set, using our custom function.
# this will apply our testing parameters to our training data, testing for
# NOTE: this may take several minutes
params$accuracy <- apply(params, 1, function (x)
eloScore(x[1], x[2], x[3], trainData)[[2]]$overall['Accuracy'])
Let’s subset the most accurate combination of input values to determine what combination(s) give the best accuracy
# this will subset the most accurate combination of inputs
subset(params, accuracy == max(params$accuracy))
## init kfac gamma accuracy
## 216 1600 40 0.8 0.6366237
## 217 1800 40 0.8 0.6366237
## 218 2000 40 0.8 0.6366237
## 219 2200 40 0.8 0.6366237
## 220 2400 40 0.8 0.6366237
## 266 1600 40 1.0 0.6366237
## 267 1800 40 1.0 0.6366237
## 268 2000 40 1.0 0.6366237
## 269 2200 40 1.0 0.6366237
## 270 2400 40 1.0 0.6366237
## 291 1600 90 1.0 0.6366237
## 292 1800 90 1.0 0.6366237
## 293 2000 90 1.0 0.6366237
## 294 2200 90 1.0 0.6366237
## 295 2400 90 1.0 0.6366237
Lastly, we can test how our newly found combination of values does on our testing data
# test inputs on testing data
testScore <- eloScore(init = 1600,
kfac = 40,
gamma = 0.8,
data = testData,
status = eloScore(1600, 40, 0.8, trainData)[[1]])
Our function returns three objects: the latest ratings, a confusion matrix with accuracy statistics and a data frame containing game by game elo probability and tips.
# the latest ratings
testScore[[1]]
##
## Elo Ratings For 18 Players Playing 990 Games
##
## Player Rating Games Win Draw Loss Lag
## 1 Melbourne 1800 111 62 1 48 0
## 2 Port Adelaide 1795 110 70 1 39 1
## 3 Geelong 1753 119 77 1 41 1
## 4 Western Bulldogs 1723 111 59 0 52 0
## 5 Brisbane Lions 1712 111 56 0 55 2
## 6 Greater Western Sydney 1650 116 65 4 47 2
## 7 Sydney 1635 109 57 0 52 3
## 8 Richmond 1631 117 80 2 35 4
## 9 St Kilda 1605 107 45 1 61 4
## 10 Essendon 1583 108 53 1 54 3
## 11 West Coast 1578 113 69 1 43 4
## 12 Fremantle 1550 105 42 0 63 4
## 13 Hawthorn 1527 107 48 3 56 4
## 14 Collingwood 1504 113 58 2 53 4
## 15 Carlton 1485 105 30 0 75 4
## 16 Adelaide 1453 108 49 1 58 4
## 17 Gold Coast 1424 105 25 1 79 4
## 18 North Melbourne 1393 105 35 1 69 4
# confusion matrix and accuracy statistics
testScore[[2]]
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 0.5 1
## 0 78 2 61
## 0.5 0 0 0
## 1 49 2 90
##
## Overall Statistics
##
## Accuracy : 0.5957
## 95% CI : (0.5359, 0.6535)
## No Information Rate : 0.5355
## P-Value [Acc > NIR] : 0.02408
##
## Kappa : 0.2028
##
## Mcnemar's Test P-Value : 0.15051
##
## Statistics by Class:
##
## Class: 0 Class: 0.5 Class: 1
## Sensitivity 0.6142 0.00000 0.5960
## Specificity 0.5935 1.00000 0.6107
## Pos Pred Value 0.5532 NaN 0.6383
## Neg Pred Value 0.6525 0.98582 0.5674
## Prevalence 0.4504 0.01418 0.5355
## Detection Rate 0.2766 0.00000 0.3191
## Detection Prevalence 0.5000 0.00000 0.5000
## Balanced Accuracy 0.6039 0.50000 0.6034
# data frame of all matches featuring elo probabilities and tips
testScore[[3]]
## # A tibble: 282 x 14
## # Groups: Season, Round, Date, Local.start.time [276]
## Season Round Date Local.start.time Venue roundcode hteam ateam hscore
## <dbl> <chr> <date> <int> <chr> <chr> <chr> <chr> <int>
## 1 2020 11 2020-08-08 1605 Adela~ 11 - 2020 Port~ Richm~ 93
## 2 2020 11 2020-08-08 1940 Gabba 11 - 2020 Bris~ Weste~ 96
## 3 2020 11 2020-08-09 1335 Perth~ 11 - 2020 West~ Carlt~ 72
## 4 2020 11 2020-08-09 1740 Adela~ 11 - 2020 Melb~ North~ 92
## 5 2020 11 2020-08-10 1810 Gabba 11 - 2020 St K~ Geelo~ 34
## 6 2020 11 2020-08-10 1840 Perth~ 11 - 2020 Frem~ Hawth~ 48
## 7 2020 11 2020-08-11 1805 Adela~ 11 - 2020 Adel~ Colli~ 38
## 8 2020 11 2020-08-12 1910 Carra~ 11 - 2020 Gold~ Essen~ 73
## 9 2020 12 2020-08-13 1810 Perth~ 12 - 2020 Sydn~ Great~ 66
## 10 2020 12 2020-08-14 1950 Carra~ 12 - 2020 Geel~ Port ~ 91
## # ... with 272 more rows, and 5 more variables: ascore <int>, spoint <dbl>,
## # Week <int>, eloProb <dbl>, tipElo <dbl>
As a small extensions we can visualise a confusion matrix of our testing data.
# pull out our data with Elo tips and probabilities
eloData <- testScore[[3]]
# create a data frame with the Elo tip and the actual outcome (spoint)
confMatELO <- as.data.frame(table(eloData$tipElo, eloData$spoint))
# can optionally remove draws (Elo is very unlikely to predict) (Var2 is the predicted results)
confMatELO <- confMatELO %>%
filter(Var2 != 0.5)
# plotting the confusion matrix
ggplot(data = confMatELO, mapping = aes(x = Var1, y = Var2)) +
geom_tile(aes(fill = Freq)) + # generates a plot with rectangles
geom_text(aes(label = Freq)) + # adds the specific values to the rectangles
scale_fill_gradient(low = "grey80", high = "dark green") + # fill with custom colours (gray and dark green)
labs(x = "Predicted Outcome", y = "Actual Outcome", # custom titles, and labels
title = "Elo Probability Model Confusion Matrix",
fill = "Count") +
theme_bw() # changing theme, is preferential
These same principles are applicable to most data sets, try out your own!