The Elo rating system was developed by Arpad Elo in 1978 to rate chess players. It has now become the most popular rating system in sport.
The Elo rating system predicts the probability of winning a match based on the skill level of the players or teams in past performances and then updates the ratings after each match.
This RPubs document will apply the rating system to the 2017 Suncorp Super Netball competition, however, the same code can be applied to other sports.
The code below shows the packages required to complete this analysis. If the following packages have not been installed, do this by using the install.packages() function.
# Load the required packages
library(dplyr)
library(tidyverse)
library(PlayerRatings)
library(caret)
library(e1071)
library(ggplot2)
library(devtools)
The data used for this analysis was scraped using the superNetballR package. The following code will help load the data set and will be used to calculate the Elo rating for each team throughout the 2017 regular season.
# Load in the data set
devtools::install_github("stevelane/superNetballR")
library(superNetballR)
data("season_2017")
View(season_2017)
head(season_2017)
## # A tibble: 6 x 9
## period squadId squadName squadNickname squadCode stat value round game
## <int> <int> <chr> <chr> <chr> <chr> <int> <int> <int>
## 1 1 806 NSW Swifts Swifts NSW reboun~ 0 1 1
## 2 2 806 NSW Swifts Swifts NSW reboun~ 1 1 1
## 3 3 806 NSW Swifts Swifts NSW reboun~ 1 1 1
## 4 4 806 NSW Swifts Swifts NSW reboun~ 0 1 1
## 5 1 8118 GIANTS Netba~ GIANTS GNB reboun~ 1 1 1
## 6 2 8118 GIANTS Netba~ GIANTS GNB reboun~ 2 1 1
In order to calculate the Elo ratings, some data preparation and manipulation is required. The code below can be followed to get the data ready for analysis.
# The first step is to subset the data to only include the stat "goals" in order to calculate the total goals shot by each team in each game. It also subset to include only the regular season games.
scores <- subset(season_2017, stat == "goals" & round <= 14)
# The next step is to remove the columns not needed in the analysis as well as re-ordering the columns to make the data set neat.
scores <- scores[-c(2, 4:6)]
scores <- scores[ , c(4, 5, 2, 1, 3)]
# The next bit of code changes the dataframe from a long format to a wide format so that the goals from each period are in their own column for each team.
scores <- scores %>% spread(period, value)
# The next step is calculate the total goals scored by each team in each game by adding up the columns containing the number of goals scored in each period of the game. Once the total goals has been calculated the period columns can be removed.
scores$final_score <- rowSums(scores[ , c(4:7)])
scores <- scores[-c(4:7)]
# Next, the data frame must be split into a home and away data frame in order to distinguish between the home team and away team as well as the home score and away score. The columns are re-named in to make this distinction.
home_team <- scores %>% slice(which(row_number() %% 2 == 1))
away_team <- scores %>% anti_join(home_team)
names(home_team)[3] <- "h_team"
names(home_team)[4] <- "h_score"
names(away_team)[3] <- "a_team"
names(away_team)[4] <- "a_score"
# Now the home and away data frames can be joined together by using the left_join function. To make the data frame neat, the columns are re-ordered.
game_scores <- left_join(home_team, away_team)
game_scores <- game_scores[, c(1, 2, 3, 5, 4, 6)]
# To calculate the Elo rating a derived score column needs to be created to indicate whether the home team won, drew or lost by assigning 1, 0.5 and 0, respectively.
game_scores <- game_scores %>%
mutate(
score = ifelse(h_score == a_score, 0.5, ifelse(h_score > a_score, 1, 0))
)
# To make calculating the Elo score easier the round column is renamed to 'week'.
names(game_scores)[1] <- 'week'
Before learning how to create one function to calculate the Elo rating, the following code will go through the basic functions of the PlayerRatings package. The PlayerRatings package contains files on the implementation of the ELO, GLICKO and STEPHENSON rating methods, however, for this analysis we will only be using the elo function. To learn more about the elo function you can use ?elo in R.
Use the ‘elo’ function to generate team ratings for the ‘game_scores’ netball data.
Assign new teams a rating of 1900 (init)
Use a k factor of 25 (kfac)
Use no home ground advantage (gamma)
Keep the ratings history of each team
eloratings <- elo(game_scores[c('week','h_team','a_team','score')],
init = 1900,
kfac = 25,
gamma = 0,
history = T)
# Print the team ratings
print(eloratings)
##
## Elo Ratings For 8 Players Playing 56 Games
##
## Player Rating Games Win Draw Loss Lag
## 1 Sunshine Coast Lightning 1991 14 11 1 2 0
## 2 Melbourne Vixens 1987 14 11 1 2 0
## 3 GIANTS Netball 1952 14 10 0 4 0
## 4 Magpies Netball 1943 14 9 0 5 0
## 5 Queensland Firebirds 1914 14 7 1 6 0
## 6 NSW Swifts 1830 14 3 1 10 0
## 7 West Coast Fever 1801 14 2 0 12 0
## 8 Adelaide Thunderbirds 1781 14 1 0 13 0
Start by getting a list of all the weeks in the data set and then remove week 1 from the predictweeks list as this will be used to build the initial model.
predictweeks <- unique(game_scores$week)
predictweeks <- predictweeks[-1]
Create a vector containing the columns passed the rating model function as this will help clean up the code.
model_vars <- c('week','h_team','a_team','score')
Build an elo model using the first week of data with the following parameters:
initial rating = 2200
k factor = 27
gamma = 0
history = T
elomodel <- elo(game_scores[game_scores$week==1, model_vars],
init = 2200,
kfac = 27,
gamma = 0,
history = T)
Next, initialise new columns in game_scores for probability estimates.
game_scores[c('p_elo')] <- NA
Loop through the weeks where we want to generate predictions.
for (w in predictweeks){
# Identify the week of games we are trying to predict
pred_week <- subset(game_scores, week == w)
# Make predictions (set tng = 1 and trat = 1600, deviation = 300)
pred_elo <- predict(elomodel,
newdata = pred_week[model_vars],
gamma = 0,
tng = 1, trat = 1600)
# Add the predicted probabilities to the test_data dataframe
game_scores$p_elo[game_scores$week == w] <- pred_elo
# Update the models using the 'status' option
elomodel <- elo(pred_week[model_vars],
status = elomodel$ratings,
init = 2200,
gamma = 0,
kfac = 27,
history = T)
}
Use the probability estimates generated in section 6.0 to test how good the elo model was at predicting match results.
In order to tip the winner of matches we need to transform our probability estimates into a binary decision; home team wins (score = 1) or home team lose (score = 0). A decision threshold of 0.5 is applied to achieve this.
Create a new column in the game_scores data frame for tips from the Elo model.
game_scores$tip_elo <- ifelse(game_scores$p_elo > 0.5, 1, 0)
Create a confusion matrix for the Elo model tips using the confusionMatrix function.
cm_elo <-
confusionMatrix(data = factor(game_scores$tip_elo, levels = c(0, 0.5, 1)),
reference = factor(game_scores$score, levels = c(0, 0.5, 1)))
cm_elo
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 0.5 1
## 0 20 1 3
## 0.5 0 0 0
## 1 7 0 21
##
## Overall Statistics
##
## Accuracy : 0.7885
## 95% CI : (0.653, 0.8894)
## No Information Rate : 0.5192
## P-Value [Acc > NIR] : 5.622e-05
##
## Kappa : 0.5867
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: 0 Class: 0.5 Class: 1
## Sensitivity 0.7407 0.00000 0.8750
## Specificity 0.8400 1.00000 0.7500
## Pos Pred Value 0.8333 NaN 0.7500
## Neg Pred Value 0.7500 0.98077 0.8750
## Prevalence 0.5192 0.01923 0.4615
## Detection Rate 0.3846 0.00000 0.4038
## Detection Prevalence 0.4615 0.00000 0.5385
## Balanced Accuracy 0.7904 0.50000 0.8125
A different way to score the Elo model is to use a probabilistic scoring method (instead of converting the predicted probabilities into discrete tips). To do this we will plot the calibration curve of the Elo model.
# remove draws (0.5) and NAs
cond <- game_scores$score != 0.5 & !is.na(game_scores$p_elo)
# build calibration model
calib <- calibration(factor(score, levels = c(1, 0))
~ p_elo,
data = game_scores[cond, ])
# plot and interpret the model
ggplot(calib)
One way to score probability estimates is the Brier score and is shown in the following code. The lower the brier score the better the prediction.
brier_elo <- mean((game_scores$score - game_scores$p_elo)^2, na.rm = T)
brier_elo
## [1] 0.194303
The following code will help build a tipping model based on the Elo function. This model is built using a combination of the basic functions and techniques learnt in the above sections. It will also show how to choose the best initial values for each parameter.
The first step is to create a function that takes the Elo parameters ‘init’, ‘kfac’ and ‘gamma’ as arguments as well as a data frame that contains match information and a ‘status’ variable that allows existing models to be passed through the function. This function will return the final elo ratings model and a confusion matrix that contains the predicted tips.
elo_score <- function(init, kfac, gamma, data, status = NULL){
predictweeks <- unique(data$week)
model_vars <- c('week','h_team','a_team','score')
if (is.null(status)){
predictweeks <- predictweeks[-1]
elomodel <- elo(data[data$week == 1, model_vars],
init = init,
kfac = kfac,
gamma = gamma,
history = T)
} else {
elomodel <- status
}
data[c('p_elo')] <- NA
for (w in predictweeks){
pred_week <- subset(data, week == w)
pred_elo <- predict(elomodel,
newdata = pred_week[model_vars],
gamma = gamma,
tng = 1, trat = 1600)
data$p_elo[data$week == w] <- pred_elo
elomodel <- elo(pred_week[model_vars],
status = elomodel$ratings,
init = init,
gamma = gamma,
kfac = kfac,
history = T)
}
data$tip_elo <- ifelse(data$p_elo > 0.5, 1, 0)
cm_elo <- confusionMatrix(data = factor(data$tip_elo, levels = c(0,0.5,1)),
reference = factor(data$score, levels = c(0,0.5,1)))
return(
list(elomodel,
cm_elo)
)
}
Next create a grid parameter of values to test using the expand.grid function. This is used to determine the most accurate combination of parameters for the Elo model.
Set the initial values to between 1400 and 3000 (increments of 200)
For the kfactor, set the values to between 20 and 80 (increments of 5)
Include the gamma parameter (home ground advantage), setting the values to between 0 and 20 (increments of 2)
parameters <- expand.grid(init = seq(1400, 3000, by = 200),
kfac = seq(20, 80, by = 5),
gamma = seq(0, 20, by = 2))
In order to estimate the optimal parameter combination and estimate the out-of-sample performance, the game_score needs to be split into train and test data. The train will be used to estimate the parameters and the test data is used to estimate out-of-sample performance. The out-of-sample performance test helps reduce the bias.
Assign week 10 and less to the train data and week 11 and above to the test data.
The code to this is shown below.
train_data <- subset(game_scores, week <= 10)
test_data <- subset(game_scores, week >= 11)
For each parameter combination in ‘parameters’, calculate the tip accuracy on the train_data by using the apply function.
parameters$accuracy <- apply(parameters, 1, function (x)
elo_score(x[1], x[2], x[3], train_data)[[2]]$overall['Accuracy'] )
To find the best parameter combination the following code can be used.
subset(parameters, accuracy == max(parameters$accuracy))
## init kfac gamma accuracy
## 136 1400 30 2 0.7777778
## 137 1600 30 2 0.7777778
## 138 1800 30 2 0.7777778
## 139 2000 30 2 0.7777778
## 140 2200 30 2 0.7777778
## 141 2400 30 2 0.7777778
## 142 2600 30 2 0.7777778
## 143 2800 30 2 0.7777778
## 144 3000 30 2 0.7777778
## 145 1400 35 2 0.7777778
## 146 1600 35 2 0.7777778
## 147 1800 35 2 0.7777778
## 148 2000 35 2 0.7777778
## 149 2200 35 2 0.7777778
## 150 2400 35 2 0.7777778
## 151 2600 35 2 0.7777778
## 152 2800 35 2 0.7777778
## 153 3000 35 2 0.7777778
## 271 1400 40 4 0.7777778
## 272 1600 40 4 0.7777778
## 273 1800 40 4 0.7777778
## 274 2000 40 4 0.7777778
## 275 2200 40 4 0.7777778
## 276 2400 40 4 0.7777778
## 277 2600 40 4 0.7777778
## 278 2800 40 4 0.7777778
## 279 3000 40 4 0.7777778
## 280 1400 45 4 0.7777778
## 281 1600 45 4 0.7777778
## 282 1800 45 4 0.7777778
## 283 2000 45 4 0.7777778
## 284 2200 45 4 0.7777778
## 285 2400 45 4 0.7777778
## 286 2600 45 4 0.7777778
## 287 2800 45 4 0.7777778
## 288 3000 45 4 0.7777778
## 406 1400 50 6 0.7777778
## 407 1600 50 6 0.7777778
## 408 1800 50 6 0.7777778
## 409 2000 50 6 0.7777778
## 410 2200 50 6 0.7777778
## 411 2400 50 6 0.7777778
## 412 2600 50 6 0.7777778
## 413 2800 50 6 0.7777778
## 414 3000 50 6 0.7777778
## 415 1400 55 6 0.7777778
## 416 1600 55 6 0.7777778
## 417 1800 55 6 0.7777778
## 418 2000 55 6 0.7777778
## 419 2200 55 6 0.7777778
## 420 2400 55 6 0.7777778
## 421 2600 55 6 0.7777778
## 422 2800 55 6 0.7777778
## 423 3000 55 6 0.7777778
## 532 1400 55 8 0.7777778
## 533 1600 55 8 0.7777778
## 534 1800 55 8 0.7777778
## 535 2000 55 8 0.7777778
## 536 2200 55 8 0.7777778
## 537 2400 55 8 0.7777778
## 538 2600 55 8 0.7777778
## 539 2800 55 8 0.7777778
## 540 3000 55 8 0.7777778
## 541 1400 60 8 0.7777778
## 542 1600 60 8 0.7777778
## 543 1800 60 8 0.7777778
## 544 2000 60 8 0.7777778
## 545 2200 60 8 0.7777778
## 546 2400 60 8 0.7777778
## 547 2600 60 8 0.7777778
## 548 2800 60 8 0.7777778
## 549 3000 60 8 0.7777778
## 550 1400 65 8 0.7777778
## 551 1600 65 8 0.7777778
## 552 1800 65 8 0.7777778
## 553 2000 65 8 0.7777778
## 554 2200 65 8 0.7777778
## 555 2400 65 8 0.7777778
## 556 2600 65 8 0.7777778
## 557 2800 65 8 0.7777778
## 558 3000 65 8 0.7777778
## 658 1400 60 10 0.7777778
## 659 1600 60 10 0.7777778
## 660 1800 60 10 0.7777778
## 661 2000 60 10 0.7777778
## 662 2200 60 10 0.7777778
## 663 2400 60 10 0.7777778
## 664 2600 60 10 0.7777778
## 665 2800 60 10 0.7777778
## 666 3000 60 10 0.7777778
## 667 1400 65 10 0.7777778
## 668 1600 65 10 0.7777778
## 669 1800 65 10 0.7777778
## 670 2000 65 10 0.7777778
## 671 2200 65 10 0.7777778
## 672 2400 65 10 0.7777778
## 673 2600 65 10 0.7777778
## 674 2800 65 10 0.7777778
## 675 3000 65 10 0.7777778
## 676 1400 70 10 0.7777778
## 677 1600 70 10 0.7777778
## 678 1800 70 10 0.7777778
## 679 2000 70 10 0.7777778
## 680 2200 70 10 0.7777778
## 681 2400 70 10 0.7777778
## 682 2600 70 10 0.7777778
## 683 2800 70 10 0.7777778
## 684 3000 70 10 0.7777778
## 685 1400 75 10 0.7777778
## 686 1600 75 10 0.7777778
## 687 1800 75 10 0.7777778
## 688 2000 75 10 0.7777778
## 689 2200 75 10 0.7777778
## 690 2400 75 10 0.7777778
## 691 2600 75 10 0.7777778
## 692 2800 75 10 0.7777778
## 693 3000 75 10 0.7777778
## 793 1400 70 12 0.7777778
## 794 1600 70 12 0.7777778
## 795 1800 70 12 0.7777778
## 796 2000 70 12 0.7777778
## 797 2200 70 12 0.7777778
## 798 2400 70 12 0.7777778
## 799 2600 70 12 0.7777778
## 800 2800 70 12 0.7777778
## 801 3000 70 12 0.7777778
## 802 1400 75 12 0.7777778
## 803 1600 75 12 0.7777778
## 804 1800 75 12 0.7777778
## 805 2000 75 12 0.7777778
## 806 2200 75 12 0.7777778
## 807 2400 75 12 0.7777778
## 808 2600 75 12 0.7777778
## 809 2800 75 12 0.7777778
## 810 3000 75 12 0.7777778
## 811 1400 80 12 0.7777778
## 812 1600 80 12 0.7777778
## 813 1800 80 12 0.7777778
## 814 2000 80 12 0.7777778
## 815 2200 80 12 0.7777778
## 816 2400 80 12 0.7777778
## 817 2600 80 12 0.7777778
## 818 2800 80 12 0.7777778
## 819 3000 80 12 0.7777778
## 919 1400 75 14 0.7777778
## 920 1600 75 14 0.7777778
## 921 1800 75 14 0.7777778
## 922 2000 75 14 0.7777778
## 923 2200 75 14 0.7777778
## 924 2400 75 14 0.7777778
## 925 2600 75 14 0.7777778
## 926 2800 75 14 0.7777778
## 927 3000 75 14 0.7777778
## 928 1400 80 14 0.7777778
## 929 1600 80 14 0.7777778
## 930 1800 80 14 0.7777778
## 931 2000 80 14 0.7777778
## 932 2200 80 14 0.7777778
## 933 2400 80 14 0.7777778
## 934 2600 80 14 0.7777778
## 935 2800 80 14 0.7777778
## 936 3000 80 14 0.7777778
## 1045 1400 80 16 0.7777778
## 1046 1600 80 16 0.7777778
## 1047 1800 80 16 0.7777778
## 1048 2000 80 16 0.7777778
## 1049 2200 80 16 0.7777778
## 1050 2400 80 16 0.7777778
## 1051 2600 80 16 0.7777778
## 1052 2800 80 16 0.7777778
## 1053 3000 80 16 0.7777778
The results from the above code show that there is no unique combination of parameters with the highest accuracy. For this analysis we will use the following values for each parameter:
init = 1400
kfac = 30
gamma = 2
The last step is to use the chosen parameters above on the ‘test_data’ set to estimate the expected accuracy of the Elo model if you were to use the model in a netball tipping competition.
test_score <- elo_score(init = 1400,
kfac = 30,
gamma = 2,
data = test_data,
status = elo_score(1800, 55, 2, train_data)[[1]])
test_score
## [[1]]
##
## Elo Ratings For 8 Players Playing 56 Games
##
## Player Rating Games Win Draw Loss Lag
## 1 Melbourne Vixens 1952 14 11 1 2 0
## 2 Sunshine Coast Lightning 1935 14 11 1 2 0
## 3 GIANTS Netball 1883 14 10 0 4 0
## 4 Magpies Netball 1875 14 9 0 5 0
## 5 Queensland Firebirds 1808 14 7 1 6 0
## 6 NSW Swifts 1690 14 3 1 10 0
## 7 West Coast Fever 1651 14 2 0 12 0
## 8 Adelaide Thunderbirds 1605 14 1 0 13 0
##
##
## [[2]]
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 0.5 1
## 0 8 0 1
## 0.5 0 0 0
## 1 1 0 6
##
## Overall Statistics
##
## Accuracy : 0.875
## 95% CI : (0.6165, 0.9845)
## No Information Rate : 0.5625
## P-Value [Acc > NIR] : 0.008643
##
## Kappa : 0.746
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: 0 Class: 0.5 Class: 1
## Sensitivity 0.8889 NA 0.8571
## Specificity 0.8571 1 0.8889
## Pos Pred Value 0.8889 NA 0.8571
## Neg Pred Value 0.8571 NA 0.8889
## Prevalence 0.5625 0 0.4375
## Detection Rate 0.5000 0 0.3750
## Detection Prevalence 0.5625 0 0.4375
## Balanced Accuracy 0.8730 NA 0.8730
The results of the above code show that the Elo model using the specified parameters had an accuracy of 87.5% and could be an accurate model to use in a netball tipping competition.