1.0 Introduction to the Elo Rating System.

The Elo rating system was developed by Arpad Elo in 1978 to rate chess players. It has now become the most popular rating system in sport.
The Elo rating system predicts the probability of winning a match based on the skill level of the players or teams in past performances and then updates the ratings after each match.
This RPubs document will apply the rating system to the 2017 Suncorp Super Netball competition, however, the same code can be applied to other sports.

2.0 Required packages.

The code below shows the packages required to complete this analysis. If the following packages have not been installed, do this by using the install.packages() function.

# Load the required packages
library(dplyr)
library(tidyverse)
library(PlayerRatings)
library(caret)
library(e1071)
library(ggplot2)
library(devtools)

3.0 Load the data set.

The data used for this analysis was scraped using the superNetballR package. The following code will help load the data set and will be used to calculate the Elo rating for each team throughout the 2017 regular season.

# Load in the data set
devtools::install_github("stevelane/superNetballR")
library(superNetballR)
data("season_2017")
View(season_2017)
head(season_2017)

## # A tibble: 6 x 9
##   period squadId squadName     squadNickname squadCode stat    value round  game
##    <int>   <int> <chr>         <chr>         <chr>     <chr>   <int> <int> <int>
## 1      1     806 NSW Swifts    Swifts        NSW       reboun~     0     1     1
## 2      2     806 NSW Swifts    Swifts        NSW       reboun~     1     1     1
## 3      3     806 NSW Swifts    Swifts        NSW       reboun~     1     1     1
## 4      4     806 NSW Swifts    Swifts        NSW       reboun~     0     1     1
## 5      1    8118 GIANTS Netba~ GIANTS        GNB       reboun~     1     1     1
## 6      2    8118 GIANTS Netba~ GIANTS        GNB       reboun~     2     1     1

4.0 Data preparation and manipulation.

In order to calculate the Elo ratings, some data preparation and manipulation is required. The code below can be followed to get the data ready for analysis.

# The first step is to subset the data to only include the stat "goals" in order to calculate the total goals shot by each team in each game. It also subset to include only the regular season games.  
scores <- subset(season_2017, stat == "goals" & round <= 14)

# The next step is to remove the columns not needed in the analysis as well as re-ordering the columns to make the data set neat. 
scores <- scores[-c(2, 4:6)] 
scores <- scores[ , c(4, 5, 2, 1, 3)] 

# The next bit of code changes the dataframe from a long format to a wide format so that the goals from each period are in their own column for each team. 
scores <- scores %>% spread(period, value)

# The next step is calculate the total goals scored by each team in each game by adding up the columns containing the number of goals scored in each period of the game. Once the total goals has been calculated the period columns can be removed.
scores$final_score <- rowSums(scores[ , c(4:7)]) 
scores <- scores[-c(4:7)] 

# Next, the data frame must be split into a home and away data frame in order to distinguish between the home team and away team as well as the home score and away score. The columns are re-named in to make this distinction. 
home_team <- scores %>% slice(which(row_number() %% 2 == 1)) 
away_team <- scores %>% anti_join(home_team) 
names(home_team)[3] <- "h_team"  
names(home_team)[4] <- "h_score"
names(away_team)[3] <- "a_team" 
names(away_team)[4] <- "a_score" 

# Now the home and away data frames can be joined together by using the left_join function. To make the data frame neat, the columns are re-ordered. 
game_scores <- left_join(home_team, away_team)
game_scores <- game_scores[, c(1, 2, 3, 5, 4, 6)] 

# To calculate the Elo rating a derived score column needs to be created to indicate whether the home team won, drew or lost by assigning 1, 0.5 and 0, respectively.
game_scores <- game_scores %>% 
  mutate(
    score = ifelse(h_score == a_score, 0.5, ifelse(h_score > a_score, 1, 0))
  )

# To make calculating the Elo score easier the round column is renamed to 'week'.
names(game_scores)[1] <- 'week'

5.0 Learn the basic functions.

Before learning how to create one function to calculate the Elo rating, the following code will go through the basic functions of the PlayerRatings package. The PlayerRatings package contains files on the implementation of the ELO, GLICKO and STEPHENSON rating methods, however, for this analysis we will only be using the elo function. To learn more about the elo function you can use ?elo in R.

Use the ‘elo’ function to generate team ratings for the ‘game_scores’ netball data.

Assign new teams a rating of 1900 (init)
Use a k factor of 25 (kfac)
Use no home ground advantage (gamma)
Keep the ratings history of each team

eloratings <- elo(game_scores[c('week','h_team','a_team','score')],
                  init = 1900, 
                  kfac = 25, 
                  gamma = 0,
                  history = T)

# Print the team ratings
print(eloratings)

## 
## Elo Ratings For 8 Players Playing 56 Games
## 
##                     Player Rating Games Win Draw Loss Lag
## 1 Sunshine Coast Lightning   1991    14  11    1    2   0
## 2         Melbourne Vixens   1987    14  11    1    2   0
## 3           GIANTS Netball   1952    14  10    0    4   0
## 4          Magpies Netball   1943    14   9    0    5   0
## 5     Queensland Firebirds   1914    14   7    1    6   0
## 6               NSW Swifts   1830    14   3    1   10   0
## 7         West Coast Fever   1801    14   2    0   12   0
## 8    Adelaide Thunderbirds   1781    14   1    0   13   0

6.0 Create probability estimates for the Elo model.

Start by getting a list of all the weeks in the data set and then remove week 1 from the predictweeks list as this will be used to build the initial model.

predictweeks <- unique(game_scores$week)
predictweeks <- predictweeks[-1]

Create a vector containing the columns passed the rating model function as this will help clean up the code.

model_vars <- c('week','h_team','a_team','score')

Build an elo model using the first week of data with the following parameters:

initial rating = 2200
k factor = 27
gamma = 0
history = T

elomodel <- elo(game_scores[game_scores$week==1, model_vars],
                init = 2200,
                kfac = 27,
                gamma = 0,
                history = T)

Next, initialise new columns in game_scores for probability estimates.

game_scores[c('p_elo')] <- NA

Loop through the weeks where we want to generate predictions.

for (w in predictweeks){
  
  # Identify the week of games we are trying to predict
  pred_week  <- subset(game_scores, week == w)
  
  # Make predictions (set tng = 1 and trat = 1600, deviation = 300)
  pred_elo <- predict(elomodel,
                      newdata = pred_week[model_vars],
                      gamma = 0,
                      tng = 1, trat = 1600)
  
  # Add the predicted probabilities to the test_data dataframe
  game_scores$p_elo[game_scores$week == w] <- pred_elo
  
  # Update the models using the 'status' option
  elomodel <- elo(pred_week[model_vars],
                  status = elomodel$ratings,
                  init = 2200,
                  gamma = 0,
                  kfac = 27,
                  history = T)
  
}

7.0 Test the accuracy of the model.

Use the probability estimates generated in section 6.0 to test how good the elo model was at predicting match results.

In order to tip the winner of matches we need to transform our probability estimates into a binary decision; home team wins (score = 1) or home team lose (score = 0). A decision threshold of 0.5 is applied to achieve this.

Create a new column in the game_scores data frame for tips from the Elo model.

game_scores$tip_elo <- ifelse(game_scores$p_elo > 0.5, 1, 0)

Create a confusion matrix for the Elo model tips using the confusionMatrix function.

cm_elo <- 
  confusionMatrix(data = factor(game_scores$tip_elo, levels = c(0, 0.5, 1)),
                  reference = factor(game_scores$score, levels = c(0, 0.5, 1)))

cm_elo

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  0 0.5  1
##        0   20   1  3
##        0.5  0   0  0
##        1    7   0 21
## 
## Overall Statistics
##                                          
##                Accuracy : 0.7885         
##                  95% CI : (0.653, 0.8894)
##     No Information Rate : 0.5192         
##     P-Value [Acc > NIR] : 5.622e-05      
##                                          
##                   Kappa : 0.5867         
##                                          
##  Mcnemar's Test P-Value : NA             
## 
## Statistics by Class:
## 
##                      Class: 0 Class: 0.5 Class: 1
## Sensitivity            0.7407    0.00000   0.8750
## Specificity            0.8400    1.00000   0.7500
## Pos Pred Value         0.8333        NaN   0.7500
## Neg Pred Value         0.7500    0.98077   0.8750
## Prevalence             0.5192    0.01923   0.4615
## Detection Rate         0.3846    0.00000   0.4038
## Detection Prevalence   0.4615    0.00000   0.5385
## Balanced Accuracy      0.7904    0.50000   0.8125

A different way to score the Elo model is to use a probabilistic scoring method (instead of converting the predicted probabilities into discrete tips). To do this we will plot the calibration curve of the Elo model.

# remove draws (0.5) and NAs
cond <- game_scores$score != 0.5 & !is.na(game_scores$p_elo)

# build calibration model
calib <- calibration(factor(score, levels = c(1, 0))
                     ~ p_elo,
                     data = game_scores[cond, ])

# plot and interpret the model
ggplot(calib)

One way to score probability estimates is the Brier score and is shown in the following code. The lower the brier score the better the prediction.

brier_elo <- mean((game_scores$score - game_scores$p_elo)^2, na.rm = T)
brier_elo

## [1] 0.194303

8.0 Build a tipping model

The following code will help build a tipping model based on the Elo function. This model is built using a combination of the basic functions and techniques learnt in the above sections. It will also show how to choose the best initial values for each parameter.

The first step is to create a function that takes the Elo parameters ‘init’, ‘kfac’ and ‘gamma’ as arguments as well as a data frame that contains match information and a ‘status’ variable that allows existing models to be passed through the function. This function will return the final elo ratings model and a confusion matrix that contains the predicted tips.

elo_score <- function(init, kfac, gamma, data, status = NULL){
  
  predictweeks <- unique(data$week)
  model_vars <- c('week','h_team','a_team','score')
  
  if (is.null(status)){
    
    predictweeks <- predictweeks[-1]
    
    elomodel <- elo(data[data$week == 1, model_vars],
                    init = init,
                    kfac = kfac,
                    gamma = gamma,
                    history = T)
    
  } else {
    
    elomodel <- status
    
  }
  
  
  data[c('p_elo')] <- NA
  
  for (w in predictweeks){
    
    pred_week  <- subset(data, week == w)
    
    pred_elo <- predict(elomodel,
                        newdata = pred_week[model_vars],
                        gamma = gamma,
                        tng = 1, trat = 1600)
    
    data$p_elo[data$week == w] <- pred_elo
    
    elomodel <- elo(pred_week[model_vars],
                    status = elomodel$ratings,
                    init = init,
                    gamma = gamma,
                    kfac = kfac,
                    history = T)
    
  }
  
  data$tip_elo <- ifelse(data$p_elo > 0.5, 1, 0)
  
  cm_elo <- confusionMatrix(data = factor(data$tip_elo, levels = c(0,0.5,1)),
                            reference = factor(data$score, levels = c(0,0.5,1)))
  
  return(
    list(elomodel,
         cm_elo)
  )
  
}

Next create a grid parameter of values to test using the expand.grid function. This is used to determine the most accurate combination of parameters for the Elo model.

Set the initial values to between 1400 and 3000 (increments of 200)
For the kfactor, set the values to between 20 and 80 (increments of 5)
Include the gamma parameter (home ground advantage), setting the values to between 0 and 20 (increments of 2)

parameters <- expand.grid(init = seq(1400, 3000, by = 200),
                          kfac = seq(20, 80, by = 5), 
                          gamma = seq(0, 20, by = 2))

In order to estimate the optimal parameter combination and estimate the out-of-sample performance, the game_score needs to be split into train and test data. The train will be used to estimate the parameters and the test data is used to estimate out-of-sample performance. The out-of-sample performance test helps reduce the bias.

Assign week 10 and less to the train data and week 11 and above to the test data.
The code to this is shown below.

train_data <- subset(game_scores, week <= 10)
test_data  <- subset(game_scores, week >= 11)

For each parameter combination in ‘parameters’, calculate the tip accuracy on the train_data by using the apply function.

NOTE: This step may take some time to compute.

parameters$accuracy <- apply(parameters, 1, function (x) 
  elo_score(x[1], x[2], x[3], train_data)[[2]]$overall['Accuracy'] )

To find the best parameter combination the following code can be used.

subset(parameters, accuracy == max(parameters$accuracy))

##      init kfac gamma  accuracy
## 136  1400   30     2 0.7777778
## 137  1600   30     2 0.7777778
## 138  1800   30     2 0.7777778
## 139  2000   30     2 0.7777778
## 140  2200   30     2 0.7777778
## 141  2400   30     2 0.7777778
## 142  2600   30     2 0.7777778
## 143  2800   30     2 0.7777778
## 144  3000   30     2 0.7777778
## 145  1400   35     2 0.7777778
## 146  1600   35     2 0.7777778
## 147  1800   35     2 0.7777778
## 148  2000   35     2 0.7777778
## 149  2200   35     2 0.7777778
## 150  2400   35     2 0.7777778
## 151  2600   35     2 0.7777778
## 152  2800   35     2 0.7777778
## 153  3000   35     2 0.7777778
## 271  1400   40     4 0.7777778
## 272  1600   40     4 0.7777778
## 273  1800   40     4 0.7777778
## 274  2000   40     4 0.7777778
## 275  2200   40     4 0.7777778
## 276  2400   40     4 0.7777778
## 277  2600   40     4 0.7777778
## 278  2800   40     4 0.7777778
## 279  3000   40     4 0.7777778
## 280  1400   45     4 0.7777778
## 281  1600   45     4 0.7777778
## 282  1800   45     4 0.7777778
## 283  2000   45     4 0.7777778
## 284  2200   45     4 0.7777778
## 285  2400   45     4 0.7777778
## 286  2600   45     4 0.7777778
## 287  2800   45     4 0.7777778
## 288  3000   45     4 0.7777778
## 406  1400   50     6 0.7777778
## 407  1600   50     6 0.7777778
## 408  1800   50     6 0.7777778
## 409  2000   50     6 0.7777778
## 410  2200   50     6 0.7777778
## 411  2400   50     6 0.7777778
## 412  2600   50     6 0.7777778
## 413  2800   50     6 0.7777778
## 414  3000   50     6 0.7777778
## 415  1400   55     6 0.7777778
## 416  1600   55     6 0.7777778
## 417  1800   55     6 0.7777778
## 418  2000   55     6 0.7777778
## 419  2200   55     6 0.7777778
## 420  2400   55     6 0.7777778
## 421  2600   55     6 0.7777778
## 422  2800   55     6 0.7777778
## 423  3000   55     6 0.7777778
## 532  1400   55     8 0.7777778
## 533  1600   55     8 0.7777778
## 534  1800   55     8 0.7777778
## 535  2000   55     8 0.7777778
## 536  2200   55     8 0.7777778
## 537  2400   55     8 0.7777778
## 538  2600   55     8 0.7777778
## 539  2800   55     8 0.7777778
## 540  3000   55     8 0.7777778
## 541  1400   60     8 0.7777778
## 542  1600   60     8 0.7777778
## 543  1800   60     8 0.7777778
## 544  2000   60     8 0.7777778
## 545  2200   60     8 0.7777778
## 546  2400   60     8 0.7777778
## 547  2600   60     8 0.7777778
## 548  2800   60     8 0.7777778
## 549  3000   60     8 0.7777778
## 550  1400   65     8 0.7777778
## 551  1600   65     8 0.7777778
## 552  1800   65     8 0.7777778
## 553  2000   65     8 0.7777778
## 554  2200   65     8 0.7777778
## 555  2400   65     8 0.7777778
## 556  2600   65     8 0.7777778
## 557  2800   65     8 0.7777778
## 558  3000   65     8 0.7777778
## 658  1400   60    10 0.7777778
## 659  1600   60    10 0.7777778
## 660  1800   60    10 0.7777778
## 661  2000   60    10 0.7777778
## 662  2200   60    10 0.7777778
## 663  2400   60    10 0.7777778
## 664  2600   60    10 0.7777778
## 665  2800   60    10 0.7777778
## 666  3000   60    10 0.7777778
## 667  1400   65    10 0.7777778
## 668  1600   65    10 0.7777778
## 669  1800   65    10 0.7777778
## 670  2000   65    10 0.7777778
## 671  2200   65    10 0.7777778
## 672  2400   65    10 0.7777778
## 673  2600   65    10 0.7777778
## 674  2800   65    10 0.7777778
## 675  3000   65    10 0.7777778
## 676  1400   70    10 0.7777778
## 677  1600   70    10 0.7777778
## 678  1800   70    10 0.7777778
## 679  2000   70    10 0.7777778
## 680  2200   70    10 0.7777778
## 681  2400   70    10 0.7777778
## 682  2600   70    10 0.7777778
## 683  2800   70    10 0.7777778
## 684  3000   70    10 0.7777778
## 685  1400   75    10 0.7777778
## 686  1600   75    10 0.7777778
## 687  1800   75    10 0.7777778
## 688  2000   75    10 0.7777778
## 689  2200   75    10 0.7777778
## 690  2400   75    10 0.7777778
## 691  2600   75    10 0.7777778
## 692  2800   75    10 0.7777778
## 693  3000   75    10 0.7777778
## 793  1400   70    12 0.7777778
## 794  1600   70    12 0.7777778
## 795  1800   70    12 0.7777778
## 796  2000   70    12 0.7777778
## 797  2200   70    12 0.7777778
## 798  2400   70    12 0.7777778
## 799  2600   70    12 0.7777778
## 800  2800   70    12 0.7777778
## 801  3000   70    12 0.7777778
## 802  1400   75    12 0.7777778
## 803  1600   75    12 0.7777778
## 804  1800   75    12 0.7777778
## 805  2000   75    12 0.7777778
## 806  2200   75    12 0.7777778
## 807  2400   75    12 0.7777778
## 808  2600   75    12 0.7777778
## 809  2800   75    12 0.7777778
## 810  3000   75    12 0.7777778
## 811  1400   80    12 0.7777778
## 812  1600   80    12 0.7777778
## 813  1800   80    12 0.7777778
## 814  2000   80    12 0.7777778
## 815  2200   80    12 0.7777778
## 816  2400   80    12 0.7777778
## 817  2600   80    12 0.7777778
## 818  2800   80    12 0.7777778
## 819  3000   80    12 0.7777778
## 919  1400   75    14 0.7777778
## 920  1600   75    14 0.7777778
## 921  1800   75    14 0.7777778
## 922  2000   75    14 0.7777778
## 923  2200   75    14 0.7777778
## 924  2400   75    14 0.7777778
## 925  2600   75    14 0.7777778
## 926  2800   75    14 0.7777778
## 927  3000   75    14 0.7777778
## 928  1400   80    14 0.7777778
## 929  1600   80    14 0.7777778
## 930  1800   80    14 0.7777778
## 931  2000   80    14 0.7777778
## 932  2200   80    14 0.7777778
## 933  2400   80    14 0.7777778
## 934  2600   80    14 0.7777778
## 935  2800   80    14 0.7777778
## 936  3000   80    14 0.7777778
## 1045 1400   80    16 0.7777778
## 1046 1600   80    16 0.7777778
## 1047 1800   80    16 0.7777778
## 1048 2000   80    16 0.7777778
## 1049 2200   80    16 0.7777778
## 1050 2400   80    16 0.7777778
## 1051 2600   80    16 0.7777778
## 1052 2800   80    16 0.7777778
## 1053 3000   80    16 0.7777778

The results from the above code show that there is no unique combination of parameters with the highest accuracy. For this analysis we will use the following values for each parameter:

init = 1400
kfac = 30
gamma = 2

The last step is to use the chosen parameters above on the ‘test_data’ set to estimate the expected accuracy of the Elo model if you were to use the model in a netball tipping competition.

test_score <- elo_score(init = 1400, 
                        kfac = 30, 
                        gamma = 2, 
                        data = test_data,
                        status = elo_score(1800, 55, 2, train_data)[[1]])

test_score

## [[1]]
## 
## Elo Ratings For 8 Players Playing 56 Games
## 
##                     Player Rating Games Win Draw Loss Lag
## 1         Melbourne Vixens   1952    14  11    1    2   0
## 2 Sunshine Coast Lightning   1935    14  11    1    2   0
## 3           GIANTS Netball   1883    14  10    0    4   0
## 4          Magpies Netball   1875    14   9    0    5   0
## 5     Queensland Firebirds   1808    14   7    1    6   0
## 6               NSW Swifts   1690    14   3    1   10   0
## 7         West Coast Fever   1651    14   2    0   12   0
## 8    Adelaide Thunderbirds   1605    14   1    0   13   0
## 
## 
## [[2]]
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction 0 0.5 1
##        0   8   0 1
##        0.5 0   0 0
##        1   1   0 6
## 
## Overall Statistics
##                                           
##                Accuracy : 0.875           
##                  95% CI : (0.6165, 0.9845)
##     No Information Rate : 0.5625          
##     P-Value [Acc > NIR] : 0.008643        
##                                           
##                   Kappa : 0.746           
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: 0 Class: 0.5 Class: 1
## Sensitivity            0.8889         NA   0.8571
## Specificity            0.8571          1   0.8889
## Pos Pred Value         0.8889         NA   0.8571
## Neg Pred Value         0.8571         NA   0.8889
## Prevalence             0.5625          0   0.4375
## Detection Rate         0.5000          0   0.3750
## Detection Prevalence   0.5625          0   0.4375
## Balanced Accuracy      0.8730         NA   0.8730

The results of the above code show that the Elo model using the specified parameters had an accuracy of 87.5% and could be an accurate model to use in a netball tipping competition.

Elo Rating System - Netball

Laine Heidenreich