1. Introduction

In this tutorial, I will talk through the process of calculating Elo Ratings for the AFL.

We will primarily be using the ‘fitzRoy’ package. This scrapes various websites for a multitude of recorded AFL metrics and statistics. We are able to manipulate these variables to calculate our Elo Ratings.

The ‘PlayerRatings’ package is able to handle these calculations and more, and I would fully recommend its use over this process, however, this tutorial will walk through the background and steps behind how the calculations are done.

2. Elo Rating Calculation and Updating

Elo Ratings represent relative skill of competitors in a sport. Players (or teams) exchange points based off the outcome of a match-up between two rated competitors.

The Elo Ratings calculation is given by:

\(Rating_{new} = Rating_{old} + K (S - P)\)

The K value represents the maximum number of points possible for transfer between competitors, and the rate at which ratings can change.
The S value represents the match outcome, 0 for a loss, 0.5 for a draw and 1 for a win.
The P value represents the expected probability of that competitor winning (see the next equation).

The probability of a result is given by:

\(P(A) = \frac{1}{10^{\frac{Rating_{B}-Rating_{A}}{400}}+1}\)

We will set up functions to use these equations in R.

# create the Elo probability function with each team's rating as parameters
eloProb <- function(ratingA, ratingB) {
    1 / (1 + (10^((ratingB - ratingA) / 400)))    
}

The rating function determines how many points should be added or deducted as a results of a match. A win will increase a rating, a loss will decrease. A draw will have minimal impact on ratings, but will reward a lower rated competitor more.

# create the Elo rating function, with p, s and k parameters
eloRating <- function(p, s, k){
        k * (s - p)
}

We will use starting ratings of 2200 and a K value of 27. These parameters can be tested upon for greater accuracy using previous data, but for the purposes of this tutorial, these default values will do.

# Assign the suggested values to variables
k <- 27
rating0 <- 2200

3. Source the Data

We are going to load in the appropriate packages and data for this tutorial.

# Load in the appropriate packages
library(dplyr)
library(tidyr)
library(fitzRoy)
library(ggplot2)

Firstly, we’ll import result data from the last five years: the desired length of time is flexible.

# add desired years of results
importedData <- fetch_player_stats_afltables(season = c(2017:2021))

# the imported data needs to be grouped and summarised by game
fitzData <- importedData %>%
  group_by(Season, Round, Date, Local.start.time, Venue) %>%
  # a round number variable will assist in creating a unique value for every round over multiple years 
  summarise(roundnumber = ifelse(Round[1] == "EF" | Round[1] == "QF", "FW1", ifelse(Round[1] == "SF", "FW2", ifelse(Round[1] == "PF", "FW3", ifelse(Round[1] == "GF", "FW4", Round[1])))),
            year = Season[1],
            hteam = Home.team[1],
            ateam = Away.team[1],
            hscore = Home.score[1],
            ascore = Away.score[1],
            # we will create a new column to represent the S value for the Elo update function. 
            # 1 for a win, 0.5 for a draw and 0 for a loss.
            spoint = ifelse(hscore > ascore, 1, ifelse(hscore == ascore, 0.5, 0)))

# combine roundnumber and year to create unique value for each round, and make sure the data is arranged by date
fitzData <- fitzData %>% 
  unite("roundcode", sep = " - ", roundnumber:year, remove = TRUE) %>%
  arrange(Date)

Let’s create a data frame that contains the teams list, the round and the current rating.

# list of all teams
aflTeamList <- unique(c(fitzData$hteam,
                        fitzData$ateam))

# data frame that contains the current round and rating
aflEloRatings <- data.frame(
        team = aflTeamList,
        roundcode = 0,
        rating = rating0)

# arranging the data frame by team name
aflEloRatings <- aflEloRatings %>%
        arrange(team)

For our calculation, we need a vector of the rounds in the data set and a vector variable containing the length of the vector.

roundList <- unique(fitzData$roundcode)
numRounds <- length(roundList)

4. Generate Elo Ratings

We’ll use a for loop to calculate all ratings for all rounds within the data set.

for (i in 1:numRounds){
        
        # Let 'r' equal the current roundcode
        r <- roundList[i]
        
        # Let 'lastRound' equal the roundcode number of the previous iteration
        # if the iteration == 1, make lastRound = 0, else, 
        # make lastRound = one less than the iteration number
        if (i == 1){
                lastRound = 0
        }else{
                lastRound = roundList[i-1]
        }
        
        # Get the subset of games played in the roundcode
        games <- subset(fitzData, roundcode == r)
        
        # merge the roundcode subset with the aflEvoRatings data frame, 
        # with column specification, by Home Team
        games <- merge(
                games,
                aflEloRatings[aflEloRatings$roundcode == lastRound, c('team','rating')],
                by.x = 'hteam', by.y = 'team'
        )
        
        # naming the column appropriately
        names(games)[names(games) == 'rating'] <- 'HomeRating'
        
        # merge the roundcode subset with the aflEvoRatings data frame, 
        # with column specification, by Away Team
        games <- merge(
                games,
                aflEloRatings[aflEloRatings$roundcode == lastRound, c('team','rating')],
                by.x = 'ateam', by.y = 'team'
        )
        # name the column appropriately
        names(games)[names(games) == 'rating'] <- 'AwayRating'
        
        
        # Calculate and update the Elo Ratings for each team
        games$HomeProb <- apply(games[c('HomeRating','AwayRating')], 1, function(x) eloProb(x[1], x[2]))
        games$AwayProb <- apply(games[c('AwayRating','HomeRating')], 1, function(x) eloProb(x[1], x[2]))
        games$HomeAdjust <- apply(games[c('HomeProb','spoint')], 1, function(x) eloRating(x[1], x[2], k))
        games$AwayAdjust <- apply(games[c('AwayProb','spoint')], 1, function(x) eloRating(x[1], 1-x[2], k))
        
        # a temporary adjustment data frame. Contains the old rating, 
        # roundcode and adjustment required, for the Home and Away teams
        adjust <- data.frame(team = c(games$ateam, games$hteam),
                             lastRating = c(games$AwayRating, games$HomeRating),
                             roundcode = r,
                             adjust = c(games$AwayAdjust, games$HomeAdjust))
        
        # combining the old rating and adjustment to get the new ranking
        adjust$rating <- adjust$lastRating + adjust$adjust
        
        # row bind the new ratings to the aflEloRatings data frame
        aflEloRatings <- rbind(aflEloRatings,
                            adjust[c('team','roundcode','rating')])
        
        
        # When a team has a bye, or doesn't qualify for finals, 
        # we need to carry forward their (unchanged rating).
        # We create a 'noGame' data frame containing the teams that didn't play
        noGame <- data.frame(team = aflTeamList[! aflTeamList %in% 
                                                     c(games$ateam, 
                                                       games$hteam)
        ]
        )
        
        
        # If there WERE teams that didn't play
        if(nrow(noGame) > 0){
                
                # Add the roundcode number (r) to the noGame data frame
                noGame$roundcode <- r
                
                # Merge the lastRound data into the noGame data
                noGame <- merge(noGame,
                                aflEloRatings[aflEloRatings$roundcode == lastRound, c('team','rating')],
                                by = 'team')
                
                # Bind the new data onto the aflEloRatings data frame
                aflEloRatings <- rbind(aflEloRatings,
                                    noGame)
                
        }
        
}

With the ratings now created, we can check any round to see the ratings at the time. The most recent round could be used.

# subset a week to check ratings
subset(aflEloRatings, roundcode == "19 - 2021")

##                        team roundcode   rating
## 2197            Collingwood 19 - 2021 2161.140
## 2198              Fremantle 19 - 2021 2130.260
## 2199             Gold Coast 19 - 2021 2024.042
## 2200 Greater Western Sydney 19 - 2021 2199.422
## 2201               Hawthorn 19 - 2021 2067.417
## 2202        North Melbourne 19 - 2021 2049.902
## 2203               Richmond 19 - 2021 2303.086
## 2204               St Kilda 19 - 2021 2186.699
## 2205       Western Bulldogs 19 - 2021 2311.856
## 2206          Port Adelaide 19 - 2021 2343.068
## 2207                 Sydney 19 - 2021 2225.498
## 2208         Brisbane Lions 19 - 2021 2331.607
## 2209               Essendon 19 - 2021 2164.460
## 2210               Adelaide 19 - 2021 2080.786
## 2211                Carlton 19 - 2021 2097.905
## 2212                Geelong 19 - 2021 2385.323
## 2213             West Coast 19 - 2021 2274.068
## 2214              Melbourne 19 - 2021 2263.462

5. Visualisation

To finish off, just a simple visualisation of the newly created AFL Elo Ratings.

Create a variable for the current or ratings of a specific round, and arrange them in alphabetical order.

# assign current ratings to a variable, and have them sorted by name
currentAFLEloRatings <- subset(aflEloRatings, roundcode == "19 - 2021")
currentAFLEloRatings <- currentAFLEloRatings %>%
        arrange(team)

Assigning abbreviations to the data frame and setting a custom order (the abbreviation ‘GCS’ comes before ‘GEEL’, same as ‘WCE’ and ‘WB’)

# convert names in data frame to shorthands
currentAFLEloRatings$team <- c("ADE", "BRIS", "CARL", "COLL", "ESS", "FRE", "GEEL", "GCS", "GWS", "HAW", "MELB", "NM", "PORT", "RICH", "STK", "SYD", "WCE", "WB")

currentAFLEloRatings$team <- factor(currentAFLEloRatings$team,levels = c("ADE", "BRIS", "CARL", "COLL", "ESS", "FRE", "GEEL", "GCS", "GWS", "HAW", "MELB", "NM", "PORT", "RICH", "STK", "SYD", "WCE", "WB"))

We’ll create custom colour palettes to represent the teams.

# create custom colour sets to represent each team
aflColoursHex <- c("#0b2240", "#78184a", "#021a31", "#000000", "#000000", "#2a0d54", "#ffffff", "#e11b0a", "#fd8204", "#4d2004", "#061a33", "#ffffff", "#000000", "#000000", "#000000", "#ed171f", "#003087", "#0d3692")

aflColoursHex2 <- c("#e21937", "#fdbe57", "#ffffff", "#ffffff", "#cc2031", "#ffffff", "#021a31", "#ffde00", "#343433", "#fbbf15", "#de0316", "#000099", "#06aac5", "#fed102", "#ffffff", "#ffffff", "#f2aa00", "#ffffff")

aflColoursHex3 <- c("#ffd200", "#000099", "#ffffff", "#ffffff", "#cc2031", "#ffffff", "#021a31", "#ffde00", "#ffffff", "#fbbf15", "#de0316", "#000099", "#ffffff", "#fed102", "#ed0f05", "#ffffff", "#f2aa00", "#c70136")

Lastly, we’ll use a histogram to show the relative rankings in direct comparison, with appropriate colours for each team.

# create plot
# x axis is the team, y axis is the rating.
# fill is the team, colour is the team.
# label is rounded off to 2 decimal places.
ggplot(data = currentAFLEloRatings, aes(x = team, y = rating, fill = team, colour = team, label = round(rating, 2))) +
        geom_col() + #bar chart
        scale_fill_manual(values = aflColoursHex) + #fill with our custom primary colours
        scale_colour_manual(values = aflColoursHex3) + # outline with our custom secondary colours
        labs(title = "AFL Elo Ratings",
             subtitle = "As of 24-06-2021",
             x = "Team",
             y = "Elo Rating") +
        geom_text(colour = aflColoursHex2, angle = 90, position = position_stack(vjust = 0.5)) + # text with our custom text colours
        theme_bw() +
        theme(legend.position = "none")

Custom AFL Elo Rating Calculation and Visualisation

Portfolio 01 - SPE5AMS

Daylon Seakins - 21094465

22/10/2021

1. Introduction

2. Elo Rating Calculation and Updating

3. Source the Data

4. Generate Elo Ratings

5. Visualisation