Elo win probability model walkthrough for the AFL 2023 finals

This walthrough will show you how to use Elo ratings to predict the win probability of games. We will use the AFL 2023 finals series as the example.

#Load in required packages
library(PlayerRatings)
library(fitzRoy)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(elo)
library(dplyr)
#Scrape afl data using FitzRoy package
elo_afl2023 <- fitzRoy::fetch_results_afltables(season = 2023)

There are multiple rounds in the finals series. Each game will impact a teams elo rating so we must update and create new data frames for each round we want #to predict

#The Elimination Finals (EF) and Qualyfying finals (QF) are all based on rounds 1 to 24. 
#We will also create a new column to determine if the home team result was a win,loss or draw

afl2023_rating_EF_QF <- elo_afl2023 %>% 
  filter(!(Round %in% c('EF', 'QF', 'SF', 'PF', 'GF'))) %>%
  mutate(score = ifelse(Home.Points > Away.Points, 1,
                        ifelse(Home.Points < Away.Points, 0,
                               0.5)))
#The semi-finals (SF) are based on rounds 1 to 24, the EF and the QF
#We will also create a new column to determine if the home team result was a win,loss or draw

afl2023_rating_SF <- elo_afl2023 %>% 
  filter(!(Round %in% c('SF', 'PF', 'GF'))) %>%
  mutate(score = ifelse(Home.Points > Away.Points, 1,
                        ifelse(Home.Points < Away.Points, 0,
                               0.5)))
#The pleliminary finals (PF) are based on rounds 1 to 24, the EF, the QF and the SF
#We will also create a new column to determine if the home team result was a win,loss or draw

afl2023_rating_PF <- elo_afl2023 %>% 
  filter(!(Round %in% c('PF', 'GF'))) %>%
  mutate(score = ifelse(Home.Points > Away.Points, 1,
                        ifelse(Home.Points < Away.Points, 0,
                               0.5)))
#The Grand-finals (GF) are based on every round except for the GF itself
#We will also create a new column to determine if the home team result was a win,loss or draw

afl2023_rating_GF <- elo_afl2023 %>% 
  filter(!(Round %in% c('GF'))) %>%
  mutate(score = ifelse(Home.Points > Away.Points, 1,
                        ifelse(Home.Points < Away.Points, 0,
                               0.5)))

Using these data frame we can figure out the elo ratings of every team at certain points in the finals series

#Calculate elo ratings for EF and QF
eloratings_EF_QF <- elo(afl2023_rating_EF_QF[c("Round.Number", "Home.Team", "Away.Team", "score")])

#Calculate elo ratings for SF
eloratings_SF <- elo(afl2023_rating_SF[c("Round.Number", "Home.Team", "Away.Team", "score")])

#Calculate elo ratings for PF
eloratings_PF <- elo(afl2023_rating_PF[c("Round.Number", "Home.Team", "Away.Team", "score")])

#Calculate elo ratings for GF
eloratings_GF <- elo(afl2023_rating_GF[c("Round.Number", "Home.Team", "Away.Team", "score")])

We will do one example on how to use these rating to predict a win probability.

Lets take the game in the EF with Carlton vs Sydney

#Use the predict function and include the elorating df which the game round belongs to (i.e. for this game it was in the EF)

#Choose the Home team and Away team
elo_pred_carlton_sydney <- predict(eloratings_EF_QF,
                                   newdata = data.frame(
                                     Week = NA,
                                     Home.Team = "Carlton",
                                     Away.Team = "Sydney"
                                   ))
print(elo_pred_carlton_sydney)
## [1] 0.5827805

This produces the probability.

We can sim this over 1000 games to see what happens. We’d expect the ratio to be similar to our probability.

#We will use rbinom to create either a 1 (Home/Carlton win) or 0 (Home/Carlton loss)

#The probability of getting a 1 will be determined by our win probability determined earlier

data.frame(
  
  Carlton = rbinom(n = 1000, size = 1, prob = elo_pred_carlton_sydney)
  
) %>% 
  summarise(Carlton = sum(Carlton),
            Sydney = 1000 - Carlton)
##   Carlton Sydney
## 1     583    417

This shows that out of 1000 games simulated Carlton won 559 and Sydney won 441.