AFL Match Result Prediction using ELO Rating System

Author

Zach Ferlazzo

Published

November 3, 2025

Introduction

What is ELO?

The ELO rating system was developed in the 1960s by Hungarian-American physicist Arpad Elo to rank chess players. The beauty of ELO lies in its simplicity: every player (or team) has a rating, and after each match, points are exchanged between competitors based on the outcome and the expected result.

How it works:

  • All teams start with a baseline rating
  • When teams play, we can calculate the expected probability of each team winning based on their rating difference
  • After the match, the winner takes points from the loser
  • If a strong team beats a weak team, they gain few points (expected result)
  • If a weak team upsets a strong team, they gain many points (unexpected result)

Why ELO for AFL?

ELO is particularly well suited for AFL predictions because:

  • Self-correcting: Teams that improve will win more than expected and their rating will rise to reflect their true strength
  • Simple but powerful: Uses only match results - no complex statistics required
  • Transparent: Anyone can understand how ratings change after each game
  • Proven track record: Successfully adapted from chess to sports like NFL, NBA, and soccer
  • Accounts for uncertainty: Produces probabilities, not just binary predictions

This tutorial builds an ELO system to predict AFL home team wins, incorporating home ground advantage and tuning parameters for optimal performance.

Note on Draws: Draws are rare in AFL and for this example are treated as home team losses for simplicity.

Setup

I used R to implement the ELO model. You can follow along by running the code blocks.

Code
# Import Library
library(fitzRoy)
library(tidyverse)
library(elo)
library(lubridate)
library(tidymodels)
library(xgboost)
library(slider) 
library(caret) 
library(yardstick) 
library(patchwork)
library(vip)
library(knitr)
library(formattable)
library(htmltools)

# Clear Memory
rm(list = ls())

Data Collection

Lets use data for all the seasons which the current competing AFL sides have been in the league together. This covers 2012 when GWS joined to 2025. Also as Elo is just win/loss based we only need basic match result data and wont need to make adjustments for the covid interrupted seasons.

We fetch AFL results from 2012-2025 using the fitzRoy package:

Code
# Collect Data
results_df <- fetch_results_afltables(season = 2012:2025)

Data Exploration

Let’s examine the structure and contents of our data:

Code
print(summary(results_df))
      Game            Date               Round            Home.Team        
 Min.   :13960   Min.   :2012-03-24   Length:2879        Length:2879       
 1st Qu.:14680   1st Qu.:2015-06-20   Class :character   Class :character  
 Median :15399   Median :2018-09-06   Mode  :character   Mode  :character  
 Mean   :15399   Mean   :2018-12-26                                        
 3rd Qu.:16118   3rd Qu.:2022-07-05                                        
 Max.   :16838   Max.   :2025-09-27                                        
   Home.Goals     Home.Behinds    Home.Points      Away.Team        
 Min.   : 2.00   Min.   : 1.00   Min.   : 16.00   Length:2879       
 1st Qu.:10.00   1st Qu.: 9.00   1st Qu.: 68.00   Class :character  
 Median :12.00   Median :11.00   Median : 85.00   Mode  :character  
 Mean   :12.67   Mean   :11.33   Mean   : 87.33                     
 3rd Qu.:16.00   3rd Qu.:14.00   3rd Qu.:105.00                     
 Max.   :31.00   Max.   :28.00   Max.   :205.00                     
   Away.Goals     Away.Behinds    Away.Points        Venue          
 Min.   : 1.00   Min.   : 1.00   Min.   : 14.00   Length:2879       
 1st Qu.: 9.00   1st Qu.: 8.00   1st Qu.: 63.00   Class :character  
 Median :11.00   Median :10.00   Median : 79.00   Mode  :character  
 Mean   :11.77   Mean   :10.49   Mean   : 81.14                     
 3rd Qu.:14.00   3rd Qu.:13.00   3rd Qu.: 97.00                     
 Max.   :29.00   Max.   :25.00   Max.   :187.00                     
     Margin             Season      Round.Type         Round.Number  
 Min.   :-138.000   Min.   :2012   Length:2879        Min.   : 1.00  
 1st Qu.: -21.000   1st Qu.:2015   Class :character   1st Qu.: 6.00  
 Median :   5.000   Median :2018   Mode  :character   Median :12.00  
 Mean   :   6.195   Mean   :2019                      Mean   :12.65  
 3rd Qu.:  33.000   3rd Qu.:2022                      3rd Qu.:19.00  
 Max.   : 171.000   Max.   :2025                      Max.   :29.00  

First few rows:

Code
print(head(results_df) )
# A tibble: 6 × 16
   Game Date       Round Home.Team Home.Goals Home.Behinds Home.Points Away.Team
  <dbl> <date>     <chr> <chr>          <int>        <int>       <int> <chr>    
1 13960 2012-03-24 R1    GWS                5            7          37 Sydney   
2 13961 2012-03-29 R1    Richmond          12            9          81 Carlton  
3 13962 2012-03-30 R1    Hawthorn          20           17         137 Collingw…
4 13963 2012-03-31 R1    Melbourne         11           12          78 Brisbane…
5 13964 2012-03-31 R1    Gold Coa…         10            8          68 Adelaide 
6 13965 2012-03-31 R1    North Me…         15           12         102 Essendon 
# ℹ 8 more variables: Away.Goals <int>, Away.Behinds <int>, Away.Points <int>,
#   Venue <chr>, Margin <int>, Season <dbl>, Round.Type <chr>,
#   Round.Number <int>

Data Preprocessing

FitzRoy data is generally clean, but let’s do some simple checks to make sure. It’s also a good habit to always inspect your data before modeling, even if you know where the data came from, you will often find small issues that need addressing.

Check team name consistency:

Code
# Preprocessing
# Check consistency
unique(results_df$Home.Team)
 [1] "GWS"             "Richmond"        "Hawthorn"        "Melbourne"      
 [5] "Gold Coast"      "North Melbourne" "Fremantle"       "Footscray"      
 [9] "Port Adelaide"   "Brisbane Lions"  "Essendon"        "Sydney"         
[13] "West Coast"      "Adelaide"        "Collingwood"     "St Kilda"       
[17] "Geelong"         "Carlton"        
Code
unique(results_df$Away.Team)
 [1] "Sydney"          "Carlton"         "Collingwood"     "Brisbane Lions" 
 [5] "Adelaide"        "Essendon"        "Geelong"         "West Coast"     
 [9] "St Kilda"        "Port Adelaide"   "Fremantle"       "Melbourne"      
[13] "Footscray"       "Richmond"        "GWS"             "Gold Coast"     
[17] "Hawthorn"        "North Melbourne"

Check venue names:

Code
# Check Ground Names
unique(results_df$Venue)
 [1] "Stadium Australia"  "M.C.G."             "Carrara"           
 [4] "Docklands"          "Subiaco"            "Football Park"     
 [7] "Gabba"              "S.C.G."             "Bellerive Oval"    
[10] "Blacktown"          "Kardinia Park"      "Manuka Oval"       
[13] "York Park"          "Marrara Oval"       "Sydney Showground" 
[16] "Cazaly's Stadium"   "Wellington"         "Adelaide Oval"     
[19] "Traeger Park"       "Jiangwan Stadium"   "Eureka Stadium"    
[22] "Perth Stadium"      "Riverway Stadium"   "Norwood Oval"      
[25] "Summit Sports Park" "Barossa Oval"       "Hands Oval"        

Check round types:

Code
# Round Type
unique(results_df$Round.Type)
[1] "Regular" "Finals" 

Check for missing values:

Code
# Missing Values
colSums(is.na(results_df))
        Game         Date        Round    Home.Team   Home.Goals Home.Behinds 
           0            0            0            0            0            0 
 Home.Points    Away.Team   Away.Goals Away.Behinds  Away.Points        Venue 
           0            0            0            0            0            0 
      Margin       Season   Round.Type Round.Number 
           0            0            0            0 

We already decided to ignore draws as a result outcome, but let’s see how often they occur to check out justification holds.

Code
# Draw Proportion
n_draws <- sum(results_df$Home.Points == results_df$Away.Points)
n_games <- nrow(results_df)
cat("Number of draws:", n_draws, "out of", n_games, "games (", round(100*n_draws/n_games, 2), "% )\n")
Number of draws: 21 out of 2879 games ( 0.73 % )

Hyperparameter Tuning

Elo models contain parameters that must be set before training - finding optimal values for these parameters is called hyperparameter tuning. This ELO model has three parameters that will impact prediction accuracy which we will tune.

The K-factor controls how reactiveratings are after a result, too high and teams fluctuate wildly, too low and the model is slow to recognise form changes.

Home Ground Advantage (HGA) adds rating points to the home team to reflect their inherent advantage from crowd support, familiarity with conditions, and reduced travel.

Carry Over determines how much of a team’s rating persists between seasons. This accounts for list turnover, coaching changes, and regression to the mean.

To find optimal values, we split our data into training (2012-2023) and validation (2024-2025) sets. We test different parameter combinations on recent unseen games to see which configuration predicts best.

Data Splitting

Split data into training (2012-2023) and validation (2024-2025) sets: For this analysis, we use 2024-2025 data for validation, though you could experiment with a larger validation set.

Code
# Training data: 2012-2023 
M1_train_data <- results_df %>%
  filter(Season <= 2023)

# Validation data: 2024 and 2025
M1_val_data <- results_df %>%
  filter(Season >= 2024)

# Check split sizes
cat("Training games:", nrow(M1_train_data), "\n")
Training games: 2447 
Code
cat("Validation games:", nrow(M1_val_data), "\n")
Validation games: 432 

Helper Functions

Create a function to calculate log loss which we will use as our performance metric:

Code
# Helper Functions
# Calculate log loss
calculate_log_loss <- function(actual, predicted, eps = 1e-15) {
  predicted <- pmax(pmin(predicted, 1 - eps), eps)
  -mean(actual * log(predicted) + (1 - actual) * log(1 - predicted))
}

Create the hyperparameter tuning function:

Code
# Hyper Parameter Tuning Function
tune_elo <- function(k_val, HGA, carryOver, train_df, val_df) {
  
  # Combine train and validation 
  combined_df <- bind_rows(train_df, val_df) %>%
    arrange(Game)
  
  # Run ELO model
  elo_model <- elo.run(
    as.numeric(Home.Points > Away.Points) ~
      adjust(Home.Team, HGA) +
      Away.Team +
      regress(Season, 1500, carryOver) +
      group(Game),
    k = k_val,
    data = combined_df
  )
  
  # Extract all predictions
  all_predictions <- as.data.frame(elo_model)
  
  # Get indices for validation games
  val_indices <- which(combined_df$Game %in% val_df$Game)
  
  # Filter to validation predictions only
  val_predictions <- all_predictions[val_indices, ]
  
  # Get actual outcomes for validation games
  val_actual <- combined_df %>%
    filter(Game %in% val_df$Game) %>%
    mutate(home_won = as.numeric(Home.Points > Away.Points)) %>%
    pull(home_won)
  
  # Calculate log loss on validation set
  log_loss <- calculate_log_loss(
    actual = val_actual,
    predicted = val_predictions$p.A
  )
  
  return(log_loss)
}

We will use grid search to find optimal parameter values. This systematically tests every combination of parameter values across predefined ranges. Our grid tests K-factors from 30-70, HGA from 40-80, and carry-over from 0-0.5, evaluating each combination’s performance on the validation set using log loss, where the smallest log loss combination will be the best parameters.

Define the parameter grid to search:

Code
# Define parameter grid
param_grid <- expand_grid(
  k_val = seq(30, 70, by = 2),      
  HGA = seq(40, 80, by = 2),        
  carryOver = seq(0, 0.5, by = 0.05)
)

Run the grid search (this will take a few minutes):

Code
# Run Hyper Parameter Grid Search
Model1_tuning_results <- param_grid %>%
  rowwise() %>%
  mutate(
    log_loss = tune_elo(k_val, HGA, carryOver, M1_train_data, M1_val_data)
  ) %>%
  ungroup() %>%
  arrange(log_loss)

Tuning Results

View the top 10 parameter combinations:

Code
# Evaluate results
# Print top 10 combinations
Model1_tuning_results %>%
  slice_head(n = 10) %>%
  print()
# A tibble: 10 × 4
   k_val   HGA carryOver log_loss
   <dbl> <dbl>     <dbl>    <dbl>
 1    46    60       0.1    0.572
 2    46    62       0.1    0.572
 3    46    58       0.1    0.572
 4    44    60       0.1    0.572
 5    44    58       0.1    0.572
 6    44    62       0.1    0.572
 7    46    64       0.1    0.572
 8    46    56       0.1    0.572
 9    48    60       0.1    0.572
10    48    62       0.1    0.572

Extract and display the best parameters:

Code
# Best parameters
M1_best_params <- Model1_tuning_results %>% dplyr::slice(1)

# Print Best Parameters
cat("K-factor:", M1_best_params$k_val, "\n",
    "Home Ground Advantage:", M1_best_params$HGA, "\n",
    "Carry Over:", M1_best_params$carryOver, "\n",
    "Validation Log Loss:", round(M1_best_params$log_loss, 4), "\n")
K-factor: 46 
 Home Ground Advantage: 60 
 Carry Over: 0.1 
 Validation Log Loss: 0.5719 

Final ELO Model

Set the best hyperparameters:

Code
# Set parameters
# Home Ground Advantage
HGA <- 60
# Regression to the mean
carryOver <- 0.1
# K-Value
k_val <- 46   

Train the Model

Run the final model on all data:

Code
# Run Elo Model
Model1 <- elo.run(
  as.numeric(Home.Points > Away.Points) ~
    adjust(Home.Team, HGA) +
    Away.Team +
    regress(Season, 1500, carryOver) +
    group(Game),  
  k = k_val,
  data = results_df
)

Model Summary

Code
# Check the output
summary(Model1)

An object of class 'summary.elo.run', containing information on 18 teams and 2879 matches, with 14 regressions.

Mean Square Error: 0.207
AUC: 0.7314
Favored Teams vs. Actual Wins: 
       Actual
Favored    0    1
  TRUE   572 1249
  (tie)    0    0
  FALSE  688  370

Final Team Ratings

Now the 2025 finals have come and gone we can see where the model ranks all the teams based on their performance over the entire data set.

Get the current ratings for all teams:

Code
# Get final ratings 
final_ratings <- final.elos(Model1)
final_ratings <- final_ratings %>%
  as.data.frame() 
final_ratings %>% arrange(desc(.))
                       .
Brisbane Lions  1767.721
Geelong         1693.723
Hawthorn        1666.938
GWS             1637.469
Collingwood     1631.229
Adelaide        1624.308
Fremantle       1619.078
Footscray       1593.243
Gold Coast      1583.020
Sydney          1576.269
Port Adelaide   1476.264
Carlton         1445.104
St Kilda        1444.623
Melbourne       1358.091
Essendon        1294.422
North Melbourne 1239.015
Richmond        1222.505
West Coast      1126.978

Model Predictions

Extract all predictions and rating updates for all the data:

Code
# Get all predictions and ratings
Model1_predictions <- as.data.frame(Model1)
print(head(Model1_predictions))
           team.A         team.B       p.A wins.A  update.A  update.B    elo.A
1             GWS         Sydney 0.5854987      0 -26.93294  26.93294 1473.067
2        Richmond        Carlton 0.5854987      0 -26.93294  26.93294 1473.067
3        Hawthorn    Collingwood 0.5854987      1  19.06706 -19.06706 1519.067
4       Melbourne Brisbane Lions 0.5854987      0 -26.93294  26.93294 1473.067
5      Gold Coast       Adelaide 0.5854987      0 -26.93294  26.93294 1473.067
6 North Melbourne       Essendon 0.5854987      0 -26.93294  26.93294 1473.067
     elo.B
1 1526.933
2 1526.933
3 1480.933
4 1526.933
5 1526.933
6 1526.933

Combine predictions with original data:

Code
# Combine with original data
Model1_results <- results_df %>%
  bind_cols(Model1_predictions)

Model Evaluation

Overall Accuracy

Calculate accuracy on all games:

Code
# Check accuracy on ALL data 
Model1_results <- Model1_results %>%
  mutate(
    home_won = Home.Points > Away.Points,
    predicted_home_win = p.A > 0.5,
    correct = home_won == predicted_home_win
  )

# Overall accuracy
cat("Overall accuracy:", mean(Model1_results$correct), "\n")
Overall accuracy: 0.6728031 

Performance by Season

Break down performance by each season:

Code
# Performance by season
Model1_by_season <- Model1_results %>%
  group_by(Season) %>%
  summarise(
    n_games = n(),
    accuracy = mean(correct),
    log_loss = calculate_log_loss(as.numeric(home_won), p.A)
  ) %>%
  arrange(Season) 
print(Model1_by_season, n = Inf)
# A tibble: 14 × 4
   Season n_games accuracy log_loss
    <dbl>   <int>    <dbl>    <dbl>
 1   2012     207    0.676    0.592
 2   2013     207    0.696    0.545
 3   2014     207    0.729    0.575
 4   2015     206    0.684    0.603
 5   2016     207    0.657    0.566
 6   2017     207    0.623    0.653
 7   2018     207    0.696    0.580
 8   2019     207    0.628    0.653
 9   2020     162    0.679    0.628
10   2021     207    0.643    0.668
11   2022     207    0.681    0.579
12   2023     216    0.671    0.613
13   2024     216    0.662    0.598
14   2025     216    0.694    0.545

Summary

This ELO model achieves approximately 67% accuracy in predicting AFL home team wins. The model uses:

  • K-factor of 46: Controls how much ratings change after each game
  • Home Ground Advantage of 60: Ratings Points boost for playing at home
  • Carry Over of 0.1: Teams regress 10% toward the mean (1500) between seasons

The model performs consistently across seasons and provides probabilistic predictions that can be used for tipping competitions.

Hope you enjoyed the brief run through of building an ELO model for AFL match predictions! Feel free to reach out with any questions or suggestions for further improvements.