In this tutorial, we will go through how to retrieve data from the AFL and how to code various ways to plot and visualise some of the data. We will also go through how to enhance these plots as it is important to learn how to create more effective visualisations which is beneficial for presenting data to others.
First of all, we need to retrieve the data using the fitzRoy package which includes a large variety of AFL match data across past seasons. We then need to download the other packages below to assist with the plotting of the data.
# Install required packages if needed by removing the "#" from the line below
# install.packages(c("fitzRoy", "tidyverse", ggrepel"))
library(fitzRoy)
library(tidyverse)
library(ggrepel)
We will now create a data frame that includes the statistics for the 2025 home and away regular season
afldata <- fetch_player_stats_afltables(2025)
regular_season <- afldata %>%
mutate(Round = as.numeric(Round)) %>%
filter(Round >= 1 & Round <= 25)
As we only need a few variables for this demonstration we can remove the other variables from the dataframe
regular_season <- regular_season %>%
select(Date, Round, Playing.for, Player, Kicks, Disposals, Goals, Clearances)
We can then use the remaining variables to create team totals for each game as the current data is for the individual player
team_totals <- regular_season %>%
group_by(Date, Round, Playing.for) %>%
summarise(
Total_Kicks = sum(Kicks, na.rm = TRUE),
Total_Disposals = sum(Disposals, na.rm = TRUE),
Total_Goals = sum(Goals, na.rm = TRUE),
Total_Clearances = sum(Clearances, na.rm = TRUE)
) %>%
ungroup()
Now that the data we will use has been finalise, the following code will create a bar plot to view the average disposals for each team per game during the 2025 season
team_totals %>%
group_by(Playing.for) %>%
summarise(Avg_Disposals = mean(Total_Disposals, na.rm = TRUE)) %>%
ggplot(aes(x = reorder(Playing.for, Avg_Disposals), y = Avg_Disposals, fill = Playing.for)) +
geom_col(show.legend = FALSE) +
labs(
title = "Average Team Disposals per Game (2025 Season)",
x = "Team",
y = "Average Disposals per Game"
) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
While this creates a basic visualisation of the data, we can enhance the plot in a few ways. First we can create code that assoacitates each team with their respective colours to make the visual more readable.
team_colours <- c(
"Adelaide" = "#002B5C",
"Brisbane Lions" = "#A50034",
"Carlton" = "#001C3E",
"Collingwood" = "#000000",
"Essendon" = "#CC2031",
"Fremantle" = "#2A0D54",
"Geelong" = "#1F2041",
"Gold Coast" = "#E41B13",
"Greater Western Sydney" = "#F37920",
"Hawthorn" = "#4D2004",
"Melbourne" = "#0F1131",
"North Melbourne" = "#0033A0",
"Port Adelaide" = "#01BFBF",
"Richmond" = "#FEC524",
"St Kilda" = "#D50032",
"Sydney" = "#E41B13",
"West Coast" = "#003087",
"Western Bulldogs" = "#004B93"
)
# Add to new enhanced plot
team_totals %>%
group_by(Playing.for) %>%
summarise(Avg_Disposals = mean(Total_Disposals, na.rm = TRUE)) %>%
ungroup() %>%
ggplot(aes(x = reorder(Playing.for, Avg_Disposals), y = Avg_Disposals, fill = Playing.for)) +
geom_col(show.legend = FALSE, width = 0.7) +
geom_text(aes(label = round(Avg_Disposals, 1)),
vjust = -0.6, size = 3.8, fontface = "bold", colour = "black") +
scale_fill_manual(values = team_colours) +
labs(
title = "AFL 2025 Season – Average Team Disposals per Game",
subtitle = "Each bar represents the mean number of disposals per game for each club across the regular season",
x = "Team",
y = "Average Disposals per Game") +
coord_cartesian(ylim = c(300, 380)) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold", size = 18, hjust = 0.5),
plot.subtitle = element_text(size = 12, hjust = 0.5, colour = "grey30"),
plot.caption = element_text(size = 9, hjust = 1, colour = "grey50"),
axis.title = element_text(face = "bold"),
axis.text.x = element_text(angle = 45, hjust = 1, size = 11, colour = "grey20"),
axis.text.y = element_text(size = 11, colour = "grey20"),
panel.grid.major.x = element_blank(),
panel.grid.minor = element_blank(),
panel.grid.major.y = element_line(colour = "grey90"),
plot.margin = margin(10, 25, 10, 25)
)
As you can see this graph showcases the difference between the top and
bottom teams much more clearly while including details such as the team
colours make the reader associate each bar and team much easier then
before.
We can also use scatter plots to view relationships within the data and below is an example of how to create a scatter plot to view the relationship between disposals and goals for each team.
# Create totals for the whole season rather than each game
season_totals <- team_totals %>%
group_by(Playing.for) %>%
summarise(
Total_Disposals = sum(Total_Disposals, na.rm = TRUE),
Total_Goals = sum(Total_Goals, na.rm = TRUE),
.groups = "drop"
)
season_totals %>%
ggplot(aes(x = Total_Disposals, y = Total_Goals, colour = Playing.for)) +
geom_point(alpha = 0.7) +
geom_smooth(method = "lm", se = FALSE, colour = "black") +
labs(
title = "Relationship Between Team Disposals and Goals (2025)",
x = "Total Disposals",
y = "Total Goals"
) +
theme(legend.position = "bottom")
## `geom_smooth()` using formula = 'y ~ x'
Whilst this creates a basic visual we can once again enhance the visual using team colours and more detail
# Primary and secondary colours for each team
team_colours_primary <- c(
"Adelaide"="darkblue","Brisbane Lions"="#B84B9E","Carlton"="#004B87",
"Collingwood"="#000","Essendon"="#E13A3E","Fremantle"="#5C2E91",
"Geelong"="#0065A4","Gold Coast"="#F44336","Greater Western Sydney"="#F57C00",
"Hawthorn"="#8D6E63","Melbourne"="#212D6B","North Melbourne"="#1976D2",
"Port Adelaide"="#00BCD4","Richmond"="#FFB300","St Kilda"="#C62828",
"Sydney"="#E53935","West Coast"="#1565C0","Western Bulldogs"="#3F51B5"
)
team_colours_secondary <- c(
"Adelaide"="#FDD835","Brisbane Lions"="#FFC107","Carlton"="#A5C8FF",
"Collingwood"="#4D4D4D","Essendon"="#212121","Fremantle"="#CDB4FF",
"Geelong"= "darkblue","Gold Coast"="#FFD180","Greater Western Sydney"="#795548",
"Hawthorn"="#FFD54F","Melbourne"="#FF5252","North Melbourne"="#E3F2FD",
"Port Adelaide"="#212121","Richmond"="#000000","St Kilda"="#4D4D4D",
"Sydney"="#BBDEFB","West Coast"="#FFD54F","Western Bulldogs"="#FF5252"
)
ggplot(season_totals, aes(x = Total_Disposals, y = Total_Goals)) +
geom_point(aes(fill = Playing.for, colour = Playing.for),
shape = 21, size = 6, stroke = 2.5, alpha = 0.95) +
geom_smooth(method = "lm", se = FALSE, colour = "grey20", linewidth = 1.1, linetype = "dashed") +
geom_text_repel(
aes(label = Playing.for),
size = 4,
fontface = "bold",
colour = "black",
bg.color = "white",
bg.r = 0.15,
max.overlaps = Inf,
box.padding = 0.6,
point.padding = 0.4
) +
scale_fill_manual(values = team_colours_primary) +
scale_colour_manual(values = team_colours_secondary) +
labs(
title = "AFL 2025 Season: Total Disposals vs Total Goals",
x = "Total Disposals (Season)",
y = "Total Goals (Season)"
) +
theme_minimal(base_size = 14, base_family = "Arial") +
theme(
plot.title = element_text(face = "bold", size = 18, hjust = 0.5),
plot.caption = element_text(size = 9, hjust = 1, colour = "grey50"),
panel.grid.major = element_line(colour = "grey90"),
panel.grid.minor = element_blank(),
axis.title = element_text(face = "bold"),
axis.text = element_text(size = 12, colour = "grey20"),
legend.position = "none",
plot.margin = margin(10, 20, 10, 20)
)
## `geom_smooth()` using formula = 'y ~ x'
As we can see from the graphic that their was a positive relationship with the total disposals and total goals and with the added details to the plot it is make more visible where each team was plotted.
We can also create box plots as shown by the code below which creates a box plot for the average kicks per game for each team
team_totals %>%
ggplot(aes(x = Playing.for, y = Total_Kicks, fill = Playing.for)) +
geom_boxplot(show.legend = FALSE) +
labs(
title = "Distribution of Team Kicks per Game (2025)",
x = "Team",
y = "Kicks per Game"
) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
A more effective version of the box plot can be created as below
team_totals %>%
ggplot(aes(x = reorder(Playing.for, Total_Kicks, FUN = median),
y = Total_Kicks, fill = Playing.for)) +
geom_boxplot(
show.legend = FALSE,
outlier.color = NA,
width = 0.7,
alpha = 0.9,
colour = "grey30") +
geom_jitter(width = 0.15, alpha = 0.4, size = 1.5, colour = "black") +
geom_hline(yintercept = mean(team_totals$Total_Kicks, na.rm = TRUE),
linetype = "dashed", colour = "grey30", linewidth = 0.8) +
annotate("text", x = 1, y = mean(team_totals$Total_Kicks, na.rm = TRUE) + 2,
label = "League Avg", hjust = 0, vjust = -0.5, colour = "grey30", size = 3.5) +
scale_fill_manual(values = team_colours) +
labs(
title = "AFL 2025: Distribution of Team Kicks per Game",
x = "Team",
y = "Kicks per Game") +
scale_y_continuous(breaks = seq(150, 280, by = 10)) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold", size = 18, hjust = 0.5),
plot.subtitle = element_text(size = 11, hjust = 0.5, colour = "grey30"),
plot.caption = element_text(size = 9, hjust = 1, colour = "grey50"),
axis.title = element_text(face = "bold"),
axis.text.x = element_text(angle = 45, hjust = 1, size = 11, colour = "grey20"),
axis.text.y = element_text(size = 11, colour = "grey20"),
panel.grid.major.y = element_line(colour = "grey90"),
panel.grid.minor = element_blank(),
plot.background = element_rect(fill = "white", colour = NA),
plot.margin = margin(10, 20, 10, 20),
legend.position = "none"
)
By adding each more detail on the y-axis as well as including the seperate data points and league average line, the box plot becomes more accurate to interpret as you can better determine what each teams median kicks per game is and how it compares to the league average.
From this tutorial, we have seen how to create various types of plots to display data but further, have demonstrated how to upgrade the basic plots to more comprehensive visuals that further explain the data whilst being easier to read. This has practical implications when presenting data to others as it is important to provide the analysis in the most effective way possible as some of the people you present to may not be familiar with the data and how certain plots may look.