In this tutorial, we will create visualisations of Defensive
Pressure in the UEFA EURO 2024 Quarter-Final
match between Spain and Germany.
We’ll be using density plots to visualise the areas
on the pitch where defensive pressure from both teams was most
concentrated. Instead of providing an in-depth analysis of the defensive
pressure characteristics between Spain and Germany, this guide will
focus on how to visually present spatiotemporal data for tactical
analysis. This step-by-step tutorial will help you gain insights into
how to incorporate data visualisation for football tactics.”
For this tutorial, we will be using a open dataset from StatsBomb.
Before we begin, please read the User
Agreement.
Also, you can check the step-by-step tutorial
guide available from StatsBomb on how to use their R
package.
First, install and load the required packages.
# Instal package
install.packages("devtools")
devtools::install_github("statsbomb/StatsBombR") # Install StatsBombR package
# Import packages
library(tidyverse)
library(StatsBombR)
library(ggsoccer)
library(viridis)
You can use FreeCompetitions() function to retrieve a list of all the competitions available in the free StatsBomb datasets.
FreeCompetitions()
Step-1
From a list of all competitions, we will
filter the data to select the 2024 UEFA European
Championship.
Step-2
Use c function to retrieve all matches
from the 2024 UEFA EURO.
Step-3
Use
free_allevents()
function to download the Standard Events data for all
matches in the UEFA EURO.
Step-4
Now that we have the raw event data for
the 2024 UEFA EURO, working with JSON files in R can be a bit
challenging. To simplify this process, we can use the
allclean()
function, which cleans the data and adds extra useful
information.
Step-5
Finally, we will filter the data to
retrieve only the Quarter-Final match between Spain and Germany.
# Select the 2024 UEFA Euro
Comp <- FreeCompetitions() %>%
filter(competition_id == 55 & season_id == 282)
# Check all the matches within the 2024 UEFA Euro
UFFA <- FreeMatches(Competitions = Comp)
# Download the UFFA Euro data for all the matches
EURO <- free_allevents(MatchesDF = UFFA, Parallel = T)
# Clean up the data format that is appropriate for R
EURO <- allclean(EURO)
# Filter to get Spain vs. Germany in Quarter-Final
Spain <- EURO %>%
filter(match_id == 3942226)
Note - This is an optional part and it will walk
you through how to obtain the location data of other players in each
event.
In this section, we will download the
360-Events data for the UEFA EURO 2024 Quarter-Final
match between Spain and Germany. The 360-Events data includes the
location of all players on the pitch. This data can be
used for more in-depth analyses, such as assessing pitch control.
# Download UFFA Euro 360-event data for all the matches
EURO_360 <- free_allevents_360(MatchesDF = UFFA, Parallel = T)
# Filter to get Spain vs. Germany in Quarter-Final
Spain_360 <- EURO_360 %>%
filter(match_id == 3942226)
# Rename the event_ID in 360 data
Spain_360 = Spain_360 %>%
rename(id = event_uuid)
Note - This is an optional part and it will walk
you through how to obtain the location data of other players in each
event.
We will combine the Standard Events data and
360-Events data. By merging these datasets, we can get access to the
events data with information about the players’ positions on the
pitch.
# Combine 360-event data & Standard event data
Player_location <- Spain %>%
left_join(Spain_360, by = c("id" = "id"))
# Clean the data
Player_location <- Player_location %>%
rename(match_id = match_id.x) %>%
select(-match_id.y)
Note - This is an optional part and it will walk
you through how to obtain the location data of other players in each
event.
In this step, we will extract the individual player
positions (x and y coordinates) for each event by unnesting the
freeze-frame data.
# Remove variables that we won`t use
Player_location <- Spain %>%
select(id, period, minute, second, type.name, team.name, OpposingTeam, player.name,
location.x, location.y, freeze_frame)
# Unnest the freeze-frame and extract the x & y coordinates of player positions
Player_location <- Player_location %>%
unnest(freeze_frame) %>%
mutate(ff_location.x = (map(location, 1)), ff_location.y = (map(location, 2))) %>%
select(-location) %>%
mutate(
ff_location.x = as.numeric(ifelse(ff_location.x == "NULL", NA, ff_location.x)),
ff_location.y = as.numeric(ifelse(ff_location.y == "NULL", NA, ff_location.y)))
In this step, we will visualise the Defensive Pressure events from the dataset. The Pressure event represents when a player from one team applies pressure on the ball carrier of the opposing team. We will filter the Standard Events dataset to extract only the Pressure events, which will be used to visualise where on the pitch the pressure was most intense.
# Filter only Pressure events from the Standard Events dataset
Spain_press <- Spain %>%
filter(type.name == "Pressure") %>%
select(period, minute, second, team.name, location.x, location.y)
1. Visualize the Defensive Pressure Events for Spain
# Visualise Defensive Pressure areas for Spain
ggplot() +
# Draw football pitch by using the ggsoccer package
annotate_pitch(
colour = "white", # Change the color of the pitch lines
fill = "#001049", # Change the pitch background color
dimensions = pitch_statsbomb) + # Use the pre-defined pitch dimensions for StatsBomb data
theme_pitch() + # Removing the unnecessary ggplot plot details
# Create the 2D density plot
stat_density_2d(data = Spain_press %>%
# Filter to retrieve Pressure event data for only Spain
filter(team.name == "Spain"),
# Map the location of a player who applied defensive pressure to opponent player
mapping = aes(x = location.x,
y = location.y,
fill = ..density..), # Fill the density color
geom = "raster", # Use a raster (heatmap) style
contour = FALSE, # Disable contour lines
alpha = 0.7) + # Set the transparency for density color
# Adjust the plot dimensions (Clip the data plotted outside of readjusted dimensions)
coord_cartesian(xlim = c(6, 114), ylim = c(5, 75), clip = "on") +
# Use "viridis" color palette to fill the density plot
scale_fill_viridis() +
# Add titles and caption
labs(title ="Denfensive Pressure Area for Spain",
subtitle = "Direction of Play →",
caption = "Data Source: StatsBomb") +
# Adjust theme settings
theme(legend.position = "none", # Hide the Legend
plot.subtitle = element_text(size = 10, # Change the font size of subtitle
color = "dimgray", # Change the font color
hjust = 0.5)) # Change the position of subtitle
2. Visualize the Defensive Pressure Events for Germany
# Visualise Defensive Pressure areas for Germany
ggplot() +
annotate_pitch(colour = "white",
fill = "#001049",
dimensions = pitch_statsbomb) +
theme_pitch() +
stat_density_2d(data = Spain_press %>%
filter(team.name == "Germany"), # Filter to retrieve Pressure event data for only Germany
mapping = aes(x = location.x, y = location.y, fill = ..density..),
geom = "raster", contour = FALSE, alpha = 0.7) +
coord_cartesian(xlim = c(6, 114), ylim = c(5, 75), clip = "on") +
# Use "viridis (magma)" color palette to fill the density plot
scale_fill_viridis(option = "magma") +
labs(title = "Denfensive Pressure Area for Germany",
subtitle = "Direction of Play →",
caption = "Data Source: StatsBomb") +
theme(legend.position = "none",
plot.subtitle = element_text(size = 10, color = "dimgray", hjust = 0.5))