This section ensures that all required libraries for data extraction, manipulation, and visualization are installed and loaded. It combines general-purpose data science tools with football-specific analytics packages.
# Install packages
if (!require("devtools")) install.packages("devtools")
if (!require("soccermatics")) devtools::install_github("jogall/soccermatics")
# Load packages
library(soccermatics)
library(tidyverse)
library(ggplot2)
library(StatsBombR)
This section retrieves match data from the StatsBomb open dataset, filters for a specific competition and match, and prepares the data for analysis by cleaning and transforming spatial coordinates.
# Show all competitions available with Statsbomb
comps <- FreeCompetitions()
# Choose one competition
competition <- comps %>%
filter(competition_name == "Premier League",
season_name == "2015/2016")
# Load matches from the selected competition
matches_available <- FreeMatches(competition)
# Choose one match
match <- matches_available %>% filter(
home_team.home_team_name == "Manchester City",
away_team.away_team_name == "Southampton")
# Load events from the selected match
events <- free_allevents(MatchesDF = match, Parallel = TRUE)
# Clean the data (remove unnecessary columns, handle missing values)
events <- allclean(events)
# Transform coordinates to the pitch dimensions required by Soccermatics
clean_events <- events %>%
soccerTransform(method='statsbomb')
This section evaluates the offensive performance of both teams using shot maps enriched with expected goals (xG), allowing both spatial and probabilistic assessment of chance quality.
clean_events %>%
filter(team.name == "Manchester City") %>%
soccerShotmap(
theme = "dark",
title = "Shots by Manchester City with xG",
subtitle = "Manchester City 3-1 Southampton | Premier League game (2015/2016)"
)
Interpretation:
- Manchester City generated 3 goals from 1.67 xG, indicating efficient
finishing above expectation.
- The shot distribution shows multiple attempts in and around the
penalty area, with at least one high-quality chance (large xG value near
goal).
- The spatial pattern suggests central penetration and effective chance
creation in dangerous zones.
clean_events %>%
filter(team.name == "Southampton") %>%
soccerShotmap(
theme = "dark",
title = "Shots by Southampton with xG",
subtitle = "Manchester City 3-1 Southampton | Premier League game (2015/2016)"
)
Interpretation:
- Southampton scored 1 goal from 1.16 xG, which is broadly in line with
expectation.
- Their shots are more dispersed and slightly further from goal,
indicating lower average shot quality.
- Fewer high-probability chances suggest difficulty in breaking into
optimal shooting areas.
The xG timeline tracks the cumulative expected goals throughout the match, providing a dynamic view of how scoring opportunities evolved over time.
clean_events %>%
soccerxGTimeline(
homeCol = "#66615e",
awayCol = "#a50044",
labels = FALSE
)
Interpretation:
- Manchester City shows a rapid early increase in xG, suggesting strong
early attacking pressure.
- Their xG curve grows steadily, indicating consistent chance creation
across the match.
- Southampton’s xG increases more gradually, with a noticeable rise
later in the match, implying delayed offensive momentum.
- The final totals (~1.7 vs ~1.2) confirm a moderate but clear advantage
for Manchester City.
Passing networks illustrate team structure by mapping player positions and pass connections, highlighting distribution patterns and key players in build-up play.
clean_events %>%
filter(team.name == "Manchester City") %>%
soccerPassmap(
fill = "#66615e",
arrow = "r",
shortName = FALSE,
title = "Manchester City Pass Network | Premier League game vs Southampton (2015/2016)"
)
Interpretation:
- The network is dense and well-connected, particularly in
midfield.
- Players like Yaya Touré and Kevin De Bruyne appear central, acting as
key hubs.
- The structure indicates possession-based play with short passing
combinations and strong central control.
- High pass completion (77.6%) reinforces technical dominance and
control.
clean_events %>%
filter(team.name == "Southampton") %>%
soccerPassmap(
fill = "#66615e",
arrow = "r",
shortName = FALSE,
title = "Southampton Pass Network | Premier League game vs Man City (2015/2016)"
)
Interpretation:
- The pass network appears to be generated over a restricted time window
by the plotting function, likely due to its internal handling of
substitutions or lineup changes.
This section zooms into individual player behavior, showing pass direction, length, and type to better understand player roles within the system.
clean_events %>%
filter(team.name == "Manchester City" &
type.name == "Pass" &
player.name == "Kevin De Bruyne") %>%
{
soccerPitch(arrow = "r",
title = "Kevin De Bruyne Pass Map vs Southampton",
subtitle = "Manchester City 3-1 Southampton | Premier League game (2015/2016)") +
geom_segment(data = ., aes(x = location.x, y = location.y,
xend = pass.end_location.x, yend = pass.end_location.y,
color = pass.height.name),
arrow = arrow(length = unit(0.25, "cm"), type = "closed")) +
theme(legend.position = "bottom",
plot.margin = margin(5,5,5,5),
legend.margin = margin(0,0,10,0)) +
labs(color = "Pass Height")
}
Interpretation:
- Highly active across the attacking half, with forward-oriented and
varied passes.
- Presence of long and high passes indicates creative playmaking and
vertical progression.
- His distribution pattern confirms his role as a primary offensive
orchestrator.
clean_events %>%
filter(team.name == "Southampton" &
type.name == "Pass" &
player.name == "Oriol Romeu Vidal") %>%
{
soccerPitch(arrow = "r",
title = "Oriol Romeu Vidal Pass Map vs Manchester City",
subtitle = "Manchester City 3-1 Southampton | Premier League game (2015/2016)") +
geom_segment(data = ., aes(x = location.x, y = location.y,
xend = pass.end_location.x, yend = pass.end_location.y,
color = pass.height.name),
arrow = arrow(length = unit(0.25, "cm"), type = "closed")) +
theme(legend.position = "bottom",
plot.margin = margin(5,5,5,5),
legend.margin = margin(0,0,10,0)) +
labs(color = "Pass Height")
}
Interpretation:
- Passes are more central and conservative, with fewer progressive
actions.
- Distribution is largely lateral or short-range, indicating a holding
midfield role.
- Limited penetration suggests support rather than creation.
This section combines pass density (heatmap) and directional flow to visualize where and how teams move the ball across the pitch.
# Standardize columns to allow layering multiple visualizations
clean_events %>%
filter(team.name == "Manchester City" & type.name == "Pass") %>%
soccerStandardiseCols(method = 'statsbomb') -> mancity_pass_data
# Heatmap of pass density
heatmap_plot <- soccerHeatmap(mancity_pass_data,
xBins = 6, yBins = 6,
arrow = "r",
colLow = "#fff9cc", colHigh = "#ff6100",
title = "Manchester City Pass Flow",
subtitle = "Manchester City 3-1 Southampton | Premier League game (2015/2016)")
# Overlay flow of passes
soccerFlow(mancity_pass_data,
xBins = 6, yBins = 6,
plot = heatmap_plot, # superimpose with heatmap
col = "#37577d",
lwd = 0.8)
Interpretation:
- High pass density in central and advanced midfield zones.
- Flow vectors indicate forward progression with structured
buildup.
- Strong presence in the opponent’s half reflects territorial
dominance.
# Standardize columns to allow layering multiple visualizations
clean_events %>%
filter(team.name == "Southampton" & type.name == "Pass") %>%
soccerStandardiseCols(method = 'statsbomb') -> southampton_pass_data
# Heatmap of pass density
heatmap_plot <- soccerHeatmap(southampton_pass_data,
xBins = 6, yBins = 6,
arrow = "r",
colLow = "#fff9cc", colHigh = "#ff6100",
title = "Southampton Pass Flow",
subtitle = "Manchester City 3-1 Southampton | Premier League game (2015/2016)")
# Overlay flow of passes
soccerFlow(southampton_pass_data,
xBins = 6, yBins = 6,
plot = heatmap_plot, # superimpose with heatmap
col = "#37577d",
lwd = 0.8)
Interpretation:
- Heatmap shows concentration in midfield areas, especially
centrally.
- Less activity in the final third suggests difficulty progressing into
dangerous zones.
- Flow appears more horizontal and less penetrative.
This section analyzes defensive intensity through pressure events, highlighting where teams attempt to regain possession.
# Identify events with valid coordinates
position_events <- clean_events %>%
filter(!is.na(location.x)) %>%
select(type.name) %>%
unique() %>%
as.list()
# Different types of events
#position_events$type.name
# Heatmap of all actions by Kevin De Bruyne (Manchester City)
clean_events %>%
filter(team.name == "Manchester City" & player.name == "Kevin De Bruyne" &
type.name %in% position_events$type.name) %>%
soccerHeatmap(
x = "location.x", y = "location.y",
kde = TRUE, # smoothed heatmap
arrow = "r", # direction of attack
title = "Kevin De Bruyne Heatmap",
subtitle = "Manchester City 3-1 Southampton | Premier League game (2015/2016)"
)
# Heatmap of pressure zones for Manchester City
clean_events %>%
filter(team.name == "Manchester City" & type.name == "Pressure") %>%
soccerHeatmap(
x = "location.x", y = "location.y",
xBins = 6, yBins = 3, # grid for counting events
colLow = "#fff9cc", colHigh = "#ff6100",
arrow = "r",
title = "Manchester City Pressure Zones",
subtitle = "Manchester City 3-1 Southampton | Premier League game (2015/2016)"
)
Interpretation:
# First, identify events with valid coordinates
position_events <- clean_events %>%
filter(!is.na(location.x)) %>%
select(type.name) %>%
unique() %>%
as.list()
# different types of events
position_events$type.name
## [1] "Pass" "Ball Receipt*" "Carry" "Pressure"
## [5] "Dispossessed" "Duel" "Foul Committed" "Foul Won"
## [9] "Interception" "Shot" "Goal Keeper" "Clearance"
## [13] "Ball Recovery" "Block" "Miscontrol" "Dribbled Past"
## [17] "Dribble" "Offside"
# Heatmap of all actions by Oriol Romeu Vidal (Southampton)
clean_events %>%
filter(team.name == "Southampton" & player.name == "Oriol Romeu Vidal" &
type.name %in% position_events$type.name) %>%
soccerHeatmap(
x = "location.x", y = "location.y",
kde = TRUE, # smoothed heatmap
arrow = "r", # direction of attack
title = "Oriol Romeu Vidal Heatmap",
subtitle = "Manchester City 3-1 Southampton | Premier League game (2015/2016)"
)
# Heatmap of pressure zones for Southampton
clean_events %>%
filter(team.name == "Southampton" & type.name == "Pressure") %>%
soccerHeatmap(
x = "location.x", y = "location.y",
xBins = 6, yBins = 3, # grid for counting events
colLow = "#fff9cc", colHigh = "#ff6100",
arrow = "r",
title = "Southampton Pressure Zones",
subtitle = "Manchester City 3-1 Southampton | Premier League game (2015/2016)"
)
Interpretation: