Dangerousity is a real-time measure that quantifies how likely a team is to score at any given moment based on player positioning, ball control, defensive pressure, and spatial dynamics. Rather than focusing on isolated events like shots or turnovers, it captures the continuous flow of the game and evaluates how threatening an offensive situation is as it unfolds. This kind of metric, though developed in football (soccer), has strong potential applications in the NBA, where player movement, spacing, and defensive reactions play a crucial role in creating scoring opportunities. By adapting the concept to basketball’s faster pace and smaller playing area, dangerousity could help coaches, analysts, and even broadcasters assess the effectiveness of offensive plays in real time and better understand how space and pressure shape scoring chances.
The aim of this code is to build upon previous research of dangerousity and apply this to the context of the NBA. This additional development will come through a newly proposed formula and its application to NBA test data. First, we’ll install the necessary libraries.
library(dplyr)
library(ggplot2)
I was given play data that showed the x and y positions of each player and the ball, how fast each were travelling and when it the game the play occured. I also created new fields, including how close each defender is to the player with the ball, so I could use triangulation to estimate whether passing lanes are blocked or not.
nba <- read.csv("AT2_nba_data.csv", stringsAsFactors = F)
names(nba)
# The plays in the dataset
unique(nba$id_play)
# Plot a single play:
plot_play <- 3421490
ggplot(data = subset(nba, id_play == plot_play)) +
# court
geom_rect(aes(xmin = 0, xmax = 94, ymin = 0, ymax = 50), alpha = 0, col = 'black')+
geom_rect(aes(xmin = 0, xmax = 19, ymin = 19, ymax = 31), alpha = 0, col = 'black')+
geom_rect(aes(xmin = 0, xmax = 19, ymin = 17, ymax = 33), alpha = 0, col = 'black')+
geom_segment(x=47,xend=47,y=0,yend=50) +
geom_segment(x=4,xend=4,y=22,yend=28) +
geom_point(aes(x = x, y = y, col = team), size = 2) +
#looks
theme_bw() +
coord_equal() +
ggtitle(plot_play)
# calculate the distance from each player to the centre of the free-throw line
nba <- nba %>%
group_by(id_play) %>%
mutate(
dist_toFreeThrow = sqrt( (x-21)^2 + (y-25)^2 )
)
# summarise each play based on a few factors
nbaplaydata <- nba %>%
group_by(id_play) %>%
summarise(
d_deepest_def = min(sqrt((x[team == 'def']-5.25)^2 + (y[team == 'def']-25)^2)), # distance to basket
dribble = dribble[1] # has the player used their dribble
)
The dangerousity formula proposed in the paper “Real Time Quantification of Dangerousity in Football Using Spatiotemporal Tracking Data” by Link et al. suggests using four main components: Zone (the danger of a player scoring from their spatial position), Control (how well the player can implement their skill based on ball dynamics), Pressure (possibility of defending team stopping the attack), and Density (chance of defending the ball after the action). Shifting this to target basketball, which is on a much smaller pitch than football, the formulas will be altered, and Density will be replaced with Space. Zone and Control are increasing factors, while Pressure and Space are decreasing. All values are between 0 and 1.
# constants
k1 <- 2.5
k2 <- 1/350
rim_x <- 4
rim_y <- 25
dist_2d <- function(x1, y1, x2, y2) sqrt((x1 - x2)^2 + (y1 - y2)^2)
zone <- function(x, y) {
d <- dist_2d(x, y, rim_x, rim_y)
exp(-d / 20)
}
control <- function(speed, dist_rim, dribble) {
if (is.na(speed)) return(0)
if (dist_rim < 14 && speed > 10) {
return(1)
} else {
co <- (1 - k2 * (speed^2))
return(pmax(pmin(co, 1), 0))
}
}
pressure <- function(defender_dists) {
valid <- defender_dists[!is.na(defender_dists) & defender_dists <= 5]
n <- length(valid)
if (n == 0) return(0)
pr <- exp(n - sum(valid) / 5) / 10
return(pmin(pr, 1))
}
space <- function(pwb, attackers, defenders, dist_rim_pwb) {
# if no attackers or defenders, full space (no blockage)
if (nrow(attackers) == 0 || nrow(defenders) == 0) return(1)
# find attackers closer to the rim
attackers <- attackers %>%
mutate(dist_rim_att = dist_2d(x, y, rim_x, rim_y)) %>%
filter(dist_rim_att < dist_rim_pwb)
if (nrow(attackers) == 0) return(0)
heights <- c()
for (i in 1:nrow(attackers)) {
for (j in 1:nrow(defenders)) {
a <- dist_2d(attackers$x[i], attackers$y[i], defenders$x[j], defenders$y[j])
b <- dist_2d(attackers$x[i], attackers$y[i], pwb$x, pwb$y)
c <- dist_2d(defenders$x[j], defenders$y[j], pwb$x, pwb$y)
s <- (a + b + c) / 2
area <- sqrt(pmax(s * (s - a) * (s - b) * (s - c), 0))
h <- ifelse(b != 0, 2 * area / b, 0)
heights <- c(heights, h)
}
}
h_min <- min(heights)
sp <- (6 - h_min) / 4
return(pmax(pmin(sp, 1), 0))
}
# main function
calculate_dangerousity <- function(df) {
df %>%
group_by(id_play, quarter, game_clock) %>%
group_modify(~{
on_ball <- .x %>% filter(player_id == "on_ball")
defenders <- .x %>% filter(team == "def")
attackers <- .x %>% filter(team == "off" & player_id != "on_ball")
if (nrow(on_ball) == 0) return(NULL)
dist_rim_pwb <- dist_2d(on_ball$x, on_ball$y, rim_x, rim_y)
z <- zone(on_ball$x, on_ball$y)
c <- control(on_ball$speed, dist_rim_pwb, on_ball$dribble)
p <- pressure(defenders$defender_on)
s <- space(on_ball, attackers, defenders, dist_rim_pwb)
da <- z * (1 - ((1 - c + p + s) / k1))
tibble(
id_play = .x$id_play[1],
game_clock = on_ball$game_clock,
ZO = z,
CO = c,
PR = p,
SP = s,
DA = da
)
}) %>%
ungroup()
}
# run model
nba_danger <- calculate_dangerousity(nba)
head(nba_danger)
While the proposed dangerousity framework provides a novel adaptation of spatial-temporal quantification to basketball, several limitations exist. First, the model is constructed on a two-dimensional plane, meaning it neglects the vertical dynamics of the game — such as shot arcs, defender jumping reach, and ball height during passes — which are highly influential in real scenarios. Second, the formula assumes perfect tracking data accuracy and instantaneous updates in player positions and speeds. In practice, optical tracking systems introduce small but meaningful errors and frame delays that could distort calculated values of control, pressure, and space. Third, simplifications were made to preserve model tractability, such as assuming binary control states and fixed radii for defensive influence, which may not fully capture nuanced human decision-making. Fourth, constants such as k₁ and k₂ were selected theoretically rather than empirically calibrated against observed scoring outcomes. Future work should focus on calibrating these parameters using historical possession-level event data and evaluating predictive validity. Finally, psychological and contextual factors — player fatigue, game state, or team strategy — are not included but could significantly influence offensive “dangerousity” in real NBA play.