Introduction

In this project, I wanted to create a rushing defense metric that can give us a better overview of how defenses perform versus workhorse rushers. I then compared my created metric to the NFL’s traditional average yards per carry allowed stat.

Creating a Metric

Below, I filtered to games where a ball carrier had at least 14 rushing attempts. In today’s NFL it’s harder to find true 20+ carry workloads every week, so 14 felt like a reasonable cutoff for “workhorse usage.” I didn’t filter out quarterbacks because if a QB is carrying the ball 14+ times, that rushing impact is still meaningful. For each qualifying player-game, I computed that player’s yards per carry in the game and compared it to their season-long yards per carry. The difference (game YPC − season YPC) measures how much more (or less) efficient that player was against a specific defense than they typically are.

Comparing Metrics

Here, we compare my created metric to the NFL’s average yards per carry allowed. Although both metrics are constructed from play-by-play data, they differ in how the data are aggregated. The traditional yards-per-carry allowed metric summarizes all rushing attempts faced by a defense, treating each play equally and aggregating directly to the team level. In contrast, the proposed metric conditions on high-volume rushing performances and evaluates how a defense affects a runner’s efficiency relative to their season baseline. Using play-by-play data for both metrics ensures consistency in data coverage while isolating the effect of aggregation strategy. So, the first metric answers “how many yards per rush does a defense allow overall,” while the second answers “do runners become more efficient than usual when they hit a workhorse workload against this defense?” To keep the plot readable, I show teams that rank in the bottom 10 by either metric.

We see that a few teams remain in similar ranks (Bills, Giants, Bengals, Dolphins, Cardinals, etc), but we also see some major discrepancies. The Panthers are ranked 11th in yards per carry but 3rd when they allow 14+ carries. The Chargers drop from 16th to 9th. Surprisingly, the Bears, Jets, and Falcons become more stout as their opponent gets more carries.

Conclusion

This project shows how using the same play-by-play data but changing how it’s aggregated can lead to very different conclusions about run defense. Traditional yards per carry allowed reflects overall efficiency, but it misses how defenses perform when opponents commit to a heavy rushing workload. By focusing on high-volume rushing games and comparing performance to season baselines, the workhorse rusher metric highlights situational weaknesses that standard metrics overlook. Some defenses remain consistent across both measures, while others change dramatically, suggesting that workload context plays an important role in evaluating run defense performance.

---
title: "Which NFL Defenses Performed the Worst Against Workhorse Rushers?"
author: "Luke Volm"
date: "2026-01-07"
output:
  html_document:           # output document format
    toc: yes               # add table contents
    toc_float: yes         # toc_property: floating
    toc_depth: 4           # depth of TOC headings
    fig_width: 6           # global figure width
    fig_height: 4          # global figure height
    fig_caption: yes       # add figure caption
    number_sections: no   # numbering section headings
    toc_collapsed: yes     # TOC subheading collapsing
    code_folding: hide     # folding/showing code 
    code_download: yes     # allow to download complete RMarkdown source code
    smooth_scroll: yes     # scrolling text of the document
    theme: lumen           # visual theme for HTML document only
    highlight: tango       # code syntax highlighting styles
  pdf_document: 
    toc: yes
    toc_depth: 4
    fig_caption: yes
    number_sections: yes
  word_document:
    toc: yes
    toc_depth: '4'
---

```{css, echo = FALSE}
div#TOC {
  list-style: upper-roman;
  background-image: none;
  background-repeat: no-repeat;
  background-position: 0;
}

h1.title {    /* level 1 header of title  */
  font-size: 24px;
  font-weight: bold;
  color: DarkRed;
  text-align: center;
}

h4.author { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Times New Roman", Times, serif;
  color: DarkRed;
  text-align: center;
}

h4.date { /* Header 4 - and the author and data headers use this too  */
  font-size: 18px;
  font-weight: bold;
  font-family: "Times New Roman", Times, serif;
  color: DarkBlue;
  text-align: center;
}

h1 { /* Header 1 - and the author and data headers use this too  */
    font-size: 20px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: center;
}

h2 { /* Header 2 - and the author and data headers use this too  */
    font-size: 18px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { /* Header 3 - and the author and data headers use this too  */
    font-size: 16px;
    font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h4 { /* Header 4 - and the author and data headers use this too  */
    font-size: 14px;
  font-weight: bold;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

/* Add dots after numbered headers */
.header-section-number::after {
  content: ".";
}
```
---

```{r setup, include=FALSE, message=FALSE, warning=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(dplyr)
library(knitr)
library(tidyr)
library(ggplot2)
library(scales)
library(tidyverse)
library(nflreadr)
library(purrr)
current_season <- nflreadr::most_recent_season()


nflreadr::.clear_cache()

pbp <- nflreadr::load_pbp(current_season)

max(pbp$week, na.rm = TRUE)
```

# Introduction

In this project, I wanted to create a rushing defense metric that can give us a better overview of how defenses perform versus workhorse rushers. I then compared my created metric to the NFL's traditional average yards per carry allowed stat. 

# Creating a Metric

Below, I filtered to games where a ball carrier had at least 14 rushing attempts. In today’s NFL it’s harder to find true 20+ carry workloads every week, so 14 felt like a reasonable cutoff for “workhorse usage.” I didn’t filter out quarterbacks because if a QB is carrying the ball 14+ times, that rushing impact is still meaningful. For each qualifying player-game, I computed that player’s yards per carry in the game and compared it to their season-long yards per carry. The difference (game YPC − season YPC) measures how much more (or less) efficient that player was against a specific defense than they typically are.

```{r, echo=FALSE, message=FALSE, warning=FALSE, results="hide"}

# pbp must already exist (ex: pbp <- load_pbp(current_season))

# --- Season-long baseline (compute once) ---
rb_season <- pbp %>%
  filter(rush_attempt == 1, !is.na(rusher_player_id)) %>%
  group_by(rusher_player_id, rusher_player_name) %>%
  summarise(
    total_att   = n(),
    total_yards = sum(rushing_yards, na.rm = TRUE),
    season_ypc  = total_yards / total_att,
    .groups = "drop"
  )

# --- Function: returns summary for one defense ---
rb_vs_def_summary <- function(def_abbr, min_att = 14) {

  comp <- pbp %>%
    filter(rush_attempt == 1, !is.na(rusher_player_id), defteam == def_abbr) %>%
    group_by(game_id, week, rusher_player_id, rusher_player_name) %>%
    summarise(
      att_vs_def   = n(),
      yards_vs_def = sum(rushing_yards, na.rm = TRUE),
      ypc_vs_def   = yards_vs_def / att_vs_def,
      .groups = "drop"
    ) %>%
    left_join(rb_season, by = c("rusher_player_id", "rusher_player_name")) %>%
    mutate(
      diff_ypc = ypc_vs_def - season_ypc,
      percent_change = (ypc_vs_def / season_ypc - 1) * 100
    ) %>%
    filter(att_vs_def >= min_att)

  comp %>%
    summarise(
      defteam = def_abbr,
      n_player_games = n(),
      avg_diff_ypc = mean(diff_ypc, na.rm = TRUE),
      avg_percent_change = mean(percent_change, na.rm = TRUE),
      .groups = "drop"
    )
}

# --- Run for every defense in the data ---
all_teams <- sort(unique(na.omit(pbp$defteam)))

all_summaries <- map_dfr(all_teams, ~ rb_vs_def_summary(.x, min_att = 14)) %>%
  arrange(desc(avg_diff_ypc))

all_summaries

```


```{r}

plot_df <- all_summaries

ggplot(plot_df,
       aes(x = reorder(defteam, avg_diff_ypc),
           y = avg_diff_ypc,
           fill = avg_diff_ypc > 0)) +
  geom_col() +
  coord_flip() +
  scale_fill_manual(
    values = c("TRUE" = "steelblue", "FALSE" = "firebrick"),
    labels = c("TRUE" = "RBs outperform", "FALSE" = "RBs underperform")
  ) +
  labs(
    title = "Worst Defenses vs Workhorse Rushers",
    x = "Defense",
    y = "Avg YPC Difference (vs season average)"
  ) +
  theme(legend.title = element_blank())
```


```{r, echo=FALSE, message=FALSE, warning=FALSE, results="hide"}

def_ypc_allowed <- pbp %>%
  filter(rush_attempt == 1, !is.na(defteam)) %>%
  group_by(defteam) %>%
  summarise(
    ypc_allowed = mean(rushing_yards, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  arrange(desc(ypc_allowed))

def_ypc_allowed
```

# Comparing Metrics

Here, we compare my created metric to the NFL's average yards per carry allowed. Although both metrics are constructed from play-by-play data, they differ in how the data are aggregated. The traditional yards-per-carry allowed metric summarizes all rushing attempts faced by a defense, treating each play equally and aggregating directly to the team level. In contrast, the proposed metric conditions on high-volume rushing performances and evaluates how a defense affects a runner’s efficiency relative to their season baseline. Using play-by-play data for both metrics ensures consistency in data coverage while isolating the effect of aggregation strategy. So, the first metric answers "how many yards per rush does a defense allow overall," while the second answers "do runners become more efficient than usual when they hit a workhorse workload against this defense?" To keep the plot readable, I show teams that rank in the bottom 10 by either metric.

```{r}
rank_compare <- all_summaries %>%
  dplyr::select(defteam, avg_diff_ypc) %>%
  left_join(def_ypc_allowed, by = "defteam") %>%
  mutate(
    rank_avg_diff_ypc = dense_rank(desc(avg_diff_ypc)),
    rank_ypc_allowed  = dense_rank(desc(ypc_allowed)),
    rank_gap          = rank_avg_diff_ypc - rank_ypc_allowed
  )

# plot 10 teams that are top-10 worst by either metric
dumbbell_df <- rank_compare %>%
  filter(rank_avg_diff_ypc <= 10 | rank_ypc_allowed <= 10) %>%
  mutate(defteam = reorder(defteam, rank_avg_diff_ypc)) %>%
  arrange(rank_avg_diff_ypc)

ggplot(dumbbell_df, aes(y = defteam)) +
  geom_segment(
    aes(x = rank_ypc_allowed, xend = rank_avg_diff_ypc),
    linewidth = 1,
    color = "gray60"
  ) +
  geom_point(aes(x = rank_ypc_allowed, color = "Traditional YPC Allowed"), size = 3) +
  geom_point(aes(x = rank_avg_diff_ypc, color = "Workhorse RB Metric"), size = 3) +
  scale_x_reverse(
    breaks = seq(1, 32, 4),
    limits = c(32, 1)
  ) +
  scale_color_manual(
    name = "Metric",
    values = c(
      "Traditional YPC Allowed" = "firebrick",
      "Workhorse Rushing Metric"     = "steelblue"
    )
  ) +
  labs(
    title = "Run Defense Rank Differences by Metric",
    x = "Defensive Rank (1 = Worst, 32 = Best)",
    y = "Team"
  ) +
  theme_minimal() +
  theme(
    legend.position = "bottom",
    legend.title = element_text(face = "bold")
  )
```

We see that a few teams remain in similar ranks (Bills, Giants, Bengals, Dolphins, Cardinals, etc), but we also see some major discrepancies. The Panthers are ranked 11th in yards per carry but 3rd when they allow 14+ carries. The Chargers drop from 16th to 9th. Surprisingly, the Bears, Jets, and Falcons become more stout as their opponent gets more carries.

# Conclusion

This project shows how using the same play-by-play data but changing how it’s aggregated can lead to very different conclusions about run defense. Traditional yards per carry allowed reflects overall efficiency, but it misses how defenses perform when opponents commit to a heavy rushing workload. By focusing on high-volume rushing games and comparing performance to season baselines, the workhorse rusher metric highlights situational weaknesses that standard metrics overlook. Some defenses remain consistent across both measures, while others change dramatically, suggesting that workload context plays an important role in evaluating run defense performance.