NFL Quarterback Statistics

Author

Michael Borromeo

Introduction

Hi everyone, I’m Michael Borromeo, and I’m a senior at Xavier University majoring in Business Analytics and Information Systems. I’m originally from Freehold, New Jersey, and throughout college I’ve spent a lot of time balancing coursework, analytics projects, job applications, and time to watch and talk football with my friends. Outside of academics, I enjoy playing sports and staying active while also following major sports leagues.

For my Programming in Analytics course, I was tasked with choosing a dataset and performing a full analysis in R. As someone who follows the NFL closely, quarterback performance immediately stood out as a topic worth exploring. Every season, fans argue endlessly about which quarterbacks are truly “elite,” who carries their offense, and who benefits from system or supporting cast. But those debates often rely on passing yards alone — a metric that doesn’t capture rushing value, scoring impact, or turnover cost. I wanted to know whether the data actually supports the way quarterbacks are evaluated, or if a more complete metric would reveal a different hierarchy.

What makes this topic especially interesting to me is how quarterback play blends both analytics and emotion. Every fan thinks they know who the best quarterbacks are, yet every season includes breakout stars, disappointing regressions, and players who quietly produce far more value than the headlines suggest. With access to detailed passing and rushing statistics — and the tools to build a combined offensive value metric — I wanted to take a more analytical approach to a conversation that is usually driven by highlight reels, narratives, and the classic group‑chat arguments.

Data Sources

Primary Dataset

The primary dataset was scraped from the CBS Sports NFL passing leaderboard:

https://www.cbssports.com/nfl/stats/player/passing/nfl/regular/all/?page=

Warning: package 'readr' was built under R version 4.5.2
Warning: package 'forcats' was built under R version 4.5.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
passing_stats <- read_csv("cbs_passing_stats.csv")
Rows: 100 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (6): Player, Position, Team, pct_completion_percentage, lng_longest_comp...
dbl (9): gp_games_played, att_pass_attempts, cmp_pass_completions, yds_passi...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Key Variables include:

  • Player

  • Position

  • Team

  • gp_games_played

  • att_pass_attempts

  • cmp_pass_completions

  • pct_completion_percentage

  • yds_passing_yards

  • td_touchdown_passes

  • int_interceptions_thrown

  • rate_passer_rating

Secondary Dataset

The secondary dataset was downloaded from:

https://github.com/hvpkod/NFL-Data/blob/main/NFL-data-Players/2025/QB_season.csv

qb_season <- read_csv("QB_season.csv")
Rows: 132 Columns: 28
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (3): PlayerName, Pos, Team
dbl (23): PlayerId, PassingYDS, PassingTD, PassingInt, RushingYDS, RushingTD...
lgl  (2): RetTD, FumTD

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Key Variables include:

  • PassingYDS

  • PassingTD

  • PassingInt

  • RushingYDS

  • RushingTD

  • Fum

  • TotalPoints

  • Rank

Data Wrangling

The data collected includes quarterbacks, punters, wide receivers, running backs, and any other NFL players who may have attempted only one or two passes during the regular season or haven’t even played at all. To avoid misleading results, I filtered the data to only include:

  • Only players who primarily play Quarterback

  • Quarterback has thrown at least 70 pass attempts

  • Quarterback has to have played in at least 6 games

  • A valid Passer Rating

  • Valid Passing Yards

passing_stats <- passing_stats %>%
  mutate(
    rate_passer_rating = na_if(rate_passer_rating, "—"),
    rate_passer_rating = as.numeric(rate_passer_rating)
  ) %>% 
  filter(Position == "QB",
         att_pass_attempts >= 70,
         gp_games_played >= 6,
         !is.na(rate_passer_rating),
         !is.na(yds_passing_yards)
         )

I created a common variable: “LastName” that both datasets can use to perform a join and then merge:

passing_stats <- passing_stats %>%
  mutate(
    LastName = sub(".*\\. ", "", Player)
  )

qb_season <- qb_season %>%
  mutate(
    LastName = sub(".* ", "", PlayerName)
  )
merged <- passing_stats %>%
  left_join(qb_season, by = "LastName")
Warning in left_join(., qb_season, by = "LastName"): Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 11 of `x` matches multiple rows in `y`.
ℹ Row 18 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
  "many-to-many"` to silence this warning.
merged <- merged %>%
  mutate(across(where(is.numeric), ~replace_na(., 0)))

Then replaced any null statistics to 0 (example: no rushing yards are now 0 instead of NA).

Summary Statistics

# A tibble: 1 × 5
  n_qbs avg_yards avg_td avg_int avg_rate
  <int>     <dbl>  <dbl>   <dbl>    <dbl>
1    42     2673.   18.0    7.76     90.2

These summary statistics provide a baseline for understanding quarterback performance:

  • 42 quarterbacks met the filtering criteria

  • Average passing yards: 2,673 yards

  • Average passing touchdowns: 18 touchdowns

  • Average interceptions: 8 interceptions

  • Average passer rating: 90.2

This shows us what the typical production level for a starting quarterback looked like in the 2025 regular season.

Descriptive Analysis

Passing Yards vs Games Played

ggplot(merged, aes(gp_games_played, yds_passing_yards)) +
  geom_point(color = "blue", size = 3, alpha = 0.8) +
  geom_smooth(method = "lm", color = "red") +
  labs(
    title = "Passing Yards vs Games Played",
    x = "Games Played",
    y = "Passing Yards"
  )
`geom_smooth()` using formula = 'y ~ x'

The graph above shows a positive relationship between games played and total passing yards. Quarterbacks who appear in all 16–17 games naturally accumulate more opportunities to throw, but the trendline also highlights that the league’s top passers are consistently healthy, trusted starters. This matters for our main question because total offensive value depends heavily on availability and health — quarterbacks who miss time cannot generate the same level of production, regardless of efficiency.

Touchdowns vs Interceptions

ggplot(merged, aes(int_interceptions_thrown, td_touchdown_passes)) +
  geom_point(color = "blue", size = 3, alpha = 0.8) +
  geom_smooth(method = "lm", color = "red") +
  labs(
    title = "Touchdowns vs Interceptions",
    x = "Interceptions",
    y = "Touchdowns"
  )
`geom_smooth()` using formula = 'y ~ x'

This scatter-plot above reveals that quarterbacks who throw more touchdowns also tend to throw more interceptions. This pattern reflects offensive responsibility: high‑volume passers are asked to push the ball downfield, take risks, and operate in aggressive passing schemes. While interceptions are negative plays, they often accompany high offensive output. Understanding this trade‑off helps contextualize total offensive value — elite production often comes with elevated risk.

Passer Rating vs Rushing Yards

ggplot(merged, aes(rate_passer_rating, RushingYDS)) +
  geom_point(color = "blue", size = 3, alpha = 0.8) +
  geom_smooth(method = "lm", color = "red") +
  labs(
    title = "Passer Rating vs Rushing Yards",
    x = "Passer Rating",
    y = "Rushing Yards"
  )
`geom_smooth()` using formula = 'y ~ x'

This graph highlights the play-style diversity among NFL quarterbacks. Some quarterbacks generate value through efficient passing (high passer rating but low rushing yards), while others contribute heavily on the ground despite more modest passing efficiency. The most valuable quarterbacks tend to excel in both areas, offering dual‑threat capabilities that stress defenses and expand offensive playbooks. This dual‑dimension value is central to identifying the league’s most complete offensive quarterbacks.

Top 10 Quarterback Total Offensive Touchdowns

merged <- merged %>%
  mutate(total_td = td_touchdown_passes + RushingTD)


top10_td <- merged %>%
  arrange(desc(total_td)) %>%
  slice(1:10)

ggplot(top10_td, aes(x = reorder(Player, total_td), y = total_td, fill = Player)) +
  geom_col() +
  coord_flip() +
  labs(
    title = "Top 10 Quarterbacks by Total Offensive Touchdowns",
    x = "Player",
    y = "Total TDs"
  ) +
  theme(legend.position = "none")

The chart highlights the quarterbacks who produced the most total offensive touchdowns in 2025, combining both passing and rushing scores to capture their full scoring impact. Matthew Stafford leads the group with the highest touchdown total even without scoring a single rushing touchdown, reflecting his elite passing volume and consistent red‑zone efficiency. Josh Allen and Trevor Lawrence follow closely, each pairing strong passing production with meaningful rushing value. Drake Maye and Jared Goff also rank near the top through high‑volume, high‑efficiency passing attacks, while players like Jalen Hurts and Caleb Williams stand out for their dual‑threat scoring ability. Rounding out the top ten are Dak Prescott, Bo Nix, and Justin Herbert, who earned their spots through balanced offensive contributions. Overall, the graph clearly identifies the quarterbacks who most consistently finished drives and generated scoring opportunities for their teams.

Conclusion

Top 10 Quarterbacks by Total Offensive Value Metric

To directly answer the main question, I created a combined metric:

Total Offensive Value = Passing Yards + Rushing Yards + (Total TD x 20) - (Turnovers x 45)

Where turnovers = interceptions + fumbles.

This formula rewards quarterbacks for both yardage and scoring while penalizing them for mistakes, with touchdowns weighted at 20 points to reflect their high impact on winning and turnovers weighted at 45 points to capture the significant negative swing they create by ending drives and often giving opponents favorable field position.

merged <- merged %>%
  mutate(
    total_td = td_touchdown_passes + RushingTD,
    turnovers = replace_na(int_interceptions_thrown, 0) + replace_na(Fum, 0),
    total_value = yds_passing_yards +
                  RushingYDS +
                  (total_td * 20) -
                  (turnovers * 45)
  )

top10_value <- merged %>%
  arrange(desc(total_value)) %>%
  slice(1:10)

ggplot(top10_value, aes(x = reorder(Player, total_value), y = total_value, fill = Player)) +
  geom_col() +
  coord_flip() +
  labs(
    title = "Top 10 Quarterbacks by Total Offensive Value",
    x = "Player",
    y = "Offensive Value Score"
  ) +
  theme(legend.position = "none")

ranking_table <- merged %>%
  arrange(desc(total_value)) %>%
  select(Player, Team.x, yds_passing_yards, RushingYDS, total_td, turnovers, total_value) %>%
  slice(1:10)

ranking_table
# A tibble: 10 × 7
   Player     Team.x yds_passing_yards RushingYDS total_td turnovers total_value
   <chr>      <chr>              <dbl>      <dbl>    <dbl>     <dbl>       <dbl>
 1 M. Staffo… LAR                 4707          1       46        15        4953
 2 D. Maye    NE                  4394        450       35        16        4824
 3 J. Goff    DET                 4564         45       34        14        4659
 4 D. Presco… DAL                 4552        177       32        16        4649
 5 T. Lawren… JAC                 4007        359       38        17        4361
 6 J. Allen   BUF                 3668        579       39        17        4262
 7 C. Willia… CHI                 3942        383       30        15        4250
 8 B. Nix     DEN                 3931        356       30        15        4212
 9 P. Mahomes KC                  3587        422       27        14        3919
10 J. Herbert LAC                 3727        498       28        20        3885

The results of this analysis clearly answer the central question: Which NFL quarterbacks provide the most total offensive value — not just passing production? The data shows that the quarterbacks who truly drive overall offensive value are those who contribute across every dimension of the game. In 2025, Matthew Stafford led the league by pairing elite passing volume with exceptional touchdown production, demonstrating that high‑efficiency scoring can outweigh limited rushing output. Drake Maye, Jared Goff, and Dak Prescott followed closely, each delivering strong blends of yardage, scoring, and ball security. Quarterbacks like Trevor Lawrence and Josh Allen added significant rushing value to their passing totals, while Caleb Williams, Bo Nix, Patrick Mahomes, and Justin Herbert rounded out the top ten through consistent drive creation and balanced offensive profiles. Taken together, these findings show that the quarterbacks who provide the most total offensive value are not simply the ones who throw for the most yards — they are the ones who combine passing efficiency, mobility, scoring impact, and turnover management to operate as complete offensive engines for their teams.