NFL Player Usage

Authors

Jackson Brasfield

Deron Khelawan

QR Code

library(qrcode)
qr = qr_code("https://rpubs.com/jbrasfield/1426562")

plot(qr)

Introduction

There are many different strategies when it comes to running an offense in the National Football League (NFL). Some teams are very pass heavy while others rely on running backs to get yardage. There are so many different reasons as to why teams use different strategies. Some of which are:

  • Opposing team’s defense

  • Players on team

  • Location and weather of game

  • Offensive line

  • Coaching staff

  • Depth of positions

What we want to look into is if using a wide receiver or running back a higher percentage of time than other wide receivers or running backs translates to more wins. To investigate, we will gather NFL data from nflfastR and run analysis to figure out the answer.

Data

Initial Setup

# Load all libraries needed
library(nflfastR)
library(dplyr)
library(tidyverse)
library(ggrepel)
library(patchwork)
library(DT)
library(qrcode)
library(plotly)
# Load required data into variables
data = load_player_stats(2025) # Gets stats for all players for the 2025 season
schedule = load_schedules(2025) # Gets schedule and records for all team for 2025 season
team_info = teams_colors_logos # Gets important team info such as division and league

Create Data Frames

Schedule

# Get the records for each team
records_2025 <- schedule |>
  filter(game_type == "REG") |>
  mutate(
    home_win = if_else(home_score > away_score, 1, 0),
    away_win = if_else(away_score > home_score, 1, 0)
  ) |>
  summarise(
    team = c(home_team, away_team),
    wins = c(home_win, away_win),
    losses = c(away_win, home_win)
  ) |>
  group_by(team) |>
  summarise(
    wins = sum(wins),
    losses = sum(losses),
    record = paste(wins, "-", losses)
  ) |>
  arrange(desc(wins))

Top WR Target Shares

# Determine the most targeted WR for each team
wr_top_usage = data |>
  filter(position == "WR" & season_type == "REG") |>
  group_by(team, player_id, player_display_name) |>
  summarise(
    season_targets = sum(targets, na.rm = TRUE)
  ) |>
  group_by(team) |>
  mutate(
    team_total_targets = sum(season_targets),
    target_share = season_targets / team_total_targets
  ) |>
  arrange(team, desc(target_share)) |>
  slice(1) |>
  left_join(records_2025, by = "team") |> # Merge with records table to get team W/L
  left_join( # Merge with team info to get conference and division
    team_info |>
      select(team_abbr, team_conf, team_division),
    by = c("team" = "team_abbr")
  )
# Add categories for the target share
wr_top_usage = wr_top_usage |>
  mutate(target_share_category = 
           if (target_share >= 0.5)
           {"Excellent"}
           else if (target_share >= 0.42)
           {"Very Good"}
           else if (target_share >= 0.34)
           {"Good"}
           else
           {"Decent"}
  )

Top RB Carry Shares

# Determine the RB with the most carries for each team
rb_top_usage = data |>
  filter(position == "RB" & season_type == "REG") |>
  group_by(team, player_id, player_display_name) |>
  summarise(
    season_carries = sum(carries, na.rm = TRUE)
  ) |>
  group_by(team) |>
  mutate(
    team_total_carries = sum(season_carries),
    carry_share = season_carries / team_total_carries
  ) |>
  arrange(desc(carry_share)) |>
  slice(1) |>
  left_join(records_2025, by = "team") |> # Merge with records table to get team W/L
  left_join( # Merge with team info to get conference and division
    team_info |>
      select(team_abbr, team_conf, team_division),
    by = c("team" = "team_abbr")
  )
# Add categories for the carry shares
rb_top_usage = rb_top_usage |>
  mutate(target_carry_category = 
           if (carry_share >= 0.7)
           {"Excellent"}
         else if (carry_share >= 0.58)
         {"Very Good"}
         else if (carry_share >= 0.46)
         {"Good"}
         else
         {"Decent"}
  )

Target vs Carry Shares by Team

# Difference between highest target share and carry share
team_difference = wr_top_usage |>
  select(team, target_share, wins, team_division) |>
  left_join(
    rb_top_usage |>
      select(team, carry_share),
    by = "team"
  )

Get WR Data

wr <- data %>%   
  filter(position == "WR") %>%   
  select(player_name, completions, team, receiving_tds,receiving_air_yards)

Filter for TDs Per Game

##filter by "good" receiving tds## 
good_wr <- wr %>%   
  filter(receiving_tds >= 1)

Filter for 3 TD Games

##players with 3 receiving tds# 
three_wr <- good_wr %>%   
  dplyr::filter(receiving_tds == 3) 

Team Info

## Teams with most completions#
team_comp <- nflfastR:: load_team_stats(2025)%>%
  filter(season_type == "REG") %>%
  select(team, completions, attempts)

#calculate mean completions across all teams#
team_comp %>%
  dplyr::group_by(team) %>%
  dplyr::summarize(mean_completion= mean(completions,na.rm = TRUE))
# A tibble: 32 × 2
   team  mean_completion
   <chr>           <dbl>
 1 ARI              25.1
 2 ATL              19.5
 3 BAL              16.4
 4 BUF              20.2
 5 CAR              19.4
 6 CHI              19.6
 7 CIN              24.2
 8 CLE              19  
 9 DAL              24.6
10 DEN              22.8
# ℹ 22 more rows
team_comp %>%
  dplyr::group_by(team) %>%
  dplyr::summarize(mean_completion = mean(completions,na.rm = TRUE)) %>%
  dplyr::arrange(desc(mean_completion))
# A tibble: 32 × 2
   team  mean_completion
   <chr>           <dbl>
 1 ARI              25.1
 2 DAL              24.6
 3 CIN              24.2
 4 SF               23.4
 5 NO               23.4
 6 DET              23.2
 7 DEN              22.8
 8 LA               22.8
 9 LAC              21.6
10 PIT              21.5
# ℹ 22 more rows
# Create a summary table with one row per species (median positions)
# These medians will be used as label anchor points
records_avg <- records_2025 |>
  group_by(team) |>
  summarize(
    wins = mean(wins, na.rm = TRUE),
    losses = mean(losses,  na.rm = TRUE)
  )

Results

# WR1 Target Share Distribution by Team
wr_top_usage |>
  ggplot(aes(x = target_share)) +
  geom_histogram(binwidth = 0.05, color = "white") +
  labs(
    title = "Distribution of WR1 Target Share",
    subtitle = "Each bar represents a team's top WR target share",
    x = "Target Share Percentage",
    y = "Number of Teams"
  ) +
  scale_x_continuous(breaks = seq(0, 1, 0.05))

This reveals that most teams have a highest target share around 35%. This histogram is right skewed with all teams in the 55% category or below.

# RB1 Carry Share Distribution by Team
rb_top_usage |>
  ggplot(aes(x = carry_share)) +
  geom_histogram(binwidth = 0.05, color = "white") +
  labs(
    title = "Distribution of RB1 Carry Share",
    subtitle = "Each bar represents a team's top RB carry share",
    x = "Carry Share Percentage",
    y = "Number of Teams"
  ) +
  scale_x_continuous(breaks = seq(0, 1, 0.05))

This shows us that RB carry share is much more spread out than the WR target share and the average is higher than the WR target share. There is also an outlier in this data, the 30%.

These histograms show that teams, on average, use their RB1 a higher percentage of time than their WR1.

# Compare target and carry share at the team level
team_difference |>
  ggplot(aes(target_share, team)) +
  geom_point() +
  geom_segment(aes(xend = carry_share, yend = team)) +
  scale_x_continuous(breaks = seq(0, 1, 0.1)) +
  geom_text(
    aes(x = carry_share, y = team, label = wins)) +
  labs(
    title = "WR Target Share vs RB Carry Share with Wins",
    subtitle = "Dots represent WR Target Share",
    x = "Share Percentage",
    y = "Team"
  )

This shows us that teams who use running back 1 more frequently than other running backs tend to not favor wide receiver 1 over other wide receivers. It also shows that the running back 1 position usually gets used significantly more than the wide receiver 1 position.

Scatter Plots

# Scatterplot comparing Target Share with Carry Share by Team and Division
team_difference |>
  ggplot(aes(x = target_share, y = carry_share, color = team_division)) +
  geom_point(size = 3) +
  geom_text_repel(aes(label = team)) +
  labs(
    title = "WR Target Share vs RB Carry Share by Division",
    x = "WR1 Target Share",
    y = "RB1 Carry Share",
    color = "Division"
  ) +
  scale_x_continuous(breaks = seq(0, 1, 0.05)) +
  scale_y_continuous(breaks = seq(0, 1, 0.05))

This scatter plot shows the comparison of Target Share and Carry Share by team, grouped by division.

# Scatterplot of WR target share percentage vs number of wins by team
wr_usage_vs_wins = wr_top_usage |>
  ggplot(aes(x=target_share, y=wins)) +
  geom_point(aes(color = team)) +
  geom_smooth(se=FALSE) +
  geom_text_repel(aes(label = team)) +
  scale_y_continuous(breaks = seq(0, 17, 1)) +
  labs(
    title = "Target Share vs Wins",
    x = "Target Share Percentage",
    y = "Wins"
  )
# Scatterplot of RB carry share percentage vs number of wins by team
rb_usage_vs_wins = rb_top_usage |>
  ggplot(aes(x=carry_share, y=wins)) +
  geom_point(aes(color = team)) +
  geom_smooth(se=FALSE) +
  geom_text_repel(aes(label = team)) +
  scale_y_continuous(breaks = seq(0, 17, 1)) +
  labs(
    title = "Carry Share vs Wins",
    x = "Carry Share Percentage",
    y = "Wins"
  )
# Combine WR and RB scatterplots
all_usage = wr_usage_vs_wins + rb_usage_vs_wins
all_usage + plot_layout(guides = 'collect') &
  theme(legend.position = 'right')

These scatter plots show the relationships between the highest target/carry share for each team and the number of wins for the 2025 season. There seems to be no relationship between either the WR or RB shares. To confirm, we ran a regression model.

Regression Model

wins_model = lm(wins ~ target_share + carry_share, data = team_difference)
summary(wins_model)

Call:
lm(formula = wins ~ target_share + carry_share, data = team_difference)

Residuals:
    Min      1Q  Median      3Q     Max 
-6.1182 -2.7469  0.0233  2.9987  5.2910 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)  
(Intercept)    9.7699     4.2686   2.289   0.0296 *
target_share   0.1151     7.5279   0.015   0.9879  
carry_share   -2.2359     4.5823  -0.488   0.6293  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.567 on 29 degrees of freedom
Multiple R-squared:  0.008272,  Adjusted R-squared:  -0.06012 
F-statistic: 0.1209 on 2 and 29 DF,  p-value: 0.8865
wins_model

Call:
lm(formula = wins ~ target_share + carry_share, data = team_difference)

Coefficients:
 (Intercept)  target_share   carry_share  
      9.7699        0.1151       -2.2359  

The regression model confirmed the lack of relationships between target share, carry share, and wins.

Adjusted R-squared value: -0.06012

Now what?

Divisional Differences

plot_ly(
  data=wr_top_usage,
  x= ~target_share_category,
  color = ~team_division,
  type = "histogram"
) |>
  layout(
    title = "Wide Receivers by Category of Target Shares",
    xaxis = list(title="Category"),
    yaxis = list(title="Number of Wide Receivers")
  )

This shows the percentage of wide receiver 1 usage by division separated by categories. The AFC North has the most teams with the highest category. Meaning they use their wide receiver 1 more often than teams in other divisions. AFC West and AFC South have the most teams with the lowest categories.

plot_ly(
  data=rb_top_usage,
  x= ~target_carry_category,
  color = ~team_division,
  type = "histogram"
) |>
  layout(
    title = "Running Backs by Category of Carry Shares",
    xaxis = list(title="Category"),
    yaxis = list(title="Number of Running Backs")
  )

The AFC East and AFC South have the most teams in the highest categories, which makes sense as they were 2 of the divisions with the teams in the lowest categories for the wide receiver target shares. The NFC South has the most teams in the lowest categories.

All of these results could be due to a number of things such as quarterbacks, wide receivers, defense, running backs, offensive lines, defensive lines, or other factors.

## Scatterplot: Relationship between players with 3 receiving tds and their receiving air yards##
ggplot(data = three_wr) +
  geom_point(mapping = aes(x=receiving_air_yards, y = player_name, color = team))

#Base Scatterplot: Completions vs atttempts, colored by team#
comp_att <- team_comp %>%
  ggplot(aes(completions,attempts)) +
  geom_point(aes(color = team))
comp_att

animate_record <- records_2025 %>%
ggplot(aes(x = team, y = wins)) +
  geom_point(color = "#BF5700", size = 5) +
  geom_segment(
    aes(yend = wins),
    xend = 0,
    linetype = "dashed",
    color = "#333F48"
  ) +
  labs(
    title = "Team by Win",
    x = "NFL Team",
    y = "Wins"
  ) +
  theme_classic()

animate_record

ggplot(data=records_2025) +
  geom_point(mapping = aes(x = wins, y = losses, color = team))

records_2025$team <- as.factor(records_2025$team)
records_2025$record <- as.factor(records_2025$record)

library(cluster)
#make the distance matrix#
dist <- daisy(records_2025)

#make a hierarchical cluster model#
model<-hclust(dist)

#plotting the hierarchy#
plot(model)

#cutting the tree at your decided level#
clustmember<-cutree(model,2)

#adding the cluster member as a column to your data#
df1 <- data.frame(records_2025,cluster=clustmember)
datatable(df1)

Conclusion

Our analysis set out to determine if relying heavily on a single wide receiver or running back translates to more wins in the NFL in the 2025 season. Ultimately, we found that there is no correlation between the two. However, that did lead us to look into some other interesting data. Breaking down target and carry share by division revealed that some divisions rely more on wide receivers while others rely more heavily on running backs. We were also able to look at some of the best single game performances by wide receivers as well as the completions vs pass attempts by team for the season.

Football is a very complex sport that relies on so many other data points other than target and carry share. If given more time and experience, we would like to pull in more data (include multiple seasons), dive deeper into individual player and team stats, and do more research into analyses done by others to find other correlations to number of wins.

Note: The findings presented in this project are exclusive to this course and were not in this or previous semesters and will not be presented in any other courses during this semester.

KSU Interactive Map

library(leaflet)
  leaflet() |>   
    addTiles()|>   
    addMarkers(     
      lng = -84.5184,     
      lat=33.9384,     
      popup = "KSU - Marietta"   
    )

Team

Email

jbrasfi1@students.kennesaw.edu

dkhelawa@students.kennesaw.edu

Group Info

Group #: 4

Group Name: Team Data