library(qrcode)
qr = qr_code("https://rpubs.com/jbrasfield/1426562")
plot(qr)NFL Player Usage
QR Code
KSU Logo
Introduction
There are many different strategies when it comes to running an offense in the National Football League (NFL). Some teams are very pass heavy while others rely on running backs to get yardage. There are so many different reasons as to why teams use different strategies. Some of which are:
Opposing team’s defense
Players on team
Location and weather of game
Offensive line
Coaching staff
Depth of positions
What we want to look into is if using a wide receiver or running back a higher percentage of time than other wide receivers or running backs translates to more wins. To investigate, we will gather NFL data from nflfastR and run analysis to figure out the answer.
Data
Initial Setup
# Load all libraries needed
library(nflfastR)
library(dplyr)
library(tidyverse)
library(ggrepel)
library(patchwork)
library(DT)
library(qrcode)
library(plotly)# Load required data into variables
data = load_player_stats(2025) # Gets stats for all players for the 2025 season
schedule = load_schedules(2025) # Gets schedule and records for all team for 2025 season
team_info = teams_colors_logos # Gets important team info such as division and leagueCreate Data Frames
Schedule
# Get the records for each team
records_2025 <- schedule |>
filter(game_type == "REG") |>
mutate(
home_win = if_else(home_score > away_score, 1, 0),
away_win = if_else(away_score > home_score, 1, 0)
) |>
summarise(
team = c(home_team, away_team),
wins = c(home_win, away_win),
losses = c(away_win, home_win)
) |>
group_by(team) |>
summarise(
wins = sum(wins),
losses = sum(losses),
record = paste(wins, "-", losses)
) |>
arrange(desc(wins))Get WR Data
wr <- data %>%
filter(position == "WR") %>%
select(player_name, completions, team, receiving_tds,receiving_air_yards)Filter for TDs Per Game
##filter by "good" receiving tds##
good_wr <- wr %>%
filter(receiving_tds >= 1)Filter for 3 TD Games
##players with 3 receiving tds#
three_wr <- good_wr %>%
dplyr::filter(receiving_tds == 3) Team Info
## Teams with most completions#
team_comp <- nflfastR:: load_team_stats(2025)%>%
filter(season_type == "REG") %>%
select(team, completions, attempts)
#calculate mean completions across all teams#
team_comp %>%
dplyr::group_by(team) %>%
dplyr::summarize(mean_completion= mean(completions,na.rm = TRUE))# A tibble: 32 × 2
team mean_completion
<chr> <dbl>
1 ARI 25.1
2 ATL 19.5
3 BAL 16.4
4 BUF 20.2
5 CAR 19.4
6 CHI 19.6
7 CIN 24.2
8 CLE 19
9 DAL 24.6
10 DEN 22.8
# ℹ 22 more rows
team_comp %>%
dplyr::group_by(team) %>%
dplyr::summarize(mean_completion = mean(completions,na.rm = TRUE)) %>%
dplyr::arrange(desc(mean_completion))# A tibble: 32 × 2
team mean_completion
<chr> <dbl>
1 ARI 25.1
2 DAL 24.6
3 CIN 24.2
4 SF 23.4
5 NO 23.4
6 DET 23.2
7 DEN 22.8
8 LA 22.8
9 LAC 21.6
10 PIT 21.5
# ℹ 22 more rows
# Create a summary table with one row per species (median positions)
# These medians will be used as label anchor points
records_avg <- records_2025 |>
group_by(team) |>
summarize(
wins = mean(wins, na.rm = TRUE),
losses = mean(losses, na.rm = TRUE)
)Results
# WR1 Target Share Distribution by Team
wr_top_usage |>
ggplot(aes(x = target_share)) +
geom_histogram(binwidth = 0.05, color = "white") +
labs(
title = "Distribution of WR1 Target Share",
subtitle = "Each bar represents a team's top WR target share",
x = "Target Share Percentage",
y = "Number of Teams"
) +
scale_x_continuous(breaks = seq(0, 1, 0.05))This reveals that most teams have a highest target share around 35%. This histogram is right skewed with all teams in the 55% category or below.
# RB1 Carry Share Distribution by Team
rb_top_usage |>
ggplot(aes(x = carry_share)) +
geom_histogram(binwidth = 0.05, color = "white") +
labs(
title = "Distribution of RB1 Carry Share",
subtitle = "Each bar represents a team's top RB carry share",
x = "Carry Share Percentage",
y = "Number of Teams"
) +
scale_x_continuous(breaks = seq(0, 1, 0.05))This shows us that RB carry share is much more spread out than the WR target share and the average is higher than the WR target share. There is also an outlier in this data, the 30%.
These histograms show that teams, on average, use their RB1 a higher percentage of time than their WR1.
# Compare target and carry share at the team level
team_difference |>
ggplot(aes(target_share, team)) +
geom_point() +
geom_segment(aes(xend = carry_share, yend = team)) +
scale_x_continuous(breaks = seq(0, 1, 0.1)) +
geom_text(
aes(x = carry_share, y = team, label = wins)) +
labs(
title = "WR Target Share vs RB Carry Share with Wins",
subtitle = "Dots represent WR Target Share",
x = "Share Percentage",
y = "Team"
)This shows us that teams who use running back 1 more frequently than other running backs tend to not favor wide receiver 1 over other wide receivers. It also shows that the running back 1 position usually gets used significantly more than the wide receiver 1 position.
Scatter Plots
# Scatterplot comparing Target Share with Carry Share by Team and Division
team_difference |>
ggplot(aes(x = target_share, y = carry_share, color = team_division)) +
geom_point(size = 3) +
geom_text_repel(aes(label = team)) +
labs(
title = "WR Target Share vs RB Carry Share by Division",
x = "WR1 Target Share",
y = "RB1 Carry Share",
color = "Division"
) +
scale_x_continuous(breaks = seq(0, 1, 0.05)) +
scale_y_continuous(breaks = seq(0, 1, 0.05))This scatter plot shows the comparison of Target Share and Carry Share by team, grouped by division.
# Scatterplot of WR target share percentage vs number of wins by team
wr_usage_vs_wins = wr_top_usage |>
ggplot(aes(x=target_share, y=wins)) +
geom_point(aes(color = team)) +
geom_smooth(se=FALSE) +
geom_text_repel(aes(label = team)) +
scale_y_continuous(breaks = seq(0, 17, 1)) +
labs(
title = "Target Share vs Wins",
x = "Target Share Percentage",
y = "Wins"
)# Scatterplot of RB carry share percentage vs number of wins by team
rb_usage_vs_wins = rb_top_usage |>
ggplot(aes(x=carry_share, y=wins)) +
geom_point(aes(color = team)) +
geom_smooth(se=FALSE) +
geom_text_repel(aes(label = team)) +
scale_y_continuous(breaks = seq(0, 17, 1)) +
labs(
title = "Carry Share vs Wins",
x = "Carry Share Percentage",
y = "Wins"
)# Combine WR and RB scatterplots
all_usage = wr_usage_vs_wins + rb_usage_vs_wins
all_usage + plot_layout(guides = 'collect') &
theme(legend.position = 'right')These scatter plots show the relationships between the highest target/carry share for each team and the number of wins for the 2025 season. There seems to be no relationship between either the WR or RB shares. To confirm, we ran a regression model.
Regression Model
wins_model = lm(wins ~ target_share + carry_share, data = team_difference)
summary(wins_model)
Call:
lm(formula = wins ~ target_share + carry_share, data = team_difference)
Residuals:
Min 1Q Median 3Q Max
-6.1182 -2.7469 0.0233 2.9987 5.2910
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.7699 4.2686 2.289 0.0296 *
target_share 0.1151 7.5279 0.015 0.9879
carry_share -2.2359 4.5823 -0.488 0.6293
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.567 on 29 degrees of freedom
Multiple R-squared: 0.008272, Adjusted R-squared: -0.06012
F-statistic: 0.1209 on 2 and 29 DF, p-value: 0.8865
wins_model
Call:
lm(formula = wins ~ target_share + carry_share, data = team_difference)
Coefficients:
(Intercept) target_share carry_share
9.7699 0.1151 -2.2359
The regression model confirmed the lack of relationships between target share, carry share, and wins.
Adjusted R-squared value: -0.06012
Now what?
Divisional Differences
plot_ly(
data=wr_top_usage,
x= ~target_share_category,
color = ~team_division,
type = "histogram"
) |>
layout(
title = "Wide Receivers by Category of Target Shares",
xaxis = list(title="Category"),
yaxis = list(title="Number of Wide Receivers")
)This shows the percentage of wide receiver 1 usage by division separated by categories. The AFC North has the most teams with the highest category. Meaning they use their wide receiver 1 more often than teams in other divisions. AFC West and AFC South have the most teams with the lowest categories.
plot_ly(
data=rb_top_usage,
x= ~target_carry_category,
color = ~team_division,
type = "histogram"
) |>
layout(
title = "Running Backs by Category of Carry Shares",
xaxis = list(title="Category"),
yaxis = list(title="Number of Running Backs")
)The AFC East and AFC South have the most teams in the highest categories, which makes sense as they were 2 of the divisions with the teams in the lowest categories for the wide receiver target shares. The NFC South has the most teams in the lowest categories.
All of these results could be due to a number of things such as quarterbacks, wide receivers, defense, running backs, offensive lines, defensive lines, or other factors.
## Scatterplot: Relationship between players with 3 receiving tds and their receiving air yards##
ggplot(data = three_wr) +
geom_point(mapping = aes(x=receiving_air_yards, y = player_name, color = team))#Base Scatterplot: Completions vs atttempts, colored by team#
comp_att <- team_comp %>%
ggplot(aes(completions,attempts)) +
geom_point(aes(color = team))
comp_attanimate_record <- records_2025 %>%
ggplot(aes(x = team, y = wins)) +
geom_point(color = "#BF5700", size = 5) +
geom_segment(
aes(yend = wins),
xend = 0,
linetype = "dashed",
color = "#333F48"
) +
labs(
title = "Team by Win",
x = "NFL Team",
y = "Wins"
) +
theme_classic()
animate_recordggplot(data=records_2025) +
geom_point(mapping = aes(x = wins, y = losses, color = team))records_2025$team <- as.factor(records_2025$team)
records_2025$record <- as.factor(records_2025$record)
library(cluster)
#make the distance matrix#
dist <- daisy(records_2025)
#make a hierarchical cluster model#
model<-hclust(dist)
#plotting the hierarchy#
plot(model)#cutting the tree at your decided level#
clustmember<-cutree(model,2)
#adding the cluster member as a column to your data#
df1 <- data.frame(records_2025,cluster=clustmember)
datatable(df1)Conclusion
Our analysis set out to determine if relying heavily on a single wide receiver or running back translates to more wins in the NFL in the 2025 season. Ultimately, we found that there is no correlation between the two. However, that did lead us to look into some other interesting data. Breaking down target and carry share by division revealed that some divisions rely more on wide receivers while others rely more heavily on running backs. We were also able to look at some of the best single game performances by wide receivers as well as the completions vs pass attempts by team for the season.
Football is a very complex sport that relies on so many other data points other than target and carry share. If given more time and experience, we would like to pull in more data (include multiple seasons), dive deeper into individual player and team stats, and do more research into analyses done by others to find other correlations to number of wins.
Note: The findings presented in this project are exclusive to this course and were not in this or previous semesters and will not be presented in any other courses during this semester.
KSU Interactive Map
library(leaflet)
leaflet() |>
addTiles()|>
addMarkers(
lng = -84.5184,
lat=33.9384,
popup = "KSU - Marietta"
)Team
jbrasfi1@students.kennesaw.edu
dkhelawa@students.kennesaw.edu
Group Info
Group #: 4
Group Name: Team Data