Mismatches in the NFL play a crucial role in determining the outcome of games. Coaches and offensive coordinators constantly seek to exploit these mismatches in terms of speed, size, and positional alignment to gain a strategic advantage. Whether it’s a wide receiver outmuscling a smaller defensive back or a player leveraging superior height in contested catches, mismatches are pivotal in creating scoring opportunities and optimizing team performance. By analyzing mismatches, teams can make more informed decisions about play design, player utilization, and in-game adjustments, giving them a vital edge over their opponents. The ability to identify and capitalize on mismatches is often the difference between winning and losing in a league where every yard and point counts.
Looking at the speed mismatches created by pre snap motion, we wanted to see if there was a change in the expected points added on plays where the offensive player was faster on average than their defensive coverage assignment. For positional mismatches, we wanted to see what the change in the expected points added would be for plays where a CB was forced to cover a larger TE or when a LB has to cover a faster RB. As for size mismatches we compared wide receivers (WRs) and defensive backs (DBs) to uncover their impact on offensive success. Using NFL Big Data Bowl datasets, we calculated the height and weight differences between WRs and the DBs covering them. To do this, we joined player data with play-level data, converted height formats for easier calculations, and filtered for matchups where WRs had a physical advantage. The calculated size differences were then linked to additional metrics, such as expected points added (expectedPointsAdded) and pass results, to evaluate the effectiveness of these mismatches.
You can also embed plots, for example:
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Registered S3 methods overwritten by 'ggalt':
## method from
## grid.draw.absoluteGrob ggplot2
## grobHeight.absoluteGrob ggplot2
## grobWidth.absoluteGrob ggplot2
## grobX.absoluteGrob ggplot2
## grobY.absoluteGrob ggplot2
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
##
## between, first, last
position_comp <- df %>%
select(gameId, playId, frameId, displayName, displayName_matchup, position, expectedPointsAdded) %>%
left_join(df %>% select(gameId, playId, frameId, displayName, position),
by = c("gameId" = "gameId",
"playId" = "playId",
"frameId" = "frameId",
"displayName_matchup" = "displayName")) %>%
filter(!is.na(displayName_matchup)) %>%
rename("position" = "position.x",
"position_matchup" = "position.y") %>%
mutate(mismatch = (position %in% c("ILB", "MLB", "OLB") & position_matchup == "RB" |
position == "CB" & position_matchup == "TE"))
data.frame()
## data frame with 0 columns and 0 rows
# Function to convert height from "X-Y" format to inches & convert height from inches to feet-inches format
convert_height_to_inches <- function(height) {
sapply(height, function(h) {
parts <- strsplit(h, "-")[[1]]
feet <- as.numeric(parts[1])
inches <- as.numeric(parts[2])
return(feet * 12 + inches)
})
}
convert_inches_to_feet_inches <- function(height_in_inches) {
feet <- floor(height_in_inches / 12)
inches <- height_in_inches %% 12
return(paste0(feet, "'", inches, "\""))
}
# Add height in inches to players data
players <- players %>%
mutate(height_inches = convert_height_to_inches(height))
# Join player_play with players to include the position column
player_play_with_position <- player_play %>%
left_join(players %>% select(nflId, position, displayName, weight, height_inches), by = "nflId")
# Filter Wide Receivers (WRs)
Wide_Receivers <- player_play_with_position %>%
filter(position == "WR") %>%
rename(player_name_wr = displayName, weight_wr = weight, height_wr = height_inches) %>%
select(gameId, playId, player_name_wr, weight_wr, height_wr)
# Filter Defensive Backs (DBs)
Defensive_Backs <- player_play_with_position %>%
filter(position %in% c("CB", "S", "DB")) %>%
rename(player_name_db = displayName, weight_db = weight, height_db = height_inches) %>%
select(gameId, playId, player_name_db, weight_db, height_db)
# Combine WR and DB data to calculate size differences
Offensive_Size_Differences <- Wide_Receivers %>%
inner_join(Defensive_Backs, by = c("gameId", "playId")) %>%
mutate(
height_wr_feet_inches = convert_inches_to_feet_inches(height_wr),
height_db_feet_inches = convert_inches_to_feet_inches(height_db),
weight_difference = weight_wr - weight_db,
height_difference = height_wr - height_db
) %>%
filter(weight_difference > 0 & height_difference > 0) %>% # Only mismatches favoring WRs
select(
gameId,
playId,
player_name_wr, height_wr_feet_inches, weight_wr,
player_name_db, height_db_feet_inches, weight_db,
height_difference, weight_difference,
)
## Warning in inner_join(., Defensive_Backs, by = c("gameId", "playId")): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 1 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.
# Add passResult and expectedPoints from plays to size differences
Size_Differences_with_Results <- Offensive_Size_Differences %>%
left_join(plays %>% select(gameId, playId, passResult, expectedPointsAdded), by = c("gameId", "playId"))
# Calculate success rate for WRs
Success_Rate <- Size_Differences_with_Results %>%
filter(passResult %in% c("C", "I")) %>%
group_by(player_name_wr, height_wr_feet_inches, weight_wr) %>%
summarise(
total_targets = n(),
successful_plays = sum(passResult == "C", na.rm = TRUE),
success_rate = (successful_plays / total_targets) * 100,
avg_height_difference = mean(height_difference, na.rm = TRUE),
avg_weight_difference = mean(weight_difference, na.rm = TRUE),
total_avg_weight_diff = sum(weight_difference, na.rm = TRUE) / total_targets, # Add avg weight diff per target
.groups = "drop"
) %>%
filter(total_targets > 150) %>%
arrange(desc(success_rate)) %>%
print(n = 50)
## # A tibble: 71 × 9
## player_name_wr height_wr_feet_inches weight_wr total_targets successful_plays
## <chr> <chr> <int> <int> <int>
## 1 Christian Wat… "6'5\"" 208 153 115
## 2 Treylon Burks "6'3\"" 225 160 119
## 3 Kendrick Bour… "6'1\"" 203 193 143
## 4 Tee Higgins "6'4\"" 210 640 473
## 5 Jakobi Meyers "6'2\"" 200 327 239
## 6 Tyler Boyd "6'2\"" 203 624 451
## 7 DK Metcalf "6'4\"" 230 512 366
## 8 Christian Kirk "5'11\"" 200 215 153
## 9 Mike Strachan "6'5\"" 205 159 113
## 10 DeVante Parker "6'3\"" 216 456 324
## 11 Alec Pierce "6'3\"" 213 570 402
## 12 Ja'Marr Chase "6'1\"" 200 553 390
## 13 Allen Robinson "6'2\"" 220 577 405
## 14 Michael Pittm… "6'4\"" 220 848 594
## 15 A.J. Brown "6'0\"" 226 306 214
## 16 Ben Skowronek "6'3\"" 220 546 380
## 17 Josh Palmer "6'2\"" 210 511 353
## 18 David Sills "6'3\"" 210 217 149
## 19 Cooper Kupp "6'2\"" 208 617 423
## 20 Parris Campbe… "6'0\"" 205 375 257
## 21 Nico Collins "6'4\"" 222 355 241
## 22 Cam Sims "6'5\"" 214 349 236
## 23 Adam Thielen "6'2\"" 200 570 381
## 24 Chase Claypool "6'4\"" 227 771 515
## 25 Amari Cooper "6'1\"" 211 393 262
## 26 Mike Evans "6'5\"" 231 798 530
## 27 Marvin Jones "6'2\"" 198 359 238
## 28 Tre'Quan Smith "6'2\"" 210 311 206
## 29 JuJu Smith-Sc… "6'1\"" 215 491 324
## 30 Justin Jeffer… "6'3\"" 192 317 209
## 31 K.J. Osborn "6'0\"" 200 312 205
## 32 A.J. Green "6'4\"" 210 387 254
## 33 Chris Godwin "6'1\"" 209 417 273
## 34 George Pickens "6'3\"" 200 440 288
## 35 Gabe Davis "6'3\"" 213 497 325
## 36 Terry McLaurin "6'0\"" 210 314 205
## 37 Michael Thomas "6'3\"" 212 192 125
## 38 Trent Sherfie… "6'1\"" 219 334 217
## 39 Mike Williams "6'4\"" 218 661 428
## 40 Allen Lazard "6'5\"" 227 548 354
## 41 Denzel Mims "6'3\"" 208 175 113
## 42 Zay Jones "6'2\"" 200 457 295
## 43 Donovan Peopl… "6'2\"" 208 478 308
## 44 Julio Jones "6'3\"" 220 208 134
## 45 David Bell "6'2\"" 205 277 177
## 46 Marquez Valde… "6'4\"" 207 664 424
## 47 Davante Adams "6'1\"" 215 561 357
## 48 Amon-Ra St. B… "6'1\"" 195 170 108
## 49 Marquez Calla… "6'2\"" 204 345 218
## 50 Brandon Aiyuk "6'1\"" 206 288 181
## # ℹ 21 more rows
## # ℹ 4 more variables: success_rate <dbl>, avg_height_difference <dbl>,
## # avg_weight_difference <dbl>, total_avg_weight_diff <dbl>
# Calculate average expected points scored by WRs
Expected_Points_Scored <- Size_Differences_with_Results %>%
group_by(player_name_wr, height_wr_feet_inches, weight_wr) %>%
summarise(
total_targets = n(),
avg_expected_points = mean(expectedPointsAdded, na.rm = TRUE),
avg_height_difference = mean(height_difference, na.rm = TRUE),
avg_weight_difference = mean(weight_difference, na.rm = TRUE),
.groups = "drop"
) %>%
filter(total_targets > 150) %>%
arrange(desc(avg_expected_points)) %>%
print(n = 50)
## # A tibble: 92 × 7
## player_name_wr height_wr_feet_inches weight_wr total_targets
## <chr> <chr> <int> <int>
## 1 Jake Kumerow "6'4\"" 209 182
## 2 Justin Watson "6'3\"" 215 243
## 3 Stefon Diggs "6'0\"" 191 213
## 4 Marquez Valdes-Scantling "6'4\"" 207 985
## 5 Marcus Johnson "6'1\"" 207 212
## 6 JuJu Smith-Schuster "6'1\"" 215 687
## 7 Rashod Bateman "6'2\"" 210 313
## 8 D.J. Chark "6'4\"" 198 257
## 9 A.J. Brown "6'0\"" 226 580
## 10 Gabe Davis "6'3\"" 213 849
## 11 Treylon Burks "6'3\"" 225 287
## 12 Devin Duvernay "5'11\"" 210 172
## 13 Christian Kirk "5'11\"" 200 390
## 14 DK Metcalf "6'4\"" 230 854
## 15 Michael Gallup "6'1\"" 200 368
## 16 Dante Pettis "6'1\"" 195 233
## 17 Kevin White "6'3\"" 216 184
## 18 Denzel Mims "6'3\"" 208 311
## 19 Trent Sherfield "6'1\"" 219 496
## 20 Zay Jones "6'2\"" 200 776
## 21 Keelan Cole "6'1\"" 194 213
## 22 Josh Palmer "6'2\"" 210 779
## 23 Donovan Peoples-Jones "6'2\"" 208 958
## 24 David Bell "6'2\"" 205 527
## 25 Tyler Boyd "6'2\"" 203 991
## 26 Julio Jones "6'3\"" 220 287
## 27 Ja'Marr Chase "6'1\"" 200 900
## 28 Tee Higgins "6'4\"" 210 984
## 29 Robert Woods "6'0\"" 193 199
## 30 K.J. Osborn "6'0\"" 200 511
## 31 Marvin Jones "6'2\"" 198 586
## 32 Amari Cooper "6'1\"" 211 780
## 33 Marquez Callaway "6'2\"" 204 581
## 34 CeeDee Lamb "6'2\"" 189 178
## 35 Robbie Chosen "6'3\"" 190 185
## 36 Drake London "6'5\"" 210 908
## 37 Chris Godwin "6'1\"" 209 566
## 38 Mike Williams "6'4\"" 218 1012
## 39 Justin Jefferson "6'3\"" 192 517
## 40 Davante Adams "6'1\"" 215 895
## 41 Mack Hollins "6'4\"" 221 1213
## 42 A.J. Green "6'4\"" 210 618
## 43 Amon-Ra St. Brown "6'1\"" 195 278
## 44 David Sills "6'3\"" 210 527
## 45 DeAndre Hopkins "6'1\"" 212 181
## 46 Mike Evans "6'5\"" 231 1127
## 47 Sammy Watkins "6'1\"" 211 346
## 48 Equanimeous St. Brown "6'5\"" 214 897
## 49 Terrace Marshall "6'4\"" 200 320
## 50 Bryan Edwards "6'3\"" 215 198
## # ℹ 42 more rows
## # ℹ 3 more variables: avg_expected_points <dbl>, avg_height_difference <dbl>,
## # avg_weight_difference <dbl>
# Visualization: Average Expected Points Scored by WRs
ggplot(Expected_Points_Scored %>% head(50), aes(x = avg_expected_points, y = reorder(player_name_wr, avg_expected_points), color = avg_height_difference, size = avg_weight_difference)) +
geom_point() +
geom_text(aes(label = paste0("Ht: ", round(avg_height_difference, 2), " Wt: ", round(avg_weight_difference, 2))),
hjust = -0.2, size = 3, color = "black") +
scale_color_gradient(low = "red", high = "green", name = "Avg Height Diff (inches)") +
scale_size_continuous(name = "Avg Weight Diff (lbs)") +
labs(
title = "Top WRs by Average Expected Points Added",
x = "Average Expected Points Added",
y = "Wide Receiver (WR)"
) +
theme_minimal() +
theme(legend.position = "right")
filtered_pos <- position_comp %>%
filter(mismatch == TRUE)
filtered_speed <- speed_comp %>%
filter(speed_difference > 0)
filtered_size <- Size_Differences_with_Results %>%
filter(height_difference > 3 & weight_difference > 30)
avg_epa_pos <- mean(filtered_pos$expectedPointsAdded, na.rm = TRUE)
avg_epa_speed <- mean(filtered_speed$expectedPointsAdded, na.rm = TRUE)
avg_epa_size <- mean(filtered_size$expectedPointsAdded, na.rm = TRUE)
avg_epa_data <- data.frame(
MismatchType = c("Speed Mismatch", "Positional Mismatch", "Size Mismatch"),
AverageEPA = c(-0.0142070711010506, -0.016072514839682, -0.0700899560471354)
)
ggplot(avg_epa_data, aes(x = MismatchType, y = AverageEPA, fill = MismatchType)) +
geom_bar(stat = "identity", width = 0.6) +
labs(
title = "Comparison of Average EPA for Mismatches",
x = "Mismatch Type",
y = "Average EPA"
) +
theme_minimal() +
scale_fill_manual(values = c("Speed Mismatch" = "skyblue", "Positional Mismatch" = "orange", "Size Mismatch" = "red"))
## Conlusion As clearly demonstrated by the bar chart, the most
effective mismatch created by pre snap motion according to expected
points added are speed mismatches with positional mismatches close
behind. However none of the mismatches boasted positive EPAs which could
suggest that pre snap motion, depending on the team, players, and
coaches, is ineffective. Having spent time analyzing the different
mismatches, it is clear that there are ways to play to your players
advantages during presnap motion. Some teams have players with size
advantages while others might have speed advantages. And if the presnap
motion creates a positional mismatch, having a QB or offensive
coordinator that can identify that positional mismatch that is likely to
occur from the presnap motion could prove to be effective. Generally
speaking however, speed and positional mismatches create the most
opportunities to add points to the scoreboard.
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.