This is my analysis of NFL defensive linemen vs different types of offensive formations. For this analysis I have decided to focus on the 2021 Season week 1 matchup between the Atlanta Falcons and the Philadelphia Eagles. My main focus of this analysis was to see how the average speed of the defensive line varried based on the formations the offense lined up in.
The data set used for this analysis was provided by the NFL and can be found on Kaggle.com. It contains game, player, play and tracking data for each game for the first 8 weeks of the 2021 NFL season. I filtered this large data set to a smaller more managable dataset to work with. The dataset I filtered down to contain the game data, player data and tracking data for the members of the eagles defensive line for their week 1 matchup against the Atlanta Falcons.
################################################################################
#Gives you access to the file functions.R and its contents
source('functions.R')
################################################################################
################################################################################
# SET
my_path <- "~/Desktop/IS470/BDB2023" #TALLONS DIRECTORY VARIABLE
setwd(my_path)
################################################################################
# LOAD NEEDED PACKAGES
load_packages(c('data.table', 'dplyr', 'ggplot2', 'lubridate', 'ggforce'))
################################################################################
# LOAD WEEK 1 DATA EAGLES vs FALCONS
df1 <- load_data_for_one_week(my_path,
weekNumber = 1,
merge = T)
test <- df1 %>%
filter(defensiveTeam == "PHI")
def_line <- c("DE", "DT")
filtered <- test %>%
select(c("gameId", "playId", "nflId", "frameId", "time", "jerseyNumber", "team", "x", "y", "s", "a", "dis", "o",
"dir", "possessionTeam", "defensiveTeam", "passResult", "playResult","offenseFormation", "officialPosition", "displayName",
"pff_hit", "pff_hurry", "pff_sack"))
playResult <- filtered %>%
select(c("playId", "passResult", "playResult", "pff_hit", "pff_hurry", "pff_sack", "offenseFormation")) %>%
filter(filtered$team == "PHI")
################## FIND AVG SPEED FOR D-LINE FOR EACH PLAY ##########################
# Completion: I_Formation: Avg_speed - 2.210132
play_56 <- filtered %>%
filter(playId == 56 & team == "PHI" & officialPosition %in% def_line)
avgS56 <- mean(play_56$s)
# Completion: Singleback: Avg_speed - 2.225312
play_122 <- filtered %>%
filter(playId == 122 & team == "PHI" & officialPosition %in% def_line)
avg122 <- mean(play_122$s)
# Completion: Empty: Avg_speed - 2.059138
play_146 <- filtered %>%
filter(playId == 146 & team == "PHI" & officialPosition %in% def_line)
avg146 <- mean(play_146$s)
# Incomplete: Single_back: Avg_speed - 2.018659
play_176 <- filtered %>%
filter(playId == 176 & team == "PHI" & officialPosition %in% def_line)
avg176 <- mean(play_176$s)
# Complete: Shotgun: Avg_speed - 2.395156 - hurry - hit
play_231 <- filtered %>%
filter(playId == 231 & team == "PHI" & officialPosition %in% def_line)
avg231 <- mean(play_231$s)
# Complete: Singleback: Avg_speed - 2.503176
play_303 <- filtered %>%
filter(playId == 303 & team == "PHI" & officialPosition %in% def_line)
avg303 <- mean(play_303$s)
# Incomplete: Singleback: Avg_speed - 1.675349 - hurry
play_348 <- filtered %>%
filter(playId == 348 & team == "PHI" & officialPosition %in% def_line)
avg348 <- mean(play_348$s)
# Incomplete: Shotgun: Avg_speed - 2.386094
play_370 <- filtered %>%
filter(playId == 370 & team == "PHI" & officialPosition %in% def_line)
avg370 <- mean(play_370$s)
# Complete: I-formation: Avg_speed - 2.210882 - hit
play_705 <- filtered %>%
filter(playId == 705 & team == "PHI" & officialPosition %in% def_line)
avg705 <- mean(play_705$s)
# Complete: Singleback: Avg_speed - 1.334539
play_750 <- filtered %>%
filter(playId == 750 & team == "PHI" & officialPosition %in% def_line)
avg750 <- mean(play_750$s)
# Incomplete: Shotgun: Avg_speed - 1.851083
play_821 <- filtered %>%
filter(playId == 821 & team == "PHI" & officialPosition %in% def_line)
avg821 <- mean(play_821$s)
# Scramble: Shotgun: Avg_speed - 2.423254 - hurry
play_843 <- filtered %>%
filter(playId == 843 & team == "PHI" & officialPosition %in% def_line)
avg843 <- mean(play_843$s)
# Complete: Singleback: Avg_speed -1.946042 - hit
play_864 <- filtered %>%
filter(playId == 864 & team == "PHI" & officialPosition %in% def_line)
avg864 <- mean(play_864$s)
# Incomplete: Jumbo: Avg_speed - 2.808836- hurry
play_966 <- filtered %>%
filter(playId == 966 & team == "PHI" & officialPosition %in% def_line)
avg966 <- mean(play_966$s)
# Incomplete: Shotgun: Avg_speed- 2.611983
play_1023 <- filtered %>%
filter(playId == 1023 & team == "PHI" & officialPosition %in% def_line)
avg1023 <- mean(play_1023$s)
# Incomplete: Shotgun: Avg_speed - 2.444043 - hurry
play_1085 <- filtered %>%
filter(playId == 1085 & team == "PHI" & officialPosition %in% def_line)
avg1085 <- mean(play_1085$s)
# Complete: I-formation:Avg_speed -2.472602- hit- hurry
play_1326 <- filtered %>%
filter(playId == 1326 & team == "PHI" & officialPosition %in% def_line)
avg1326 <- mean(play_1326$s)
# Incomplete: Singleback: Avg_speed-3.145469-hit-hurry
play_1459 <- filtered %>%
filter(playId == 1459 & team == "PHI" & officialPosition %in% def_line)
avg1459 <- mean(play_1459$s)
# Complete: Shotgun: Avg_speed - 1.818409
play_1494 <- filtered %>%
filter(playId == 1494 & team == "PHI" & officialPosition %in% def_line)
avg1494 <- mean(play_1494$s)
# Complete: Shotgun: Avg_speed - 1.999219
play_2675 <- filtered %>%
filter(playId == 2675 & team == "PHI" & officialPosition %in% def_line)
avg2675 <- mean(play_2675$s)
# Complete: Shotgun: Avg_speed - 2.396019
play_2720 <- filtered %>%
filter(playId == 2720 & team == "PHI" & officialPosition %in% def_line)
avg2720 <- mean(play_2720$s)
# Complete: Shotgun: Avg_speed - 1.9935
play_2978 <- filtered %>%
filter(playId == 2978 & team == "PHI" & officialPosition %in% def_line)
avg2978 <- mean(play_2978$s)
# Complete: Shotgun: Avg_speed - 2.276818
play_3045 <- filtered %>%
filter(playId == 3045 & team == "PHI" & officialPosition %in% def_line)
avg3045 <- mean(play_3045$s)
# Complete: Shotgun: Avg_speed - 2.535074
play_3110 <- filtered %>%
filter(playId == 3110 & team == "PHI" & officialPosition %in% def_line)
avg3110 <- mean(play_3110$s)
# Complete: Empty: Avg_speed - 1.96359
play_3145 <- filtered %>%
filter(playId == 3145 & team == "PHI" & officialPosition %in% def_line)
avg3145 <- mean(play_3145$s)
# Incomplete: Empty: Avg_speed - 1.857027
play_3383 <- filtered %>%
filter(playId == 3383 & team == "PHI" & officialPosition %in% def_line)
avg3383 <- mean(play_3383$s)
# Complete: Empty: Avg_speed - 2.254355
play_3416 <- filtered %>%
filter(playId == 3416 & team == "PHI" & officialPosition %in% def_line)
avg3416 <- mean(play_3416$s)
# Incomplete: Shotgun: Avg_speed - 2.286014 -hurry-hit
play_3480 <- filtered %>%
filter(playId == 3480 & team == "PHI" & officialPosition %in% def_line)
avg3480 <- mean(play_3480$s)
# Incomplete: Shotgun: Avg_speed - 2.177179
play_3502 <- filtered %>%
filter(playId == 3502 & team == "PHI" & officialPosition %in% def_line)
avg3502 <- mean(play_3502$s)
# Complete: Shotgun: Avg_speed - 1.577569
play_3728 <- filtered %>%
filter(playId == 3728 & team == "PHI" & officialPosition %in% def_line)
avg3728 <- mean(play_3728$s)
# Complete: Shotgun: Avg_speed - 2.167895
play_3752 <- filtered %>%
filter(playId == 3752 & team == "PHI" & officialPosition %in% def_line)
avg3752 <- mean(play_3752$s)
# Incomplete: Singleback: Avg_speed - 2.203721
play_3776 <- filtered %>%
filter(playId == 3776 & team == "PHI" & officialPosition %in% def_line)
avg3776 <- mean(play_3776$s)
# Incomplete: Shotgun: Avg_speed - 1.819375
play_3798 <- filtered %>%
filter(playId == 3798 & team == "PHI" & officialPosition %in% def_line)
avg3798 <- mean(play_3798$s)
# Complete: Shotgun: Avg_speed - 2.120814 - hurry
play_3820 <- filtered %>%
filter(playId == 3820 & team == "PHI" & officialPosition %in% def_line)
avg3820 <- mean(play_3820$s)
# Complete: Shotgun: Avg_speed - 2.264063
play_3988 <- filtered %>%
filter(playId == 3988 & team == "PHI" & officialPosition %in% def_line)
avg3988 <- mean(play_3988$s)
# Complete: Shotgun: Avg_speed - 2.184071
play_4017 <- filtered %>%
filter(playId == 4017 & team == "PHI" & officialPosition %in% def_line)
avg4017 <- mean(play_4017$s)
# Incomplete: Shotgun: Avg_speed - 2.322616 -hit
play_4041 <- filtered %>%
filter(playId == 4041 & team == "PHI" & officialPosition %in% def_line)
avg4041 <- mean(play_4041$s)
# Sack: Empty: Avg_speed - 2.736716 -Sack
play_4112 <- filtered %>%
filter(playId == 4112 & team == "PHI" & officialPosition %in% def_line)
avg4112 <- mean(play_4112$s)
# Incomplete: Singleback: Avg_speed - 1.785677
play_4240 <- filtered %>%
filter(playId == 4240 & team == "PHI" & officialPosition %in% def_line)
avg4240 <- mean(play_4240$s)
# Complete: Shotgun: Avg_speed - 2.367643
play_4274 <- filtered %>%
filter(playId == 4274 & team == "PHI" & officialPosition %in% def_line)
avg4274 <- mean(play_4274$s)
# Sack: Shotgun: Avg_speed - 2.675815 -hurry-sack
play_4298 <- filtered %>%
filter(playId == 4298 & team == "PHI" & officialPosition %in% def_line)
avg4298 <- mean(play_4298$s)
# Incomplete: Shotgun: Avg_speed - 2.527344 -hurry
play_4317 <- filtered %>%
filter(playId == 4317 & team == "PHI" & officialPosition %in% def_line)
avg4317 <- mean(play_4317$s)
# Sack: Shotgun: Avg_speed - 2.5374 -sack
play_4346 <- filtered %>%
filter(playId == 4346 & team == "PHI" & officialPosition %in% def_line)
avg4346 <- mean(play_4346$s)
# Complete: Shotgun: Avg_speed - 2.556818
play_4367 <- filtered %>%
filter(playId == 4367 & team == "PHI" & officialPosition %in% def_line)
avg4367 <- mean(play_4367$s)
########## PUT AVG SPEED TOGETHER BY FORMATION ##############################################
shotgun <- data.frame(c(avg231, avg3728, avg821, avg1023, avg1085, avg1494, avg2675, avg2720,
avg2978, avg3045, avg3110, avg3480, avg3502, avg3728, avg3798, avg3820,
avg3988, avg4017, avg4041, avg4274, avg4298, avg4317, avg4346, avg4367))
colnames(shotgun) <- c("Shotgun Avg Speed")
empty_bkfld <- data.frame(c(avg146, avg3145, avg3383, avg3416, avg4112))
colnames(empty_bkfld) <- c("Empty Backfield Avg Speed")
singleback <- data.frame(c(avg176, avg303, avg348, avg750, avg864, avg1459, avg3776, avg4240))
colnames(singleback) <- c("Singleback Avg Speed")
I_formation <- data.frame(c(avg705, avg1326))
colnames(I_formation) <- c("I-Formation Avg Speed")
jumbo <- data.frame(c(avg966))
colnames(jumbo) <- c("Jumbo Formation Avg Speed")
######## FIND AVG SPEED OF D_LINE AGAINST EACH FORMATION ########
sga <- mean(shotgun$`Shotgun Avg Speed`)
eas <- mean(empty_bkfld$`Empty Backfield Avg Speed`)
singa <- mean(singleback$`Singleback Avg Speed`)
iavg <- mean(I_formation$`I-Formation Avg Speed`)
javg <- mean(jumbo$`Jumbo Formation Avg Speed`)
graph_avg <- data.frame(formation = c("Shotgun", "Empty Backfield", "Singleback", "I_Formation", "Jumbo"), avg_speed = c(sga, eas, singa, iavg, javg))
############ CREATE VISUALIZATION OF AVG SPEEDS ##############
ggplot(graph_avg, aes(x= formation, y = avg_speed, fill = formation)) +
geom_bar(stat = 'identity', color = 'black') +
geom_text(aes(label = round(avg_speed, digits = 2)), vjust = -1, colour = 'black')+
xlab("Formation") + ylab("Average Speed (mph)") +
labs(title = "Avgs Speed of D-Line vs Offensive Formations(WK 1 PHI vs ATL)") +
scale_y_continuous(name = "Average Speed (mph)", limits = c(0,3)) +
guides(fill = guide_legend(title = "Formations"))
knitr::include_graphics("Fast_Shotgun.gif")
Fastest Rush Against Shotgun
knitr::include_graphics("Slow_Shotgun.gif")
Slowest Rush Against Shotgun
knitr::include_graphics("Fast_Empty.gif")
Fastest Rush Against Empty Backfield
knitr::include_graphics("Slow_Empty.gif")
Slowest Rush Against Empty Backfield
knitr::include_graphics("Fast_Single.gif")
Fastest Rush Against Singleback
knitr::include_graphics("Slow_Single.gif")
Slowest Rush Against Singleback
knitr::include_graphics("Fast_I.gif")
Fastest Rush Against I-Formation
knitr::include_graphics("Slow_I.gif")
Slowest Rush Against I-Formation
knitr::include_graphics("Jumbo.gif")
Rush Against Jumbo Formation
#SHOTGUN: 24 PLAYS: 9 PLAYS IMPACTED QB = 37.5% EFFECTIVENESS HITS: 3 HURRY: 6 SACKS: 2
#EMPTY: 5 PLAYS: 1 PLAY IMPACTED QB = 20% EFFECTIVNESS HITS: 0 HURRY: 0 SACK: 1
#SINGLEBACK: 8 PLAYS: 4 PLAYS IMPACTED QB = 50% EFFECTIVNESS HITS:2 HURRY:3 SACK: 0
#I_FORMATION 2 PLAYS: 2 PLAYS IMPACTED QB = 100% EFFECTIVNESS HITS: 1 1 HURRY: 1 SACK: 0
#JUMBO 1 PLAY: 1 PLAY IMPACTED QB = 100% EFFECTIVNESS HITS: HURRYS: 1 SACK: 0
##How This Can be Useful This analysis can help NFL coaches in a number of different ways. The first of which being, it would allow for the coach to breakdown the d-lines performance against different offensive formations. Knowing which formations they have the most sucess against can help them when gameplanning against other teams. A coach can also see the speed, or lack of speed, from his d-line which can then become an area of focus for practice the following week, in an attempt to make sure that the d-line is able to have a similar effectivness against all formations.
##What Would You Change? Prior to submitting this to the NFL Big Data Bowl, I would like to create a more automated way of conducting this analysis, which would replace the hardcoded parts with features that can be adjusted and changed depending on the game and week you are interested in seeing results from.