Here is an intro section. Here you can explain at a high level what you did to determine who is the best kicker.
When looking at what makes a kicker successful in the eyes of a team or organization, being able to kick when the pressure is on is a crucial element. In this project I looked to capture pressure moments as during the latter stages of the game. Looking at kicking data from the 4th quarter and on during games helped in determining which kickers can be considered ‘clutch’ in the eyes of the viewer as an NFL fan. In this project we see what kicker has the highest kicking percentage at this crucial point of the game, along with illustrate this through the use of a bar chart. Volume of kicks was also taken into account, highlighted on the bar chart with the shading pattern.
To start this project data files of plays, players, games, and PFFScoutingData were read in and merged together. The key for this was understanding kickerId was equal to nflId. These files included a variety of special teams plays from the seasons 18’-20’. The first modifications that were made from the original dataframe to my second df was filtering out for Field Goal attempts in the 4th and 5th quarter (overtime), as well as selecting the appropriate variables: gameId, playId, quarter, gameClock, specialTeamsPlayType, specialTeamsResult, yardlineNumber, gameClock, kickLength, Position, displayName, kickerId. Finally I created a final df that started by using the group by function for displaynames. I then created a number of new analytics including total_kicks, successful_kicks, kicking_percentage, total_yards, average_length. Including extra points diluted the kick length average statistic, so later on I opted to use the fill coloring feature of the bar chart with total kicks, so the viewer can have a better view of understanding of kick volume. Along with this I also filtered out kickers with less than 10 kick attempts, which improved the percentage statistic.
For my visualization portion I opted to use a vertical bar chart. By making the chart vertical it improved the ability of labeling each kicker observation along the y axis, of which there are 44. The x axis is the kicking percentage and at the top of the chart is where the highest percentage kickers lie. The observations seen at the top of this graph align well with top viewed kickers during this timeframe, including names like Wil Lutz leading the way, followed by the likes of Justine Tucker and Sebastrion Janikowski. Along with this we can see that the volume of kicks, represented by the shading on the chart, is diverse moving up and down the chart.
setwd("C:/Users/17169/OneDrive - Loyola University Maryland/IS470SportsAnalytics")
#here is my code
library(data.table)
library(dplyr)
library(ggplot2)
my_df1 <- fread("Data/NFLBDB2022/NFL2022/plays.csv")
my_df2 <- fread("Data/NFLBDB2022/NFL2022/players.csv")
my_df3 <- fread("Data/NFLBDB2022/NFL2022/games.csv")
my_df4 <- fread("Data/NFLBDB2022/NFL2022/PFFScoutingData.csv")
df <- left_join(my_df1, my_df2, by = c("kickerId" = "nflId"))
df <- left_join(df, my_df3, by = c("gameId"))
df <- left_join(df, my_df4, by = c("gameId", "playId"))
rm(my_df1)
rm(my_df2)
rm(my_df3)
rm(my_df4)
df_new <- df %>%
filter(( quarter == 4 | quarter == 5), (specialTeamsResult == "Kick Attempt Good" | specialTeamsResult == "Kick Attempt No Good")) %>%
select(gameId, playId, quarter, gameClock, specialTeamsPlayType, specialTeamsResult, yardlineNumber, gameClock, kickLength, Position, displayName, kickerId ) %>%
data.frame()
df_new2 <- df_new %>%
group_by(displayName)%>%
summarise(total_kicks = n(),
successful_kicks = sum(specialTeamsResult == "Kick Attempt Good"),
kicking_percentage = (successful_kicks / total_kicks) * 100,
total_yrds = (sum(yardlineNumber) + 18),
avg_length = (total_yrds/total_kicks),
.groups = "keep")%>%
filter(total_kicks>=10)%>%
data.frame()
# vertical bar chart
ggplot(data = df_new2, aes(x = reorder(displayName, kicking_percentage), y = kicking_percentage, fill = total_kicks)) +
geom_bar(stat = "identity") +
coord_flip() +
geom_text(aes(label = scales::percent(kicking_percentage * 0.01, accuracy = 1L)), hjust = -0.2) +
labs(x = "Kickers", y = "Kicking Percentage",
title = "Best 4th Quater & Overtime Kickers",
fill = "Total Kicks") +
theme(axis.text.x = element_text(color = "black", size = 10),
axis.text.y = element_text(color = "black", size = 10),
axis.title = element_text(hjust = .5, size = 18, face = "bold"),
plot.title = element_text(hjust = .5, size = 18, face = "bold")) +
scale_fill_continuous(limits = c(0,65), low = "red", high = "blue")
In this project we look to see how successful kickers are in the 4th quarter and on. This is the point of the game where pressure situations arise. Being a kicker at the NFL level, converting on kicks at clutch points of the game is crucial for a team’s chance at victory. Kickers like Justin Tucker, Nick Folk, and Chris Boswell show up at the top of the list, illustrating that the list does a good job just of showing who the top clutch kickers are when it comes to making kicks late. A way one could improve this data is by looking to filter by close game situations (possibly 1 score games) to further filter for high pressure information, however this may cause difficulty with sample sizes. Overall after filtering for volume, the chart does a good job of showing how successful kickers were over these three seasons.