Assignment7

Kick Returning in 2025

In the 2024 NFL season, the NFL implemented a new kickoff system that would make it so that more teams were more likely to return the kick. This rule makes it so that the kicking team were not allowed to run at the returner until the ball was caught. This makes it so that these NFL players were running at slower speeds and would prevent injuries from happening on kickoff. With this rule, kick returners are returning the ball more and getting their team closer to the 50 yard line rather than starting at their own 25.

Research Question

With this new rule change, I wanted to see which Kick Returner puts their team in the best field position in the NFL

Data Wrangling

First we need to load the packages needed to get the data

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(rvest)

Attaching package: 'rvest'

The following object is masked from 'package:readr':

    guess_encoding
library(dplyr)
library(xml2)
library(readr)

Next we want to create a loop that would scrape the necessary pages to gather the data for my question. This is the data that was scraped off of the internet.

kick_return<-read_csv("https://myxavier-my.sharepoint.com/:x:/g/personal/shannonr1_xavier_edu/IQDxOswDVLJeTbhy4Kcbp6dFAX0XrVT61QSbgOWwB_OEP5c?download=1")
Rows: 147 Columns: 8
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): Player_Player_on_team, Role
dbl (6): GP_Games_played, RET_Kickoff_Returns, RETY_Kickoff_Return_Yards, AV...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

We are also going to do some data cleaning to make it easier to code everything and sort the kick returners in the NFL.

Analysis

In this visualization, I wanted to show the average yardage per kickoff return when defined as the starting kick returner. I filtered the data so that a “starter” has more than 25 kick returns. What this visualization shows is that each NFL starting kick returner on average takes the ball up the field 25-27 yards up the field. So on average, the starting field position after kickoff is usually around the 30-35 yard line. There are 2 outliers here that average over 28 yards a return with more than 25 returns. This means that these two kick returns would set their team up near the 35-40 yard line with great field position.

kick_return %>%
  filter(Role == "Starter") %>%
  ggplot(aes(x = Role, y = AVG_Average_Yards_per_Kickoff_Return)) +
  geom_boxplot(fill = "skyblue", color = "darkblue") +
  labs(
    title = "Kickoff Return Average Yards — Starters Only",
    x = "Role",
    y = "Average Yards per Kickoff Return"
  ) +
  theme_minimal()

With finding out there were two outliers among the starting kickoff returners, I wanted to see which players they were and what team they were on. The two players were Luke McCaffrey on the Washington Commanders and Skyy Moore on the San Francisco 49ers. Unfortunately, the Commanders starting Commanders quarterback is injured right now so they can’t capitalize on the field position given but the 49ers as of 11/25/25 have an 8-3 record. Having great starting field position make it easier to score and scoring on every possession in crucial in this league.

top5_starters <- kick_return %>%
  filter(Role == "Starter") %>%
  arrange(desc(AVG_Average_Yards_per_Kickoff_Return)) %>%
  slice_head(n = 5)  


ggplot(top5_starters, aes(x = reorder(Player_Player_on_team, AVG_Average_Yards_per_Kickoff_Return), 
                           y = AVG_Average_Yards_per_Kickoff_Return)) +
  geom_col(fill = "steelblue") +
  coord_flip() +  
  labs(
    title = "Top 5 Starters by Average Kickoff Return Yards",
    x = "Player (Team)",
    y = "Average Kickoff Return Yards"
  ) +
  theme_minimal()

The next visualization shows the top 5 kick returners based on total yards. Looking at the teams these returners are on, they don’t have the best defense so they probably return the ball a lot so they would get more total yards than everyone else.

top5_returners <- kick_return %>%
  arrange(desc(RETY_Kickoff_Return_Yards)) %>%
  slice_head(n = 5)

# Bar chart
ggplot(top5_returners, aes(x = reorder(Player_Player_on_team, RETY_Kickoff_Return_Yards), 
                           y = RETY_Kickoff_Return_Yards)) +
  geom_col(fill = "darkorange") +
  coord_flip() +
  labs(
    title = "Top 5 Kick Returners by Total Kickoff Return Yards",
    x = "Player (Team)",
    y = "Total Kickoff Return Yards"
  ) +
  theme_minimal()

The next visualization I wanted to see was each group of kick returners whether they are starters or backups and see the difference between yardages. This could help a team to see if maybe a backup deserves a starting role.

ggplot(kick_return, aes(x = RET_Kickoff_Returns, y = RETY_Kickoff_Return_Yards, color = Role)) +
  geom_point(size = 3, alpha = 0.7) +
  geom_smooth(method = "lm", se = FALSE, color = "black") +  # regression line
  labs(title = "Total Kickoff Return Yards vs. Number of Kick Returns",
       x = "Number of Kick Returns",
       y = "Total Kickoff Return Yards") +
  theme_minimal()
`geom_smooth()` using formula = 'y ~ x'

The last visualization I wanted to see was the average kickoff return yardage based on how many returns. You could see there are more explosive yardages for people with less returns where the starters are more grouped together. Backups are very similar to starters with a little bit more explosiveness but less returns.

ggplot(kick_return, aes(x = RET_Kickoff_Returns, y = AVG_Average_Yards_per_Kickoff_Return, color = Role)) +
  geom_point(size = 3, alpha = 0.7) +
  geom_smooth(method = "lm", se = FALSE, color = "black") +
  labs(title = "Kickoff Return Efficiency vs Attempts")
`geom_smooth()` using formula = 'y ~ x'

Conclusion

In conclusion, the NFL is a scoring league and having a kick returner to get you in the best field position is key to winning games in this league. Starting closer to the 40 is better than starting closer to the 30. A kick returner is extremely underrated and is key to winning games. The top 5 kick returners, besides one because of an injured quarterback, have good records in the NFL because these kick returners put their offense in position to win games.