Assignment 3

Introduction

Players like Devin Hester, Percy Harvin, and Darren Sproles made livings off mastering the art of the punt return. It’s a play that requires keen field vision and awareness. The sheer chaotic nature of the play makes it a unique one to analyze. All players are moving towards one ball carrier either with the goal of protecting him or tackling him. By measuring the distance of each player to the ball carrier, we can assess the returner’s ability to decisively pick the msot efficient route to the endzone.

Description of Project

Using tracking data from the 2020 season, plays data, and game data we were able to remodel the traffic meter used to analyze the Isaiah McKenzie punt return touchdown against the Miami Dolphins. We used an “ifelse” statment to ensure that only players who were of the opposing team, were in front of the returner, and were within a 10 yard distance vertically of the returner. We deemed defenders ahead of and within 10 yards of the returner as “true” traffic. We then found the euclydian distance of those defenders and visualized it below.

Data Visualization

We created a trellis chart to compare Professor Tallon’s distance and the distance we calculated. There is little to no overlap between the two charts. Our distance peaks once while Professor Tallon’s peaks twice. As you will see in our t-test, our p-value sits well below .05 so the results are statistically significant. Tallon’s chart seems to imply that McKenzie experienced a great deal of traffic for the majority of the play given that the distance frequency spikes at about 17 yards. In reality, McKenxie was given ample running room and followed his blockers efficiently on his way to the endzone which our chart implies as the majority of the play, opponents were about 50 yards from him.

setwd("U:/")

library("ggplot2")
library("ggalt")

## Registered S3 methods overwritten by 'ggalt':
##   method                  from   
##   grid.draw.absoluteGrob  ggplot2
##   grobHeight.absoluteGrob ggplot2
##   grobWidth.absoluteGrob  ggplot2
##   grobX.absoluteGrob      ggplot2
##   grobY.absoluteGrob      ggplot2

library("ggforce")
library("hms")
library("gganimate")
library("data.table")
library("dplyr")

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:data.table':
## 
##     between, first, last

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library("nflfastR")
library("gifski")
library("png")
library("ggimage")

df_tracking <- fread("Data/NFLBDB2022/tracking2020.csv")
df_plays    <- fread("plays.csv")
df_games    <- fread("games.csv")

# you are creating a metric for a player in a play (playId) in a game (gameId)
my_gameId <- 2021010300
my_playId <- 1586 
my_playerNumber <- 19

df <- df_tracking %>%
  filter(gameId == my_gameId & playId == my_playId ) %>%
  left_join(df_plays, by = c("playId" = "playId", "gameId" = "gameId")) %>%
  select(x, y, displayName, jerseyNumber, team, gameId, playId, frameId, time, 
         nflId, dis, playDescription, s) %>%
  data.frame()


team1 <- as.character(  df_games[df_games$gameId == my_gameId, "homeTeamAbbr"]  )
team2 <- as.character(  df_games[df_games$gameId == my_gameId, "visitorTeamAbbr"]  )


df_team1 <- teams_colors_logos %>%
  filter(team_abbr == team1)


df_team2 <- teams_colors_logos %>%
  filter(team_abbr == team2)


df_player1 <- df %>%
  filter(jerseyNumber == my_playerNumber) %>%
  select(gameId, playId, frameId, x, y, dis, team) %>%
  data.frame()


df <- df %>%
  left_join(df_player1, by = c( "playId" = "playId", "gameId" = "gameId", "frameId" = "frameId"))
  

ball_carrier_x <- df_player1$x
ball_carrier_y <- df_player1$y

calculate_traffic <- df %>%
  # calculate my distance
  mutate(my_dist = ifelse(team.x != team.y & team.x != "football" & x.x < ball_carrier_x & abs(y.y - y.x ) > 10 ,
                       sqrt( (x.x - x.y)^2 + (y.x - y.y)^2),
                       NA)) %>%
  
  #calculate tallons distance using the code from above
  mutate(tallon_dist = ifelse( team.x != team.y & team.x != "football", 
                        sqrt( (x.x - x.y)^2 + (y.x - y.y)^2), 
                        NA)) %>%
  
  group_by(gameId, playId, frameId) %>%
  
  #calculate the mean
  mutate(my_mean_dist_to_player = mean(my_dist, na.rm = TRUE ),
         tallon_mean_dist_to_player = mean(tallon_dist, na.rm = TRUE),
         my_alpha = 1 - (my_mean_dist_to_player/113.33),
         tallon_alpha = 1 - (tallon_mean_dist_to_player/113.33)) %>%
  ungroup() %>%
  data.frame()

t.test(calculate_traffic$my_mean_dist_to_player, calculate_traffic$tallon_mean_dist_to_player)

## 
##  Welch Two Sample t-test
## 
## data:  calculate_traffic$my_mean_dist_to_player and calculate_traffic$tallon_mean_dist_to_player
## t = 34.443, df = 6906.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  11.65614 13.06303
## sample estimates:
## mean of x mean of y 
##  39.44520  27.08562

suppressWarnings(p_df <- melt(calculate_traffic %>% select(my_dist, tallon_dist) %>% filter(!is.na(tallon_dist)), id = c()))

g1 <- ggplot(data = p_df, aes(x = value, fill = variable)) +
  geom_density(alpha = 0.2) +
  facet_wrap(~variable, scales = "free", nrow = 1)
suppressWarnings(print(g1))

Assignment 3

Evan Alander and David Brown

11/4/2024

Introduction

Description of Project

Data Visualization