Notes on the Bengals Offense ahead of the Super Bowl

KEY POINTS

1) Stop running on first down

2) The Bengals need to protect Joe Burrow

3) Can the Bengals get Jamar Chase involved?

data_2021 <- load_pbp(2021)

# Downloading Rams & Bengals Games
rams <- data_2021 %>% 
    filter(home_team == "LA" | away_team == "LA")

bengals <- data_2021 %>% 
    filter(home_team == "CIN" | away_team == "CIN")

# Bengals & Rams Offense
rams_o <- rams %>% 
    filter(posteam == "LA")

bengals_o <- bengals %>% 
    filter(posteam == "CIN")

# Bengals & Rams Defense
rams_d <- rams %>% 
    filter(posteam != "LA")

bengals_d <- bengals %>% 
    filter(posteam != "CIN")

Why do the Bengals keep running on first down?

Bengals Run Rates on on first-third down:

bengals_o %>% 
    drop_na(rush, pass) %>% 
    filter(down < 4, special_teams_play == 0, penalty == 0, play_type != c("qb_spike","qb_kneel", "no_play") ) %>% 
    group_by(Down = down) %>% 
    summarise(Mean_Epa = mean(epa), Pass_Percentage = mean(pass), 
              Run_Percentage = mean(rush), Attempts = n()) %>% 
    gt() %>% 
    tab_spanner(
        label = "BENGALS PLAY CALL BY DOWN",
        columns = 1:5
    ) %>%
    tab_options(
        table.border.top.color = "white",
        row.striping.include_table_body = FALSE
    ) %>%
    tab_source_note(
        source_note = "SOURCE: NFLFastR"
    ) %>%
    cols_label(
        Down = "Down",
        Mean_Epa = "Mean EPA",
        Pass_Percentage = "Pass %",
        Run_Percentage = "Run %"
    ) %>% fmt_percent(
        columns = c(Pass_Percentage, Run_Percentage),
        rows = 1:3,
          decimals = 1
    ) %>% tab_style(
        style = list(
        cell_fill(color = "#f5f5f5")
        ),
        locations = cells_body(
        columns = c(2,4)
        )
    )
BENGALS PLAY CALL BY DOWN
Down Mean EPA Pass % Run % Attempts
1 -0.061460387 48.0% 50.4% 542
2 -0.036442303 63.2% 35.6% 410
3 -0.006929657 85.2% 14.8% 243
SOURCE: NFLFastR

League Average Numbers

data_2021 %>% 
    drop_na(epa) %>% 
    filter(down < 4, special_teams_play == 0, penalty == 0) %>% 
    group_by(Down = down) %>% 
    summarise(Mean_Epa = mean(epa), Pass_Percentage = mean(pass), 
              Run_Percentage = mean(rush), Attempts = n()) %>% 
    gt() %>% 
    tab_spanner(
        label = "LEAGUE PLAY CALL BY DOWN",
        columns = 1:5
    ) %>%
    tab_options(
        table.border.top.color = "white",
        row.striping.include_table_body = FALSE
    ) %>%
    tab_source_note(
        source_note = "SOURCE: NFLFastR"
    ) %>%
    cols_label(
        Down = "Down",
        Mean_Epa = "Mean EPA",
        Pass_Percentage = "Pass %",
        Run_Percentage = "Run %"
    ) %>% fmt_percent(
        columns = c(Pass_Percentage, Run_Percentage),
        rows = 1:3,
          decimals = 1
    ) %>% tab_style(
        style = list(
        cell_fill(color = "#f5f5f5")
        ),
        locations = cells_body(
        columns = c(2,4)
        )
    )
LEAGUE PLAY CALL BY DOWN
Down Mean EPA Pass % Run % Attempts
1 -0.01942332 50.5% 47.6% 15693
2 -0.01218965 61.3% 37.6% 11760
3 -0.06657715 80.2% 19.1% 7176
SOURCE: NFLFastR

Disclaimer I know these percentages don’t exactly add up. After filtering out penalties, special teams plays, kneels, and spikes they still wouldn’t add up. Not sure what’s going on, but I decided to let it be.

The Bengals are finding significantly more success on second and third down, where the scales tip in favor of a higher pass rate. What’s most interesting here is how effective the Bengals have been on third down compared to the rest of the league, with an 85% pass rate.

How do each series of downs play out, based on first down play call?

# Creating unique series IDs
bengals_o$UniqueSeries <- paste(as.character(bengals_o$game_id),as.character(bengals_o$series))

bengals_o %>% 
    filter(down == 1) %>% 
    filter(play_type == "run" | play_type == "pass") %>% 
    group_by(Play_Type = play_type) %>% 
    summarise(Attempts = n(), Series_Success_Rate = mean(series_success)) %>% 
    #slice(1:8,28:32) %>% 
    gt() %>% 
    tab_spanner(
        label = "SERIES SUCCESS",
        columns = 1:3
    ) %>%
    tab_options(
        table.border.top.color = "white",
        row.striping.include_table_body = FALSE
    ) %>%
    tab_source_note(
        source_note = "SOURCE: NFLFastR"
    ) %>%
    cols_label(
        Play_Type = "Play Type",
        Series_Success_Rate = "Series Success Rate"
    ) %>% fmt_percent(
        columns = 3,
        rows = 1:2,
          decimals = 1
    ) %>% tab_style(
        style = list(
        cell_fill(color = "lightblue")
        ),
        locations = cells_body(
        columns = c(Series_Success_Rate)
        )
    )
SERIES SUCCESS
Play Type Attempts Series Success Rate
pass 259 74.5%
run 280 69.3%
SOURCE: NFLFastR

Here a series “success” is defined as a set of downs that result in either a first down or a touchdown.

It appears that the Bengals are more likely to have an overall advantage in terms of series success, when they pass the ball on first down. Hidden behind this success rate number is also the fact that pass plays tend have a more explosive nature which reduces the number of “successful” series necessary to get points on the board.

Let’s take a look at the distribution in yards gained this year on run plays and pass plays in the NFL

data_2021 %>% 
    filter(play_type == "pass" | play_type == "run") %>% 
    group_by(posteam) %>% 
    summarise(yards_gained, play_type) -> teams_rp
## `summarise()` has grouped output by 'posteam'. You can override using the `.groups` argument.
teams_rp$posteam <- replace(teams_rp$posteam, teams_rp$posteam != "CIN", "NFL")
teams_rp %>% 
    mutate(TeamFct = fct_rev(as.factor(posteam))) %>%
    ggplot(aes(y = posteam)) +
    geom_density_ridges(
        aes(x = yards_gained, fill = play_type), 
        alpha = .7, color = "white", from = -15, to = 45
    ) +
    labs(
        x = "Yards Gained",
        y = "Bengals vs League Distribution",
        title = "Distribution of Yards Gained",
        subtitle = "How much more explosive are pass plays than run plays?",
        caption = "Visualization Inspiration: Tom Mock | Source: NFLFastR"
    ) +
    scale_y_discrete(expand = c(0, 0)) +
    scale_x_continuous(expand = c(0, 0)) +
    scale_fill_cyclical(
        values = c("#FB4F14", "#000000", "#000000", "#000000"),
        name = "Play Type", guide = "legend"
    ) +
    coord_cartesian(clip = "on") +
    theme_light() +
    theme(plot.title = element_text(color="black", size=8, face="bold"))+
    theme(plot.title = element_text(size = 12, face = "bold"),
        plot.subtitle = element_text(size = 8))+
    theme(plot.background = element_rect(fill = "gray97"))+
    theme(panel.background = element_rect(fill = "gray97"))
## Picking joint bandwidth of 1.09

Clearly, running the ball has a lower variance, which naturally allows for fewer negative plays, but also the upside is almost equally limited. Whereas pass plays, in the case of incompletions and sacks allow for negative plays with (marginally) “worse” outcomes but a much more frequent distribution of plays that pick up more than a few yards. The Bengals especially are outperforming the NFL average in terms of frequency of pass plays that go for more than 10 yards or so (and beyond that), while their run game closely resembles the league average. This confirms our theory that a higher success rate when passing on first down is more likely to result in chunk plays than a “success” running the ball. The effect of the negative pass plays we see in the distribution is entirely another discussion.

bengals_o$UniqueDriveId <- paste(bengals_o$game_id,bengals_o$fixed_drive)

drive_sums <- bengals_o %>% 
    filter(special_teams_play == 0) %>% 
    drop_na(yards_gained) %>% 
    group_by(Drive = UniqueDriveId) %>% 
    summarise(Yards_Gained = sum(yards_gained), 
              Rush_Rate = mean(rush_attempt),
              Pass_Rate = mean(pass_attempt),
              Result = unique(fixed_drive_result)) %>% 
    arrange(Yards_Gained)

for (i in 1:nrow(drive_sums)) {
    drive_sums$Drive[i] <- i
    if(drive_sums$Result[i] == "Field goal") {
        drive_sums$Result[i] <- "3"
    } else if(drive_sums$Result[i] == "Touchdown") {
        drive_sums$Result[i] <- "7"
    } else {
        drive_sums$Result[i] <- "0"
    }
}
ggplot(data = drive_sums, aes(Drive, Yards_Gained)) +
    geom_segment(aes(x = Pass_Rate, y = 0, xend = Pass_Rate, yend = Yards_Gained, 
                     color = Result, 
                     alpha = 0.6, size = .005))+
    guides(size = FALSE, alpha = FALSE)+
    scale_color_manual(values = c( "#d3d3d3","#000000","#FB4F14")) +
    theme_light()+
    theme(plot.title = element_text(color="black", size=8, face="bold"))+
    theme(plot.title = element_text(size = 10, face = "bold"),
        plot.subtitle = element_text(size = 8))+
    theme(plot.background = element_rect(fill = "gray97"))+
    theme(panel.background = element_rect(fill = "gray97"))+
    labs( x = "Pass Rate",
          y = "Yards Gained",
          title = "Relationship Between Pass Rate and Drive Outcomes",
          subtitle = "Each Segment Represents an Individual Drive")
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

prop.test(x = c(194, 193), n = c(280, 259))
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  c(194, 193) out of c(280, 259)
## X-squared = 1.5695, df = 1, p-value = 0.2103
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.13176932  0.02713611
## sample estimates:
##    prop 1    prop 2 
## 0.6928571 0.7451737
## Proportion of success when passing first doesn't say much definitively

So, why “Establish the Run”?

Points_Scored <- c(16,34,41,15,23,22,41,32,16,31,41,34,22,24,24,17,27,26,19,27)
Cin_Won <- c(0,1,1,1,0,0,1,1,0,0,1,1,0,1,1,0,1,1,1,1)

cin_rushing_o <- bengals_o %>% 
    group_by(week) %>% 
    filter(play_type == "run", wp < 0.95, wp > 0.05) %>% 
    summarise(Rush_EPA = sum(epa)) %>% 
    cbind(Points_Scored, Cin_Won)
summary(lm(cin_rushing_o$Points_Scored ~ cin_rushing_o$Rush_EPA))
## 
## Call:
## lm(formula = cin_rushing_o$Points_Scored ~ cin_rushing_o$Rush_EPA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.157  -6.009  -1.616   5.689  14.425 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             27.1001     2.1625  12.532  2.5e-10 ***
## cin_rushing_o$Rush_EPA   0.1405     0.2747   0.511    0.615    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.626 on 18 degrees of freedom
## Multiple R-squared:  0.01432,    Adjusted R-squared:  -0.04044 
## F-statistic: 0.2615 on 1 and 18 DF,  p-value: 0.6153
ggplot(data = cin_rushing_o, aes(x = Rush_EPA, y= Points_Scored))+
    geom_point(aes(color = factor(Cin_Won)))+
    guides(color = FALSE)+
    scale_color_manual(values=c("#000000", "#FB4F14"))+
    theme_light()+
    theme(plot.title = element_text(color="black", size=8, face="bold"))+
    theme(plot.title = element_text(size = 12, face = "bold"),
        plot.subtitle = element_text(size = 8))+
    theme(plot.background = element_rect(fill = "gray97"))+
    theme(panel.background = element_rect(fill = "gray97"))+
    xlab("Cumulative Rush EPA")+ylab("Points Scored")+
    ggtitle("Relationship between Rush EPA and Points Scored", 
            subtitle = "Garbage time (5% < WP% < 95%) excluded  |  Orange Dots indicate Bengals Win")+
    stat_regline_equation(label.y = 40, label.x = -15, aes(label = ..rr.label..))
## Warning: `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> =
## "none")` instead.

Upon examination, especially once we remove runs that occurred with the game far out of hand, it’s clear that the Bengals’ ability to establish the run has about zero predictive ability in their overall offensive performance or in their ability to win games.

How will the Bengals’ pass attack be affected by the Rams defensive line?

schedule_21 <- fast_scraper_schedules(2021)

bengals_21 <- schedule_21 %>% 
    filter(away_team == "CIN" | home_team == "CIN")

bengals_21$CIN_score <- if_else(bengals_21$home_team == "CIN", 
        bengals_21$CIN_score <- bengals_21$home_score, bengals_21$CIN_score <- bengals_21$away_score)

bengals_21$Opponent <- if_else(bengals_21$home_team == "CIN", 
        bengals_21$Opponent <- bengals_21$away_team, bengals_21$Opponent <- bengals_21$home_team)

# Removing row for super bowl week to prepare for column bind
bengals_21 <- bengals_21[-21,]

bengals_21 <- cbind(bengals_21, Cin_Won)

bengals_21 <- bengals_21 %>% 
    group_by(Opponent) %>% 
    summarise(PPG = mean(CIN_score), "Win %" = mean(Cin_Won))
pressure_stats_nfl <- data_2021 %>% 
    filter(play_type == "pass") %>% 
    drop_na(qb_hit, sack, qb_epa) %>% 
    group_by(defteam) %>% 
    summarize(QB_Hit_Rate = mean(qb_hit), Sack_Rate = mean(sack), QB_EPA = mean(qb_epa)) %>% 
    mutate(Pressure_Rankings = dense_rank(Sack_Rate*-1)) %>% # *(-1) to make rankings work
    arrange(Pressure_Rankings)

pressure_stats_nfl <- pressure_stats_nfl[-33,]

pressure_stats_nfl <- left_join(pressure_stats_nfl, teams_colors_logos, by = c("defteam" = "team_abbr"))

attach(pressure_stats_nfl)
summary(lm(QB_EPA~Sack_Rate))
## 
## Call:
## lm(formula = QB_EPA ~ Sack_Rate)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.17198 -0.04450 -0.01189  0.05734  0.14432 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  0.25654    0.07189   3.569  0.00123 **
## Sack_Rate   -3.68485    1.12965  -3.262  0.00276 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.08043 on 30 degrees of freedom
## Multiple R-squared:  0.2618, Adjusted R-squared:  0.2372 
## F-statistic: 10.64 on 1 and 30 DF,  p-value: 0.002759
summary(lm(QB_EPA~QB_Hit_Rate))
## 
## Call:
## lm(formula = QB_EPA ~ QB_Hit_Rate)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.150226 -0.069694  0.000401  0.040391  0.163918 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   0.3035     0.1046   2.901   0.0069 **
## QB_Hit_Rate  -1.9388     0.7253  -2.673   0.0120 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.08413 on 30 degrees of freedom
## Multiple R-squared:  0.1924, Adjusted R-squared:  0.1654 
## F-statistic: 7.146 on 1 and 30 DF,  p-value: 0.01204
detach(pressure_stats_nfl)
pressure_stats_CIN <- bengals_o %>% 
    filter(play_type == "pass") %>% 
    drop_na(qb_hit, sack) %>% 
    group_by(Opponent = defteam) %>% 
    summarize(QB_Hit_Rate = mean(qb_hit), Sack_Rate = mean(sack), QB_EPA = mean(qb_epa))

pressure_stats_CIN <- cbind(pressure_stats_CIN, bengals_21)
pressure_stats_CIN <- pressure_stats_CIN[,-5] # removing 2nd opponent column
pressure_stats_CIN <- left_join(pressure_stats_CIN, pressure_stats_nfl, by =c("Opponent" = "defteam"))
pressure_stats_CIN <- left_join(pressure_stats_CIN, teams_colors_logos, by = c("Opponent" = "team_abbr")) %>% 
    select(-QB_Hit_Rate.x,-Sack_Rate.x,-QB_EPA.x) %>% 
    rename("QB_Hit_Rate" = "QB_Hit_Rate.y", "Sack_Rate" = "Sack_Rate.y","QB_EPA" = "QB_EPA.y", 
           "team_logo_espn" = "team_logo_espn.y" )
summary(lm(pressure_stats_CIN$QB_EPA~pressure_stats_CIN$Sack_Rate))
## 
## Call:
## lm(formula = pressure_stats_CIN$QB_EPA ~ pressure_stats_CIN$Sack_Rate)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.10440 -0.06869  0.01167  0.06405  0.11381 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                   0.26538    0.09609   2.762   0.0162 *
## pressure_stats_CIN$Sack_Rate -3.27766    1.44183  -2.273   0.0406 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0773 on 13 degrees of freedom
## Multiple R-squared:  0.2844, Adjusted R-squared:  0.2294 
## F-statistic: 5.168 on 1 and 13 DF,  p-value: 0.04062
summary(lm(pressure_stats_nfl$QB_EPA~pressure_stats_nfl$QB_Hit_Rate))
## 
## Call:
## lm(formula = pressure_stats_nfl$QB_EPA ~ pressure_stats_nfl$QB_Hit_Rate)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.150226 -0.069694  0.000401  0.040391  0.163918 
## 
## Coefficients:
##                                Estimate Std. Error t value Pr(>|t|)   
## (Intercept)                      0.3035     0.1046   2.901   0.0069 **
## pressure_stats_nfl$QB_Hit_Rate  -1.9388     0.7253  -2.673   0.0120 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.08413 on 30 degrees of freedom
## Multiple R-squared:  0.1924, Adjusted R-squared:  0.1654 
## F-statistic: 7.146 on 1 and 30 DF,  p-value: 0.01204
theme_538 <- function(base_size = 12, base_family = "Chivo") {
  theme_grey(base_size = base_size, base_family = base_family) %+replace%
    theme(
      # drop minor gridlines and axis-ticks
      panel.grid.minor = element_blank(),
      axis.ticks = element_blank(),
      # change font elements/family
      text = element_text(family = base_family, size = base_size),
      axis.text = element_text(face = "bold", color = "grey", size = base_size),
      axis.title = element_text(face = "bold", size = rel(1.33)),
      axis.title.x = element_text(margin = margin(0.5, 0, 0, 0, unit = "cm")),
      axis.title.y = element_text(margin = margin(0, 0.5, 0, 0, unit = "cm"), angle =90),
      plot.title = element_text(face = "bold", size = rel(1.67), hjust = 0),
      plot.title.position = "plot",
      plot.subtitle = element_text(size = 16, margin = margin(0.2, 0, 1, 0, unit = "cm"), hjust = 0),
      plot.caption = element_text(size = 10, margin = margin(1, 0, 0, 0, unit = "cm"), hjust = 1),
      # change plot colors for the background/panel elements
      plot.background = element_rect(fill = "#f0f0f0", color = NA),
      panel.background = element_rect(fill = "#f0f0f0", color = NA),
      panel.grid.major =  element_line(color = "#d0d0d0"),
      panel.border = element_blank(),
      # shrinks margin and simplify the strip/facet elements
      plot.margin = margin(0.5, 1, 0.5, 1, unit = "cm"),
      strip.background = element_blank(),
      strip.text = element_text(size = rel(1.33), face = "bold")
    )
}

Where do the Rams rank among the NFL’s leading pass rushes?

pressure_stats_nfl %>% 
    ggplot(aes(x = Sack_Rate, y =  QB_EPA)) +
    geom_image(aes(image = team_logo_espn), size = 0.07, asp = 16 / 9) +
    gghighlight(defteam == "LA",label_key = defteam)+
    theme_light()+
    theme(plot.title = element_text(color="black", size=8, face="bold"))+
    theme(plot.title = element_text(size = 12, face = "bold"),
        plot.subtitle = element_text(size = 8))+
    theme(plot.background = element_rect(fill = "gray97"))+
    theme(panel.background = element_rect(fill = "gray97"))+
    labs( x = "Sack Rate",
          y = "Mean QB EPA Allowed",
          title = "How Does Pressure Affect QB Play?",
          subtitle = "Where do the Rams stack up?")

Among NFL defenses, the Rams are amongst league leaders in the frequency that they sack the quarterback. We can also tell that the Rams do a good job of limiting opposing QBs’ production. If the Bengals are to stay in this game, Joe Burrow obviously needs to show up.

How have the Bengals fared against the NFL’s best pass rushing defenses?

# Bengals Relationship of Sack Rate and QB_EPA
pressure_stats_CIN %>% 
    ggplot(aes(x = Sack_Rate, y =  QB_EPA)) +
    geom_image(aes(image = team_logo_espn), size = 0.07, asp = 16 / 9) +
    gghighlight(Pressure_Rankings <= 10, label_key = Pressure_Rankings)+
    ##geom_point() +
    theme_light()+
    theme(plot.title = element_text(color="black", size=8, face="bold"))+
    theme(plot.title = element_text(size = 12, face = "bold"),
        plot.subtitle = element_text(size = 8))+
    theme(plot.background = element_rect(fill = "gray97"))+
    theme(panel.background = element_rect(fill = "gray97"))+
    labs( x = "Defensive Team Sack Rate vs Bengals",
          y = "Bengals QB EPA",
          title = "Bengals' QB Play Under Pressure",
          subtitle = "Teams that rank in the season-long top 10 in sack rate are highlighted")

Here, QB_EPA (from my understanding) is defined as the EPA of plays the QB was involved in. We define that as QB runs/scrambles and any drop back to pass, regardless of the outcome (sack, scramble, incompletion, etc). This gives us a pretty good sense of how well a QB performed in isolation from the entire offense, which usually is a good indicator for overall offensive success, but I digress. I’m choosing to evaluate QB EPA here to build off my last point, that I think it’s crucial to the Bengals’ success that the pass game is the foundation of their offensive gameplan on downs 1-3.

It seems clear that the Bengals are vulnerable against the NFL’s elite fronts. The Bengals have struggled against the NFL’s best fronts in terms of sack rate and overall QB Play. The Rams, being among league leaders in sack rate, pose a threat to an offense that allowed 9 sacks in their matchup with the Titans.

Ramsey v Chase

How important is Jamar Chase to the Bengals’ offensive Success?

Chase_o <- bengals_o %>% 
    filter(receiver_jersey_number == 1) %>% 
    group_by(Week = week) %>% 
    summarise(Jamar_Chase_yds = sum(yards_gained), Offensive_EPA = mean(epa))# %>% 
    #cbind(Cin_Won)

ggplot(data= Chase_o, aes(x = Jamar_Chase_yds, y = Offensive_EPA)) +
    geom_point()+
    #guides(color = FALSE)+
    #scale_color_manual(values=c("#000000", "#FB4F14"))+
    geom_smooth(method = "lm" , color = "#FB4F14", alpha = 0.5, se = F)+
    theme_light()+
    theme(plot.title = element_text(color="black", size=8, face="bold"))+
    theme(plot.title = element_text(size = 12, face = "bold"),
        plot.subtitle = element_text(size = 8))+
    theme(plot.background = element_rect(fill = "gray97"))+
    theme(panel.background = element_rect(fill = "gray97"))+
    xlab("Jamar Chase Recieving Yards")+ylab("Mean Offensive EPA")+
    ggtitle("Relationship Jamar Chase Recieving Production and Bengals' Offensive Performance", 
            subtitle = "Getting Jamar Chase going against Jalen Ramsey may be essential")
## `geom_smooth()` using formula 'y ~ x'

From this plot we can say that Jamar Chase’s performance is a bit of an indicator for the Bengals’ success on offense as a whole, but we can’t really draw any major conclusions here. But, for fun, let’s look at his matchup versus Rams’ star corner, Jalen Ramsey.

Shoutout to Jonas Trostle, from Open Source Football, for the code to compare teams ability to cover WR1s. Read his article for a better explanation of this code than I could ever provide.

pbp <- data_2021 %>%
  select(
    desc,
    posteam,
    defteam,
    receiver,
    receiver_id,
    epa,
    cpoe,
    pass,
    qb_epa,
    air_yards,
    yards_gained,
    penalty
  ) %>%
  filter(penalty == 0,!is.na(epa)) 

roster <- fast_scraper_roster(2021)

pbp2 <- pbp %>%
  select(
    desc,
    posteam,
    defteam,
    receiver,
    receiver_id,
    epa,
    cpoe,
    pass,
    qb_epa,
    air_yards,
    yards_gained
  ) %>%
  filter(!is.na(receiver_id))

pbp3 <- pbp2 %>%
  left_join(roster, by = c("receiver_id" = "gsis_id"))
pbp4 <- pbp3 %>%
  # we keep only the wide receivers, no tight ends or running backs
  filter(position == "WR") %>% 
  group_by(receiver_id) %>%
  # we sum up all the receivers air yards when targeted
  mutate(targeted_air_yards = sum(air_yards)) %>%
  ungroup() %>%
  # distinct allows us to keep only one row per player
  distinct(receiver_id, .keep_all = T) %>%
  # we want to rank receivers by team, not overall, so we group by offense
  group_by(posteam) %>%
  # since we want to rank the receivers, arrange allows us to order them by some
  # criterion
  arrange(-targeted_air_yards) %>%
  # this index is now the within-team rank of each wide receiver that year
  mutate(index = row_number()) %>%
  ungroup() %>%
  select(receiver_id, targeted_air_yards, index)

pbp2 %>%
  left_join(pbp4) -> pbp5
## Joining, by = "receiver_id"
pbp5 %>%
  filter(!is.na(index)) -> pbp6

pbp6 %>%
  mutate(bernoulli = if_else(index == 1, 1,0)) -> pbp7

pbp7 %>%
  group_by(defteam, bernoulli) %>%
  summarise(epa = mean(epa)) %>%
  ungroup() -> pbp8
## `summarise()` has grouped output by 'defteam'. You can override using the `.groups` argument.
pbp8 %>%
  pivot_wider(names_from = bernoulli,
              values_from =  epa,
              names_prefix = "wr_") -> pbp9

pbp9 %>%
  left_join(teams_colors_logos, by = c('defteam' = 'team_abbr')) -> pbp10
pbp10 %>% 
    ggplot(aes(x = wr_0 , y =  wr_1)) +
    geom_image(aes(image = team_logo_espn), size = 0.07, asp = 16 / 9) +
    gghighlight(defteam == "LA",label_key = defteam)+
    theme_light()+
    theme(plot.title = element_text(color="black", size=8, face="bold"))+
    theme(plot.title = element_text(size = 12, face = "bold"),
        plot.subtitle = element_text(size = 8))+
    theme(plot.background = element_rect(fill = "gray97"))+
    theme(panel.background = element_rect(fill = "gray97"))+
    labs( x = "EPA Given up to All Other Recivers",
          y = "EPA Given up to WR1s",
          title = "How well do NFL Defenses guard WR1s?",
          subtitle = "Maybe the Rams are Vulnerable to the Burrow-Chase Connection")

It looks like the Rams have been vulnerable against teams’ top wideouts, but have shutdown their supporting casts. I’m curious how often Ramsey will follow Chase. He has been used all over the place this year, so it’s not a given. I’m curious how much this has to do with a potential Ramsey regression we’ve missed or if it’s a byproduct of Ramsey’s use all over the defense. After some exploratory data analysis, the mean epa of other wide recievers (other than the WR1) is not very indicative of opposing offenses’ mean epa per game, so it’s hard to say that getting Boyd and Higgins involved is a key to this game.

Moral of the story here, I don’t know what this means for the Bengals’ offense, but I do believe the Jalen Ramsey effect on Jamar Chase is overstated and I’m taking the over on Jamar Chase’s yards total.