Abstract

A quarterback spike is a strategy employed by every single NFL team, usually in low-clock scenarios for one purpose. To stop the clock. Whether it’s within a 2-minute drill or for a final field-goal attempt, or an offensive play in an unlikely attempt to score. Teams prepare for 2-minute drills and late game scenarios, so why are they forced to use a spike to stop the clock and essentially throw an incompletion? In almost every end of game situation most NFL teams choose to concede a down and spike the ball to set up the next play. It’s unfair to assume that teams will evaluate what to do in late-game situations, however, that got me interested. Why do teams spike the ball anyway? For the most part, a quarterback spike is a purposeful incompletion, that wastes a down, to stop the clock. In some situations, stopping the clock with a spike is a necessary evil. However, many teams misinterpret the clock and choose to spike the ball, assuming they’ll need more time than necessary.

Step 1: Load the libraries.

library(tidyverse)
library(glue)
library(caret)
library(nflfastR)
options(scipen = 9999)
nflreadr::.clear_cache()

Step 2: Create a data frame for fourth quarter spikes, loading the data from the past 10 years.

fourth_quarter_spikes <- load_pbp(2012:2021) %>%
  filter(qb_spike == 1) %>%
  filter(qtr == 4) %>%
  filter(score_differential >= -8)

Step 3: Look at every fourth quarter spike in the past 10 years.

Most of the data is clustered in the negative expected points added (EPA) and win probability added (WPA) range. Just below 0 for both EPA and WPA. The red lines indicate the average EPA and WPA per spike, and are both below 0. Meaning that on an average spike in the past 10 years, it would have a lower win probability and expected points than the play before.

Step 4: What variables are important for fourth quarter spikes?

## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 139 × 8
##    posteam defteam time  yrdln  score_differential     epa       wpa fg_prob
##    <chr>   <chr>   <chr> <chr>               <dbl>   <dbl>     <dbl>   <dbl>
##  1 TEN     HOU     00:01 TEN 4                  -3  0.102  -0.0422   0.00649
##  2 CLE     NE      00:02 NE 40                  -1  0.751   0.000285 0.160  
##  3 IND     JAX     00:02 IND 17                 -3 -0.0312 -0.0253   0.00767
##  4 DET     MIN     00:02 MIN 40                 -3  0.970   0.000205 0.215  
##  5 JAX     MIN     00:02 MIN 44                  0  0.596  -0.0226   0.151  
##  6 DEN     LAC     00:02 MID 50                 -3 -0.0442 -0.00270  0.0777 
##  7 ARI     BUF     00:03 BUF 20                  0  0.582  -0.0148   0.635  
##  8 KC      PIT     00:03 PIT 28                 -3  0.851   0.000264 0.426  
##  9 BAL     LA      00:03 LA 29                   0  0.721  -0.0150   0.470  
## 10 DAL     NO      00:03 DAL 48                 -2  0.0209 -0.00474  0.0677 
## # … with 129 more rows
## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 139 × 8
##    posteam defteam time  yrdln  score_differential     epa       wpa fg_prob
##    <chr>   <chr>   <chr> <chr>               <dbl>   <dbl>     <dbl>   <dbl>
##  1 LV      ARI     00:04 ARI 17                 -1  0.222   0.0137     0.765
##  2 NO      CAR     00:04 CAR 15                  0  0.297  -0.00825    0.736
##  3 SEA     CAR     00:05 CAR 13                  0  0.421  -0.00593    0.731
##  4 CAR     NYG     00:06 NYG 25                  0  0.136  -0.00567    0.710
##  5 KC      WAS     00:08 WAS 25                  0  0.0846 -0.0190     0.694
##  6 BAL     CLE     00:04 CLE 13                 -1  0.526  -0.00796    0.690
##  7 KC      MIN     00:04 MIN 26                  0  0.268  -0.0145     0.679
##  8 ARI     SEA     00:03 SEA 26                 -3  0.356   0.000265   0.677
##  9 DAL     ATL     00:05 ATL 28                 -2  0.352   0.00986    0.654
## 10 BAL     PIT     00:14 PIT 24                 -3 -0.135  -0.00664    0.649
## # … with 129 more rows
## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 139 × 8
##    posteam defteam time  yrdln  score_differential      epa      wpa fg_prob
##    <chr>   <chr>   <chr> <chr>               <dbl>    <dbl>    <dbl>   <dbl>
##  1 TEN     HOU     00:01 TEN 4                  -3  0.102   -0.0422  0.00649
##  2 IND     JAX     00:02 IND 17                 -3 -0.0312  -0.0253  0.00767
##  3 BAL     LA      00:04 BAL 30                 -1 -0.00366 -0.0386  0.0143 
##  4 ATL     NE      00:19 ATL 27                  0 -0.0266   0.00700 0.0296 
##  5 MIN     DET     00:06 MIN 44                 -2 -0.0142  -0.00719 0.0394 
##  6 LV      JAX     00:18 LV 31                   0 -0.0714  -0.0527  0.0398 
##  7 DAL     NO      00:03 DAL 48                 -2  0.0209  -0.00474 0.0677 
##  8 DEN     LAC     00:02 MID 50                 -3 -0.0442  -0.00270 0.0777 
##  9 NYJ     NE      00:31 NYJ 31                 -2 -0.0744  -0.0382  0.0832 
## 10 NYJ     TB      00:15 NYJ 45                 -2 -0.00317 -0.00867 0.0850 
## # … with 129 more rows
## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 119 × 8
##    posteam defteam time  yrdln  score_differential     epa       wpa td_prob
##    <chr>   <chr>   <chr> <chr>               <dbl>   <dbl>     <dbl>   <dbl>
##  1 BUF     NYJ     00:01 BUF 10                 -7  0.0469 -0.0218   0.00328
##  2 DAL     SF      00:01 SF 24                  -6 -1.88   -0.237    0.0114 
##  3 NO      WAS     00:02 WAS 39                 -8 -0.111  -0.000683 0.0127 
##  4 PHI     TB      00:02 TB 1                   -5 -0.132  -0.0377   0.387  
##  5 CIN     BAL     00:02 CIN 49                 -7 -0.0931 -0.0130   0.0175 
##  6 LAC     NE      00:02 NE 23                  -8 -0.0612 -0.00116  0.0216 
##  7 SF      NYG     00:02 NYG 21                 -4 -0.192   0.00553  0.0198 
##  8 ATL     WAS     00:02 WAS 37                 -4 -0.133   0.00389  0.0128 
##  9 GB      SF      00:03 SF 42                  -6  0.0410 -0.00822  0.00964
## 10 SEA     NO      00:03 NO 10                  -5 -0.0560 -0.0912   0.0696 
## # … with 109 more rows
## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 119 × 8
##    posteam defteam time  yrdln score_differential    epa      wpa td_prob
##    <chr>   <chr>   <chr> <chr>              <dbl>  <dbl>    <dbl>   <dbl>
##  1 SEA     SF      00:23 SF 1                  -5 -0.538 -0.0668    0.767
##  2 TB      PIT     00:35 PIT 5                 -4 -1.03  -0.131     0.691
##  3 PIT     MIN     00:24 MIN 6                 -7 -0.517 -0.0194    0.619
##  4 BAL     ARI     00:18 ARI 4                 -8 -0.427 -0.00802   0.612
##  5 LV      DEN     00:16 DEN 6                 -7  0.309  0.00224   0.557
##  6 JAX     NYJ     00:13 NYJ 1                 -5 -1.17  -0.0986    0.539
##  7 LV      CIN     00:30 CIN 9                 -7 -0.503 -0.0309    0.529
##  8 LAC     KC      00:25 KC 14                 -7 -0.311 -0.0246    0.502
##  9 TEN     ARI     00:18 ARI 8                 -7 -0.431 -0.0249    0.488
## 10 SF      LA      00:36 LA 19                 -7  0.947  0.0341    0.471
## # … with 109 more rows

If a team is down by a field goal with little time left, and a high field goal probability, it makes perfect sense to spike the ball to set up a field goal to win or tie the game. EPA agrees with this, indicating that it’s beneficial to set up chances to win the game. It’s very impractical to spike the ball with a low field goal probability because there’s really no beneficial reason to throwing a purposeful incompletion. Because of this, I chose to focus more on EPA rather than WPA because it’s better representative of a “good” spike.

If a team is down by a touchdown, and they spike the ball both EPA and WPA tend to be negative. In the last group, you can see that choosing to spike the ball with such a high touchdown probability is bad in terms of EPA. There’s plenty of time to run 4 plays especially considering the fact they only need <10 yards so they won’t take long to throw into the end zone. All these teams chose to spike the ball to stop the clock when the clock won’t become a problem. Instead, they choose to lower their chances of getting a touchdown while at the same time increasing the likelihood of a turnover on downs. The biggest problem is they are eliminating an extra chance at the end zone. One extra throw in a set of 4 downs could make the difference. These spikes help the defense reset and already give them 1 successful stop. Instead of having 4 chances at the end zone they now only have 3, while also not being constrained by time, forcing a 4th down on what could have been 3rd down.

How many spikes have a negative EPA based on endgame scenarios?
Field-Goal score games
Negative Spikes
## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 1 × 2
##   `n()`     n
##   <int> <int>
## 1    83    83
Positive Spikes
## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 1 × 2
##   `n()`     n
##   <int> <int>
## 1    56    56
## [1] 59.71223
Touchdown score games
Negative Spikes
## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 1 × 2
##   `n()`     n
##   <int> <int>
## 1   105   105
Positive Spikes
## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 1 × 2
##   `n()`     n
##   <int> <int>
## 1    14    14
## [1] 88.23529

In field goal score games, there were 83 instances of spikes with a negative EPA, just below 60%. In touchdown score games, there were 105 instances of spikes with a negative EPA, just below 90%.

What does it look like with a spike to set up a field goal to win the game?
## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 89 × 9
##    posteam defteam posteam_score defteam_score time  yrdln     epa fg_prob
##    <chr>   <chr>           <dbl>         <dbl> <chr> <chr>   <dbl>   <dbl>
##  1 LV      ARI                20            21 00:04 ARI 17 0.222    0.765
##  2 NO      CAR                31            31 00:04 CAR 15 0.297    0.736
##  3 SEA     CAR                27            27 00:05 CAR 13 0.421    0.731
##  4 CAR     NYG                35            35 00:06 NYG 25 0.136    0.710
##  5 KC      WAS                20            20 00:08 WAS 25 0.0846   0.694
##  6 BAL     CLE                20            21 00:04 CLE 13 0.526    0.690
##  7 KC      MIN                23            23 00:04 MIN 26 0.268    0.679
##  8 DAL     ATL                37            39 00:05 ATL 28 0.352    0.654
##  9 ARI     BUF                16            16 00:03 BUF 20 0.582    0.635
## 10 DEN     LAC                20            22 00:04 LAC 16 0.576    0.628
## # … with 79 more rows, and 1 more variable: score_differential <dbl>

With a high field goal probability, less than 3 point deficit and little time remaining, EPA tends to be positive. This makes a lot of practical sense in game-time scenarios. If you can, it makes perfect sense to spike the ball - or throw an incompletion - before a field goal to set up a chance to win the game.

Step 5: Is there a way to make a multivariate linear regression model to predict the EPA of a spike?

I want to create 2 different models, one for field goal score games, and one for touchdown score games. First there need to be 2 more data frames for a fg_model and a td_model.

fg_model <- fourth_quarter_spikes %>%
  filter(score_differential >= -3) %>%
  select(epa, game_seconds_remaining, down, yardline_100, posteam_timeouts_remaining, 
         score_differential, fg_prob)
td_model <- fourth_quarter_spikes %>%
  filter(score_differential <= -4) %>%
  select(epa, game_seconds_remaining, down, yardline_100, posteam_timeouts_remaining, 
         score_differential, td_prob)
fg_model <- lm(epa ~ ., data = fg_model)
td_model <- lm(epa ~ ., data = td_model)

Here are the results!

Predictive EPA for a Field-Goal Situation

## 
## Call:
## lm(formula = epa ~ ., data = fg_model)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.72795 -0.10869 -0.02808  0.11767  0.83195 
## 
## Coefficients:
##                             Estimate Std. Error t value      Pr(>|t|)    
## (Intercept)                -0.188510   0.124201  -1.518      0.131461    
## game_seconds_remaining     -0.009314   0.001472  -6.326 0.00000000361 ***
## down                       -0.103985   0.037736  -2.756      0.006687 ** 
## yardline_100                0.006633   0.001685   3.935      0.000134 ***
## posteam_timeouts_remaining -0.273009   0.054400  -5.019 0.00000165020 ***
## score_differential          0.012075   0.017744   0.681      0.497379    
## fg_prob                     1.020982   0.167045   6.112 0.00000001036 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2352 on 132 degrees of freedom
## Multiple R-squared:  0.5303, Adjusted R-squared:  0.5089 
## F-statistic: 24.84 on 6 and 132 DF,  p-value: < 0.00000000000000022

In order to spike the ball beneficially in a field goal score game, a team must have good field position and field goal probability. The model suggests not to spike the ball with a lot of time left or a later down because the margin for error grows larger with more time and a later down. The -.009314 and -.103985 coefficients for seconds and down lower the predicted EPA on a spike. So if it’s 4th down and there are 40 seconds left the EPA is considerably lower than if it was 1st down and there were 10 seconds left. Additionally, if a team has at least 1 timeout, there is absolutely no need to spike the ball.

Predictive EPA for a Touchdown Situation

## 
## Call:
## lm(formula = epa ~ ., data = td_model)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.80719 -0.05991  0.01162  0.07509  1.28515 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                -0.073993   0.147102  -0.503   0.6159  
## game_seconds_remaining     -0.001639   0.001963  -0.835   0.4055  
## down                       -0.068575   0.055206  -1.242   0.2168  
## yardline_100                0.001468   0.001978   0.742   0.4596  
## posteam_timeouts_remaining  0.042754   0.119485   0.358   0.7212  
## score_differential         -0.007623   0.019941  -0.382   0.7030  
## td_prob                    -0.462032   0.251410  -1.838   0.0687 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2835 on 112 degrees of freedom
## Multiple R-squared:  0.1419, Adjusted R-squared:  0.09592 
## F-statistic: 3.087 on 6 and 112 DF,  p-value: 0.007799

In a touchdown score game, the odds are already stacked against any team. In that situation, the model suggests than the more likely a touchdown is, the worse decision a spike is. By spiking the ball - or throwing an incompletion - that increases the likelihood of a turnover on downs and lowers the likelihood of a touchdown. The model is definitely flawed because timeouts remaining carries a positive coefficient, because, as stated, if there are any timeouts there’s really no reason for a spike decision to be made.

Let’s say that a team has the ball on 1st down with 30 seconds to go on the opponents 8 yard line, down 1. What’s the predictive EPA? Field goal probability is estimated at 70%.

##         1 
## 0.2078993

What about if they were down by 5 points instead of 1? Touchdown probability is estimated at 50%.

##          1 
## -0.4491277

In different scenarios there’s a drastic difference between a good spike and a bad spike. The biggest difference is that a field goal is so much more likely to be successful than a touchdown. The 16th best NFL team last year in redzone TD% was the Rams with ~59%, sacrificing that down lowers the odds of a touchdown, while increasing the likelihood of turnover on downs. A spike doesn’t hurt that much when down by a field goal because a field goal is much more attainable.

In almost any scenario a quarterback spike is impractical. Spikes are used so frequently and so inefficiently when there are only a select few examples when a spike produces positive results. If the purpose of a spike in a field goal score game is to stop the clock in order to kick a field goal, that makes perfect sense. If a team is stopping the clock when they have a timeout or if there is a lot of time remaining it becomes less reasonable. When down a touchdown, there’s almost never a reason to spike the ball. Possibly to set up one final play but it never really makes any sense to just spike the ball to then continue the drive. As a team, you just purposefully threw an incompletion and are still trying to end up in the end zone. Spiking the ball with 30 seconds left only stunts the progress of the drive, forcing the team to - and letting the defense - reset, and then continue the drive with the same goal before the spike. It makes no sense.

Conclusion

From all of the data from the past 10 years over half of the spikes had a negative EPA. About 90% of spikes in a touchdown score game had a negative EPA as well as about 60% of spikes in a field goal score game. In most scenarios there’s no real reason to spike the ball, there’s no reason to waste a down and throw a purposeful incompletion.

What if a team was down 1, had a 1st and 10 at the 40, with no timeouts and 45 seconds remaining? Assume 30% field goal probability.

##          1 
## -0.1279528

In another realistic scenario it doesn’t really make sense to increase the likelihood of a turnover on downs, and sacrifice the possibility of getting much better field position, in addition to expected points and win probability. A spike like this should NEVER happen.

There’s only a small area where a spike is reasonable. There are only a handful of scenarios where there may be a positive EPA, however, the rest of the time, spikes are just a forced incompletion and loss of down, overestimating the importance of time.

What happens on 1st and goal, with 35 seconds left, 7 yards from the end zone, down 5, with a 70% touchdown probability?

##          1 
## -0.5511967

It won’t take 35 seconds to run 4 plays from the 7 yard line. It makes no sense to think that time will run out. So often teams are afraid time will run out and always misunderstand and overrate how little effect the clock will have with 35 seconds and 7 yards left to go. No 3 play calls will ever surmount that extra wasted chance of a touchdown. Spikes in these scenarios are terrible, and give up a crucial chance that can win or lose the game. By spiking the ball, the chance is given up.

Quarterback spikes should only be used to set up plays to win or tie a game. Otherwise they really don’t serve much of a purpose other than to hurt the flow of the drive, the offense, down, distance, field position, touchdown probability, first down probability, expected points and win probability all just to spare 5 seconds to throw an incompletion rather than run a play. Quarterback spikes just open up the margin for error, additionally sacrificing the opportunity to find more offensive success, instead stalling on the drive. “Football is a game of inches.” Spiking the ball leaves nothing up to chance. That one chance could win the game, it could have been a touchdown, or even a 30-yard gain. A spike eliminates that chance. That one down, that one chance, will never be known.

Of all the fourth quarter spikes in a one-possession game in the last 10 years, almost 1 out of every 4 spikes occurs with more than 30 seconds remaining.

## ── nflverse play by play data ──────────────────────────────────────────────────
## ℹ Data updated: 2022-09-27 07:35:02 EDT
## # A tibble: 1 × 2
##   `n()`     n
##   <int> <int>
## 1    65    65
## [1] 25.1938

That is ridiculous. Time will never run out in four plays with >30 seconds left, yet poor spikes occur in this time frame so frequently. There is no reason to waste a down, lower the likelihood of a touchdown, and increase the chance of a turnover on downs. A spike like that should never happen. Teams constantly misinterpret the clock and feel pressured to waste a down for no real benefit. These spikes must stop. All these spikes do is waste a down, taking away a chance to gain yardage and decrease the odds of success, while also increasing the chance of failure (turnover on downs) with no real overall benefit, other than to stop the clock, which is not a problem with more than 30 seconds left. Teams prepare for 2-minute drills all the time, they do not need make rash decisions to spare the clock, hurting the overall goal of the drive. Teams need to understand the clock, and realize a spike may hurt their overall chances to score a touchdown or kick a field goal. By understanding the clock, teams can make educated decisions based on how, despite time decreasing, a spike will only lower the likelihood of a score and increase the likelihood of a failure to convert a first down.

Quarterback spikes need to change.