Recently, I was watching an NFL game with my totally real, not make-belief girlfriend. A team was trying to convert a first down on 3rd and 1, and they brought their jumbo personnel in. They snapped the ball, the QB pivoted, handed the ball off to his running back, who went straight into the gap of the center and guard and got stuffed for no gain. Now it was fourth down, and the punt team was trotting out.

“That was stupid.” my girlfriend said, who while being incredibly smart and playing sports all her life, did not grow up watching much American football, preferring soccer instead.

Me, being more adverse in random NFL stuff like 3rd down statistics asked, “Why? It’s more successful than passing in that situation.”

“It just seems like everyone knew they were going to run it up the middle,” she replied.

This was an interesting point. Is it more successful to run on 3rd downs because many analyses classify QB scrambles as runs? Should an offense bring out their passing game personnel to fool a defense? Is it really inefficient to run up the middle? All of these questions can be answered using play-by-play data!

As of 2016, Dennis Erny at Armchair Analysis (http://armchairanalysis.com/) has been charting team’s formation personnel for each play and rush play direction. Using these two factors, I wanted to see which run plays were the most efficient and succesful depending on different situations.

To start, I imported the play-by-play data (play), the formation data (chart), and player information (player), and merged these together.

play <- read_csv("~/Desktop/nfl_00-16/PBP.csv")
chart <- read_csv("~/Desktop/nfl_00-16/CHART.csv")
player <- read_csv("~/Desktop/Football/nfl_00-16/PLAYER.csv")

plays = merge(play[,c(1:28)], chart, by = "pid")
plays = merge(plays, player[,c(1,5)], by.x = "bc", by.y = "player")

My first task was to format the formation data correctly. In the NFL, a formation is classified by a number, with the tens digit holding the number of running backs and the ones digit accounting for the number of tight ends. I also cropped the data to include only rushing plays and removed any plays that had “scramble” in their play-by-play detail. This isn’t perfect because some details won’t include this even on scrambles, but it’s better than not addressing it at all.

The table below shows the amount of rushing plays from each of the formation types.

plays$form = plays$rb * 10 + plays$te
run_plays_form = plays[plays$type == "RUSH" & is.na(plays$kne) & !grepl("scramble", plays$detail), ]
run_plays_form$form = as.factor(run_plays_form$form)
run_plays_form %>% group_by(form) %>% summarise(amt = n()) %>% arrange(-amt)
## # A tibble: 16 x 2
##      form   amt
##    <fctr> <int>
##  1     11  6037
##  2     12  3165
##  3     21  1502
##  4     13   755
##  5     22   694
##  6     10   213
##  7     20   204
##  8     23    98
##  9      1    47
## 10     31    26
## 11     14    13
## 12      0     9
## 13      2     9
## 14     30     6
## 15      3     5
## 16     32     5

So there are 16 formation personnels on record last year. This seems like too many to choose from, especially when we start getting into specific situations and subsetting our data set even further. I’m going to bin these formations into a few different groups: Spread, Pro, Ace, Tight, and Misc.

run_plays = plays[plays$type == "RUSH" & is.na(plays$kne) & (!grepl("scrambles", plays$detail) | !grepl("scramble", plays$detail)), ]
run_plays$style = "Misc"
run_plays$style = ifelse(as.numeric(run_plays$form) < 12, "Spread", run_plays$style)
run_plays$style = ifelse(as.numeric(run_plays$form) == 12, "Ace", run_plays$style)
run_plays$style = ifelse(as.numeric(run_plays$form) == 21, "Pro", run_plays$style)
run_plays$style = ifelse((as.numeric(run_plays$form) > 21 | as.numeric(run_plays$form) == 13) & as.numeric(run_plays$form) != 30, "Tight", run_plays$style)
run_plays$form = as.factor(run_plays$form)
run_plays$style = as.factor(run_plays$style)

I also want to bin the directions, since I don’t really care about left or right, so I took the direction variable down from RE, RT, RG, MD, LG, LT, LE to Inside, Tackle, and End. I also will be tracking if plays are in shotgun and if the play was successful or not. A success, according to Armchair Analysis, is when the offense gets 40% of the yards to go on 1st down, 60% of the yards to go on second down, and a first down on 3rd/4th down. Success percentage can tell us about consistency instead of something like yards per carry, which is biased to long gains or losses.

run_plays$direction = "Inside"
run_plays$direction = ifelse(run_plays$dir == "LT" | run_plays$dir == "RT", "Tackle", run_plays$direction)
run_plays$direction = ifelse(run_plays$dir == "LE" | run_plays$dir == "RE", "End", run_plays$direction)

run_plays$success = ifelse(is.na(run_plays$succ), 0, 1)
run_plays$shotgun = ifelse(is.na(run_plays$sg), 0, 1)

First Down Runs

In the following tables, I made sure to remove runs from the QB just in case of scrambles. I realize this may remove a few designed runs from players like Cam Newton, but currently, I just want to see what successful plays happen from position players.

run_dir = run_plays %>% group_by(direction) %>% filter(dwn == 1, ytg == 10, pos1 != "QB") %>% summarise(amt =  n(), ypc = mean(yds), succ = mean(success)) %>% arrange(-ypc)
run_style = run_plays %>% group_by(style) %>% filter(dwn == 1, ytg == 10, pos1 != "QB") %>% summarise(amt = n(), ypc = mean(yds), succ = mean(success)) %>% arrange(-ypc)
run_types = run_plays %>% group_by(style, direction) %>% filter(dwn == 1, ytg == 10, pos1 != "QB") %>% summarise(amt = n(), ypc = mean(yds), succ = mean(success)) %>% arrange(-ypc)

Direction Table

knitr::kable(run_dir)
direction amt ypc succ
End 1403 5.044904 0.4839629
Tackle 1568 4.558036 0.4706633
Inside 3224 4.072581 0.4491315

It looks like from the table above, on first down, outside runs were the most efficient (5.04 YPC) and successful (48.4% success rate). Already, it shows that running to the inside is the least efficient direction (4.07 YPC, 44.9% success rate). However, I think it’s relatively fair to assume that running outside is has more loss of yards along with longer runs. Let’s look at a density plot to see the comparisons.

Direction Density Plot

ggplot(run_plays, aes(yds, fill = direction, colour = direction)) +
  geom_density(alpha = 0.2) + xlim(-10, 40)

Yep, just as I suspected. We can see the density is higher on rushes less than zero yards and above seven yards. Since end runs have a larger standard deviation and less sample size, are we confident to say they’re more efficient and successful than inside runs? Let’s conduct a t-test to find out.

YPC T-test

t.test(run_plays[run_plays$direction == "End" & run_plays$dwn == 1 & run_plays$ytg == 10 & run_plays$pos1 != "QB",]$yds,
       run_plays[run_plays$direction == "Inside" & run_plays$dwn == 1 & run_plays$ytg == 10 & run_plays$pos1 != "QB",]$yds, 
       alternative="greater")
## 
##  Welch Two Sample t-test
## 
## data:  run_plays[run_plays$direction == "End" & run_plays$dwn == 1 &  and run_plays[run_plays$direction == "Inside" & run_plays$dwn ==     run_plays$ytg == 10 & run_plays$pos1 != "QB", ]$yds and     1 & run_plays$ytg == 10 & run_plays$pos1 != "QB", ]$yds
## t = 4.1409, df = 2064.4, p-value = 1.799e-05
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.5859251       Inf
## sample estimates:
## mean of x mean of y 
##  5.044904  4.072581

Success T-test

t.test(run_plays[run_plays$direction == "End" & run_plays$dwn == 1 & run_plays$ytg == 10 & run_plays$pos1 != "QB",]$success,
       run_plays[run_plays$direction == "Inside" & run_plays$dwn == 1 & run_plays$ytg == 10 & run_plays$pos1 != "QB",]$success, 
       alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  run_plays[run_plays$direction == "End" & run_plays$dwn == 1 &  and run_plays[run_plays$direction == "Inside" & run_plays$dwn ==     run_plays$ytg == 10 & run_plays$pos1 != "QB", ]$success and     1 & run_plays$ytg == 10 & run_plays$pos1 != "QB", ]$success
## t = 2.1817, df = 2656.1, p-value = 0.01461
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.008561315         Inf
## sample estimates:
## mean of x mean of y 
## 0.4839629 0.4491315

Using an alpha = 0.05, we can reject the null hypothesis. Thus, we accept the alternative hypothesis that outside runs are more efficient (p = 1.79e-5) and successful than inside runs on first down (p = 0.01).

Now let’s look at the formation styles.

Style Table

knitr::kable(run_style)
style amt ypc succ
Spread 2845 4.518102 0.4752197
Pro 785 4.468790 0.4662420
Misc 116 4.413793 0.4913793
Ace 1788 4.322148 0.4451902
Tight 661 4.164902 0.4447806

Another hypothesis confirmed that using a formation to spread the field causes more efficient runs, even on first down. The “Misc” formation has the highest success rate but it has a relatively low sample size compared to the rest of the styles (only 1.9% of all first down runs). Finally, we want to observe grouping the runs by both the direction and style

Style/Direction Tables

knitr::kable(run_types)
style direction amt ypc succ
Tight End 141 6.248227 0.5319149
Misc Tackle 34 5.411765 0.5000000
Spread End 635 5.018898 0.4834646
Ace End 417 5.000000 0.4868106
Spread Tackle 711 4.950774 0.4908579
Pro Inside 393 4.544529 0.4732824
Pro End 179 4.530726 0.4469274
Ace Tackle 448 4.453125 0.4508929
Pro Tackle 213 4.276995 0.4694836
Misc Inside 51 4.196078 0.5098039
Spread Inside 1499 4.100734 0.4643095
Ace Inside 923 3.952329 0.4236186
Tight Inside 358 3.729050 0.4162011
Misc End 31 3.677419 0.4516129
Tight Tackle 162 3.314815 0.4320988

The table above is already starting to have low samples for some of the formations, but we can see almost each formation produces a high YPC value when running to the ends. Interestingly enough, on a low sample, running to the outside in a tight situation gave us the highest YPC and only direction/style over 50% success rate. That’s really interesting that on every other run on first down from another direction/style, there was a 50% chance or worse that four yards weren’t picked up! For the next part, let’s elminate some of the low sample situations to generlize better.

Filtered Style/Direction Table

run_types = run_plays %>% group_by(style, direction)  %>% filter(dwn == 1, ytg == 10, pos1 != "QB") %>% summarise(amt = n(), ypc = mean(yds), succ = mean(success)) %>% filter(amt > 200) %>% arrange(-ypc)
knitr::kable(run_types)
style direction amt ypc succ
Spread End 635 5.018898 0.4834646
Ace End 417 5.000000 0.4868106
Spread Tackle 711 4.950774 0.4908579
Pro Inside 393 4.544529 0.4732824
Ace Tackle 448 4.453125 0.4508929
Pro Tackle 213 4.276995 0.4694836
Spread Inside 1499 4.100734 0.4643095
Ace Inside 923 3.952329 0.4236186
Tight Inside 358 3.729050 0.4162011

That’s better. Running outside from spread and ace formations have the highest YPC but are less consistent than running off tackle from the spread. What’s interesting also is the amount of runs from the spread right up the middle and how much less efficient they are (almost a full yard per carry).

What about running out of shotgun?

Shotgun Table

knitr::kable(run_plays %>% group_by(shotgun) %>% filter(dwn == 1, ytg == 10, pos1 != "QB") %>% summarise(amt = n(), ypc = mean(yds), succ = mean(success)) %>% filter(amt > 150) %>% arrange(-ypc))
shotgun amt ypc succ
1 1919 4.708702 0.4955706
0 4276 4.284144 0.4476146

In the table above, a 1 in the shotgun column marks if it was a shotgun formation. Here we see shotgun runs are generally more efficient than runs from under center.

The table below takes into account the formation and direction, along with shotgun, and we can see that in spread formations, running to the outside/tackles gets a > 5 YPC and a large sample containing a success rate of over 50%! Finally!

Shotgun + Direction/Style Table

knitr::kable(head(run_plays %>% group_by(shotgun, direction, style) %>% filter(dwn == 1, ytg == 10, pos1 != "QB") %>% summarise(amt = n(), ypc = mean(yds), succ = mean(success)) %>% filter(amt > 100, shotgun == 1) %>% arrange(-ypc)))
shotgun direction style amt ypc succ
1 Tackle Spread 334 5.344311 0.5389222
1 End Spread 236 4.783898 0.5211864
1 Inside Spread 881 4.353008 0.4892168
1 Inside Ace 171 3.918129 0.4385965

To wrap this analysis up, I wanted to make sure there’s no game script effect, where a team is winning by so much that they run up the middle with a jumbo set to simply run out the clock, I I eliminated runs from the 4th quarter.

Non-4th Quarter runs

run_types = run_plays %>% group_by(style, direction)  %>% filter(dwn == 1, ytg == 10, pos1 != "QB", qtr != 4) %>% summarise(amt = n(), ypc = mean(yds), succ = mean(success)) %>% filter(amt > 100) %>% arrange(-ypc)
knitr::kable(run_types)
style direction amt ypc succ
Spread End 514 5.256809 0.4805447
Spread Tackle 549 5.176685 0.5009107
Ace End 336 5.014881 0.4732143
Pro End 135 4.911111 0.4666667
Pro Inside 313 4.801917 0.4600639
Pro Tackle 164 4.384146 0.4634146
Ace Tackle 355 4.236620 0.4309859
Tight Inside 242 4.144628 0.3801653
Spread Inside 1155 4.123810 0.4632035
Ace Inside 719 3.990264 0.4353268

Looks like all runs were more efficient, so the effect is still the same. Thus, I’d conclude that teams should try to avoid running up the middle on first down. It’s relatively inefficient from any formation. For best results, it seems like teams should either go spread or ace, and try to run to the edges.

Short Yardage Runs

Alright, here comes the topic that started this conversation. On 3rd/4th down, what is the most successful way to convert? Because we are looking at short yardage situations, success rate is much more important than yards per carry. We’ll omit YPC from here on out in our analysis.

First, let’s subset only run plays on 3rd/4th down and within 2 yards.

short_ydg_runs = run_plays %>% filter((dwn == 4 | dwn == 3) & ytg < 3)
short_ydg_runs = short_ydg_runs[short_ydg_runs$pos1 != "K" & !grepl("Punt", short_ydg_runs$detail),]

Before we get into anything crazy, let’s take explore the data. Below is a grid of different variables and the success rate based on those.

g1 = ggplot(short_ydg_runs, aes(as.factor(pos1), ..count..)) + geom_bar(aes(fill = as.factor(success)), position = "dodge") + labs(title = "Position", fill = "success")
g2 = ggplot(short_ydg_runs, aes(as.factor(shotgun), ..count..)) + geom_bar(aes(fill = as.factor(success)), position = "dodge") + labs(title = "Shotgun", fill = "success")
g3 = ggplot(short_ydg_runs, aes(as.factor(style), ..count..)) + geom_bar(aes(fill = as.factor(success)), position = "dodge") + labs(title = "Style", fill = "success")
g4 = ggplot(short_ydg_runs, aes(as.factor(direction), ..count..)) + geom_bar(aes(fill = as.factor(success)), position = "dodge") + labs(title = "Direction", fill = "success")

Exploratory Analysis on 3rd/4th and short

Overall, it’s hard to discern between success rates, but we can definitely see the counts of the different variables well. Most teams elect to use a RB for short yardage, go from under center, use a spread formation, and run towards the middle.

Total success rates

short_ydg_runs %>% summarise(succ = mean(success), n = n())
##        succ   n
## 1 0.6601942 824

Something else to mention: looking at the results above, last year teams converted 66% of the runs on 3rd/4th and short. That success is relatively high compared to what we saw on early downs. Teams are likely to convert first downs on 3rd/4th and short!

Now for the more in depth analysis:

By Position

knitr::kable(short_ydg_runs %>% group_by(pos1) %>% summarise(amt = n(), succ = mean(success)) %>% arrange(-succ))
pos1 amt succ
TE 2 1.0000000
QB 105 0.7238095
RB 699 0.6523605
WR 16 0.5625000
DL 2 0.5000000

We’re keeping QB’s in this part of the analysis because many teams like to QB sneak in these situations. QB runs have a 72.4% success rate while RB’s have a 65.2% success rate.

By Direction

knitr::kable(short_ydg_runs %>% group_by(direction) %>% summarise(amt = n(), succ = mean(success)) %>% arrange(-succ))
direction amt succ
Inside 547 0.6709324
Tackle 131 0.6564885
End 146 0.6232877

Whoa, here’s our first relatively small surprise. Inside runs are the most attempted 3rd/4th and short runs and actually the most successful, contrary to what my girlfriend and I thought in the beginning. But this is biased towards QB runs, which we showed are more efficient. What if it’s just running backs?

By Direction, non-QB

knitr::kable(short_ydg_runs %>% group_by(direction) %>% filter(pos1 != "QB") %>% summarise(amt = n(), succ = mean(success)) %>% arrange(-succ))
direction amt succ
Inside 474 0.6603376
Tackle 126 0.6587302
End 119 0.6050420

Both inside and end success rates went down (probably because there are still some QB scrambles left in the play-by-play), but again, we see running inside is the most successful with non-QB’s.

Alright, what about the formation style?

By Formation Style

knitr::kable(short_ydg_runs %>% group_by(style) %>% summarise(amt = n(), succ = mean(success)) %>% filter(amt > 25) %>% arrange(-succ))
style amt succ
Spread 367 0.6784741
Pro 102 0.6568627
Tight 225 0.6444444
Ace 121 0.6280992

Alright now we’re starting to make some sense. Spread formations have a much better success rate (67.8%) than the rest of the formations, including jumbo, tight formations (62.8%).

What if we combined both direction and formation?

Direction/Style Table

knitr::kable(short_ydg_runs %>% group_by(style, direction) %>% summarise(amt = n(), succ = mean(success)) %>% filter(amt > 50) %>% arrange(-succ))
style direction amt succ
Pro Inside 68 0.6911765
Spread End 79 0.6835443
Spread Inside 232 0.6810345
Ace Inside 89 0.6629213
Spread Tackle 56 0.6607143
Tight Inside 152 0.6447368

Another big win for our hypothesis. Running inside works, but it’s more successful when lining up with at least two WR’s, thus contributing to our hypothesis that teams should spread out the defense on these situations.

What if we remove QB’s from this analysis?

Direction/Style, non-QB’s

knitr::kable(short_ydg_runs %>% group_by(style, direction) %>% filter(pos1 != "QB") %>% summarise(amt = n(), succ = mean(success)) %>% filter(amt > 50) %>% arrange(-succ))
style direction amt succ
Pro Inside 62 0.6774194
Spread Tackle 52 0.6730769
Spread Inside 204 0.6715686
Ace Inside 72 0.6527778
Tight Inside 131 0.6335878
Spread End 59 0.6271186

Each of these successes went down, but the biggest drop off is the difference is the “Spread, End” combination, which dropped about 5.5% taking out the 20 QB runs. This probably attests to the fact that QB scrambles or read-options are actually very efficient converting short yardage play.

So while not the most inefficient play call on 3rd/4th and short, handing the ball to the running back is still one of the more frequent inefficient calls that coaches tend to do. If a coach wants a successful run but does not want a QB sneak, they should look to spread out the field with multiple wide receivers and run it up the gut or off tackle.

Conclusions

  1. We’ve shown that outside runs on first downs not involving the QB are much more efficient and successful than inside runs.

  2. Shotgun non-QB runs are more efficient than non-QB runs from under center. The most successful first down run charted was shotgun, spread formation, tackle direction (53.8% success rate).

  3. Running up the middle in a tight formation on 3rd/4th and short is one of the least successful runs to try (still works 63% of the time, though). Teams should look to spread out the field if they want to run up the middle, and be aware QB’s are the most successful picking these runs up.