NFL First Down Success Rate Regression Analysis (2025)

Introduction

For this lab, I wanted to model something that gets closer to how offensive consistency actually works in football. Instead of using a broad outcome like total yards or total points, I built a custom team-level metric called First Down Success Rate and then regressed it on several offensive indicators from nflreadr.

The idea behind this metric is simple. A strong offense is not just one that occasionally hits big plays. A strong offense is one that consistently stays on schedule, creates manageable situations, and gives itself a better chance to sustain drives. First Down Success Rate is meant to capture that idea at the play level and then summarize it across a full season.

What First Down Success Rate Means

First Down Success Rate is a play-based measure of whether an offense gained enough yardage on a given play to be considered meaningfully successful relative to the down and distance.

A play is counted as a success using the following rules:

On 1st down, the play is successful if it gains at least 40% of the yards needed for a first down
On 2nd down, the play is successful if it gains at least 60% of the yards needed
On 3rd or 4th down, the play is successful only if it gains 100% of the yards needed

This framework reflects normal football logic. On first down, an offense does not need to gain every yard immediately for the play to be useful. A moderate gain can still keep the offense on schedule and preserve flexibility. On second down, the threshold becomes more demanding because there is less room for error. By third and fourth down, the offense has to fully convert.

So the process works in two steps. First, each run or pass play is labeled as either a 1 for success or a 0 for failure based on those down-and-distance rules. Then those success indicators are averaged across the season for each team. That produces a season-long First Down Success Rate.

In practical terms, if a team has a First Down Success Rate of 48%, that means 48% of its offensive run and pass plays met the success threshold for their specific situation.

Data Construction

I limited the analysis to regular season offensive run and pass plays and summarized the data at the team-season level. The independent variables were chosen to capture different dimensions of offensive performance. EPA per play measures overall efficiency, turnover rate captures mistakes that can disrupt or end drives, explosive play rate measures how often an offense generated a gain of at least 20 yards, and red zone success reflects how often an offense still produced successful plays once the field became compressed inside the 20-yard line.

Team Rankings Table

The following table provides a full team-by-team ranking based on First Down Success Rate. It is meant to give a clean descriptive overview of where each offense stood before moving into the scatter plot and regression analysis. In addition to the custom success metric, the table also includes the main independent variables used later in the model so it is easier to see how efficiency, explosiveness, turnovers, and red zone execution vary across teams.

Rank	Team	Success Rate	EPA/Play	Turnover Rate	Explosive Rate	Red Zone Success
1	Los Angeles Rams	57.0%	0.145	1.4%	8.0%	55.6%
2	Buffalo Bills	53.7%	0.141	1.7%	7.3%	53.7%
3	San Francisco 49ers	53.1%	0.089	2.1%	5.2%	50.0%
4	New England Patriots	52.5%	0.157	1.6%	8.4%	47.4%
5	Green Bay Packers	52.1%	0.113	1.3%	6.5%	49.7%
6	Dallas Cowboys	51.0%	0.095	1.9%	6.3%	42.9%
7	Baltimore Ravens	49.8%	0.034	2.3%	7.1%	42.3%
8	Indianapolis Colts	49.7%	0.075	2.1%	5.4%	47.6%
9	Washington Commanders	49.4%	0.007	2.2%	5.2%	49.0%
10	Kansas City Chiefs	49.3%	0.034	1.4%	5.3%	45.2%
11	Cincinnati Bengals	49.2%	−0.008	2.3%	5.3%	51.4%
12	Chicago Bears	48.6%	0.080	1.0%	6.5%	48.9%
13	Seattle Seahawks	48.6%	0.030	2.6%	7.5%	43.4%
14	Detroit Lions	48.4%	0.081	1.4%	7.6%	43.2%
15	Jacksonville Jaguars	48.3%	0.033	1.7%	6.2%	50.7%
16	Atlanta Falcons	48.2%	−0.022	1.6%	5.4%	46.4%
17	Denver Broncos	48.0%	0.048	1.4%	5.4%	46.3%
18	Pittsburgh Steelers	47.8%	0.025	1.4%	5.3%	48.8%
19	Los Angeles Chargers	47.5%	−0.024	1.8%	5.5%	42.4%
20	Carolina Panthers	47.3%	−0.037	2.2%	5.2%	43.8%
21	Philadelphia Eagles	46.6%	0.027	1.2%	5.1%	50.3%
22	Miami Dolphins	46.5%	−0.016	2.4%	6.3%	48.6%
23	New York Giants	46.1%	0.010	1.5%	5.8%	31.5%
24	Arizona Cardinals	46.1%	−0.017	1.8%	5.3%	44.2%
25	New Orleans Saints	45.5%	−0.086	2.2%	3.6%	37.4%
26	Minnesota Vikings	45.5%	−0.116	2.9%	4.9%	39.7%
27	Tampa Bay Buccaneers	45.0%	−0.006	1.5%	6.1%	41.1%
28	Houston Texans	44.6%	−0.010	1.0%	5.2%	35.6%
29	New York Jets	44.1%	−0.132	2.1%	3.9%	40.5%
30	Las Vegas Raiders	42.4%	−0.215	2.3%	4.5%	38.6%
31	Tennessee Titans	40.8%	−0.160	1.8%	5.1%	45.5%
32	Cleveland Browns	37.9%	−0.189	2.3%	4.0%	36.8%

This table ranks all teams by First Down Success Rate, which is the share of offensive plays that met the success benchmark for their down and distance. Explosive Rate shows the share of plays that gained at least 20 yards, so it represents how often an offense generated chunk production rather than steady incremental gains. The color scale is meant to make the table easier to read at a glance, with stronger values standing out more clearly and weaker values fading into lighter shades.

This visual shows how First Down Success Rate aligns with EPA per play across the league. Teams in the upper-right quadrant were above average in both efficiency and down-to-down consistency, while teams in the lower-left quadrant lagged in both areas. The fitted line reinforces that there is a positive league-wide relationship between the two, even if there is still meaningful variation from team to team.

Initial Regression Model

Variable	EPA per Play	Turnover Rate	Explosive Play Rate	Red Zone Success
epa_per_play	1.000	-0.478	0.759	0.585
turnover_rate	-0.478	1.000	-0.254	-0.194
explosive_play_rate	0.759	-0.254	1.000	0.429
red_zone_success	0.585	-0.194	0.429	1.000

The correlation matrix is used here as a diagnostic tool rather than a result of substantive interest on its own. In regression, the main concern is not simply whether predictors are related, because some relationship among offensive variables is expected. The real issue is whether that relationship becomes strong enough that the model struggles to separate the individual contribution of each predictor. That is the core problem of multicollinearity. When two independent variables move too closely together, the model can still fit well overall, but the coefficient estimates become less stable and harder to interpret cleanly.

From an analytical standpoint, the most important relationship in this matrix is the one between EPA per play and explosive play rate. That pairing deserves attention because both variables reflect offensive efficiency and the ability to generate valuable outcomes. A stronger correlation there would suggest that both variables are partly measuring overlapping versions of the same offensive quality. That does not automatically mean one of them must be removed, but it does mean the model should be evaluated carefully before treating both coefficients as fully distinct effects.

Variable	VIF
EPA per Play	3.716
Turnover Rate	1.368
Explosive Play Rate	2.455
Red Zone Success	1.548

The Variance Inflation Factor, or VIF, gives a more targeted diagnostic for multicollinearity by measuring how much the variance of a coefficient is inflated because that predictor overlaps with the others in the model. Conceptually, it asks whether a variable is bringing in distinct information or whether much of what it contributes could already be reconstructed from the other predictors. A VIF close to 1 suggests very little overlap-driven inflation. As that value rises, it becomes a sign that the predictor is less independent from the rest of the model and that its coefficient may be less reliable as a stand-alone effect.

That makes VIF especially useful as a follow-up to the correlation matrix. The correlation matrix shows pairwise relationships, but VIF reflects the combined overlap structure of the full set of predictors. In other words, it is possible for no single pairwise correlation to look extreme while the broader set of variables still creates estimation instability. Using both tools together gives a stronger basis for deciding whether a simplified model would be cleaner and more interpretable.

Final Regression Model

Term	Estimate	Std. Error	t value	p value	95% CI Low	95% CI High
Initial Regression Results
Success rate regressed on EPA per play, turnover rate, explosive play rate, and red zone success
Intercept	0.394	0.033	11.769	<0.001	0.325	0.462
EPA per Play	0.383	0.053	7.164	<0.001	0.273	0.493
Turnover Rate	1.923	0.644	2.986	0.006	0.601	3.244
Explosive Play Rate	-0.230	0.351	-0.655	0.518	-0.950	0.490
Red Zone Success	0.134	0.059	2.291	0.030	0.014	0.254

Term	Estimate	Std. Error	t value	p value	95% CI Low	95% CI High
Final Regression Results
Explosive play rate removed after diagnosing overlap and multicollinearity
Intercept	0.381	0.027	14.032	<0.001	0.326	0.437
EPA per Play	0.359	0.038	9.417	<0.001	0.281	0.437
Turnover Rate	1.840	0.625	2.944	0.006	0.560	3.120
Red Zone Success	0.136	0.058	2.354	0.026	0.018	0.255

Model	R-squared	Adjusted R-squared	Residual Std. Error	F Statistic	Model p-value	Observations
Model Fit Comparison
Comparing overall model quality before and after removing overlap
Initial Model	0.873	0.854	0.014	46.502	<0.001	32
Final Model	0.871	0.857	0.014	63.148	<0.001	32

The initial regression table shows how each predictor relates to First Down Success Rate when all four variables are included together. The estimate gives the direction and size of the relationship, the t value reflects the strength of the estimate relative to its uncertainty, and the p value indicates whether the coefficient appears statistically distinguishable from zero within the model.

The final regression table is more analytically useful because it reflects the version of the model after removing a variable that risked overlapping too heavily with broader efficiency measures. That makes the remaining coefficients easier to interpret as distinct drivers of offensive success. In this final setup, EPA per play stands in for overall offensive efficiency, turnover rate captures possession-ending or drive-damaging mistakes, and red zone success reflects whether an offense maintains execution quality in the most compressed part of the field.

The model fit comparison is important because simplification alone is not automatically an improvement. What matters is whether the cleaner model preserves most of the explanatory power while improving interpretability. If the fit metrics remain strong after removing the overlapping variable, that supports the decision to favor the more parsimonious specification. In a regression setting, that is often the preferable result because it gives a model that is not only strong statistically, but also easier to explain in football terms.

Overall Summary

For this lab, I wanted to create a version of offensive performance that better reflects how football is actually played from snap to snap. That is why I used First Down Success Rate instead of something broader like total yards or total points. This metric focuses on whether an offense is consistently staying on schedule based on the demands of the situation. A 4-yard gain on first down is not spectacular, but it is still useful because it keeps the offense in a favorable rhythm and makes the next snap easier to manage. That kind of logic is built directly into this measure.

The team rankings show that offensive success is not just about isolated explosive moments. Teams that grade out well tend to combine broad efficiency, fewer mistakes, and stronger situational execution. The scatterplot supports that idea by showing a positive relationship between EPA per play and First Down Success Rate, while still leaving enough spread to show that the two measures are related without being interchangeable.

From the modeling side, the biggest analytical issue was whether the independent variables were sufficiently distinct from one another. The correlation matrix suggested that EPA per play and explosive play rate were the most likely to overlap conceptually, and the VIF analysis provided a second check on whether that overlap was strong enough to threaten interpretability. That combination of diagnostics supported the decision to move from the initial model to a cleaner final specification.

The final regression model is the strongest version of the analysis because it balances explanatory power with interpretability. Rather than stacking multiple variables that may partially duplicate one another, it isolates three core dimensions of offense. EPA per play captures overall efficiency, turnover rate captures damaging mistakes, and red zone success captures execution under tighter spatial constraints. Together, those predictors provide a cleaner explanation of why some teams were more consistently successful on a down-to-down basis than others.

Overall, I think this approach fits the assignment well because it uses nflreadr data to build a meaningful team-season regression model while also taking diagnostic testing seriously. More importantly, I think it gets closer to real football logic. Offensive quality is not just about accumulating yards or points in the aggregate. It is about repeatedly creating favorable situations, avoiding drive-killing errors, and executing with enough consistency to sustain offense over the course of a season. ```