Background

In an NFL game, each offensive play is either a pass (the quarterback throws the ball to a receiver downfield) or a rush (the ball is handed off to a runner who advances it on the ground, with teammates trying to fend off defensive players who are trying to attack/tackle the runner). An instance of this is called a “carry”. Rushing is more physical and more affected by field and weather conditions than passing.

NFL teams play in very different climates. In winter (Dec/Jan/Feb), teams like Buffalo and Green Bay host games in sub-freezing outdoor stadiums, while teams like Miami and Houston are used to warm weather year-round. A warm-weather team traveling to a frozen stadium may struggle to run the ball effectively.

Question: When a top-8 cold team hosts a top-8 warm team in winter, is the home rushing advantage larger than the league-wide winter average?

Setup

  • Data: 10 NFL seasons (2014–2023), nflfastR play-by-play
  • Winter: December, January, February games only
  • Cold teams: BUF, GB, CHI, CLE, NE, PIT, DEN, CIN
  • Warm teams: MIA, TB, JAX, HOU, NO, AZ, DAL, ATL
  • For each game: compute Home Average Yards per Carry \(-\) Away Average Yards per Carry
  • Compare the cold-home-vs-warm-away subset against the overall winter average

The Math: Rushing Advantage

For each game \(j\), the home rushing advantage is:

\[\Delta_j = \text{YPC}_{\text{home},\, j} \;-\; \text{YPC}_{\text{away},\, j}\]

A positive \(\Delta_j\) means the home team rushed more efficiently than the visitors. We want to know whether the mean \(\Delta\) for cold-vs-warm games is significantly different from the overall winter mean \(\mu_0\).

Why a t-Test?

A t-test is used when we want to determine whether a sample mean differs significantly from a known value (or another sample mean), and the population standard deviation is unknown. We estimate it from the data using \(s\).

We use a one-sample t-test here because we are comparing the mean of one specific group (cold-vs-warm games) against a fixed value (the overall winter average):

\[t = \frac{\bar{\Delta}_{\text{cold vs warm}} - \mu_0}{s \;/\; \sqrt{n}}\]

where \(\bar{\Delta}\) is the sample mean of the cold-vs-warm games, \(\mu_0\) is the overall winter mean advantage, \(s\) is the sample standard deviation, and \(n\) is the number of cold-vs-warm games. If \(|t|\) is large enough, the difference is unlikely due to chance.

R Code

game_ypc <- winter_rushes %>%
  mutate(side = ifelse(posteam == home_team,
                       "home", "away")) %>%
  group_by(game_id, season, home_team,
           away_team, side) %>%
  summarise(ypc = mean(yards_gained),
            .groups = "drop") %>%
  pivot_wider(names_from = side,
              values_from = ypc) %>%
  mutate(ypc_diff = home - away)

ggplot: Rushing Advantage Distribution

ggplot: Advantage by Cold Home Team

Plotly: 3D Scatter

Results

Statistic Value
Cold vs Warm: Mean Advantage 0.45 YPC
All Winter Games: Mean Advantage 0.16 YPC
N (cold vs warm games) 34
N (all winter games) 931
t-statistic 1.079
p-value 0.2882

If \(p < 0.05\), the cold-home rushing edge in these matchups is significantly larger than the overall winter home-field rushing advantage.