Introduction: It’s a Friday afternoon and I just cannot wait for this Sunday’s slate of NFL games. Out of anticipation I pull up the matchups on my phone and take a look at all the spreads. As always, 3 to 5 games jump out at me. “This line is a total joke” I think to myself as a slap down $10 on each of my locks this week. We fast forward to Sunday evening when my 5 locks have turned into a disgusting 1-4 record on the day. The realization sets in. I am a square.

This experience (that definitely only happened once) inspired a series of thoughts, questions, and hypothesis about betting NFL games.

  1. Vegas comes up with mind blowing accurate lines and they will always have a better sense of game outcomes than I will.
  2. Because Vegas is so good at knowing outcomes, any information they give off must be heavily analyzed.
  3. Are there any patterns in line movement that yield a winning betting strategy regardless of any other factor?

We first load in our NFL data dating back to the 2012-2013 season. I simply join the tables together to produce the following result:

df4
## # A tibble: 4,810 x 13
##     Date   Rot VH    Team  `1st` `2nd` `3rd` `4th` Final Open  Close    ML `2H` 
##    <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <dbl> <chr>
##  1   905   451 V     Dall~     0     7    10     7    24 46.5  46      170 21.5 
##  2   905   452 H     NYGi~     0     3     7     7    17 3     4      -200 1.5  
##  3   909   453 V     Indi~     7     7     0     7    21 41    42      385 21   
##  4   909   454 H     Chic~     7    17    10     7    41 10    9.5    -485 3    
##  5   909   455 V     Phil~     0    10     0     7    17 8.5   9      -450 3    
##  6   909   456 H     Clev~     3     0     3    10    16 41.5  42.5    350 20.5 
##  7   909   457 V     Buff~     0     7     7    14    28 42.5  39.5    130 0.5  
##  8   909   458 H     NYJe~     7    20    14     7    48 4.5   2.5    -150 18   
##  9   909   459 V     Wash~    10    10    10    10    40 50.5  50      325 26.5 
## 10   909   460 H     NewO~     7     7     3    15    32 9.5   8.5    -400 6    
## # ... with 4,800 more rows

The data is an absolute mess so the next few steps are dedicated to cleaning. The variables that I am most interested in are the opening and closing lines and the final score.

##make new columns for outright winner, spread winner open, spread winner close,line movement, open margin, close margin, over/under...,
df4=df4 %>% mutate(spread_results = home_final-away_final)
df4=df4 %>% mutate(total_results = away_final+home_final)
df4$home=as.character(df4$home)
df4$away=as.character(df4$away)


spread_winner_open=list()
for (i in 1:length(df4$date)){
  if (df4$away_final[i]+df4$spread_open[i]>df4$home_final[i])
  {
    spread_winner_open[i]<-df4$away[i]
  }
  else if (df4$away_final[i]+df4$spread_open[i]<df4$home_final[i])
  {
    spread_winner_open[i]<-df4$home[i]
  }
  else
  {
    spread_winner_open[i]<-"push"
  }
}
df4$spread_winner_open=spread_winner_open


spread_winner_close=list()
for (i in 1:length(df4$date)){
  if (df4$away_final[i]+df4$spread_close[i]>df4$home_final[i])
  {
    spread_winner_close[i]<-df4$away[i]
  }
  else if (df4$away_final[i]+df4$spread_close[i]<df4$home_final[i])
  {
    spread_winner_close[i]<-df4$home[i]
  }
  else
  {
    spread_winner_close[i]<-"push"
  }
}
df4$spread_winner_close=spread_winner_close

over_under=list()
for (i in 1:length(df4$date)){
  if (df4$total_results[i]>df4$total_close[i])
  {
    over_under[i]<-"over"
  }
  else if (df4$total_results[i]<df4$total_close[i])
  {
     over_under[i]<-"under"
  }
  else
  {
     over_under[i]<-"push"
  }
}
df4$over_under=over_under


df4$line_movement= abs(df4$spread_open-df4$spread_close)

for (i in 1:length(df4$date)){
  if (df4$spread_open[i]>df4$spread_close[i])
  {

     df4$line_movement[i]<-df4$line_movement[i]*-1
  }
}

The next variable that I will create is crucial. It is called “direction” and it describes if the direction of the line moved towards the winner or towards the loser. For example, the Packers play the Bears and the line opens at Packers -2.5. The line closes at Packers -3.5 and the Packers win the game and cover the spread. We would say the line moved “towards the loser” because the Bears were getting a friendlier spread at the close but were the losing bet. In this same example if the Bears covered the 3.5 point closing spread, then the result would be “towards the winner”. Similarly, if I use the phrasing “bet against the line movement” I would be talking about a bet on the Packers while “with the line movement” would be a bet on the Bears.

#towards winner towards loser
direction<-list()
for (i in 1:length(df4$date)){
  if (df4$line_movement[i]>0)
  {
    if (df4$away[i]==df4$spread_winner_close[i])
    {
      direction[i]<-"towards loser"
    }
    else
    {
       direction[i]<-"towards winner"
    }
  }
  else if (df4$line_movement[i]<0)
  {
    if (df4$away[i]==df4$spread_winner_close[i])
    {
      direction[i]<-"towards winner"
    }
    else
    {
       direction[i]<-"towards loser"
    }
  }
  else
  {
    direction[i]<-"none"
  }
}
df4$direction<-direction

df4$total_movement<-df4$total_close-df4$total_open

total_direction<-list()
for (i in 1:length(df4$date)){
  if (df4$total_movement[i]>0)
  {
    if (df4$over_under=="under")
    {
      total_direction[i]<-"towards loser"
    }
    else
    {
       total_direction[i]<-"towards winner"
    }
  }
  else if (df4$total_movement[i]<0)
  {
    if (df4$over_under=="under")
    {
      total_direction[i]<-"towards winner"
    }
    else
    {
       total_direction[i]<-"towards loser"
    }
  }
  else
  {
    total_direction[i]<-"none"
  }
}

Before I analyzed the movements in the spreads, I looked for some more obvious patterns to see if Vegas had any glaring patterns. I first checked out if home or away teams win against the spread more often and if the over or the under hits more.

a=sum(df4$spread_winner_close==df4$home)
b=sum(df4$spread_winner_close==df4$away)
a
## [1] 1121
b
## [1] 1184
a/(a+b)
## [1] 0.4863341

In 2,305 games since 2012, the home team has won ATS 1,121 times and the away team 1,184. Away teams win more often, but after doing a proportion test it is clear that the difference is not statistically significant with a p-value of .095. This means that the difference of 65 games is due to variance and away teams are not more likely to win.

a=sum(df4$over_under=="under")
b=sum(df4$over_under=="over")
a
## [1] 1198
b
## [1] 1156
a/(a+b)
## [1] 0.508921

A very similar conclusion can be made for the over/under. The difference is not statistically significant making the over/under a 50/50 bet. I examined numerous other splits and intersections of splits such as spreads over 7, home teams that are favored, game totals over 50. Nothing was statistically significant! In fact, most of the numbers were impressively close to 50%. I already had lots of respect for Los Vegas, but seeing these numbers forced me to give them even more.

Next, I took a look at my direction variable to see the line moved towards the winner or loser more often:

a=sum(df4$direction=="towards loser" )
b=sum(df4$direction=="towards winner")
a
## [1] 1025
b
## [1] 927
a/(a+b)
## [1] 0.5251025

As we can see, 52.5% of the time the line moves towards the loser. Very interesting right? Vegas moves the line to entice more people to bet on the losing side by throwing them a few extra points. Additionally, a proportion test demonstrated a p-value of .0135 making it significant and very unlikely that this phenomenon is due to variance. So, all I have to do is wait until the line is about to close and bet on the team that has a worse line than when it opened? Sort of… but not so fast. We have neglected the 10% vig that makes Vegas an absolute gold mine rather than a non-profit. In order to beat the vig and make a profit you need to win about 52.4% of all bets. This makes any of the rougher trends like betting on the away team every time very bad bets. When looking at our line movement strategy we can construct a 95% confidence interval of [0.5064 - 0.5436]. Yes, we will win bets more often than losing them but we will make little to no profit in the long run. Never the less, I am happy with these findings and tried to further them.

Let’s only look at games where the line moved a lot (3 or more points). When we narrow it down to just these games, we jump up to 54.9%. This is certainly enough to be profitable! This also makes some sense that a line moving over 2.5 points (with the exception of player injuries) means that Vegas is scrambling and desperately trying to get people on the losing side of the line.

a=sum(df4$direction=="towards loser" & abs(df4$line_movement)>2.5)
b=sum(df4$direction=="towards winner"& abs(df4$line_movement)>2.5)
a
## [1] 185
b
## [1] 152
a/(a+b)
## [1] 0.5489614

After looking futher into how the size of the line movement indicates outcome, I found that games with line movement equal to 0.5 to be very interesting. Since 2012 a whopping 56.3% of those games were won against the spread by the team that the line got worse for! After thinking about it for a while it does make sense. When Vegas moves the line half a point, they are telling us that the sharp money is on the other side. The small line movement is Vegas turning their hand face up and hoping that you won’t look. They want you to pay attention to the distractions of starting quarterbacks, winning streaks, and home and away splits. All the while their very small or very large movement of the line is a singing canary in a coal mine warning you about where the bookies want your money to go!

a=sum(df4$direction=="towards loser" & abs(df4$line_movement)==.5)
b=sum(df4$direction=="towards winner"& abs(df4$line_movement)==.5)
a
## [1] 320
b
## [1] 248
a/(a+b)
## [1] 0.5633803

Qualifications: Before you take out a loan and start hammering home these .5 and greater than 2.5 games, we need to consider a few important things. First off, past success does not guarantee future success. With thousands of different variables to look at there can be statistical anomalies. In other words, if we flip a coin 30 times, we wouldn’t expect to get 25 heads. However, if we flipped 10,000 coins 30 times each, it would be probable that a few of these coins landed heads 25 times. It would be unwise to bet on these coins to come up heads more often than tails in the future. That being said, I don’t think that is the case here and line movement really does tell us a big part of the story. The other more problematic point is that Vegas is constantly changing their strategy to disrupt trends like the one we have found. I investigated by doing a year-by-year breakdown. The results of betting $100 dollars on every single game with our different strategies are below and as you can see there is a change in pattern in the last couple years. Is this Vegas catching on to our trend or simply an outlier from the data? It is very tough to say.

This is a data table and graph of the year-by-year breakdown of our line movement direction. The profit variable in the tables and graphs is created by simulating a wager of 110 dollars to win 100 dollars on every game against the line movement

year=(2012:2021)
towards_winner=c(100,108,93,97,96,100,99,118,116,62)
towards_loser=c(103,111,112,113,112,120,118,116,120,69)
winning_percent=towards_loser/(towards_winner+towards_loser)
profit=(towards_loser*100 - towards_winner*110)
overall=data.frame(year,towards_winner,towards_loser,winning_percent,profit)
overall
##    year towards_winner towards_loser winning_percent profit
## 1  2012            100           103       0.5073892   -700
## 2  2013            108           111       0.5068493   -780
## 3  2014             93           112       0.5463415    970
## 4  2015             97           113       0.5380952    630
## 5  2016             96           112       0.5384615    640
## 6  2017            100           120       0.5454545   1000
## 7  2018             99           118       0.5437788    910
## 8  2019            118           116       0.4957265  -1380
## 9  2020            116           120       0.5084746   -760
## 10 2021             62            69       0.5267176     80
plot(year, profit, type = "l", lty = 1,main = "Betting $100 on all Games With Line Movement")
abline(h=0, col="red")

#net profit if we bet to win $100 every time
sum(profit)
## [1] 610

Next, we look at the year by year breakdown of only the half a point line movements

year=(2012:2021)
towards_winner=c(36,21,23,22,32,28,29,28,29,11)
towards_loser=c(41,28,35,33,38,41,42,27,25,9)
winning_percent=towards_loser/(towards_winner+towards_loser)
profit=(towards_loser*100 - towards_winner*110)
half_point=data.frame(year,towards_winner,towards_loser,winning_percent,profit)
half_point
##    year towards_winner towards_loser winning_percent profit
## 1  2012             36            41       0.5324675    140
## 2  2013             21            28       0.5714286    490
## 3  2014             23            35       0.6034483    970
## 4  2015             22            33       0.6000000    880
## 5  2016             32            38       0.5428571    280
## 6  2017             28            41       0.5942029   1020
## 7  2018             29            42       0.5915493   1010
## 8  2019             28            27       0.4909091   -380
## 9  2020             29            25       0.4629630   -690
## 10 2021             11             9       0.4500000   -310
plot(year,profit, type = "l", lty = 1, main = "Betting $100 on Line Movement of 0.5")
abline(h=0, col="red")

sum(profit)
## [1] 3410
#net profit if we bet to win $100 every time

Finally, we look at the year by year breakdown of our 3 or more point line movments.

year=(2012:2021)
towards_winner=c(11,11,12,11,14,16,13,29,24,23)
towards_loser=c(8,16,16,16,18,12,16,29,41,28)
winning_percent=towards_loser/(towards_winner+towards_loser)
profit=(towards_loser*100 - towards_winner*110)
atleast_three=data.frame(year,towards_winner,towards_loser,winning_percent,profit)
atleast_three
##    year towards_winner towards_loser winning_percent profit
## 1  2012             11             8       0.4210526   -410
## 2  2013             11            16       0.5925926    390
## 3  2014             12            16       0.5714286    280
## 4  2015             11            16       0.5925926    390
## 5  2016             14            18       0.5625000    260
## 6  2017             16            12       0.4285714   -560
## 7  2018             13            16       0.5517241    170
## 8  2019             29            29       0.5000000   -290
## 9  2020             24            41       0.6307692   1460
## 10 2021             23            28       0.5490196    270
plot(year, profit, type = "l", lty = 1,main = "Betting $100 on Line Movement of at least 3")
abline(h=0, col="red")

sum(profit)
## [1] 1960
#net profit if we bet to win $100 every time

Conclusion: If you decide to bet on the team that the line moves away from, it certainly seems to improve your odds of winning. Betting against 0.5-line changes has made a killing over the last 10 years (3,410 by betting 100 on each game at -110 odds) but a recent dip may be enough for us to want to wait and see what happens next. Our other strategy of betting against line changes of 3 or more has done very well, swapping some profit for more consistency and perhaps the start of a dominant upward trend (1,960 in profit and 1,460 of it coming from last season!). My final statistical and gambling recommendation would be that at the very least you heavily consider the line movement in all of your bets and listen to the canary’s profitable song!