The call on the field: in NFL sports betting, we commonly hear that sportsbooks give the home team an automatic 3-point advantage. It would be nice to know whether there is any truth to that statement. In this analysis, we will use historical NFL game data and test this myth to determine if statistics can back the claim.
We are going to compare the home and visitor scores of NFL games and perform statistical tests to see if there is a basis for the 3-point advantage. First we need to get some game data. At Hooded Rhino, we have a dataset of all NFL games since the 2000 season. We have a total of 3989 games to use for this analysis.
Game.Id Home.Score Visitor.Score Home.Spread
Min. : 1 Min. : 0.00 Min. : 0.00 Min. :-26.500
1st Qu.: 998 1st Qu.:16.00 1st Qu.:13.00 1st Qu.: -6.500
Median :1995 Median :23.00 Median :20.00 Median : -3.000
Mean :1995 Mean :22.99 Mean :20.36 Mean : -2.548
3rd Qu.:2992 3rd Qu.:30.00 3rd Qu.:27.00 3rd Qu.: 3.000
Max. :3989 Max. :62.00 Max. :59.00 Max. : 20.000
In the data, the average score of home teams is 22.99 and the average score of visitor teams is 20.36 points. Perhaps, the difference of 2.63 points is the reason the 3-point rule of thumb was created. Also, it is interesting to see that the median home spread is exactly -3 points.
First we can look at the distribution of the spreads given to the home teams. In the histogram below we can see that a -3 point spread is the most common by far. While this may lead us to believe that home teams get an automatic 3 points, it isn’t statistically backed. We need to look into the data a little more.
It may help to visualize the actual home and visitor scores. In the plot below, we can see that home and visitor scores are evenly spread but on average, the home scores are slightly higher than the visitor scores.
Now, we can look at the distributions of the home and visitor scores. In the plot below, we see some interesting results.
1. The away teams score zero (0) points more often.
2. The distribuion of visitor scores has two peaks while the home distribution of home scores has one distinct peak. We should investigate this in a future analysis.
3. There ranges of scores are almost even; however there is a distinct shift in the two distributions.
Next, we want to see if there is a statistical difference between the means of the distributions. For this we will look at a two-tail t-test.
H0: There is not a difference between home and visitor scores in the NFL
(\(\mu\)home = \(\mu\)visitor)
H1: There is a significant difference between home and visitor scores in the NFL
(\(\mu\)home <> \(\mu\)visitor)
Welch Two Sample t-test
data: games$Home.Score and games$Visitor.Score
t = 11.3636, df = 7969.824, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
2.171944 3.077492
sample estimates:
mean of x mean of y
22.98847 20.36375
With a 95% confidence interval, we can see that there is a signifcant difference between the two means (p-value = 1.073832910^{-29}). We reject the null hypothesis, accept the alternate hypothesis, and conclude there is a difference between the home and visitor scores. Interesting, the 95% confidence level for the difference of average scores is in the range of 2.172-3.077 points. Perhaps that is why sports bettors state that Vegas sportsbooks give home teams the 3-point spread.
Next, let’s check the 99% confidence t-test.
Welch Two Sample t-test
data: games$Home.Score and games$Visitor.Score
t = 11.3636, df = 7969.824, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
99 percent confidence interval:
2.029620 3.219816
sample estimates:
mean of x mean of y
22.98847 20.36375
With a 99% confidence interval, the p-value is still very low (1.073832910^{-29}), and the range of differnece is between 2.03 and 3.22 points.
Lastly, we want to check two see if there is a relationship between the two distributions with a paired T-Test.
Paired t-test
data: games$Home.Score and games$Visitor.Score
t = 11.1146, df = 3988, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
99 percent confidence interval:
2.016143 3.233293
sample estimates:
mean of the differences
2.624718
As we see, the 99% confidence intervals change slightly, but still in the range of 3 points.
From this analysis, it appears that the 3-point rule of thumb in NFL sports betting has statistical roots. This statistical fact is begging to be taken advantage of in terms of sports wagers. In future articles, we will look at more sportsbook “myths” and see how they stack up.
© Hooded Rhino, 2015