How the NFL Rankings Are Produced

In the last article we covered how to produce the NFL Power Rankings, now we will put those to use in predicting the spreads of games. If you recall, the coefficient produced for each team in the power rankings is a number that shows how a team is expected to do against the average NFL team on a neutral field. To put it as simple as possible, we’ll be adding those numbers together and factoring in a home field advantage to predict the spreads.

For our example we’ll be working with Week 16 games of the 2018 NFL season.

First we’ll load up the necessary R packages & define some objects with Pro Football Reference’s schedule. You’ll have to enter the date manually for this, so bear in mind you’ll have to run it once more at the end for Thursday/Saturday/Monday games.

library(XML)
library(RCurl)
## Loading required package: bitops
u <- "https://www.pro-football-reference.com/years/2018/games.htm"
newu <- getURL(u)
raw <- readHTMLTable(newu, as.is = T)

#remove unnecessary games
today <- raw$games
today <-today[!(today$Week=="Week"),]
index <- today[today$Week == "16",] #min(which(today$PtsA==""))
cur_date <- '2018-12-23' 

Next we’re going to create an object called ‘slate’ which will house today’s games. You’ll note there’s extra code that is changing the names and character types of the columns. Pro Football Reference has their data table in such a way that the when a game has already happened the winner shifts to the away column and the loser to the home column. As these games have not happened yet, everything is the right column for our use, but the names will need changed.

slate <- which(today$Date == "December 23" )
predictions <- data.frame(today$`Loser/tie`[slate],today$`Winner/tie`[slate])
colnames(predictions) <- c("Home", "Away")
predictions$Home <- as.character(predictions$Home)
predictions$Away <- as.character(predictions$Away)
locs <- rep("H", length(slate))

#Let's look at today's games
predictions
##                    Home                 Away
## 1         Buffalo Bills New England Patriots
## 2  Tampa Bay Buccaneers       Dallas Cowboys
## 3    Cincinnati Bengals     Cleveland Browns
## 4         New York Jets    Green Bay Packers
## 5       New York Giants   Indianapolis Colts
## 6         Detroit Lions    Minnesota Vikings
## 7     Carolina Panthers      Atlanta Falcons
## 8        Houston Texans  Philadelphia Eagles
## 9        Miami Dolphins Jacksonville Jaguars
## 10  San Francisco 49ers        Chicago Bears
## 11    Arizona Cardinals     Los Angeles Rams
## 12  Pittsburgh Steelers   New Orleans Saints
## 13   Kansas City Chiefs     Seattle Seahawks

Next we define a function that takes in a team & opponent & location, then uses the 2 linear models to output the predicted home score differential and win probability.

ptdif_call <- function(home,away,HN){
  
  arr <- c(0,0)
  
  r1 <- rankings$bs_coeff[which(rankings$team == home)]
  r2 <- rankings$bs_coeff[which(rankings$team == away)]
  
  
  if(HN == "H"){
    pt_dif <- r1 - r2 - coefficients(lm.NFLfootball)[[1]]
  }
  
  if(HN == "N"){
    pt_dif <- r1 - r2
  }
  
  arr[1] <- pt_dif
  prob <- 1 / (1+ exp(- coefficients(glm.pointspread)[[2]] * pt_dif))
  arr[2] <- prob
  
  return(arr)
}

Now we can call this function for each game that is happening today.

predictions$pt_dif <- rep(0,length(slate))
predictions$home_prob <- rep(0,length(slate))

for(i in 1:length(slate)){
  predictions$pt_dif[i] <- ptdif_call(predictions$Home[i], predictions$Away[i], locs[i])[1]

  predictions$home_prob[i] <- ptdif_call(predictions$Home[i], predictions$Away[i], locs[i])[2]
}

#clean up the final data by rounding and sorting

predictions$pt_dif <- round(predictions$pt_dif, digits = 2)
predictions$home_prob <- round(predictions$home_prob, digits = 2)
predictions <- predictions[order(predictions$home_prob,decreasing = T),]

Finally, let’s have a look at our predicted spreads for today. (I have to type these out, since I’m doing this after the Week 16 games we’re played the table on Pro Football Reference is not going to reflect properly.)

.

.

The heaviest road favorite is the New England Patriots, favored by 14.99 points. For gambling purposed you can round up to 15. The probability they simply win the game, the money line, is 91%, or about -1000 in gambling terms.

Inversely, if the home_pt_dif column is negative, the road team is favored. This week the Rams are favored over the Cardinals, according to J.A.R.E.D.G.O.F.F., by -16.25 points.

How did J.A.R.E.D. do this week? Well, it is January of 2019 as I type this, so I can tell you. If you followed J.A.R.E.D.’s picks straight up you would have finished the week 10-4 (2 of the losing picks too were also essentially pick ’ems, Miami & Seattle).

There is more to be done with this model and this is just a start, but if you chose to utilize any of this to create your own model I hope this helps in some way.