1 - Introduction


In 1977 the idea of applying statistical analysis to American professional sports was popularized with Bill James and the introduction of Sabermetrics. However, professional sports tend to be very conservative and slow to change and you still see articles and comments from players and coaches on how their favored sport can not be boiled down to just numbers. Baseball began to explore the use of metrics in the 1990’s and early 2000’s and over the last 10 years we have slowly seen the NFL, NBA, HNL, and Professional Soccer Leagues begin to explore the application of data analytics in their own sports. In this project we will apply the tools that we have learned this semester to build a recommender system for NFL play calling.

My interest in this topic is also of a personal nature. I have spent 13 years as a high school football coach and spent 3 years as an offensive coordinator and play caller. Currently calling plays in football is based on a mixture of hours of video work, trying to find tendencies and weaknesses in the opponents defense and an intuition of how your opponent will adjust throughout the game. This leads to the production of the massive, although not nearly as large in high school, play calling cards with everything that you think that will work for any given situation. The idea for this recommender is to provide a type of play and a direction to run the play in. For example we would like to indicate that on a 1^{st} and 10 from the 20 yard line, a given team should run the ball over the right tackle or throw the ball to the deep middle of the field.

2 - Data and Methodology


For this recommender system we will be using the play-by-play data for the 2013 through 2016 NFL seasons. The data that we will be using is available at NFL Savant and contains every play for each of the seasons. The data set contains 41 different variables dealing with various aspects of the game. We will predominately be working with the variables that give us information on the game situation, the play type and the yards gained. The work to prepare the data is not included in this paper but involved creating numeric variables to describe the various categories and play types and merging the various years together. There was also a number of steps needed to clean up issues in the csv files. The fully cleaned data set is loaded below, it contains 123814 different plays which we will end up segmenting into run and pass plays. We also load two small csv files that contain the unique identifier for each of the teams and the unique identifiers for each of the plays.

The heart of our recommender will be a decomposition of the user-item matrix into a user factor matrix and an item factor matrix using Alternating Least Squares matrix factorization. This will be calculated twice, once for rushing plays and again for passing plays. To facilitate this, we will be using the spark system and the sparklyr package in R. We begin this process by connecting to spark.

3 - Preparing the Data for our Recommenders


Given that our system is designed to recommend plays based off of the maximum number of yards that can be gained, we quickly realized that the system would naturally settle on pass plays. To avoid this issue we decided to build two separate user-item matrices, one for passes and one for runs. This will allow us to recommend the best pass play and the best run play for a given game situation. To facilitate this we will separate the data into two separate data frames and explore our target variable of yards gained in each of the sets.

3.1 - Exploring the Rushing Plays

We begin by exploring the rushing plays that have occurred over the last 4 years in the NFL. We see that there have been 51,972 rushing plays with an average rush of 4.38 yards. We note that the standard deviation for the plays is larger then the mean. This is related to the explosive rushing plays which we will end up removing from the data set. Our first attempt at a recommendation system left these in but we found that they influenced to system to overestimate the yards per rush. We can see the explosive plays as the long tail of the distribution.

paste("Number of Plays:", length(pbp_rush$Yards), 
      ", Average Yards per Attempt:", round(mean(pbp_rush$Yards), 2),
      ", Standard Deviation:", round(sd(pbp_rush$Yards), 2))
[1] "Number of Plays: 51972 , Average Yards per Attempt: 4.38 , Standard Deviation: 6.38"

3.2 Exploring the Pass Plays

We next explore the passing plays from the last 4 years of the NFL and see that there have been 71,842 pass plays, however there is an interesting feature of the data set. When we graph the pass yardage we notice a massive spike at 0. While there are some passes that end up being zero yards, we note that the majority of these plays are incompletions. We decided that we should remove these plays to give the data a more normal shape. We do want to include the information from the completion percentage for each pass in the recommender and we will discuss possible methods for this later in the paper.

In the graph below we see that removing the incompletions creates a more normal looking data set. We will also notice that the data has a long tail that indicates that there are a number of explosive plays in the data set that we will end up removing to improve the recommendations. Removing the incompletions reduces the total number of pass plays to 46,740 plays with an average yardage of 11.64 yards.

paste("Number of Plays:", length(pbp_pass$Yards), 
      ", Average Yards per Attempt:", round(mean(pbp_pass$Yards), 2),
      ", Standard Deviation:", round(sd(pbp_pass$Yards), 2))
[1] "Number of Plays: 46740 , Average Yards per Attempt: 11.64 , Standard Deviation: 10.78"

4 - Creating User-Item Data Frames and Loading into Spark


Now that we have some idea of what is going on in rushing an passing plays for the last 4 years we can build our user item data frames for the two play types. We start by reducing each of the data sets to the situation identifiers (SitId), play identifiers (PlayId), and the Yard gained. We next load each of the data frames into our spark instance and once we have the data loaded we use some of the built in dplyr features in sparklyr to remove the explosive plays from each of the data sets. While there are various definitions for an explosive play, we have elected to remove all running plays more than 15 yards and all passing plays more than 35 yards.

5 - Building the Recommender


Now that we have our data in we use the following commands to compute the Alternating Least Squares model for the rushing and passing plays. These will result in two model object that contain the factor matrices for the users (SitId) and items (PlayId) once we build the factor matrices we will use them to predict the predicted yards for each of the SitId and PlayId pairs.

5.1 Constructing the ALS Models

The code below generates the ALS factorization in spark.

# Creating the models for each down and run pass
rush_model <- ml_als_factorization(rush_small, rating.column = "Yards", 
                                   user.column = "SitId",
                                   item.column = "Play",
                                   iter.max = 5, regularization.parameter = 0.01,
                                   implicit.preferences = TRUE, alpha = 1.0)

pass_model <- ml_als_factorization(pass_small, rating.column = "Yards", 
                                   user.column = "SitId",
                                   item.column = "Play",
                                   iter.max = 5, regularization.parameter = 0.01,
                                   implicit.preferences = TRUE, alpha = 1.0)

5.2 Calculating the Root Mean Squared Error

We have elected to use the root mean squared error (RMSE) as a measure of the accuracy of our recommender this measure will give us information about the average number of yards that each of the plays is off by.

Calculating the RMSE for the Rushing Plays

We find that the rushing recommender has a RMSE of 2.02. This indicates the average difference between the yardage from the recommender and the yardage from the actual play. Given that the average rush play is only 4 yards the two yard difference could lead to some poor recommendations. We recognize that this is an area that would need continued work before the system would be used in a live situation.

sqrt(mean(with(rush_small_pred, prediction-Yards)^2))
[1] 2.015488
Calculating the RMSE for the Passing Plays

We find that the passing recommender has a RMSE of 2.99. Given that the typical pass play is 11.64 yards we find this difference is much more reasonable for our system.

sqrt(mean(with(pass_small_pred, prediction-Yards)^2))
[1] 2.994098

6 - Building the NFL Play Recommender


Now that we have our factorization matrices for the rushing an passing plays we can now build our recommendation system. For this portion of the project we will use the following methodology. First, we will generate a user item matrix for both the passing plays and the rushing plays. second, we will then build our system to first check to see if a given game situation is in the recommender. If it is, we will return the top two pass plays and the top two run plays for the given situation. If we have not seen the game situation, ie a cold start type situation, then we will return the best two pass plays as judged by their average yardage gained and the best two run plays using the same criteria. Finally, we output these plays along with the predicted number of yards that play is expected to gain in the given game situation. We include sample output from the rushing and passing prediction matrices.

Rushing Prediction Matrix
head(rush_pred)
              100         102       103        104        105        106        107
110199 -0.5316073 -0.02593189 0.7152297 -0.1499245 -0.4852820 -1.6861091 -0.1685180
110298  1.4588747  0.27142259 0.5282111  0.3367665  0.2883610  0.3112495  0.3089785
110397  1.3459811  0.28720546 0.7872098  0.1133299  0.2612576  0.6498827  2.8592724
110496  1.7435216  0.74015563 1.3295096  1.2254416  0.7724459  3.3875516  3.7033791
110521  0.7611403  0.70469797 1.0808897  4.9070508  0.6993462  1.0585020  1.5448542
110534  0.0000000  0.00000000 0.0000000  0.0000000  0.0000000  0.0000000  0.0000000
Passing Prediction Matrix
head(pass_pred)
             201        202       211       212        221         222
110199 0.6969699  0.9937938 0.9844354  2.393720 0.11492312  1.43355413
110298 0.3796494  2.0993939 0.2195770  3.320521 1.97026154  0.04652605
110397 2.0909097  2.9813811 2.9533060  7.181160 0.34476933  4.30066227
110496 3.9814873  7.2553211 3.7701047 14.084616 3.75308341  5.93951272
110566 1.1920599 -0.6750632 0.7493586  3.331336 0.01272894 19.99186356
110568 2.7878797  3.9751750 3.9377415  9.574880 0.45969246  5.73421652

6.1 - Code for the Recommender

The following code takes a given game situation in terms of team, down, distance, and yard line and returns the top two passing plays and the top two rushing plays.

getPlay <- function(team, down, distance, yardline){
  teamId = teams$TeamId[teams$Teams == team]
  SitId = yardline + 100*distance + 10000*down + 100000*teamId
  SitId = as.character(SitId)
  
  if(SitId %in% rownames(pass_pred)){
    best_pass <- names(sort(pass_pred[SitId,], decreasing = TRUE)[1:2])
    print(paste("Top Pass Choice:", 
                plays$Plays[plays$PlayId == as.numeric(best_pass[1])]))
    print(paste("Predicted Gain: ",
                round(sort(pass_pred[SitId,], decreasing = TRUE)[1]), 
                "Yards"))
    
    print(paste("Second Pass Choice:",
                plays$Plays[plays$PlayId == as.numeric(best_pass[2])]))
    print(paste("Predicted Gain: ",
                round(sort(pass_pred[SitId,], decreasing = TRUE)[2]), 
                "Yards"))
  }
  else{
    best <- sort(apply(pass_pred, 2, function(x) mean(x, na.rm = TRUE)), decreasing = TRUE)
    best_pass <- names(best)
    print(paste("Top Pass Choice:", 
                plays$Plays[plays$PlayId == as.numeric(best_pass[1])]))
    print(paste("Predicted Gain: ",
                round(best[1]), 
                "Yards"))
    
    print(paste("Second Pass Choice:",
                plays$Plays[plays$PlayId == as.numeric(best_pass[2])]))
    print(paste("Predicted Gain: ",
                round(best[2]), 
                "Yards"))
  }
  
  if(SitId %in% rownames(rush_pred)){
    best_rush <- names(sort(rush_pred[SitId,], decreasing = TRUE)[1:2])
    print(paste("Top Run Choice:", 
                plays$Plays[plays$PlayId == as.numeric(best_rush[1])]))
    print(paste("Predicted Gain: ",
                round(sort(rush_pred[SitId,], decreasing = TRUE)[1]), 
                "Yards"))
    
    print(paste("Second Run Choice:",
                plays$Plays[plays$PlayId == as.numeric(best_rush[2])]))
    print(paste("Predicted Gain: ",
                round(sort(rush_pred[SitId,], decreasing = TRUE)[2]), 
                "Yards"))
  }
  else{
    best <- sort(apply(rush_pred, 2, function(x) mean(x, na.rm = TRUE)), decreasing = TRUE)
    best_rush <- names(best)
    print(paste("Top Run Choice:", 
                plays$Plays[plays$PlayId == as.numeric(best_rush[1])]))
    print(paste("Predicted Gain: ",
                round(best[1]), 
                "Yards"))
    
    print(paste("Second Pass Choice:",
                plays$Plays[plays$PlayId == as.numeric(best_rush[2])]))
    print(paste("Predicted Gain: ",
                round(best[2]), 
                "Yards"))
  }
}

6.2 - Sample Output

We now include some of the sample output from running the recommender. We start by including two different teams placed in the same situation. Here we look at the New England Patriots and the Arizona Cardinals in a 1^{st} and 10 from their own 20 yard line. We notice that we get different recommendations for the play which we expect and that we see that Arizona, who has a weaker rushing game, has smaller predicted yards gained in their runs.

getPlay('NE', 1, 10, 20)
[1] "Top Pass Choice: Pass Deep Left"
[1] "Predicted Gain:  31 Yards"
[1] "Second Pass Choice: Pass Deep Middle"
[1] "Predicted Gain:  22 Yards"
[1] "Top Run Choice: Run Right Tackle"
[1] "Predicted Gain:  7 Yards"
[1] "Second Run Choice: Run Left End"
[1] "Predicted Gain:  5 Yards"
getPlay('ARI', 1, 10, 20)
[1] "Top Pass Choice: Pass Deep Middle"
[1] "Predicted Gain:  22 Yards"
[1] "Second Pass Choice: Pass Deep Left"
[1] "Predicted Gain:  18 Yards"
[1] "Top Run Choice: Run Left End"
[1] "Predicted Gain:  4 Yards"
[1] "Second Run Choice: Run Left Tackle"
[1] "Predicted Gain:  4 Yards"

Finally lets look at the same team in a couple of different situations. We will look at the Seattle Seahawks in the following situations. We will first place them in a 2^{nd} and 5 from their own 45 yard line and then look at 1^{st} and 5 from their opponents 5 yard line. Once again we note that the recommended plays are different and that in the second situation near our opponents goal line we get run plays and pass plays with short yardage gains expected.

getPlay('SEA', 2, 5, 45)
[1] "Top Pass Choice: Pass Deep Left"
[1] "Predicted Gain:  18 Yards"
[1] "Second Pass Choice: Pass Deep Middle"
[1] "Predicted Gain:  10 Yards"
[1] "Top Run Choice: Run Center"
[1] "Predicted Gain:  3 Yards"
[1] "Second Run Choice: Run Left Guard"
[1] "Predicted Gain:  1 Yards"
getPlay('SEA', 1, 5, 95)
[1] "Top Pass Choice: Pass Deep Right"
[1] "Predicted Gain:  9 Yards"
[1] "Second Pass Choice: Pass Deep Left"
[1] "Predicted Gain:  7 Yards"
[1] "Top Run Choice: Run Left Guard"
[1] "Predicted Gain:  5 Yards"
[1] "Second Run Choice: Run Right Guard"
[1] "Predicted Gain:  3 Yards"

7 - Conclusions and Future Work

The NFL Play Recommender has proven to be a very interesting proof of concept that modern techniques in recommender systems can work with the relatively limited data from the last 4 years of the NFL to build a working recommendation system. Not only did we get a working system, we also constructed a system that has solid performance in the passing game and decent performance in the rushing game with a fairly straightforward code-base. Our final data sets ended up containing less and 100,000 observations of 7 rush plays and 6 passing plays. We would love to see what is possible if we had a full playbook worth of plays instead of basic directions for the play. We feel like method would allow us to become even more specific in what we recommend.

We also noticed an interesting trait of the passing portion of the recommendation system. The system tends to select the long pass plays. Given that we want the system to recommend the best yardage play in a given situation this makes sense. We hypothesize that there may be a few different ways of handling this problem that we would like to implement in future work. One solution to this problem would b to simply construct a third recommender that handles the short passing plays. This has the drawback of limiting the data set even further and may be unnecessary. The more interesting ways of handling this problem deal with using the incompletions as another data source. We could see using the probability of incompletion for a given pass and team combination serving as a binary variable that would serve to weight the passing plays. This could potentially decrease the power of deeper passes in the system but may serve to only recommend passes with a high probability of completion which takes away some of the gamble of play calling. It could be interesting to apply a weight of 0 or 1 base off of a random probability process but we would need to fully implement such a system to see if it woks. The finally technique could be to simply apply a post recommendation weight based on the probability completion an report this information to user of the system. This would place the decision making process in the hands of the coach and may be the most preferred method.

We really enjoyed getting to work with this data and were excited to see how well the system was able to perform. As a football coach we felt like this system would be a really nice tool to have on the sideline. Anything that can help a play-caller by giving us more information is a great help in the fast paced and stressful environment of football sideline. This project has proven, as much of the class has as well, to be a fascinating look at what can be done with data analytics and the modern statistical techniques.

