As a student equipment manager the last three years with the UVA Football Team under Coach Bronco Mendenhall, I’ve seen firsthand the time, effort, and preparation that must be put in to have quality special teams play at a high level of football. With the NFL moving towards safer kickoffs and rules being proposed to eliminate them entirely, I chose to focus all of my attention onto punts and punt returns. In this paper I will discuss how I went about the task of evaluating punt returner performance, gunner performance, the optimal number of rushers for a punt , and whether you can “out kick your punt coverage”.
The data for the NFL Big Data Bowl on Kaggle contains game and play data, player information data player specific tracking data from AWS Next Gen Stats throughout all 17 weeks of the 2018, 2019, and 2020 NFL seasons1.
For my evaluation of punt returners and gunners I used the 2018 and 2019 seasons as training data in my regressions and machine learning algorithms. The 2020 season was used as testing data. I filtered the plays data set to only include punts and then split it by season to be joined with that season’s tracking data. Further preparation will be explained in the specific analysis sections.
On any given punt return, is it possible to tell if the returner did a poor or good job? To try and answer this question I decided to run a regression to predict how many yards a returner was expected to get versus how many he actually got. Using an animation inspired and adapted from former fellow UVA student Ella Summer’s 2021 Big Data Bowl Submission2 I show an example punt return play. On this play from Week 1 of the 2020 NFL season, Hunter Renfrow (Jersery number 13) receives the punt and returns it for 27 yards. My model predicted that the return on this play would be 8 yards. Renfrow was able to exceed that by 19 yards, setting the Raiders up in excellent field position against the Panthers.
In order to calculate expected yards for a given return, I made a two critical decisions in setting up my variables for regression. I chose to examine player positioning at the time of the punt returner catching the ball. I also chose to only consider punt returns in which the return yardage was less than 33 yards. I made my decision based off of the average return of 9.5 yards and a standard deviation of 11.0 yards. Because of the infrequency of punt return touchdowns and longer returns in general, I felt it would be better to try and evaluate the more normal return. Leaving the longer returns in the training set would have skewed it severely. I also took any players from the 2020 season who did not have at least 5 return attempts and instead placed these observations in the 2018-2019 training data.
I first ran a linear regression with the given variable kickReturnLength as my dependent variable.
I considered the returning team to be on offense since I was evaluating the return aspect of a given punt.
My independent variables were the provided variables
and then my created variables
Def_1_blkd, def_2_blkd, and def_3_blkd are dummy variables that coded as 1 if that particular defender is being actively blocked and a 0 otherwise. In order to define a player as being actively blocked, they had to have an offensive player within 1 yard of them, said player also needed to be between the defender and the returner, and the offensive player needed to be looking at the defensive player.
This was done using the provided player tracking variables * x (Player position along the long axis of the field, 0 - 120 yards) * y (Player position along the short axis of the field, 0 - 53.3 yards) * o (Player orientation (deg), 0 - 360 degrees)
Figure 1
Figure 1 provides an example of how the blocked dummy variables were calculated. First I subtracted the defensive player’s x and y variables from the offensive players x and y to figure out what “quadrant” the two players were in. Using that knowledge, I was able to use trigonometry, specifically properties of right triangles, to calculate the angle of the defensive player from the origin in comparison to the offensive with either sin or cosine. Once I had that angle, I added and subtracted 60 degrees to the orientation variable of the offensive player to and then checked if the angle of the defensive player was within that range. If it was, then the offensive player was looking at the defensive player.
I first began by running a linear regression and found the variables kickLength, hangTime, def_1, and def_3 to be significant. I then calculated my test mean squared error for my linear regression by calculating the mean of the actual values minus the predicted values squared. I found the test MSE to be 38.6.
After first starting with linear regression, I decided to move to tree based regression methods to see if I may be able to obtain better results. I reran the regression using Recursive Binary Splitting, Bagging, Random Forest, and Boosting and settled on using Random Forest as it gave me the lowest mean squared error of 38.4. All variables were left in for my tree based methods. Once I chose Random Forest, I added my predictions to the original data in order to compare with the actual return values. For each player I averaged their actual return yards minus their predicted return yards.
| Name | Attempts | Average of Actual-Predicted (Yards) |
|---|---|---|
| Andre Roberts | 17 | 0.46 |
| Dwayne Harris | 6 | -0.16 |
| Kenjon Barner | 7 | 1.28 |
| Diontae Spencer | 9 | 0.86 |
| DeAndre Carter | 8 | 0.84 |
| Pharoh Cooper | 11 | -0.92 |
| Jakeem Grant | 17 | 3.13 |
| Kalif Raymond | 15 | -0.44 |
| Alex Erickson | 16 | 2.05 |
| Tommylee Lewis | 6 | 0.00 |
| Jaydon Mickens | 13 | -2.07 |
| Jabrill Peppers | 8 | 5.22 |
| Cooper Kupp | 5 | -2.92 |
| Desmond King | 5 | -2.70 |
| Jamal Agnew | 8 | 2.86 |
| Trent Taylor | 9 | 3.08 |
| David Moore | 8 | 2.85 |
| Greg Ward | 16 | -1.33 |
| River Cracraft | 5 | -0.52 |
| Christian Kirk | 15 | -1.82 |
| Keke Coutee | 7 | -2.52 |
| Nyheim Hines | 17 | 2.40 |
| D.J. Reed | 5 | -0.22 |
| Ray-Ray McCloud | 22 | -1.35 |
| Braxton Berrios | 7 | 2.30 |
| Richie James | 5 | -3.22 |
| Brandon Powell | 8 | 0.90 |
| Mecole Hardman | 16 | -2.79 |
| Hunter Renfrow | 13 | 5.63 |
| Steven Sims | 14 | -1.03 |
| Deonte Harris | 12 | 1.84 |
| Nsimba Webster | 14 | -1.74 |
| Gunner Olszewski | 14 | 2.16 |
| CeeDee Lamb | 11 | -0.69 |
| Donovan Peoples-Jones | 7 | -2.16 |
| James Proche | 15 | -0.38 |
| K.J. Hill | 9 | -2.82 |
| Marquez Callaway | 7 | 1.39 |
Hunter Renfrow and Jabril Peppers were clearly the most over performing returners according to my model, returning the average kick 5 yards further than expected. Mecole Hardman was a notable poor performer as he averaged almost 3 yards less than expected on his 16 attempts. If I was a team looking to improve in the return game by signing or trading for a new returner, I would take a look at Trent Taylor. He average over 3 yards more than expected last year and is currently on the Bengal’s practice squad so he could be signed to any team’s active roster. Players such as Renfrow and Peppers currently have bigger roles on offense and defense of their teams and would not be as easily obtained as Taylor would be.
In the most common punt formation, the punting team lines up a player to the left and right of the formation on the outside. These players are called “gunners”. The players the return team lines up across from these players to cover them are called “vises”. It is the gunners job to run down the field at the snap of the ball and make one of three common plays. Either they recover the punted football and “down it”, they tackle the returner, or they force the returner to make a fair catch. It this last aspect of their job I chose to evaluate as it is a likely outcome of any punt and is a win for the punting team with a good kick from the punter.
I chose to make this evaluation a classification problem with the response variable being a dummy variable of whether a 1 if the gunner was within 8 yards of the returner at the time of the returner catching the ball and a 0 otherwise. I chose 8 yards because the average distance of the closest defender to someone who called fair catch was 5 yards and one standard deviation was 3 yards. For the training data I used all punts from 2018 and 2019 that either resulted in a fair catch or a return and any observations of gunners who did not have at least 5 attempts in 2020. The rest of the 2020 data was used as testing and evaluation data.
The response variable, as mentioned above, was a variable called under_8 which was a 1 if the gunner was under 8 yards at the time of the returner catching the ball and a 0 otherwise.
The independent variables consisted of
and my created variables of
Euc_dist_snap was, in other words, the shortest distance from where they started to where they would need to be to make a play on the returner at the catch point.
Figure 2(Week 2, 2020)
Figure 2 shows an example of a play that the variable doubled would be coded as a 1.
On this play from week 2 of the 2020 season, Buccaneers gunner Jamel Dean (Number 35) is being double teamed and starts on the far side of the field from where the punt returner catches the ball. He beats the double team by running to the inside and covers a euclidean distance of 53 yards to make it to 3 yards from the returner when he catches the ball. While the Panther’s returner breaks Dean’s tackle, it was his initial contact that prevents a bigger return even if he should have made the tackle. My model gave Dean a 9% chance of being within 8 yards of the returner at the time of the catch so I would say this play was a win for Dean even with the missed tackle.
I first used the variables mentioned above to create a logistic regression that modeled the probability a gunner would be within 8 yards of the returner at the time of the catch. Some notable statistics from this model include an accuracy of 73.3%, a False Positive Rate of 16.1% and a False Negative rate of 42.8%.
I then moved onto tree based classification methods including Bagging, Random Forest, and Boosting. Boosting had the highest accuracy of 77.2% with a False Positive Rate of 20.0% and a False Negative rate of 26.8%. I chose boosting as my classification method and proceeded to add my predictions to my testing data set to compare with the actual values. I created three new variables to compare the results. Actual Rate is the total number of time a gunner made it within 8 yards of the returner at the time of the catch divided by the total number of attempts he had. Predicted Rate is the total number of predicted times a gunner projected to have been within 8 yards at the time of the catch divided by the total number of attempts. I then subtracted predicted rate from the actual rate to create Actual - Predicted.
Some interesting results include Jamel Dean, who had the greatest positive differential, making it within 8 yards 60% of the time when he was only predicted to have 10% of the time. Christian Blake and Grayland Arnold were the worst performers with Blake only actually making it within 8 yards 9% of the time and Arnold 0% of the time. Another particular player to consider is Mack Hollins who managed to make it to within 8 yards 72% of the time on his 25 attempts. Even though he was predicted to have made it 64% of the time, this is still an impressive feat nonetheless. If I was a team looking for a boost at the gunner position, I would again try to pick out over achieving players who are currently on practice squads, as they can be easily signed away to my team. A notable player who is currently on a roster but plays sparingly barring injuries, is Nick Westbrook-Ikhine. Westbrook-Ikhine has an actual rate of 60% even though his predicted rate is on 27%.
On a given play, how many rushers should the receiving team bring against the punter? Teams are a little restricted in this decision as the answer lies somewhere in between 1 and 8. If you don’t at least send 1 rusher, a punter could just hold the ball and let all his players run down the field to cover. If you bring more than 8, either you have no returner, or potentially could be leaving gunners uncovered. A punter could quickly receive the snap and throw the ball to a gunner for a first down.
In order to figure out the optimal number of rushers, I performed some basic filters. I first took out plays listed as “Non-special teams results”,punts that ended in a touchback, and punts that were blocked. Punts that had penalties were also removed. I decided to remove blocked punts because I did not want these numbers to negatively influence my averages. Touchbacks were removed because playResult was always listed as kickLength - 20, which makes sense. The problem lies that this is exactly like what a play with a 20 yard return would look like, so it, in a sense, invalidates the net yardage variable. I considered artificially changing the kicklength distance to account for this, but decided that it would be better just to remove touchbacks entirely.
I next created a variable I called Number_of_Rushers which counted the number of punt rushers on a given punt. In the PFF Scouting Data data set3, Pro Football Focus has a variable where they list all of the players on a given punt “actively trying to block the punt” and does not include players who “cross the line of scrimmage to engage in punt coverage players in a”Hold Up" role". My variable totaled the numbers of rushers on the list provided by PFF for a particular punt. I then counted up how many observations there were for each number of rushers, calculated the average punt length for each number of rusher, and the average net yardage gained for each number of rushers. These were calculated using the provided variables kickLength and playResult. I lastly totaled the number of blocked punts for each number of rushers, starting with a fresh data set and using the same method as before to calculate the number of rushers.
Looking at the created table, I believe the optimal number of rushers to bring on each punt is 5. Rushing anymore than 5 does not result in any gain in either net yardage or the number of blocked punts. Rushing 9 provides an interesting case as the average punt length and net yardage is only 36, without including any blocked punt yardage, of which there were 0 anyway. However, we do not have enough attempts to say if the shorter punt length and net is significant or not and rushing 9 players is impractical, as discussed earlier. Unless a player or coach notices a vulnerability in the punt blocking scheme that would require more than 5 rushers to exploit, 5 rushers is the optimal number to bring.
Should a punter always punt the ball as hard as he can or is there some scenarios where doing that would result in less yardage than if he kicked it with less power? To attempt to answer this question I split up punts from all 3 seasons into 2 yard intervals or “Bins”.
Like with the optimal number of rushers analysis, I first took out punts that had penalties, punts that end in a “Non-special teams result”, touchbacks, and blocked punts. I again did this in order to not have these influence my averages.
I filtered the punt data to be punts that were greater than 45 yards but less than 70 yards. Only 2 punts over 70 yards were returned. The other few were all punts that had been muffed or had bounced down the field to make it 70 yards.
I then proceeded to split the data into 12 bins of ~2 yards each. Once I had done that, I created the mean and median variables by calculating the mean and median of playResult. I created the Average Return variable by calculating the mean of kickReturnYardage. The number of each punts in each bin was totaled and this is the punts variable.
| Punt Length | Median Net Yardage | Average Net Yardage | Average Return Yardage | Punts |
|---|---|---|---|---|
| (46,47.9] | 41 | 39.25 | 7.24 | 190 |
| (47.9,49.8] | 42 | 40.30 | 8.20 | 198 |
| (49.8,51.8] | 43 | 41.19 | 9.25 | 214 |
| (51.8,53.7] | 45 | 42.31 | 10.15 | 180 |
| (53.7,55.6] | 45 | 43.83 | 10.69 | 183 |
| (55.6,57.5] | 46 | 43.77 | 12.74 | 136 |
| (57.5,59.4] | 48 | 45.02 | 13.53 | 121 |
| (59.4,61.3] | 48 | 45.19 | 15.32 | 57 |
| (61.3,63.2] | 51 | 47.47 | 15.07 | 30 |
| (63.2,65.2] | 52 | 46.24 | 18.29 | 17 |
| (65.2,67.1] | 57 | 57.60 | 8.60 | 5 |
| (67.1,69] | 54 | 52.80 | 15.60 | 5 |
The most interesting result to come out of this table is that average return yardage on the 17 punts that went between 63 and 65 yards was over 18 yards. This was enough to drive the average net yardage gained all the way down to 46, which is less than punts that went between 61 and 63 yards.
Advanced statistics and metrics like discussed in this paper are great tools to be used for things like player evaluation and strategy. However, they are most effective when combined with the traditional way of doing things. A smart General Manager or Coach will not draft a player just because he ran a fast 40 yard dash at the combine. When a player puts up an unexpected number, that’s a signal to say “Hey, I need to go back and take another look at his tape to see if that speed shows up on the actual field”. The same idea rings true with statistics. Players who perform well in the advanced metrics are players that should be reevaluated, but we shouldn’t rush to make decisions without first confirming that what the metric says is actually true on the field.
My future work will consist of watching film to see for example, was Hunter Renfrow actually the best returner in the league last year or was his high performance in my regression the result of outside factors? Why wasn’t Matthew Slater, who has made the Pro Bowl the last 10 years as a special teamer, higher on my list of the best gunners?
A link to all code used in this analysis is available here.