Prospect projections in basketball have always interested me, but given how often length and wingspan are referenced in scouting reports, I was surprised to find how little public work there is available making use of wingspans and how this should factor into projections. It makes good basketball sense that having a longer wingspan should improve a player’s defensive ability and potential, since long arms are useful for clogging up passing lanes and protecting the rim, but can wingspan data improve upon box-score metrics (steals, blocks, fouls) to estimate a player’s defensive contributions?

Box Plus/Minus is a box-score metric whose basis is a 14-year “RAPM” sample, a plus/minus-oriented (i.e. how does the team perform when a player is on the court?) metric. The basic idea is that a player’s box score could predict that player’s RAPM. This investigation will focus specifically on Defensive Box Plus/Minus, which is calculated simply by subtracting Offensive Box Plus/Minus (OBPM, also captured by box score) from overall Box Plus/Minus. A more detailed description can be found on Basketball Reference. As they caution:

“Box Plus/Minus is good at measuring offense and solid overall, but the defensive numbers in particular should not be considered definitive. Look at the defensive values as a guide, but don’t hesitate to discount them when a player is well known as a good or bad defender.”

To date, there is no definitive metric for evaluating defensive contribution, so despite this warning, Defensive Box Plus/Minus is a worthy attempt to capture a player’s value. Some of its strengths as a metric include its availability throughout NBA history, and it generally has less year-to-year variance for players than other “advanced” metrics.

Wingspan data is much less complete, but is fairly well-catalogued in the 2000s and 2010s on DraftExpress’ pre-draft measurement database. With a few exceptions, the highest measured wingspan was used because many of these measurements occur while a player is still growing. However, the highest measurement obviously isn’t guaranteed to be the most recent or accurate one, yet it should be enough to explore the theory that having a longer wingspan should lead to better defense. This will be the central focus of the investigation.

NBA wingspans follow a roughly normal distribution, with most wingspans falling between 76 and 88 inches. The median wingspan for all player-seasons is 82.75 inches. The median wingspan to official listed height ratio is 1.047:1.

Let’s compare wingspan to DBPM and organize by position. DBPM tends to favor larger players, but I suspected that it might be particularly beneficial to have a long wingspan for one’s position.

Sadly for my theory, the bulk of the data shows that DBPM generally varies linearly with wingspan in total and across all positions. Close-ups for each position (not pictured here) do not reveal exponential increases in DBPM. Perhaps we can look at this another way: if one has an “outsized” wingspan, or a long wingspan for their height, their DBPM should get a boost as they can move like a smaller, speedier player while retaining the ability to disrupt offense with their oversized arms.

I found that the correlation between wingspan and DBPM is slightly higher than that of height. The correlation between wingspan vs. height and DBPM is only about half as strong, but this is not surprising since DBPM favors larger players and it is still significant. With that in mind, let’s look at some of the players with the most outsized wingspans (greater than +7).

At first glance, this could be a random sample of NBA players, but some of the NBA’s best defenders are in this group: two-time reigning Defensive Player of the Year in Kawhi Leonard as well as very adept rim-protecting bigs in Rudy Gobert, Anthony Davis, Bismack Biyombo, Larry Sanders, and David West. Is there a pattern here?

(Jitter was added to this plot in order to reduce overplotting and make it easier to navigate through names.) Here we see that there are more greens and yellows above the trendline and more reds and oranges below. However, there is a less noticeable shift in colors while moving solely from left to right. This indicates that the relationship between wingspan and height plays a role in DBPM and might be useful to include in a linear model. Let’s test a few, beginning with (transformed) ordinary rate stats and seeing how the additions of wingspan and height factor affect the proportion of variance in the data that can be explained via modeling, measured by R-squared.

## 
## Calls:
## Model 1: lm(formula = DBPM ~ STL + sqrtBLK + logDRB + logPF, data = NBA)
## Model 2: lm(formula = DBPM ~ Ht + STL + sqrtBLK + logDRB + logPF, data = NBA)
## Model 3: lm(formula = DBPM ~ Wingspan + STL + sqrtBLK + logDRB + logPF, 
##     data = NBA)
## Model 4: lm(formula = DBPM ~ WingspanminHt + STL + sqrtBLK + logDRB + 
##     logPF, data = NBA)
## Model 5: lm(formula = DBPM ~ WingspandivHt + STL + sqrtBLK + logDRB + 
##     logPF, data = NBA)
## Model 6: lm(formula = DBPM ~ WingspandivHt + Wingspan + STL + sqrtBLK + 
##     logDRB + logPF, data = NBA)
## 
## ====================================================================================
##                    Model 1    Model 2    Model 3    Model 4    Model 5    Model 6   
## ------------------------------------------------------------------------------------
##   (Intercept)     -6.230***  -5.155***  -5.384***  -6.291***  -5.903***  -5.678***  
##                   (0.138)    (0.672)    (0.625)    (0.156)    (0.817)    (0.830)    
##   STL              1.135***   1.116***   1.129***   1.142***   1.142***   1.123***  
##                   (0.031)    (0.033)    (0.036)    (0.035)    (0.035)    (0.037)    
##   sqrtBLK          2.105***   2.143***   2.083***   2.041***   2.039***   2.087***  
##                   (0.059)    (0.064)    (0.074)    (0.068)    (0.068)    (0.075)    
##   logDRB           1.624***   1.689***   1.745***   1.691***   1.690***   1.761***  
##                   (0.067)    (0.078)    (0.082)    (0.074)    (0.074)    (0.087)    
##   logPF           -0.441***  -0.429***  -0.443***  -0.448***  -0.448***  -0.438***  
##                   (0.064)    (0.065)    (0.072)    (0.072)    (0.072)    (0.073)    
##   Ht                         -0.015                                                 
##                              (0.009)                                                
##   Wingspan                              -0.013                           -0.016     
##                                         (0.008)                          (0.011)    
##   WingspanminHt                                    -0.006                           
##                                                    (0.010)                          
##   WingspandivHt                                               -0.389      0.528     
##                                                               (0.777)    (0.980)    
## ------------------------------------------------------------------------------------
##   R-squared         0.675      0.675      0.697      0.697      0.697      0.697    
##   adj. R-squared    0.674      0.675      0.697      0.696      0.696      0.697    
## ====================================================================================

While defensive box-score metrics can get one a long way toward approximating a player’s defensive ability, adding variables like wingspan, height, and the wingspan to height ratio produces a small, but noticeable jump in explaining variance in DBPM. (Layering these stats doesn’t produce any significant change in R-squared.) This indicates that wingspan is an important (but not crucial) factor in estimating a player’s defensive ability and potential. A player with a mediocre wingspan for their height can still be an elite defender (as is the case with Joakim Noah and Andrew Bogut), but players with short wingspans for their height rarely are impactful. Attributes like lateral quickness, strength, tenacity, and focus are still central to any defensive projection. (The ‘wrong’ signs of the coefficients are likely due to multicollinearity, which muddles determining the effect of each predictor.)

A limitation of this investigation is related to the reliability of DBPM as a measure of defensive contribution. Defensive impact is notoriously hard to measure through box scores, and it would have been helpful to compare the central variables to alternative metrics like Real Plus-Minus if long-term data were publicly available. Secondly, there was the reliability of wingspan data, for which uniformity is an impossible goal given that players often are still growing around the time when measurements are conducted. In many cases, different organizations conduct the measurements. After some deliberation, I took the highest recorded measurement (with a few exceptions). Finally, this data is truncated because A) player data begins with the 2003 draft and therefore is skewed toward younger players, and B) players who are poor defenders may be excluded from the data since they did not play enough or are out of the league.

All in all, wingspan is an important part of defense, and given that it is helpful in measuring defensive contribution in a given season, it should be used in prospect projections as well. Considering that collegiate stats are often of a small sample and can greatly vary from year-to-year, I suspect that wingspan will have even more predictive power when estimating future stats. More on that later.