DATA 621 Blog 1: Predicting NBA Player Efficiency by Total Points

David Quarshie

Intro

For my blogs I plan on looking at player stats from the NBA to see if there are statistical methods we can use to derive some interesting findings. The data I’ll be using can be found here: https://www.kaggle.com/drgilermo/nba-players-stats#Seasons_Stats.csv It contains stats such as points, rebounds, assists, fouls, etc. for players starting from the 1950 season.

In this first blog I’m going to focus on a player’s efficiency. The data we’re dealing with already has a stat called Player Efficiency Rating, or PER. This stat was developed by ESPN’s John Hollinger and is defined by John as, “The PER sums up all a player’s positive accomplishments, subtracts the negative accomplishments, and returns a per-minute rating of a player’s performance.” For the full breakdown of how PER is calculated, take a look here: https://www.basketball-reference.com/about/per.html

After reading about what goes into calculating the PER we see that a major point of basketball, scoring points, is not a major part. So let’s use regression to see how total points relates to a player’s efficiency.

Dimensions

We’ve drilled down our data to only include stats from the 2017 NBA season. The dimensions of the dataset are below. We’re working with 595 rows of data with 53 columns.

## [1] 595  53

Data Summary

Let’s also take a look at data’s summary. This will allow us to see some basic information like mean, minimum, and maximum. For example we see that the average number of points for a player is around 474.

##        X              Year                   Player         Pos     
##  Min.   :24096   Min.   :2017   Ersan Ilyasova  :  4   SG     :125  
##  1st Qu.:24244   1st Qu.:2017   Lance Stephenson:  4   SF     :121  
##  Median :24393   Median :2017   Omri Casspi     :  4   PF     :119  
##  Mean   :24393   Mean   :2017   Andrew Bogut    :  3   PG     :116  
##  3rd Qu.:24542   3rd Qu.:2017   Andrew Nicholson:  3   C      :113  
##  Max.   :24690   Max.   :2017   Anthony Brown   :  3   PF-C   :  1  
##                                 (Other)         :574   (Other):  0  
##       Age              Tm            G               GS       
##  Min.   :19.00   TOT    : 53   Min.   : 1.00   Min.   : 0.00  
##  1st Qu.:23.00   NOP    : 26   1st Qu.:24.00   1st Qu.: 0.00  
##  Median :26.00   DAL    : 24   Median :55.00   Median : 8.00  
##  Mean   :26.41   BRK    : 21   Mean   :48.43   Mean   :22.14  
##  3rd Qu.:29.00   CLE    : 21   3rd Qu.:73.00   3rd Qu.:39.00  
##  Max.   :40.00   PHI    : 21   Max.   :82.00   Max.   :82.00  
##                  (Other):429                                  
##        MP            PER              TS.             X3PAr       
##  Min.   :   1   Min.   :-35.30   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.: 321   1st Qu.:  9.60   1st Qu.:0.5000   1st Qu.:0.1670  
##  Median :1013   Median : 12.70   Median :0.5370   Median :0.3330  
##  Mean   :1091   Mean   : 12.73   Mean   :0.5268   Mean   :0.3216  
##  3rd Qu.:1758   3rd Qu.: 15.55   3rd Qu.:0.5750   3rd Qu.:0.4600  
##  Max.   :3048   Max.   : 31.50   Max.   :0.8200   Max.   :1.0000  
##                                  NA's   :2        NA's   :2       
##       FTr              ORB.             DRB.             TRB.      
##  Min.   :0.0000   Min.   : 0.000   Min.   :  0.00   Min.   : 0.00  
##  1st Qu.:0.1560   1st Qu.: 1.800   1st Qu.: 10.30   1st Qu.: 6.20  
##  Median :0.2310   Median : 3.300   Median : 13.80   Median : 8.90  
##  Mean   :0.2705   Mean   : 4.912   Mean   : 15.14   Mean   :10.03  
##  3rd Qu.:0.3400   3rd Qu.: 7.550   3rd Qu.: 19.00   3rd Qu.:13.00  
##  Max.   :2.0000   Max.   :26.300   Max.   :100.00   Max.   :56.40  
##  NA's   :2                                                         
##       AST.            STL.             BLK.             TOV.      
##  Min.   : 0.00   Min.   : 0.000   Min.   : 0.000   Min.   : 0.00  
##  1st Qu.: 6.20   1st Qu.: 1.000   1st Qu.: 0.500   1st Qu.: 9.70  
##  Median :10.10   Median : 1.400   Median : 1.200   Median :12.50  
##  Mean   :12.77   Mean   : 1.535   Mean   : 1.685   Mean   :12.89  
##  3rd Qu.:17.50   3rd Qu.: 1.900   3rd Qu.: 2.300   3rd Qu.:15.60  
##  Max.   :57.30   Max.   :11.100   Max.   :20.200   Max.   :43.60  
##                                                    NA's   :2      
##       USG.        blanl              OWS              DWS       
##  Min.   : 0.00   Mode:logical   Min.   :-1.700   Min.   :0.000  
##  1st Qu.:14.60   NA's:595       1st Qu.: 0.000   1st Qu.:0.200  
##  Median :18.10                  Median : 0.500   Median :0.800  
##  Mean   :18.50                  Mean   : 1.155   Mean   :1.103  
##  3rd Qu.:21.25                  3rd Qu.: 1.600   3rd Qu.:1.700  
##  Max.   :41.70                  Max.   :11.500   Max.   :6.000  
##                                                                 
##        WS            WS.48           blank2             OBPM        
##  Min.   :-0.80   Min.   :-0.47300   Mode:logical   Min.   :-26.700  
##  1st Qu.: 0.30   1st Qu.: 0.03700   NA's:595       1st Qu.: -3.000  
##  Median : 1.30   Median : 0.08100                  Median : -1.400  
##  Mean   : 2.26   Mean   : 0.07366                  Mean   : -1.573  
##  3rd Qu.: 3.30   3rd Qu.: 0.11400                  3rd Qu.:  0.000  
##  Max.   :15.00   Max.   : 0.48000                  Max.   : 11.800  
##                                                                     
##       DBPM              BPM               VORP               FG       
##  Min.   :-7.1000   Min.   :-26.900   Min.   :-1.4000   Min.   :  0.0  
##  1st Qu.:-1.8000   1st Qu.: -3.850   1st Qu.:-0.1000   1st Qu.: 38.0  
##  Median :-0.4000   Median : -1.800   Median : 0.0000   Median :134.0  
##  Mean   :-0.3773   Mean   : -1.951   Mean   : 0.5227   Mean   :175.5  
##  3rd Qu.: 0.9000   3rd Qu.:  0.200   3rd Qu.: 0.7500   3rd Qu.:261.5  
##  Max.   :12.0000   Max.   : 15.600   Max.   :12.4000   Max.   :824.0  
##                                                                       
##       FGA              FG.              X3P              X3PA      
##  Min.   :   0.0   Min.   :0.0000   Min.   :  0.00   Min.   :  0.0  
##  1st Qu.:  91.5   1st Qu.:0.4000   1st Qu.:  2.00   1st Qu.:  9.0  
##  Median : 300.0   Median :0.4420   Median : 23.00   Median : 73.0  
##  Mean   : 384.7   Mean   :0.4412   Mean   : 43.93   Mean   :122.9  
##  3rd Qu.: 562.0   3rd Qu.:0.4850   3rd Qu.: 69.00   3rd Qu.:195.0  
##  Max.   :1941.0   Max.   :1.0000   Max.   :324.00   Max.   :789.0  
##                   NA's   :2                                        
##       X3P.             X2P             X2PA             X2P.       
##  Min.   :0.0000   Min.   :  0.0   Min.   :   0.0   Min.   :0.0000  
##  1st Qu.:0.2660   1st Qu.: 26.5   1st Qu.:  57.5   1st Qu.:0.4460  
##  Median :0.3340   Median : 87.0   Median : 183.0   Median :0.4910  
##  Mean   :0.3011   Mean   :131.5   Mean   : 261.7   Mean   :0.4865  
##  3rd Qu.:0.3760   3rd Qu.:195.5   3rd Qu.: 386.0   3rd Qu.:0.5370  
##  Max.   :1.0000   Max.   :730.0   Max.   :1421.0   Max.   :1.0000  
##  NA's   :46                                        NA's   :5       
##       eFG.              FT              FTA             FT.        
##  Min.   :0.0000   Min.   :  0.00   Min.   :  0.0   Min.   :0.0000  
##  1st Qu.:0.4650   1st Qu.: 12.00   1st Qu.: 18.0   1st Qu.:0.6670  
##  Median :0.5000   Median : 45.00   Median : 63.0   Median :0.7650  
##  Mean   :0.4945   Mean   : 79.87   Mean   :103.7   Mean   :0.7376  
##  3rd Qu.:0.5360   3rd Qu.:103.00   3rd Qu.:134.5   3rd Qu.:0.8330  
##  Max.   :1.0000   Max.   :746.00   Max.   :881.0   Max.   :1.0000  
##  NA's   :2                                         NA's   :24      
##       ORB              DRB             TRB              AST       
##  Min.   :  0.00   Min.   :  0.0   Min.   :   0.0   Min.   :  0.0  
##  1st Qu.:  8.00   1st Qu.: 36.0   1st Qu.:  45.0   1st Qu.: 18.5  
##  Median : 25.00   Median :119.0   Median : 151.0   Median : 58.0  
##  Mean   : 45.51   Mean   :150.9   Mean   : 196.4   Mean   :101.1  
##  3rd Qu.: 60.50   3rd Qu.:214.0   3rd Qu.: 280.5   3rd Qu.:132.5  
##  Max.   :345.00   Max.   :817.0   Max.   :1116.0   Max.   :906.0  
##                                                                   
##       STL              BLK              TOV               PF        
##  Min.   :  0.00   Min.   :  0.00   Min.   :  0.00   Min.   :  0.00  
##  1st Qu.:  8.00   1st Qu.:  3.00   1st Qu.: 14.00   1st Qu.: 29.00  
##  Median : 27.00   Median : 11.00   Median : 43.00   Median : 84.00  
##  Mean   : 34.66   Mean   : 21.51   Mean   : 60.33   Mean   : 90.23  
##  3rd Qu.: 52.00   3rd Qu.: 29.00   3rd Qu.: 88.50   3rd Qu.:139.00  
##  Max.   :157.00   Max.   :214.00   Max.   :464.00   Max.   :278.00  
##                                                                     
##       PTS        
##  Min.   :   0.0  
##  1st Qu.: 103.0  
##  Median : 357.0  
##  Mean   : 474.7  
##  3rd Qu.: 685.0  
##  Max.   :2558.0  
## 

Fields

Now that we have a summary of our data let’s shave down our data some more to only get the PER and some stats that have to do with scoring. We’ll pull the true shooting %, 2- and 3- point shooting %, effective field goal %, free throw %, and total points. This should give our regression model enough info to see how scoring effects PER.

##       PER              TS.              FG.              X3P.       
##  Min.   :-35.30   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000  
##  1st Qu.:  9.60   1st Qu.:0.5000   1st Qu.:0.4000   1st Qu.:0.2660  
##  Median : 12.70   Median :0.5370   Median :0.4420   Median :0.3340  
##  Mean   : 12.73   Mean   :0.5268   Mean   :0.4412   Mean   :0.3011  
##  3rd Qu.: 15.55   3rd Qu.:0.5750   3rd Qu.:0.4850   3rd Qu.:0.3760  
##  Max.   : 31.50   Max.   :0.8200   Max.   :1.0000   Max.   :1.0000  
##                   NA's   :2        NA's   :2        NA's   :46      
##       X2P.             eFG.             FT.              PTS        
##  Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :   0.0  
##  1st Qu.:0.4460   1st Qu.:0.4650   1st Qu.:0.6670   1st Qu.: 103.0  
##  Median :0.4910   Median :0.5000   Median :0.7650   Median : 357.0  
##  Mean   :0.4865   Mean   :0.4945   Mean   :0.7376   Mean   : 474.7  
##  3rd Qu.:0.5370   3rd Qu.:0.5360   3rd Qu.:0.8330   3rd Qu.: 685.0  
##  Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :2558.0  
##  NA's   :5        NA's   :2        NA's   :24

Model

With just PER and some scoring stats in our dataset we can use R’s lm function to create a linear model to see how those stats play into PER.

## 
## Call:
## lm(formula = PER ~ ., data = stats_final)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.1462  -1.7146  -0.2006   1.4715  11.7290 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -8.444752   1.016989  -8.304 8.68e-16 ***
## TS.          45.844088   5.275127   8.691  < 2e-16 ***
## FG.          51.700570   4.268011  12.114  < 2e-16 ***
## X3P.          2.605305   1.213459   2.147   0.0323 *  
## X2P.        -11.067119   2.433360  -4.548 6.73e-06 ***
## eFG.        -44.764416   5.812190  -7.702 6.76e-14 ***
## FT.          -1.735443   1.090412  -1.592   0.1121    
## PTS           0.004940   0.000263  18.785  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.59 on 523 degrees of freedom
##   (64 observations deleted due to missingness)
## Multiple R-squared:  0.7588, Adjusted R-squared:  0.7556 
## F-statistic: 235.1 on 7 and 523 DF,  p-value: < 2.2e-16

Our results shows us some interesting things. For one, looking at the p-values for stats like 3 point % and free throw % we see that they’re not that critical for PER. Stats like true shooting %, field goal %, and points are in fact very important scoring metrics for PER. We see from their intercepts that if a player increases either their true shooting % or field goal % their PER will increase faster. The total points they score will increase their PER but not by that much.