Predicting Playoff Qualifications for NBA Teams

Executive Summary

Summary

The National Basketball Association (NBA) is a men’s professional basketball league in North America. It is one of the four major professional sports leagues in the United States and Canada and is widely considered to be the premier men’s professional basketball league in the world. The NBA playoffs are a best-of-seven elimination tournament annually held after the National Basketball Association’s regular season to determine the league’s champion.

The purpose of this project is to create a regression model that predicts which teams would qualify for the NBA Playoffs based on historical performance statistics. We have obtained the NBA data for period 1980-2011 from Kaggle.com. This model will take into consideration several factors which will give a team a probability of making it through to the playoffs and the dataset has several observations/factors which will help us in making these predictions.

This report is divided into several chapters and it starts with explanation of the different covariates such as number of different types of field goals attempted and made, free throws attempted and made, number of different types of rebounds, assists, steals, blocks, turnovers, etc.

The report progresses with different parts of regression model-building process such as model specification, parameter estimation, model adequacy checking, and model validation.

The report concludes with the final model which will predict the possibility of a team to qualify for the playoffs.

drawing

Exploratory Data Analysis

Exploratory Data Analysis

We loaded the NBA dataset for training purpose into R. This data was checked for missing values and inconsistency. Using domain knowledge, we have already identified the covariates for analysis purpose. We have performed exploratory data analysis to analyze the data set to summarize its main characteristics.

Dataset dimensions

## [1] 835  20

Column Names

##  [1] "SeasonEnd" "Team"      "Playoffs"  "W"         "PTS"       "oppPTS"   
##  [7] "FG"        "FGA"       "X2P"       "X2PA"      "X3P"       "X3PA"     
## [13] "FT"        "FTA"       "ORB"       "DRB"       "AST"       "STL"      
## [19] "BLK"       "TOV"

The statistics are about whether a team qualified for the playoffs in that season, how many matches the team won, how many points the team scored and other attributes that are listed below. From the available attributes, we are initially considering the following covariates:

  • 2PA: 2 Point Field Goals Attempted
  • 3PA: 3 Point Field Goals Attempted
  • FTA: Free Throws Attempted
  • ORB: Offensive Rebounds
  • DRB: Defensive Rebounds
  • AST: Assists
  • STL: Steals
  • BLK: Blocks
  • TOV: Turnovers

The dataset we have is pretty clean and we didn’t find any null/missing values hence there were no imputations done.

Correlation between the variables

We determine the correlation between all the variables in the dataset. If the value is displayed in dark blue it implies that the covariates are positively corelated and if the vale is displayed in dark red it implies that the covariates are negatively corelated.

Histograms

We check the overall spread of the dataset using the histograms. We determine, whether the data is spread evenly or has a positive or negative skew to it.

  • Inference

When we observed all the histograms, we’re getting symmetric or roughly symmetric shapes, hence there isn’t much skewness in the data of the significant variables that we’ve selected to build our model.

Box Plot

Using the box plot we have tried to identify the outliers in the data set. We have used two colours to show the teams that will qualify for the playoffs. The box plot in green shows success, that the team has quailifed for the playoffs and the box plot in yellow shows failure, that the team could not qualify for the next round.

  • Inference

Similarly, when we checked the box plots for all the significant covariates, we observed that the shape of box plots was symmetric or roughly symmetric with minimal outliers. Hence, we can conclude that the data in our significant covariates followed a normal distribution.

Plotting the data

We plot the number of wins with the most significant covariates.

  • Inference

After plotting the number of wins against points scored by a team, points scored by the opponents, and the difference of these 2 covariates, we found out that, number of wins has better correlation with points differential than that of number of points scored by a team or their opponents.

Model Building

Model Building

Our idea is to create 3 different models for calculating number of wins, number of points by the team in consideration, and number of points scored by its opponents.

We also confirmed the variable selction using AIC and BIC criteria. Based on the result of both AIC and BIC criteria, the covariates that we had choosen in the above model remained same. Hence, we can conclude that the variables that we’ve identified are good to move further ahead in the modeling process.

Following is the R code and summary of AIC and BIC criteria that we checked:

We have only considered variables which are important to build the model. Following are the list of variables with which we have started the backward selection technique.

drawing

## 
## Call:
## lm(formula = W ~ PTSdiff, data = Nba)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.7393 -2.1018 -0.0672  2.0265 10.6026 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 4.100e+01  1.059e-01   387.0   <2e-16 ***
## PTSdiff     3.259e-02  2.793e-04   116.7   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.061 on 833 degrees of freedom
## Multiple R-squared:  0.9423, Adjusted R-squared:  0.9423 
## F-statistic: 1.361e+04 on 1 and 833 DF,  p-value: < 2.2e-16

We get the following equation.

W = 41 + 0.0326 * PTSdiff

Linear Model 1 - Points Model

Backward Selection

## 
## Call:
## lm(formula = PTS ~ X2PA + X3PA + FTA + AST + ORB + DRB + TOV + 
##     STL + BLK, data = Nba.train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -512.75 -127.46    4.74  124.93  573.76 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.964e+03  2.548e+02  -7.708 5.65e-14 ***
## X2PA         1.037e+00  3.568e-02  29.060  < 2e-16 ***
## X3PA         1.262e+00  4.713e-02  26.786  < 2e-16 ***
## FTA          1.123e+00  4.113e-02  27.308  < 2e-16 ***
## AST          8.913e-01  5.170e-02  17.240  < 2e-16 ***
## ORB         -9.212e-01  9.638e-02  -9.558  < 2e-16 ***
## DRB          3.015e-03  7.598e-02   0.040   0.9684    
## TOV         -2.289e-02  7.523e-02  -0.304   0.7611    
## STL         -2.425e-01  1.101e-01  -2.201   0.0281 *  
## BLK         -7.056e-03  1.106e-01  -0.064   0.9492    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 189.4 on 574 degrees of freedom
## Multiple R-squared:  0.8949, Adjusted R-squared:  0.8932 
## F-statistic:   543 on 9 and 574 DF,  p-value: < 2.2e-16
## 
## Call:
## lm(formula = PTS ~ X2PA + X3PA + FTA + AST + ORB + DRB + STL + 
##     BLK, data = Nba.train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -510.28 -129.72    5.65  124.45  575.18 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.993e+03  2.364e+02  -8.430 2.79e-16 ***
## X2PA         1.038e+00  3.556e-02  29.184  < 2e-16 ***
## X3PA         1.267e+00  4.491e-02  28.204  < 2e-16 ***
## FTA          1.121e+00  4.072e-02  27.537  < 2e-16 ***
## AST          8.915e-01  5.166e-02  17.258  < 2e-16 ***
## ORB         -9.244e-01  9.571e-02  -9.658  < 2e-16 ***
## DRB          4.184e-03  7.582e-02   0.055   0.9560    
## STL         -2.484e-01  1.083e-01  -2.294   0.0222 *  
## BLK         -1.134e-02  1.097e-01  -0.103   0.9177    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 189.3 on 575 degrees of freedom
## Multiple R-squared:  0.8949, Adjusted R-squared:  0.8934 
## F-statistic: 611.9 on 8 and 575 DF,  p-value: < 2.2e-16
## 
## Call:
## lm(formula = PTS ~ X2PA + X3PA + FTA + AST + ORB + STL + BLK, 
##     data = Nba.train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -509.61 -129.50    5.55  124.72  575.97 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.986e+03  2.032e+02  -9.774   <2e-16 ***
## X2PA         1.038e+00  3.418e-02  30.373   <2e-16 ***
## X3PA         1.267e+00  4.207e-02  30.130   <2e-16 ***
## FTA          1.122e+00  4.040e-02  27.766   <2e-16 ***
## AST          8.919e-01  5.100e-02  17.487   <2e-16 ***
## ORB         -9.257e-01  9.269e-02  -9.988   <2e-16 ***
## STL         -2.506e-01  1.005e-01  -2.493   0.0129 *  
## BLK         -9.037e-03  1.013e-01  -0.089   0.9290    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 189.1 on 576 degrees of freedom
## Multiple R-squared:  0.8949, Adjusted R-squared:  0.8936 
## F-statistic: 700.5 on 7 and 576 DF,  p-value: < 2.2e-16
## 
## Call:
## lm(formula = PTS ~ X2PA + X3PA + FTA + AST + ORB + STL, data = Nba.train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -510.07 -129.69    5.45  123.88  574.72 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.990e+03  1.978e+02 -10.063   <2e-16 ***
## X2PA         1.039e+00  3.386e-02  30.677   <2e-16 ***
## X3PA         1.268e+00  4.153e-02  30.537   <2e-16 ***
## FTA          1.121e+00  4.021e-02  27.891   <2e-16 ***
## AST          8.914e-01  5.061e-02  17.613   <2e-16 ***
## ORB         -9.267e-01  9.193e-02 -10.080   <2e-16 ***
## STL         -2.504e-01  1.004e-01  -2.494   0.0129 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 189 on 577 degrees of freedom
## Multiple R-squared:  0.8949, Adjusted R-squared:  0.8938 
## F-statistic: 818.6 on 6 and 577 DF,  p-value: < 2.2e-16

The final variables which we have selected for our Points model are X2PA, X3PA, FTA, AST, ORB and STL.

Forward Selection

## Single term additions
## 
## Model:
## PTS ~ 1
##        Df Sum of Sq       RSS    AIC  F value    Pr(>F)    
## <none>              195977838 7432.6                       
## X2PA    1  99352922  96624916 7021.6 598.4316 < 2.2e-16 ***
## X3PA    1  52396167 143581671 7252.9 212.3848 < 2.2e-16 ***
## FTA     1  82767605 113210233 7114.1 425.4982 < 2.2e-16 ***
## AST     1 109859692  86118146 6954.4 742.4491 < 2.2e-16 ***
## ORB     1  49087464 146890373 7266.2 194.4913 < 2.2e-16 ***
## DRB     1   1561939 194415898 7429.9   4.6758     0.031 *  
## TOV     1  31699839 164277998 7331.5 112.3054 < 2.2e-16 ***
## STL     1  33787416 162190422 7324.1 121.2419 < 2.2e-16 ***
## BLK     1   5546264 190431574 7417.8  16.9506 4.392e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Single term additions
## 
## Model:
## PTS ~ X2PA
##        Df Sum of Sq      RSS    AIC  F value    Pr(>F)    
## <none>              96624916 7021.6                       
## X3PA    1  24979952 71644963 6848.9 202.5732 < 2.2e-16 ***
## FTA     1  20855031 75769884 6881.6 159.9154 < 2.2e-16 ***
## AST     1  26016833 70608083 6840.4 214.0800 < 2.2e-16 ***
## ORB     1    881394 95743521 7018.3   5.3486  0.021088 *  
## DRB     1   6911241 89713675 6980.3  44.7583 5.252e-11 ***
## TOV     1   1330384 95294532 7015.5   8.1112  0.004555 ** 
## STL     1   1419988 95204927 7015.0   8.6657  0.003372 ** 
## BLK     1      4080 96620836 7023.6   0.0245  0.875594    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Single term additions
## 
## Model:
## PTS ~ X2PA + X3PA
##        Df Sum of Sq      RSS    AIC  F value    Pr(>F)    
## <none>              71644963 6848.9                       
## FTA     1  29263874 42381089 6544.3 400.4863 < 2.2e-16 ***
## AST     1  22603750 49041213 6629.6 267.3297 < 2.2e-16 ***
## ORB     1   4585823 67059140 6812.3  39.6632 5.965e-10 ***
## DRB     1   3210553 68434410 6824.1  27.2103 2.544e-07 ***
## TOV     1    222221 71422742 6849.1   1.8046   0.17969    
## STL     1    313060 71331903 6848.4   2.5455   0.11115    
## BLK     1    597793 71047170 6846.0   4.8801   0.02756 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Single term additions
## 
## Model:
## PTS ~ X2PA + X3PA + FTA
##        Df Sum of Sq      RSS    AIC  F value    Pr(>F)    
## <none>              42381089 6544.3                       
## AST     1  17354957 25026133 6238.7 401.5211 < 2.2e-16 ***
## ORB     1  10683352 31697737 6376.7 195.1452 < 2.2e-16 ***
## DRB     1   2198003 40183086 6515.2  31.6711 2.847e-08 ***
## TOV     1    401203 41979886 6540.8   5.5335   0.01899 *  
## STL     1    146978 42234111 6544.3   2.0150   0.15629    
## BLK     1      8314 42372775 6546.2   0.1136   0.73620    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Single term additions
## 
## Model:
## PTS ~ X2PA + X3PA + FTA + AST
##        Df Sum of Sq      RSS    AIC  F value    Pr(>F)    
## <none>              25026133 6238.7                       
## ORB     1   4202347 20823786 6133.3 116.6434 < 2.2e-16 ***
## DRB     1    458766 24567367 6229.9  10.7935   0.00108 ** 
## TOV     1    278193 24747939 6234.1   6.4973   0.01106 *  
## STL     1    796259 24229874 6221.8  18.9946 1.552e-05 ***
## BLK     1     60493 24965640 6239.3   1.4005   0.23712    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Single term additions
## 
## Model:
## PTS ~ X2PA + X3PA + FTA + AST + ORB
##        Df Sum of Sq      RSS    AIC F value  Pr(>F)  
## <none>              20823786 6133.3                  
## DRB     1     28813 20794973 6134.5  0.7995 0.37163  
## TOV     1     24081 20799705 6134.6  0.6680 0.41408  
## STL     1    222037 20601749 6129.1  6.2187 0.01292 *
## BLK     1        21 20823765 6135.3  0.0006 0.98058  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Single term additions
## 
## Model:
## PTS ~ X2PA + X3PA + FTA + AST + ORB + STL
##        Df Sum of Sq      RSS    AIC F value Pr(>F)
## <none>              20601749 6129.1               
## DRB     1      10.5 20601739 6131.1  0.0003 0.9863
## TOV     1    3558.4 20598191 6131.0  0.0995 0.7525
## BLK     1     284.5 20601465 6131.0  0.0080 0.9290

The final variables after using the Forward Selection method are same as those we got after using the backward selection method.

Linear Model 2 - Opposition Points Model

Model 2 AIC Selection

## Start:  AIC=8511.67
## oppPTS ~ X2PA + X3PA + FTA + AST + ORB + DRB + TOV + STL + BLK
## 
##        Df Sum of Sq       RSS    AIC
## - AST   1     26114  21819163 8510.7
## - BLK   1     43569  21836618 8511.3
## <none>               21793049 8511.7
## - STL   1  13185641  34978690 8904.8
## - ORB   1  16184366  37977415 8973.4
## - TOV   1  16586642  38379691 8982.2
## - DRB   1  19228733  41021782 9037.8
## - FTA   1  19834232  41627280 9050.1
## - X3PA  1  78995378 100788426 9788.4
## - X2PA  1 104139976 125933025 9974.4
## 
## Step:  AIC=8510.67
## oppPTS ~ X2PA + X3PA + FTA + ORB + DRB + TOV + STL + BLK
## 
##        Df Sum of Sq       RSS     AIC
## - BLK   1     47213  21866376  8510.5
## <none>               21819163  8510.7
## + AST   1     26114  21793049  8511.7
## - STL   1  14347741  36166903  8930.6
## - TOV   1  16622373  38441536  8981.6
## - ORB   1  17211444  39030606  8994.3
## - FTA   1  19996810  41815973  9051.8
## - DRB   1  20066509  41885671  9053.2
## - X3PA  1  79951060 101770223  9794.5
## - X2PA  1 124299917 146119080 10096.5
## 
## Step:  AIC=8510.48
## oppPTS ~ X2PA + X3PA + FTA + ORB + DRB + TOV + STL
## 
##        Df Sum of Sq       RSS     AIC
## <none>               21866376  8510.5
## + BLK   1     47213  21819163  8510.7
## + AST   1     29757  21836618  8511.3
## - STL   1  14717721  36584097  8938.2
## - TOV   1  16586932  38453307  8979.8
## - ORB   1  18095365  39961741  9012.0
## - FTA   1  19985322  41851697  9050.5
## - DRB   1  24139548  46005923  9129.6
## - X3PA  1  84414035 106280410  9828.7
## - X2PA  1 130175808 152042184 10127.7
## 
## Call:
## lm(formula = oppPTS ~ X2PA + X3PA + FTA + ORB + DRB + TOV + STL, 
##     data = NBA)
## 
## Coefficients:
## (Intercept)         X2PA         X3PA          FTA          ORB          DRB  
##    140.6485       1.6217       1.8463       0.8056      -1.6858      -1.4910  
##         TOV          STL  
##      1.3384      -1.8285
## Start:  AIC=7453.49
## oppPTS ~ 1
## 
##        Df Sum of Sq       RSS    AIC
## + X2PA  1 115220295  87897909 6966.3
## + TOV   1  63949743 139168461 7234.7
## + X3PA  1  62546369 140571835 7240.5
## + ORB   1  61752270 141365934 7243.8
## + FTA   1  57186574 145931630 7262.4
## + AST   1  55881080 147237124 7267.6
## + STL   1  20136861 182981343 7394.5
## + DRB   1   7808437 195309767 7432.6
## <none>              203118204 7453.5
## + BLK   1    371537 202746667 7454.4
## 
## Step:  AIC=6966.32
## oppPTS ~ X2PA
## 
##        Df Sum of Sq       RSS    AIC
## + X3PA  1  25913903  61984007 6764.3
## + FTA   1   5279126  82618783 6932.2
## + BLK   1   3651602  84246307 6943.5
## + DRB   1   1816935  86080974 6956.1
## + TOV   1   1715315  86182594 6956.8
## + STL   1    539682  87358227 6964.7
## <none>               87897909 6966.3
## + ORB   1    276437  87621472 6966.5
## + AST   1    114508  87783401 6967.6
## - X2PA  1 115220295 203118204 7453.5
## 
## Step:  AIC=6764.33
## oppPTS ~ X2PA + X3PA
## 
##        Df Sum of Sq       RSS    AIC
## + FTA   1   9779066  52204941 6666.1
## + TOV   1   9626349  52357657 6667.8
## + DRB   1   5117153  56866854 6716.0
## + ORB   1   3023781  58960226 6737.1
## + STL   1   1944510  60039497 6747.7
## + BLK   1   1459852  60524154 6752.4
## <none>               61984007 6764.3
## + AST   1       724  61983283 6766.3
## - X3PA  1  25913903  87897909 6966.3
## - X2PA  1  78587828 140571835 7240.5
## 
## Step:  AIC=6666.06
## oppPTS ~ X2PA + X3PA + FTA
## 
##        Df Sum of Sq       RSS    AIC
## + TOV   1   6367732  45837208 6592.1
## + DRB   1   5984683  46220258 6597.0
## + ORB   1   5761960  46442980 6599.8
## + STL   1   3864721  48340220 6623.1
## + BLK   1   2610890  49594050 6638.1
## <none>               52204941 6666.1
## + AST   1    148368  52056572 6666.4
## - FTA   1   9779066  61984007 6764.3
## - X3PA  1  30413843  82618783 6932.2
## - X2PA  1  72690013 124894954 7173.5
## 
## Step:  AIC=6592.09
## oppPTS ~ X2PA + X3PA + FTA + TOV
## 
##        Df Sum of Sq       RSS    AIC
## + ORB   1   8447416  37389792 6475.1
## + STL   1   6863832  38973377 6499.4
## + DRB   1   4668287  41168921 6531.4
## + BLK   1   3794236  42042972 6543.6
## <none>               45837208 6592.1
## + AST   1    103041  45734168 6592.8
## - TOV   1   6367732  52204941 6666.1
## - FTA   1   6520449  52357657 6667.8
## - X3PA  1  36166879  82004088 6929.8
## - X2PA  1  71652642 117489850 7139.8
## 
## Step:  AIC=6475.13
## oppPTS ~ X2PA + X3PA + FTA + TOV + ORB
## 
##        Df Sum of Sq       RSS    AIC
## + DRB   1   9601575  27788217 6303.8
## + STL   1   5194250  32195542 6389.8
## + BLK   1   3113154  34276638 6426.4
## + AST   1   1732849  35656943 6449.4
## <none>               37389792 6475.1
## - ORB   1   8447416  45837208 6592.1
## - FTA   1   8918978  46308770 6598.1
## - TOV   1   9053188  46442980 6599.8
## - X3PA  1  43685104  81074896 6925.1
## - X2PA  1  74321780 111711572 7112.3
## 
## Step:  AIC=6303.81
## oppPTS ~ X2PA + X3PA + FTA + TOV + ORB + DRB
## 
##        Df Sum of Sq       RSS    AIC
## + STL   1  11854495  15933722 5981.0
## + AST   1    927997  26860220 6286.0
## + BLK   1    385905  27402312 6297.6
## <none>               27788217 6303.8
## - TOV   1   7752957  35541174 6445.5
## - DRB   1   9601575  37389792 6475.1
## - FTA   1  11301026  39089243 6501.1
## - ORB   1  13380704  41168921 6531.4
## - X3PA  1  50859832  78648049 6909.4
## - X2PA  1  83172186 110960403 7110.4
## 
## Step:  AIC=5981
## oppPTS ~ X2PA + X3PA + FTA + TOV + ORB + DRB + STL
## 
##        Df Sum of Sq       RSS    AIC
## + BLK   1     64719  15869002 5980.6
## <none>               15933722 5981.0
## + AST   1     30573  15903148 5981.9
## - TOV   1  11605948  27539670 6298.6
## - STL   1  11854495  27788217 6303.8
## - ORB   1  12578729  28512450 6318.8
## - FTA   1  14857565  30791287 6363.7
## - DRB   1  16261820  32195542 6389.8
## - X3PA  1  60276315  76210037 6893.0
## - X2PA  1  94355627 110289349 7108.9
## 
## Step:  AIC=5980.63
## oppPTS ~ X2PA + X3PA + FTA + TOV + ORB + DRB + STL + BLK
## 
##        Df Sum of Sq       RSS    AIC
## <none>               15869002 5980.6
## - BLK   1     64719  15933722 5981.0
## + AST   1     26387  15842615 5981.7
## - STL   1  11533309  27402312 6297.6
## - TOV   1  11638243  27507245 6299.9
## - ORB   1  11864146  27733148 6304.7
## - DRB   1  12976522  28845524 6327.6
## - FTA   1  14896711  30765713 6365.3
## - X3PA  1  56808807  72677809 6867.3
## - X2PA  1  89337007 105206009 7083.3
## 
## Call:
## lm(formula = oppPTS ~ X2PA + X3PA + FTA + TOV + ORB + DRB + STL + 
##     BLK, data = Nba.train)
## 
## Coefficients:
## (Intercept)         X2PA         X3PA          FTA          TOV          ORB  
##    -53.3581       1.6314       1.8659       0.8314       1.3548      -1.6728  
##         DRB          STL          BLK  
##     -1.4280      -1.9241      -0.1484
## 
## Call:
## lm(formula = oppPTS ~ X2PA + X3PA + FTA + ORB + DRB + TOV + STL, 
##     data = Nba.train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -553.86 -117.29   15.89  119.24  459.78 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -31.56190  222.71769  -0.142    0.887    
## X2PA          1.64047    0.02809  58.403   <2e-16 ***
## X3PA          1.87910    0.04026  46.679   <2e-16 ***
## FTA           0.83009    0.03582  23.175   <2e-16 ***
## ORB          -1.69486    0.07948 -21.324   <2e-16 ***
## DRB          -1.46788    0.06054 -24.246   <2e-16 ***
## TOV           1.34200    0.06552  20.483   <2e-16 ***
## STL          -1.93953    0.09369 -20.701   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 166.3 on 576 degrees of freedom
## Multiple R-squared:  0.9216, Adjusted R-squared:  0.9206 
## F-statistic: 966.7 on 7 and 576 DF,  p-value: < 2.2e-16

Using the AIC selection technique,the final variables which we have selected for our model are X2PA, X3PA, FTA,ORB, DRB, TOV, STL and BLK

Model Checking

QQ plot for Pts model

Model Validation- Points Model

## Analysis of Variance Table
## 
## Response: PTS
##            Df   Sum Sq  Mean Sq F value Pr(>F)    
## X2PA        1 1.41e+08 1.41e+08 4115.97 <2e-16 ***
## X3PA        1 3.67e+07 3.67e+07 1070.41 <2e-16 ***
## FTA         1 4.48e+07 4.48e+07 1304.62 <2e-16 ***
## AST         1 2.34e+07 2.34e+07  681.59 <2e-16 ***
## ORB         1 6.69e+06 6.69e+06  194.83 <2e-16 ***
## STL         1 2.53e+05 2.53e+05    7.38 0.0067 ** 
## Residuals 828 2.84e+07 3.43e+04                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Warning in cv.lm(data = NBA1, PointsReg4, m = 3, seed = 123): 
## 
##  As there is >1 explanatory variable, cross-validation
##  predicted values for a fold are not a linear function
##  of corresponding overall predicted values.  Lines that
##  are shown for the different folds are approximate

## 
## fold 1 
## Observations in test set: 278 
##                5   15   16   19   23   24   25   30   35   38      39   45   46
## Predicted   9130 8891 9185 8488 8414 8821 8494 8789 9116 8996 8992.65 8890 8170
## cvpred      9134 8903 9214 8485 8397 8826 8481 8775 9145 9019 9014.38 8903 8145
## PTS         8878 8949 9114 8820 8604 9008 8937 9006 9276 9156 9019.00 8662 8281
## CV residual -256   46 -100  335  207  182  456  231  131  137    4.62 -241  136
##               48   50    51   53     56      59   61     62   63     64     66
## Predicted   8535 8832 10145 8654 8677.0 8732.28 8919 8644.5 8788 9297.5 8837.8
## cvpred      8513 8823 10136 8645 8688.6 8754.53 8937 8663.1 8787 9303.7 8838.9
## PTS         8743 8575 10371 9092 8785.0 8746.00 9119 8705.0 9006 9272.0 8795.0
## CV residual  230 -248   235  447   96.4   -8.53  182   41.9  219  -31.7  -43.9
##               67   72      74   80      81   89   90   91     92   94     96
## Predicted   8787 7835 10146.9 9009 8749.01 9159 9164 8334 8259.2 8735 8983.3
## cvpred      8784 7823 10165.1 9055 8749.78 9163 9175 8346 8235.2 8727 8978.2
## PTS         9094 7964 10105.0 9433 8740.00 9019 8938 8134 8321.0 8501 9052.0
## CV residual  310  141   -60.1  378   -9.78 -144 -237 -212   85.8 -226   73.8
##                97   98    104    105    110    115    124  126  133  134  137
## Predicted   10378 9631 8616.8 8978.6 9812.1 8670.2 8947.2 8437 9381 9260 8758
## cvpred      10398 9640 8612.7 8990.6 9823.6 8664.8 8931.4 8421 9392 9271 8780
## PTS         10147 9602 8666.0 9019.0 9862.0 8743.0 8879.0 8784 9469 9412 8655
## CV residual  -251  -38   53.3   28.4   38.4   78.2  -52.4  363   77  141 -125
##              142    144  147  148  150  152    155     157    160  162  164
## Predicted   9367 9387.6 8627 9028 9349 8380 9447.4 9099.06 8512.9 9038 8721
## cvpred      9357 9410.9 8633 9014 9357 8378 9456.3 9111.93 8499.5 9055 8708
## PTS         9453 9363.0 8519 8907 9390 8094 9436.0 9120.00 8442.0 9237 8558
## CV residual   96  -47.9 -114 -107   33 -284  -20.3    8.07  -57.5  182 -150
##              165    168  173    174    175    178  184  186  189  190    195
## Predicted   9366 9204.5 9148 8927.3 8572.8 9727.1 8730 8499 9753 8828 9188.6
## cvpred      9362 9207.7 9164 8914.7 8570.2 9731.4 8722 8507 9795 8806 9199.3
## PTS         9567 9188.0 9052 8893.0 8508.0 9667.0 8844 8609 9573 8957 9250.0
## CV residual  205  -19.7 -112  -21.7  -62.2  -64.4  122  102 -222  151   50.7
##                197  200    201    202    204    206  210    212  213  214
## Predicted   8252.7 9028 9595.8 8938.7 9129.0 8621.3 8586 8439.3 9942 8519
## cvpred      8247.3 9048 9604.4 8937.8 9147.7 8608.3 8608 8429.5 9984 8501
## PTS         8235.0 8901 9518.0 8855.0 9135.0 8653.0 8726 8484.0 9675 8740
## CV residual  -12.3 -147  -86.4  -82.8  -12.7   44.7  118   54.5 -309  239
##                216  218    221    222    224  228  232  235  238  242  250  263
## Predicted   8920.5 8972 8854.0 8473.4 9099.7 8810 8511 8671 9713 8675 9388 8586
## cvpred      8927.7 8998 8872.3 8479.5 9105.1 8838 8498 8702 9757 8668 9373 8597
## PTS         8897.0 8712 8932.0 8506.0 9174.0 8652 8901 8977 9397 8962 9090 8343
## CV residual  -30.7 -286   59.7   26.5   68.9 -186  403  275 -360  294 -283 -254
##                264  270  271  272  273  274    276  281    282  284  285  288
## Predicted   8101.6 8811 8600 8487 8546 8310 8434.0 8253 8790.1 8697 8521 9148
## cvpred      8097.8 8825 8607 8492 8577 8317 8443.3 8256 8787.7 8695 8525 9184
## PTS         8195.0 8491 8717 8349 8727 8169 8455.0 7928 8782.0 8527 8313 8980
## CV residual   97.2 -334  110 -143  150 -148   11.7 -328   -5.7 -168 -212 -204
##                 291  292    311    313  318    321  322  324    326  332  335
## Predicted   8020.81 8029 8798.5 8758.3 8381 8949.4 8408 8873 8417.2 8806 9090
## cvpred      8008.89 8029 8802.9 8790.4 8379 8955.5 8430 8909 8398.8 8819 9107
## PTS         8007.00 8176 8877.0 8814.0 8141 9014.0 8531 8783 8495.0 8556 8847
## CV residual   -1.89  147   74.1   23.6 -238   58.5  101 -126   96.2 -263 -260
##              342  343    344     345    346    349  351  352  354  355    357
## Predicted   8843 7952 8355.5 7787.20 8287.5 8342.5 8725 8409 8251 8196 8126.4
## cvpred      8873 7978 8365.5 7797.46 8273.8 8373.9 8760 8436 8272 8185 8147.9
## PTS         8732 8033 8296.0 7801.00 8221.0 8292.0 8447 8233 7949 7930 8076.0
## CV residual -141   55  -69.5    3.54  -52.8  -81.9 -313 -203 -323 -255  -71.9
##              359    360  365  368    374  375  377    379  381    385  386
## Predicted   8205 8878.6 8573 8252 8001.1 8813 7926 8657.1 8085 9059.5 7646
## cvpred      8210 8907.5 8584 8245 8023.8 8828 7921 8694.8 8086 9081.8 7634
## PTS         8033 8876.0 8354 8428 8053.0 8667 8136 8616.0 8146 9091.0 7820
## CV residual -177  -31.5 -230  183   29.2 -161  215  -78.8   60    9.2  186
##                388  391    395  400    402    403  408    412    414    416
## Predicted   8425.4 8857 8576.0 8159 8282.7 8423.2 7654 8578.6 8633.3 8159.3
## cvpred      8435.3 8889 8583.4 8151 8289.1 8446.4 7653 8601.2 8629.3 8152.7
## PTS         8451.0 9055 8495.0 8013 8334.0 8404.0 7837 8571.0 8552.0 8163.0
## CV residual   15.7  166  -88.4 -138   44.9  -42.4  184  -30.2  -77.3   10.3
##              419     433  435  437  442  443    446  449  450    461  464
## Predicted   7857 7838.79 8310 7519 8341 8515 7371.9 8338 7490 8049.2 8805
## cvpred      7880 7822.17 8323 7506 8337 8538 7372.7 8349 7510 8064.4 8808
## PTS         7994 7819.00 8200 7818 8215 8431 7418.0 8454 7313 8099.0 8652
## CV residual  114   -3.17 -123  312 -122 -107   45.3  105 -197   34.6 -156
##                469  474    478    481    482  485    488  489    490  492
## Predicted   7488.5 7958 8270.6 7681.3 8211.0 8307 8504.0 8335 8199.8 7655
## cvpred      7497.7 7953 8254.1 7659.9 8228.4 8317 8494.5 8351 8207.1 7652
## PTS         7509.0 7637 8279.0 7735.0 8146.0 7950 8483.0 7834 8156.0 7546
## CV residual   11.3 -316   24.9   75.1  -82.4 -367  -11.5 -517  -51.1 -106
##                494  502  505  506    508  516  522  523  527    529  530    531
## Predicted   7766.2 7883 8334 8295 7646.0 8332 8393 7472 7144 7823.8 8093 7737.7
## cvpred      7778.3 7891 8358 8311 7636.7 8347 8388 7476 7135 7813.2 8118 7753.9
## PTS         7739.0 7991 8125 7968 7702.0 7918 8251 7289 7275 7763.0 7710 7824.0
## CV residual  -39.3  100 -233 -343   65.3 -429 -137 -187  140  -50.2 -408   70.1
##              532    533  534  537    539  540  543  545    546  552    554  556
## Predicted   8429 7996.1 7918 7860 7752.7 8160 7946 7834 7753.4 7677 7904.4 8133
## cvpred      8458 7979.3 7927 7866 7758.6 8196 7949 7851 7764.6 7685 7924.9 8161
## PTS         8343 7886.0 7978 7522 7711.0 7901 7812 7559 7735.0 7372 7996.0 7889
## CV residual -115  -93.3   51 -344  -47.6 -295 -137 -292  -29.6 -313   71.1 -272
##                559     560     561     567    568  573  580    583    585  587
## Predicted   7487.7 7766.76 7903.45 7609.67 7684.5 7406 8130 8133.6 7777.6 8164
## cvpred      7479.5 7792.34 7927.78 7604.28 7680.9 7414 8148 8145.9 7788.1 8178
## PTS         7461.0 7802.00 7925.00 7609.00 7714.0 6901 7995 8046.0 7699.0 8078
## CV residual  -18.5    9.66   -2.78    4.72   33.1 -513 -153  -99.9  -89.1 -100
##                589  592    597  600  601  602  606  610  611    619    622  623
## Predicted   7866.0 7732 7680.3 7906 8425 8092 7633 7507 8278 7369.7 7876.9 7234
## cvpred      7875.5 7722 7680.7 7901 8457 8101 7645 7504 8274 7378.5 7906.3 7239
## PTS         7834.0 7856 7611.0 7619 8626 7972 7493 7402 8039 7440.0 7964.0 7006
## CV residual  -41.5  134  -69.7 -282  169 -129 -152 -102 -235   61.5   57.7 -233
##              629    635    638  640    641    642  643  644    645  647     649
## Predicted   7960 7845.6 8109.7 8321 8043.6 7911.7 7868 7622 7891.3 8299 7613.26
## cvpred      7961 7847.7 8105.8 8301 8024.6 7912.9 7872 7624 7890.8 8314 7611.09
## PTS         7745 7796.0 8095.0 8327 7973.0 7934.0 7496 7252 7977.0 8128 7621.00
## CV residual -216  -51.7  -10.8   26  -51.6   21.1 -376 -372   86.2 -186    9.91
##              652    653  654  658    660  662    667  668  670    672    674
## Predicted   7781 8219.9 7794 8272 7962.6 8488 7924.6 8159 8114 7578.1 7683.5
## cvpred      7772 8226.8 7772 8291 7945.1 8495 7906.2 8164 8098 7571.1 7663.1
## PTS         8113 8178.0 7625 7943 8002.0 8227 7970.0 8154 8191 7522.0 7611.0
## CV residual  341  -48.8 -147 -348   56.9 -268   63.8  -10   93  -49.1  -52.1
##              675     677  681  682    686    689  691  696    697     700
## Predicted   7768 8167.37 7635 8128 7681.2 8111.3 7776 7954 7898.3 7757.94
## cvpred      7723 8155.76 7634 8135 7663.7 8118.8 7770 7951 7883.8 7756.03
## PTS         7842 8147.00 7837 8411 7680.0 8100.0 8201 7840 7843.0 7759.00
## CV residual  119   -8.76  203  276   16.3  -18.8  431 -111  -40.8    2.97
##                702  703    707  708  711  720  722  732  740  743    753    763
## Predicted   7894.3 8295 7759.0 8705 7929 7929 9288 7684 8156 8060 7643.3 7947.5
## cvpred      7895.1 8298 7743.2 8742 7937 7928 9312 7695 8149 8086 7634.7 7950.6
## PTS         7877.0 8001 7785.0 9037 8079 7903 9074 7839 8408 8215 7727.0 8044.0
## CV residual  -18.1 -297   41.8  295  142  -25 -238  144  259  129   92.3   93.4
##                765    767  770    771  772  776    779    784  786     792
## Predicted   8582.2 8210.7 7697 8208.1 7765 8101 8063.4 8870.1 8470 8042.79
## cvpred      8613.3 8210.1 7696 8206.6 7774 8114 8058.5 8904.1 8485 8044.94
## PTS         8627.0 8286.0 8153 8248.0 7958 8338 7993.0 8922.0 8263 8051.00
## CV residual   13.7   75.9  457   41.4  184  224  -65.5   17.9 -222    6.06
##                795  796  802  805    809    815  819  823  824  825  827  830
## Predicted   8275.2 8066 8168 7742 8009.5 8718.0 7930 7897 7675 8588 8024 7722
## cvpred      8305.5 8062 8173 7730 8017.2 8740.1 7944 7891 7679 8610 8027 7740
## PTS         8373.0 8322 8312 7892 8087.0 8685.0 8195 7722 7784 8734 8135 7896
## CV residual   67.5  260  139  162   69.8  -55.1  251 -169  105  124  108  156
## 
## Sum of squares = 9025347    Mean square = 32465    n = 278 
## 
## fold 2 
## Observations in test set: 279 
##                  1    2      4    7      9     10   17     18   20   21   26
## Predicted   8532.3 9152 9363.6 8335 9082.7 8926.3 8153 9756.0 8793 8047 8578
## cvpred      8548.8 9127 9306.7 8299 9040.6 8884.7 8146 9715.1 8769 8038 8545
## PTS         8573.0 9303 9360.0 8493 9119.0 8860.0 8402 9788.0 8897 8394 8670
## CV residual   24.2  176   53.3  194   78.4  -24.7  256   72.9  128  356  125
##               27   31     33   34   36   41     42     52   57   58   65   71
## Predicted   8637 8721 8844.1 8902 8954 9027 8788.6 9052.8 9064 8669 8560 9256
## cvpred      8658 8704 8823.3 8865 8930 9036 8764.1 9045.4 9019 8663 8560 9258
## PTS         8322 8878 8769.0 9117 8768 9209 8737.0 9112.0 9400 8890 8896 9102
## CV residual -336  174  -54.3  252 -162  173  -27.1   66.6  381  227  336 -156
##                 73   76   82   83      85   88   93     95  100  103  106  109
## Predicted   9301.9 8666 8555 8297 8793.51 8826 8932 8442.5 8970 9141 8583 8963
## cvpred      9299.6 8628 8534 8304 8772.37 8790 8925 8436.7 8946 9119 8585 8959
## PTS         9243.0 8908 8672 8198 8776.00 8903 9194 8386.0 9071 9478 8763 9277
## CV residual  -56.6  280  138 -106    3.63  113  269  -50.7  125  359  178  318
##              112  114  117  118  123    125  131  135    141    146    151
## Predicted   9012 8584 8687 9131 8913 9378.0 8932 8663 8909.7 9442.9 8950.0
## cvpred      9013 8591 8703 9126 8899 9375.8 8949 8665 8898.3 9414.3 8933.2
## PTS         8865 8423 8916 8903 9118 9413.0 9261 8376 8836.0 9379.0 8949.0
## CV residual -148 -168  213 -223  219   37.2  312 -289  -62.3  -35.3   15.8
##                156    158    161  167  172  179  180    181    187  188     191
## Predicted   9018.2 8545.6 8930.9 8974 9443 9190 9012 9259.0 8567.0 8774 8797.98
## cvpred      9015.3 8537.7 8944.9 8982 9430 9197 8984 9252.5 8571.8 8785 8766.95
## PTS         8924.0 8564.0 9024.0 9118 9656 9095 8882 9325.0 8566.0 8960 8771.00
## CV residual  -91.3   26.3   79.1  136  226 -102 -102   72.5   -5.8  175    4.05
##                192  193    198  207  208    211  215  217  226  230  236    239
## Predicted   8904.1 8387 8642.3 8947 8835 8892.4 9478 8583 9448 8754 8590 8458.1
## cvpred      8894.2 8369 8626.3 8948 8835 8882.8 9421 8590 9406 8780 8586 8474.7
## PTS         8935.0 8581 8655.0 9102 8958 8923.0 9558 8767 9395 8588 8411 8556.0
## CV residual   40.8  212   28.7  154  123   40.2  137  177  -11 -192 -175   81.3
##                243    244  246    249    252    253    257  259  262    266
## Predicted   8536.0 9005.5 8733 8799.8 9342.2 9461.8 8713.7 8893 8583 8112.4
## cvpred      8535.1 9003.4 8705 8792.4 9341.6 9450.3 8727.4 8886 8555 8138.5
## PTS         8509.0 9079.0 8691 8879.0 9423.0 9365.0 8760.0 9003 9024 8205.0
## CV residual  -26.1   75.6  -14   86.6   81.4  -85.3   32.6  117  469   66.5
##              268    269    277     279     280  283     286  287  289  290
## Predicted   8590 9069.1 8781.9 9347.39 9434.49 8379 8755.96 8394 8728 8902
## cvpred      8562 9065.5 8782.7 9346.12 9415.88 8354 8717.53 8371 8714 8895
## PTS         8753 9159.0 8684.0 9348.00 9407.00 8744 8711.00 8745 9011 8926
## CV residual  191   93.5  -98.7    1.88   -8.88  390   -6.53  374  297   31
##                293    295    297  300  302    304    306    307    308  309
## Predicted   8055.5 8330.1 8549.0 8473 8421 8421.9 9125.8 9076.9 8650.2 8457
## cvpred      8072.3 8319.4 8526.3 8441 8395 8418.8 9111.4 9059.9 8622.6 8450
## PTS         8113.0 8366.0 8440.0 8609 8641 8330.0 9194.0 9135.0 8549.0 8524
## CV residual   40.7   46.6  -86.3  168  246  -88.8   82.6   75.1  -73.6   74
##              310  312     315    319  320    325    327     331  333  334
## Predicted   8390 8541 9041.70 8535.2 8280 8584.1 8424.7 8619.46 9093 9121
## cvpred      8379 8504 9031.45 8531.9 8272 8569.8 8393.1 8644.08 9082 9103
## PTS         8737 8395 9030.00 8626.0 8252 8546.0 8392.0 8652.00 9298 8898
## CV residual  358 -109   -1.45   94.1  -20  -23.8   -1.1    7.92  216 -205
##                336  337    338  340     341    358  361  362    366    367
## Predicted   8626.4 8435 8787.6 8278 8291.00 8629.9 9040 8650 8154.7 7847.5
## cvpred      8633.4 8404 8789.5 8248 8271.31 8632.6 9020 8644 8150.3 7835.6
## PTS         8652.0 8884 8709.0 8318 8267.00 8666.0 8795 8291 8229.0 7921.0
## CV residual   18.6  480  -80.5   70   -4.31   33.4 -225 -353   78.7   85.4
##                371  378  380  383  384     387  390  393  394  399  401    404
## Predicted   7324.0 8166 8077 8298 8173 9083.57 8604 8152 7692 8633 7638 8010.7
## cvpred      7348.2 8144 8078 8322 8186 9067.92 8616 8138 7685 8605 7676 8054.3
## PTS         7417.0 7927 8293 8042 8054 9073.00 8742 8242 8059 8409 7822 8144.0
## CV residual   68.8 -217  215 -280 -132    5.08  126  104  374 -196  146   89.7
##              405  406     407  410  411    415  421  423  426  427    428  429
## Predicted   7831 8336 7876.20 7850 7822 8207.6 7697 7432 8170 7011 7513.3 8211
## cvpred      7822 8322 7900.58 7877 7821 8226.7 7693 7438 8152 7031 7519.6 8215
## PTS         8153 8438 7909.00 7684 7971 8145.0 7362 7774 8458 7173 7431.0 8020
## CV residual  331  116    8.42 -193  150  -81.7 -331  336  306  142  -88.6 -195
##              431    436    440  444    448  451     454    455  458     459
## Predicted   8046 7670.9 7699.4 7960 7809.8 7994 7902.29 7952.8 7530 7695.15
## cvpred      8061 7682.8 7722.6 7978 7805.7 7985 7919.01 7953.7 7549 7719.17
## PTS         8171 7776.0 7819.0 8114 7829.0 8147 7923.00 7931.0 7300 7721.00
## CV residual  110   93.2   96.4  136   23.3  162    3.99  -22.7 -249    1.83
##              460    463    468  470  471  473  475  479  484  486  487    491
## Predicted   7627 7833.6 8112.1 7567 7803 7805 7654 8030 7351 8203 8447 8237.1
## cvpred      7633 7832.9 8116.6 7593 7828 7836 7696 8046 7380 8170 8445 8230.8
## PTS         7237 7865.0 8170.0 7387 7651 7734 7587 7923 6952 8316 8115 8306.0
## CV residual -396   32.1   53.4 -206 -177 -102 -109 -123 -428  146 -330   75.2
##              500  501  507     510  511  512  513  514  515  517  520  526  528
## Predicted   8052 8234 8039 7457.86 8079 7963 7615 7655 8120 8185 7784 7897 8223
## cvpred      8049 8213 8056 7453.62 8069 7974 7617 7666 8103 8188 7797 7900 8210
## PTS         7771 8111 7914 7459.00 7759 7539 7181 7561 8239 7837 7591 7552 7992
## CV residual -278 -102 -142    5.38 -310 -435 -436 -105  136 -351 -206 -348 -218
##              536  542  547  550  553  555  564  566  569  572  574  578    586
## Predicted   8137 7754 8238 7860 7270 8029 7765 8113 7939 8284 7621 7812 7855.0
## cvpred      8160 7758 8250 7876 7278 8022 7741 8138 7919 8245 7639 7832 7828.9
## PTS         7959 7335 8009 7846 7150 8145 8014 7871 7599 8444 7492 7693 7860.0
## CV residual -201 -423 -241  -30 -128  123  273 -267 -320  199 -147 -139   31.1
##                588  595  596  599  603  605  607    612  613  616  617    627
## Predicted   8021.2 7970 7601 7828 7525 7236 7913 7731.0 7785 7912 7418 8272.8
## cvpred      8014.2 8003 7616 7815 7545 7248 7940 7726.8 7778 7913 7435 8279.8
## PTS         7941.0 7762 7502 7355 7388 7362 7771 7753.0 7401 7711 7215 8304.0
## CV residual  -73.2 -241 -114 -460 -157  114 -169   26.2 -377 -202 -220   24.2
##              628  630  631  632    636  637     639  648  651  655  656    657
## Predicted   8109 8049 8190 8357 7620.4 7964 7666.62 8773 7913 8349 7802 7978.9
## cvpred      8101 8058 8189 8356 7642.9 7993 7670.83 8750 7926 8353 7825 7994.9
## PTS         7729 7914 8405 8161 7626.0 7849 7661.00 9054 7888 8241 7972 8033.0
## CV residual -372 -144  216 -195  -16.9 -144   -9.83  304  -38 -112  147   38.1
##              659  664    666    669    671  676  678  683  684    685    694
## Predicted   8193 8382 7768.7 7552.2 8074.6 7525 8676 8108 7883 8276.1 8834.8
## cvpred      8189 8375 7787.2 7571.9 8079.1 7566 8615 8108 7938 8288.9 8792.9
## PTS         8020 8076 7699.0 7558.0 8020.0 7784 8886 8287 7573 8336.0 8737.0
## CV residual -169 -299  -88.2  -13.9  -59.1  218  271  179 -365   47.1  -55.9
##              698    699  704    705  712  714    715  718  719  721  724  733
## Predicted   8497 8279.0 7692 7880.0 7863 8436 8580.3 8100 8191 8017 8977 8283
## cvpred      8493 8298.9 7699 7917.1 7854 8467 8579.5 8100 8181 8030 8918 8303
## PTS         8474 8331.0 7833 7994.0 8130 8322 8556.0 7959 7981 8234 9105 7857
## CV residual  -19   32.1  134   76.9  276 -145  -23.5 -141 -200  204  187 -446
##              734  736    737  742  744    748  757  758  759  760  762    773
## Predicted   7917 8427 7854.2 8150 8567 7646.9 7942 8531 7481 7949 8042 8089.5
## cvpred      7890 8435 7856.1 8134 8566 7660.2 7929 8504 7495 7927 8040 8079.4
## PTS         8270 8567 7923.0 7999 8710 7677.0 7799 8768 7698 8061 8019 8121.0
## CV residual  380  132   66.9 -135  144   16.8 -130  264  203  134  -21   41.6
##                777    778  780  781    787  788  790    791  793  794  797  799
## Predicted   8109.6 7836.6 8220 8256 7912.1 8222 7691 8081.8 7749 8027 8163 8579
## cvpred      8103.8 7850.1 8228 8232 7911.6 8205 7688 8048.2 7753 7992 8173 8578
## PTS         8136.0 7813.0 8373 8364 7849.0 8339 7914 8009.0 7575 8220 8426 9039
## CV residual   32.2  -37.1  145  132  -62.6  134  226  -39.2 -178  228  253  461
##              800  803    806  807  810  814    816  817  818  820  826     828
## Predicted   7823 8340 7729.7 7814 8188 8211 8164.6 8243 8125 8055 8384 8131.30
## cvpred      7837 8345 7727.6 7806 8187 8160 8152.4 8249 8116 8076 8388 8109.08
## PTS         8045 8534 7790.0 7913 7827 8477 8183.0 8089 8321 8369 8596 8119.00
## CV residual  208  189   62.4  107 -360  317   30.6 -160  205  293  208    9.92
##                829    831    833  834      835
## Predicted   8557.9 8119.8 8055.5 8179 7993.237
## cvpred      8534.8 8105.9 8050.9 8176 7976.881
## PTS         8611.0 8151.0 8124.0 8153 7977.000
## CV residual   76.2   45.1   73.1  -23    0.119
## 
## Sum of squares = 10172554    Mean square = 36461    n = 279 
## 
## fold 3 
## Observations in test set: 278 
##                3    6    8   11     12   13     14   22    28     29     32
## Predicted   8906 8775 8918 8995 8935.4 9091 9237.0 9050 10146 8193.1 8853.2
## cvpred      8905 8801 8921 9001 8944.7 9109 9254.4 9066 10187 8202.6 8855.5
## PTS         8813 8933 9084 9438 9025.0 8879 9344.0 8773  9986 8174.0 8827.0
## CV residual  -92  132  163  437   80.3 -230   89.6 -293  -201  -28.6  -28.5
##                 37     40   43   44   47     49   54     55   60   68     69
## Predicted   8779.7 8989.9 8622 8163 8999 8513.8 8491 8415.7 8542 8515 8380.4
## cvpred      8797.6 9000.8 8635 8172 9007 8534.5 8495 8429.6 8547 8531 8391.4
## PTS         8849.0 9080.0 8531 8301 9180 8463.0 8680 8379.0 8707 8485 8335.0
## CV residual   51.4   79.2 -104  129  173  -71.5  185  -50.6  160  -46  -56.4
##                 70   75   77   78     79   84      86   87     99    101  102
## Predicted   9129.3 9368 8420 8755 9243.7 8859 8796.48 9180 8922.8 8650.1 9117
## cvpred      9141.2 9385 8439 8772 9257.7 8863 8804.88 9183 8944.5 8660.6 9123
## PTS         9191.0 9239 8145 8911 9328.0 9191 8808.00 9375 9008.0 8566.0 9023
## CV residual   49.8 -146 -294  139   70.3  328    3.12  192   63.5  -94.6 -100
##              107  108  111    113  116  119    120     121    122  127  128
## Predicted   8688 8869 8777 9450.4 9126 9044 9802.8 9481.85 8929.3 9260 8982
## cvpred      8685 8877 8787 9460.9 9133 9055 9829.9 9500.65 8957.3 9255 8984
## PTS         8838 9101 9077 9428.0 9412 9116 9841.0 9508.00 9052.0 9696 9090
## CV residual  153  224  290  -32.9  279   61   11.1    7.35   94.7  441  106
##              129    130  132  136    138  139  140  143    145  149  153  154
## Predicted   8901 8634.1 9017 9136 8848.0 9215 8840 9542 9196.0 9326 8900 9211
## cvpred      8912 8645.1 9020 9151 8849.4 9220 8846 9575 9217.8 9327 8899 9214
## PTS         8975 8627.0 8858 8937 8906.0 9359 8962 9410 9299.0 9618 9051 9023
## CV residual   63  -18.1 -162 -214   56.6  139  116 -165   81.2  291  152 -191
##              159    163  166  169  170  171  176  177  182  183  185  194  196
## Predicted   9229 8621.9 9924 8990 8841 8960 8529 9237 9287 8748 9096 8541 8865
## cvpred      9237 8618.6 9949 8992 8851 8975 8526 9238 9300 8772 9085 8554 8865
## PTS         8871 8596.0 9569 8765 8698 8566 8729 9111 8844 8690 9315 8103 8697
## CV residual -366  -22.6 -380 -227 -153 -409  203 -127 -456  -82  230 -451 -168
##                199  203  205  209  219  220  223    225  227  229  231    233
## Predicted   8620.3 9476 9139 8982 9241 8172 9240 9722.4 8738 8967 9045 8981.5
## cvpred      8615.4 9487 9134 8987 9240 8169 9249 9741.9 8752 8969 9066 8975.2
## PTS         8667.0 9314 8899 8566 9406 8016 9567 9727.0 8651 9196 8879 9023.0
## CV residual   51.6 -173 -235 -421  166 -153  318  -14.9 -101  227 -187   47.8
##              234  237  240    241  245  247  248  251  254     255  256  258
## Predicted   8597 8209 9619 8824.7 8394 7956 8322 8774 8506 8717.46 8377 9003
## cvpred      8611 8218 9648 8824.3 8406 7955 8351 8783 8515 8720.65 8380 9018
## PTS         8232 8384 9534 8752.0 8247 7803 8208 9039 8341 8718.00 8769 8832
## CV residual -379  166 -114  -72.3 -159 -152 -143  256 -174   -2.65  389 -186
##              260  261   265  267    275    278  294    296    298  299  301
## Predicted   8577 8562 10172 9441 8510.0 8551.4 9406 9190.1 8264.3 8420 8379
## cvpred      8583 8571 10228 9463 8523.6 8566.4 9425 9178.2 8272.6 8429 8390
## PTS         9145 8428  9828 9564 8441.0 8641.0 9732 9197.0 8229.0 8608 8237
## CV residual  562 -143  -400  101  -82.6   74.6  307   18.8  -43.6  179 -153
##                303  305  314  316    317    323  328  329  330  339     347
## Predicted   8303.2 8145 8236 8343 8795.0 8797.3 8197 8308 8569 8607 7933.20
## cvpred      8290.2 8149 8244 8334 8785.8 8786.2 8189 8308 8559 8611 7942.05
## PTS         8328.0 8358 8502 8625 8832.0 8836.0 8046 8431 8328 8353 7949.00
## CV residual   37.8  209  258  291   46.2   49.8 -143  123 -231 -258    6.95
##                348  350  353  356  363  364    369  370    372  373  376  382
## Predicted   8876.1 8176 8236 8666 7972 8569 8232.9 8097 8553.3 8077 8375 7961
## cvpred      8869.9 8155 8227 8667 7957 8561 8214.8 8081 8540.8 8061 8358 7949
## PTS         8844.0 8280 8475 8461 8202 8687 8249.0 8325 8462.0 8309 8491 7726
## CV residual  -25.9  125  248 -206  245  126   34.2  244  -78.8  248  133 -223
##                389  392  396  397  398    409  413     417  418  420  422  424
## Predicted   8024.2 8582 8244 8186 7236 8123.5 7875 8499.25 8423 8308 8099 8400
## cvpred      8008.3 8559 8231 8164 7204 8113.9 7864 8485.82 8401 8281 8095 8403
## PTS         8056.0 8726 8431 8625 7473 8024.0 7746 8477.00 8572 8404 8408 8248
## CV residual   47.7  167  200  461  269  -89.9 -118   -8.82  171  123  313 -155
##              425  430  432    434  438    439    441    445  447  452    453
## Predicted   7945 7352 8161 7881.0 8019 8014.2 7685.8 7868.0 8170 7640 7964.7
## cvpred      7919 7337 8137 7876.7 8005 8003.5 7668.8 7855.5 8147 7628 7958.7
## PTS         8108 7723 8248 7969.0 7882 7974.0 7719.0 7908.0 8274 7860 7864.0
## CV residual  189  386  111   92.3 -123  -29.5   50.2   52.5  127  232  -94.7
##                456    457  462    465  466  467  472  476  477  480  483  493
## Predicted   7696.2 7517.1 7819 7711.4 7576 8545 8229 8150 7906 8199 8591 8527
## cvpred      7675.1 7524.3 7801 7688.5 7569 8537 8212 8130 7899 8192 8584 8521
## PTS         7585.0 7494.0 7874 7787.0 7748 8290 8166 8246 7781 7969 8072 8267
## CV residual  -90.1  -30.3   73   98.5  179 -247  -46  116 -118 -223 -512 -254
##              495  496    497  498    499  503  504    509  518  519    521  524
## Predicted   8112 8205 8122.6 7416 8174.8 8789 7946 7958.2 8107 7857 7677.7 8084
## cvpred      8111 8194 8128.1 7420 8184.3 8798 7937 7954.2 8106 7856 7676.3 8081
## PTS         8300 8079 8036.0 7555 8206.0 8607 7886 7921.0 7584 7972 7581.0 8260
## CV residual  189 -115  -92.1  135   21.7 -191  -51  -33.2 -522  116  -95.3  179
##                525  535  538  541  544    548  549  551  557    558  562    563
## Predicted   8060.4 8339 7793 7854 8441 7587.9 8190 8415 7674 8270.2 8592 7888.0
## cvpred      8052.6 8331 7789 7845 8448 7588.9 8184 8411 7670 8273.7 8596 7883.7
## PTS         7982.0 8007 7645 7700 8629 7572.0 7938 8304 7514 8240.0 8578 7932.0
## CV residual  -70.6 -324 -144 -145  181  -16.9 -246 -107 -156  -33.7  -18   48.3
##              565  570  571    575  576  577  579  581    582  584    590    591
## Predicted   7553 7996 7806 8351.2 7544 8276 8372 7235 8090.3 8136 7755.6 8422.4
## cvpred      7542 7996 7803 8346.5 7540 8266 8365 7237 8088.3 8128 7740.6 8426.1
## PTS         7494 7786 7495 8400.0 7688 7940 8230 7016 8158.0 7820 7803.0 8342.0
## CV residual  -48 -210 -308   53.5  148 -326 -135 -221   69.7 -308   62.4  -84.1
##              593  594    598  604  608  609  614    615    618    620  621  624
## Predicted   7675 7620 7912.7 7763 8478 8167 7759 7449.4 7684.5 8537.7 7646 7376
## cvpred      7668 7621 7908.3 7754 8470 8154 7744 7443.1 7692.6 8524.4 7635 7353
## PTS         7555 7453 7811.0 7649 8052 7930 7529 7542.0 7723.0 8433.0 7501 7271
## CV residual -113 -168  -97.3 -105 -418 -224 -215   98.9   30.4  -91.4 -134  -82
##              625    626  633  634    646  650  661    663  665  673    679
## Predicted   7739 7657.1 7798 8379 8068.1 8496 7697 8044.5 7522 8135 7311.9
## cvpred      7735 7655.6 7781 8380 8075.7 8493 7694 8025.6 7515 8120 7309.8
## PTS         7528 7605.0 7653 8094 8160.0 8505 8130 7941.0 7387 7691 7285.0
## CV residual -207  -50.6 -128 -286   84.3   12  436  -84.6 -128 -429  -24.8
##                680    687  688  690  692    693    695  701  706  709  710
## Predicted   8209.2 7905.4 8226 8039 8771 7822.4 7916.7 8032 7604 7534 8502
## cvpred      8204.3 7903.7 8223 8034 8769 7812.3 7909.5 8030 7595 7533 8510
## PTS         8106.0 7857.0 7945 7934 8639 7872.0 7953.0 8172 7771 7717 8303
## CV residual  -98.3  -46.7 -278 -100 -130   59.7   43.5  142  176  184 -207
##                 713  716  717  723    725  726  727    728    729  730    731
## Predicted   8153.71 8082 8031 7903 7913.2 8697 8063 8831.9 8259.9 7675 7986.7
## cvpred      8154.96 8074 8017 7891 7904.7 8702 8066 8828.5 8272.1 7675 7981.4
## PTS         8157.00 8054 8245 7992 7931.0 8527 7692 8904.0 8257.0 7491 7956.0
## CV residual    2.04  -20  228  101   26.3 -175 -374   75.5  -15.1 -184  -25.4
##              735  738    739  741  745  746  747  749  750  751    752    754
## Predicted   7987 8832 7792.5 7728 7905 7943 7925 8211 7864 8088 8509.9 8847.3
## cvpred      7993 8828 7786.7 7724 7904 7939 7910 8218 7858 8089 8504.7 8867.1
## PTS         7948 9026 7820.0 7820 8104 8046 8275 8378 8223 8343 8555.0 8905.0
## CV residual  -45  198   33.3   96  200  107  365  160  365  254   50.3   37.9
##              755  756    761  764    766  768  769    774    775    782    783
## Predicted   7847 8463 8218.9 7572 7929.2 7865 8539 8542.2 7804.2 8650.8 7669.4
## cvpred      7844 8476 8217.7 7569 7934.8 7860 8536 8533.3 7807.4 8658.3 7666.6
## PTS         8070 8617 8142.0 7857 7952.0 7987 8974 8492.0 7877.0 8729.0 7709.0
## CV residual  226  141  -75.7  288   17.2  127  438  -41.3   69.6   70.7   42.4
##                785  789  798    801  804  808  811  812  813  821    822  832
## Predicted   8437.7 8078 7817 8146.9 8598 7762 8083 8732 7812 7667 8184.1 8235
## cvpred      8440.2 8092 7818 8156.2 8582 7756 8069 8736 7811 7673 8191.5 8232
## PTS         8395.0 8404 8014 8200.0 8547 7650 8220 8811 7951 7534 8288.0 8502
## CV residual  -45.2  312  196   43.8  -35 -106  151   75  140 -139   96.5  270
## 
## Sum of squares = 10324557    Mean square = 37139    n = 278 
## 
## Overall (Sum over all 278 folds) 
##    ms 
## 35356
## [1] 35356

MSPE for our Points Model

## [1] 29522458

The Press value for our model.

## [1] 0.895

Prediction R^2 value for our model.

## [1] 0.895

Adjusted R^2 value for our model.

We have taken a test data, which is 30% of our total data to test our model.

##     fit  lwr  upr
## 1  8532 8484 8581
## 13 9080 9041 9120
## 14 9225 9180 9270
## 17 8153 8111 8196
## 18 9741 9695 9787
## 19 8487 8438 8536
## [1] 31261

MSPE value.

## [1] 7846412

Press Value

## [1] 0.908

Predicted R-square value

QQ plot for Opposition Pts model

#### Model Validation for Opp Pts model.

## 
## Call:
## lm(formula = oppPTS ~ X2PA + X3PA + FTA + AST + ORB + DRB + TOV + 
##     STL + BLK, data = NBA)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##   -574   -112      9    116    467 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 114.0949   178.2704    0.64     0.52    
## X2PA          1.6263     0.0259   62.79   <2e-16 ***
## X3PA          1.8413     0.0337   54.69   <2e-16 ***
## FTA           0.8098     0.0296   27.40   <2e-16 ***
## AST          -0.0383     0.0385   -0.99     0.32    
## ORB          -1.6897     0.0683  -24.75   <2e-16 ***
## DRB          -1.4553     0.0539  -26.98   <2e-16 ***
## TOV           1.3430     0.0536   25.06   <2e-16 ***
## STL          -1.7971     0.0804  -22.34   <2e-16 ***
## BLK          -0.0988     0.0769   -1.28     0.20    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 163 on 825 degrees of freedom
## Multiple R-squared:  0.924,  Adjusted R-squared:  0.923 
## F-statistic: 1.12e+03 on 9 and 825 DF,  p-value: <2e-16
## Analysis of Variance Table
## 
## Response: oppPTS
##            Df   Sum Sq  Mean Sq F value  Pr(>F)    
## X2PA        1 1.65e+08 1.65e+08 6238.03 < 2e-16 ***
## X3PA        1 3.56e+07 3.56e+07 1348.41 < 2e-16 ***
## FTA         1 1.48e+07 1.48e+07  561.10 < 2e-16 ***
## AST         1 2.98e+05 2.98e+05   11.27 0.00082 ***
## ORB         1 1.05e+07 1.05e+07  397.11 < 2e-16 ***
## DRB         1 1.53e+07 1.53e+07  578.66 < 2e-16 ***
## TOV         1 1.13e+07 1.13e+07  428.32 < 2e-16 ***
## STL         1 1.35e+07 1.35e+07  509.36 < 2e-16 ***
## BLK         1 4.36e+04 4.36e+04    1.65 0.19941    
## Residuals 825 2.18e+07 2.64e+04                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Warning in cv.lm(data = NBA, PointsRegopp, m = 3, seed = 123): 
## 
##  As there is >1 explanatory variable, cross-validation
##  predicted values for a fold are not a linear function
##  of corresponding overall predicted values.  Lines that
##  are shown for the different folds are approximate

## 
## fold 1 
## Observations in test set: 278 
##                5   15   16     19     23   24   25     30   35     38   39   45
## Predicted   9084 8628 8985 9279.4 8821.7 8898 8590 9098.5 8987 8543.4 8950 9067
## cvpred      9072 8639 8983 9257.6 8807.7 8898 8580 9059.3 8989 8558.1 8940 9072
## oppPTS      9240 8603 8819 9160.0 8858.0 8526 8775 9103.0 8680 8512.0 8567 8661
## CV residual  168  -36 -164  -97.6   50.3 -372  195   43.7 -309  -46.1 -373 -411
##                 46   48   50    51   53   56   59     61   62     63     64
## Predicted   8323.9 8616 9167 10169 8903 9158 8850 8561.0 8521 8891.5 9147.8
## cvpred      8327.5 8602 9146 10159 8895 9160 8849 8583.4 8525 8889.6 9157.1
## oppPTS      8237.0 8909 8938 10328 9007 9039 8690 8649.0 8422 8957.0 9083.0
## CV residual  -90.5  307 -208   169  112 -121 -159   65.6 -103   67.4  -74.1
##               66   67     72      74   80   81   89     90   91   92     94
## Predicted   8614 9278 8488.0 10075.6 8835 8778 8996 9267.6 8617 8573 8993.1
## cvpred      8629 9281 8476.8 10074.6 8838 8789 8995 9279.4 8615 8580 8980.9
## oppPTS      8456 9558 8574.0 10054.0 8978 8379 8756 9282.0 8145 8427 8926.0
## CV residual -173  277   97.2   -20.6  140 -410 -239    2.6 -470 -153  -54.9
##               96      97   98  104  105  110    115  124    126  133  134  137
## Predicted   9276 10256.2 9497 8634 9118 9652 8943.2 9199 9061.3 9312 9031 8951
## cvpred      9270 10256.1 9487 8633 9110 9651 8943.5 9188 9053.4 9310 9028 8968
## oppPTS      9017 10237.0 9308 8325 8929 9884 8862.0 9388 9152.0 9190 9337 8677
## CV residual -253   -19.1 -179 -308 -181  233  -81.5  200   98.6 -120  309 -291
##                142    144  147    148  150  152     155  157  160    162  164
## Predicted   9313.4 9236.9 8545 9394.4 9005 8978 9353.23 9053 8882 8700.6 9030
## cvpred      9310.9 9234.2 8546 9386.1 9005 8968 9340.66 9046 8911 8716.8 9018
## oppPTS      9363.0 9267.0 8792 9475.0 8649 8554 9349.00 9272 8590 8692.0 8871
## CV residual   52.1   32.8  246   88.9 -356 -414    8.34  226 -321  -24.8 -147
##                165  168  173    174     175    178     184  186  189  190
## Predicted   9018.8 9277 8930 9260.3 9038.79 9410.9 8536.21 8105 9486 8716
## cvpred      9016.2 9266 8934 9236.4 9027.55 9390.3 8540.38 8116 9503 8708
## oppPTS      9050.0 9380 8731 9307.0 9022.00 9410.0 8549.00 8330 9239 8533
## CV residual   33.8  114 -203   70.6   -5.55   19.7    8.62  214 -264 -175
##                195  197  200     201  202    204  206  210    212  213  214
## Predicted   8730.4 8729 9065 9163.26 9143 9076.0 8863 8338 8579.8 9734 8397
## cvpred      8728.7 8720 9057 9150.36 9132 9060.4 8863 8344 8584.4 9749 8388
## oppPTS      8771.0 8900 9268 9147.00 9327 8966.0 8716 8608 8583.0 9536 8264
## CV residual   42.3  180  211   -3.36  195  -94.4 -147  264   -1.4 -213 -124
##                216  218  221  222  224    228  232  235    238    242  250
## Predicted   8775.1 9255 8766 8709 8908 9233.3 8656 8479 9359.8 8875.3 9528
## cvpred      8771.5 9258 8759 8707 8900 9215.5 8638 8482 9375.2 8865.2 9503
## oppPTS      8819.0 9525 8636 9027 9051 9249.0 8817 8710 9281.0 8949.0 9821
## CV residual   47.5  267 -123  320  151   33.5  179  228  -94.2   83.8  318
##                263  264    270  271    272    273    274  276    281    282
## Predicted   8467.6 8399 8697.8 8316 8873.6 8476.1 8516.5 8324 8515.2 8354.2
## cvpred      8470.6 8409 8698.1 8313 8855.1 8467.8 8519.7 8320 8519.2 8366.2
## oppPTS      8545.0 8570 8774.0 8164 8840.0 8524.0 8491.0 8474 8484.0 8412.0
## CV residual   74.4  161   75.9 -149  -15.1   56.2  -28.7  154  -35.2   45.8
##                284  285  288  291  292  311  313  318    321  322  324     326
## Predicted   8231.5 8820 8960 8445 8625 8125 8622 9189 9081.5 8035 8609 8610.55
## cvpred      8242.9 8821 8963 8437 8623 8132 8593 9164 9057.3 8044 8617 8592.24
## oppPTS      8254.0 8721 9300 8634 8821 8353 8885 9387 9095.0 8184 8754 8589.00
## CV residual   11.1 -100  337  197  198  221  292  223   37.7  140  137   -3.24
##              332    335  342    343  344    345  346    349  351    352  354
## Predicted   8859 9175.6 8538 7742.0 7933 8474.8 8198 7946.4 8670 8577.5 8383
## cvpred      8865 9160.3 8542 7736.9 7933 8451.3 8215 7956.1 8663 8573.5 8366
## oppPTS      9029 9107.0 8750 7780.0 7966 8514.0 8099 7938.0 8916 8585.0 8480
## CV residual  164  -53.3  208   43.1   33   62.7 -116  -18.1  253   11.5  114
##                355  357  359  360    365    368  374  375  377    379    381
## Predicted   8431.9 7634 8447 8366 7974.3 8516.7 8337 8993 7742 8553.2 8522.9
## cvpred      8421.4 7623 8451 8352 7975.2 8494.7 8325 8953 7720 8547.6 8491.3
## oppPTS      8498.0 7503 8658 8479 8008.0 8582.0 8651 9111 7833 8634.0 8504.0
## CV residual   76.6 -120  207  127   32.8   87.3  326  158  113   86.4   12.7
##                385    386  388  391    395    400  402    403    408  412  414
## Predicted   8515.1 8167.2 8265 8200 8725.9 8174.2 8272 8257.8 8248.1 8194 8227
## cvpred      8492.6 8162.8 8245 8187 8711.7 8173.7 8249 8250.6 8227.6 8181 8218
## oppPTS      8512.0 8236.0 8138 8384 8774.0 8235.0 8453 8261.0 8272.0 8115 8525
## CV residual   19.4   73.2 -107  197   62.3   61.3  204   10.4   44.4  -66  307
##              416  419    433    435  437  442    443  446  449  450    461
## Predicted   8254 8259 7842.3 7950.8 7853 8560 8321.1 7767 7587 7871 8106.7
## cvpred      8225 8239 7820.9 7947.6 7844 8522 8310.9 7748 7587 7857 8080.5
## oppPTS      8385 8610 7739.0 7850.0 7973 8751 8377.0 8064 7733 8152 8161.0
## CV residual  160  371  -81.9  -97.6  129  229   66.1  316  146  295   80.5
##                464  469    474  478  481  482  485  488  489    490    492
## Predicted   8085.6 7500 8177.4 7623 7989 8108 8227 8513 8278 8272.4 8416.4
## cvpred      8075.1 7491 8167.4 7615 7990 8097 8220 8501 8259 8256.4 8420.6
## oppPTS      8017.0 7307 8095.0 7743 8176 8208 8237 8365 8512 8227.0 8491.0
## CV residual  -58.1 -184  -72.4  128  186  111   17 -136  253  -29.4   70.4
##                494  502  505    506  508    516    522  523  527    529  530
## Predicted   7475.0 7518 8230 8015.1 8033 8080.9 8012.7 7466 7378 7470.2 7778
## cvpred      7488.8 7518 8221 8022.5 8018 8085.8 8007.5 7471 7392 7473.6 7793
## oppPTS      7484.0 7466 8047 7981.0 8163 8120.0 7974.0 7101 7059 7412.0 7529
## CV residual   -4.8  -52 -174  -41.5  145   34.2  -33.5 -370 -333  -61.6 -264
##                531    532  533    534    537  539    540  543    545  546  552
## Predicted   7446.3 7915.7 7349 7965.8 8008.2 7937 7796.8 7853 8137.3 7769 7856
## cvpred      7449.4 7933.9 7371 7964.4 8008.6 7937 7800.2 7859 8129.5 7782 7857
## oppPTS      7480.0 7866.0 7250 7976.0 7992.0 8058 7720.0 8085 8036.0 7560 7982
## CV residual   30.6  -67.9 -121   11.6  -16.6  121  -80.2  226  -93.5 -222  125
##              554  556    559    560    561  567  568  573  580    583  585  587
## Predicted   7813 7770 7226.8 7906.5 7598.1 7676 7772 7760 8144 7808.1 7665 8274
## cvpred      7820 7780 7230.2 7911.2 7590.9 7681 7776 7748 8152 7826.7 7663 8266
## oppPTS      8014 7548 7330.0 7858.0 7680.0 7724 8006 7580 8260 7876.0 7526 8067
## CV residual  194 -232   99.8  -53.2   89.1   43  230 -168  108   49.3 -137 -199
##              589  592  597  600  601  602  606  610  611    619    622  623
## Predicted   7940 7592 7852 7668 8055 8026 7277 7601 8072 7468.4 8085.8 7507
## cvpred      7942 7598 7847 7681 8059 8032 7280 7599 8073 7467.3 8074.5 7520
## oppPTS      7741 7412 7992 7834 8262 7884 7021 7359 7952 7544.0 8016.0 7253
## CV residual -201 -186  145  153  203 -148 -259 -240 -121   76.7  -58.5 -267
##              629  635  638    640    641    642  643  644  645     647  649
## Predicted   8042 7635 8192 7848.1 8129.8 7764.8 7841 8049 8065 8186.46 7834
## cvpred      8027 7639 8169 7855.5 8121.8 7774.1 7842 8029 8054 8187.35 7836
## oppPTS      7658 7465 8338 7792.0 8218.0 7815.0 7619 7832 8177 8189.00 7949
## CV residual -369 -174  169  -63.5   96.2   40.9 -223 -197  123    1.65  113
##              652    653     654    658    660    662  667    668    670    672
## Predicted   8088 8267.2 7987.71 8297.1 7868.8 8184.0 7691 8003.9 7833.1 7746.8
## cvpred      8063 8266.9 7966.08 8298.6 7863.4 8195.8 7719 7995.8 7837.9 7764.7
## oppPTS      7925 8311.0 7975.00 8271.0 7819.0 8208.0 7841 7949.0 7874.0 7676.0
## CV residual -138   44.1    8.92  -27.6  -44.4   12.2  122  -46.8   36.1  -88.7
##              674  675  677  681  682    686  689  691  696  697    700    702
## Predicted   7791 8176 8147 7531 8531 8025.4 7958 7756 8186 7764 7737.2 8129.3
## cvpred      7797 8143 8159 7543 8505 8019.1 7959 7758 8176 7774 7741.9 8129.2
## oppPTS      7842 8367 8307 7278 8659 8070.0 7689 7609 8040 7881 7834.0 8178.0
## CV residual   45  224  148 -265  154   50.9 -270 -149 -136  107   92.1   48.8
##                703     707  708  711  720  722  732  740    743    753  763
## Predicted   8157.3 8024.60 8643 7670 7823 8600 8076 8444 7897.2 7663.0 8107
## cvpred      8137.7 8027.19 8645 7681 7817 8617 8077 8442 7914.5 7672.9 8107
## oppPTS      8064.0 8033.00 8438 7388 7932 8770 8395 8593 7977.0 7767.0 8244
## CV residual  -73.7    5.81 -207 -293  115  153  318  151   62.5   94.1  137
##              765  767     770  771    772    776    779  784  786  792  795
## Predicted   8826 7885 7718.73 8753 7690.8 7975.9 8008.4 9087 8610 8466 8555
## cvpred      8806 7887 7705.82 8744 7709.9 7984.5 8030.3 9097 8618 8464 8547
## oppPTS      8841 7737 7715.00 8966 7650.0 7956.0 8127.0 9217 8510 8838 8686
## CV residual   35 -150    9.18  222  -59.9  -28.5   96.7  120 -108  374  139
##              796  802    805  809    815    819    823  824     825  827  830
## Predicted   7895 8021 8248.7 7640 8540.6 7903.0 8253.4 7579 8664.31 7894 7876
## cvpred      7907 8024 8248.9 7648 8532.4 7924.7 8250.2 7595 8667.04 7883 7868
## oppPTS      8036 7895 8284.0 7487 8506.0 8003.0 8234.0 7711 8670.00 7687 7771
## CV residual  129 -129   35.1 -161  -26.4   78.3  -16.2  116    2.96 -196  -97
## 
## Sum of squares = 7910667    Mean square = 28456    n = 278 
## 
## fold 2 
## Observations in test set: 279 
##                1    2    4    7      9   10     17      18   20   21   26   27
## Predicted   8361 9051 9696 8586 9113.5 8865 8498.8 9807.96 8780 8528 9328 9216
## cvpred      8359 9068 9682 8577 9103.1 8846 8502.5 9814.76 8782 8532 9336 9249
## oppPTS      8334 8664 9332 8853 9176.0 8603 8469.0 9819.00 8515 8887 9068 9011
## CV residual  -25 -404 -350  276   72.9 -243  -33.5    4.24 -267  355 -268 -238
##               31      33     34   36   41   42   52   57   58     65     71
## Predicted   8961 8776.05 8751.1 9421 8851 9177 9387 8755 8590 9388.0 9534.2
## cvpred      8962 8770.23 8734.4 9420 8843 9187 9392 8733 8598 9416.1 9569.2
## oppPTS      8851 8768.00 8802.0 9262 8973 8867 9187 9001 8441 9502.0 9503.0
## CV residual -111   -2.23   67.6 -158  130 -320 -205  268 -157   85.9  -66.2
##               73     76   82   83   85   88   93     95    100  103  106    109
## Predicted   9470 9149.7 8688 8571 8641 9495 8801 8695.3 9335.5 8979 8792 9071.8
## cvpred      9490 9144.1 8695 8579 8633 9510 8799 8692.3 9338.3 8985 8802 9075.5
## oppPTS      9277 9205.0 8445 7997 8361 9299 8656 8735.0 9324.0 9170 8448 8986.0
## CV residual -213   60.9 -250 -582 -272 -211 -143   42.7  -14.3  185 -354  -89.5
##              112  114    117    118  123    125  131  135    141    146    151
## Predicted   9027 8811 8896.8 9206.4 9144 9571.8 8747 8929 9091.9 9224.4 9015.1
## cvpred      9031 8834 8898.2 9208.4 9153 9608.7 8754 8943 9091.4 9213.5 9027.3
## oppPTS      8879 8660 8985.0 9129.0 8977 9632.0 8925 8822 9071.0 9165.0 9112.0
## CV residual -152 -174   86.8  -79.4 -176   23.3  171 -121  -20.4  -48.5   84.7
##                156  158    161  167    172    179  180     181    187  188  191
## Predicted   9251.4 8812 8416.0 8695 8868.2 9364.1 8954 9262.45 8584.0 8467 9241
## cvpred      9272.4 8821 8416.9 8697 8865.3 9391.8 8965 9291.14 8573.9 8474 9240
## oppPTS      9176.0 8572 8431.0 8836 8893.0 9359.0 9300 9287.00 8504.0 8602 9453
## CV residual  -96.4 -249   14.1  139   27.7  -32.8  335   -4.14  -69.9  128  213
##                192    193  198    207  208    211    215  217  226    230
## Predicted   8804.3 8608.6 8999 8630.6 8674 8281.7 9554.8 8897 9399 8198.7
## cvpred      8788.4 8607.8 9014 8625.3 8678 8255.6 9529.1 8925 9410 8198.6
## oppPTS      8821.0 8646.0 8695 8699.0 8863 8300.0 9583.0 9109 9275 8176.0
## CV residual   32.6   38.2 -319   73.7  185   44.4   53.9  184 -135  -22.6
##                  236  239  243    244     246  249    252  253  257  259  262
## Predicted   8435.424 8444 8624 8558.8 8750.56 8644 8784.6 9079 8277 8727 8139
## cvpred      8436.446 8457 8614 8564.3 8746.95 8654 8768.5 9091 8282 8729 8112
## oppPTS      8436.000 8057 8787 8523.0 8755.00 8763 8841.0 8847 8367 8940 8278
## CV residual   -0.446 -400  173  -41.3    8.05  109   72.5 -244   85  211  166
##              266    268  269  277    279  280  283  286  287     289  290  293
## Predicted   8178 8493.8 8997 9127 8767.8 8920 8511 8432 8132 8182.76 8382 8073
## cvpred      8189 8503.9 9013 9155 8753.8 8931 8508 8430 8100 8161.61 8353 8086
## oppPTS      7937 8466.0 9191 9010 8811.0 8695 8643 8834 8448 8155.00 8479 7946
## CV residual -252  -37.9  178 -145   57.2 -236  135  404  348   -6.61  126 -140
##              295  297  300  302    304    306  307  308  309    310    312  315
## Predicted   8350 8230 8570 8588 8868.3 8679.2 8683 8879 8069 8495.0 8752.2 8879
## cvpred      8353 8203 8589 8588 8880.5 8656.4 8696 8852 8045 8494.1 8730.1 8875
## oppPTS      8507 8352 8749 8780 8897.0 8707.0 8539 9046 8252 8583.0 8761.0 9050
## CV residual  154  149  160  192   16.5   50.6 -157  194  207   88.9   30.9  175
##              319  320  325  327    331    333    334    336    337  338  340
## Predicted   8700 8473 8470 8573 8567.4 8708.9 8713.5 8381.5 8232.9 8229 7692
## cvpred      8690 8501 8458 8573 8593.6 8720.7 8704.5 8366.6 8208.9 8223 7677
## oppPTS      8769 8366 8650 8698 8544.0 8752.0 8643.0 8433.0 8304.0 8531 7886
## CV residual   79 -135  192  125  -49.6   31.3  -61.5   66.4   95.1  308  209
##              341  358  361    362  366  367    371     378    380  383  384
## Predicted   8439 8465 8687 8715.7 8654 7929 7441.1 8671.62 8348.4 8071 7940
## cvpred      8416 8490 8685 8729.9 8670 7944 7455.1 8671.44 8375.5 8101 7975
## oppPTS      8618 8347 8579 8764.0 8834 7816 7366.0 8678.00 8427.0 8299 7799
## CV residual  202 -143 -106   34.1  164 -128  -89.1    6.56   51.5  198 -176
##                387  390    393    394  399  401  404    405     406  407  410
## Predicted   8674.9 8017 8718.2 7972.9 8840 7728 8019 8389.4 8072.41 7933 7742
## cvpred      8702.3 8023 8732.2 7999.3 8901 7755 8061 8404.1 8068.98 7971 7757
## oppPTS      8755.0 8253 8701.0 7959.0 8811 7617 7878 8448.0 8073.00 7792 8031
## CV residual   52.7  230  -31.2  -40.3  -90 -138 -183   43.9    4.02 -179  274
##                411  415    421  423  426  427    428    429    431  436  440
## Predicted   7829.0 8125 8145.5 7491 7591 7182 7898.3 8522.9 8420.2 7656 7656
## cvpred      7836.1 8163 8159.1 7512 7613 7201 7925.2 8563.1 8467.3 7689 7695
## oppPTS      7781.0 7952 8180.0 7328 7572 7022 7952.0 8535.0 8557.0 7326 7563
## CV residual  -55.1 -211   20.9 -184  -41 -179   26.8  -28.1   89.7 -363 -132
##              444     448  451  454  455  458    459  460     463  468  470
## Predicted   7907 8071.62 7902 7652 7582 8045 7661.7 7843 8447.54 8198 7756
## cvpred      7935 8094.09 7904 7667 7596 8055 7666.7 7857 8464.51 8213 7772
## oppPTS      7772 8085.00 8014 7759 7348 8266 7592.0 7985 8469.00 8041 7475
## CV residual -163   -9.09  110   92 -248  211  -74.7  128    4.49 -172 -297
##                471  473  475  479  484  486    487  491    500  501  507    510
## Predicted   7770.1 7732 7328 8205 7783 8483 8375.5 8074 7740.1 7887 7688 7864.5
## cvpred      7776.4 7759 7337 8241 7824 8472 8383.4 8076 7731.3 7899 7690 7868.8
## oppPTS      7847.0 7619 7260 8522 7723 8363 8289.0 7929 7661.0 7683 7548 7886.0
## CV residual   70.6 -140  -77  281 -101 -109  -94.4 -147  -70.3 -216 -142   17.2
##              511      512    513    514  515  517    520  526  528  536  542
## Predicted   8070 7.37e+03 7852.1 7815.5 8071 8200 7659.4 8155 8228 7736 7887
## cvpred      8084 7.37e+03 7856.5 7813.5 8050 8206 7660.8 8147 8231 7745 7888
## oppPTS      7934 7.37e+03 7927.0 7909.0 7888 7976 7607.0 7966 7911 7574 8035
## CV residual -150 5.94e-02   70.5   95.5 -162 -230  -53.8 -181 -320 -171  147
##              547  550  553    555    564  566  569  572  574    578  586  588
## Predicted   8276 8010 7387 7855.0 7725.7 7679 7903 8001 7604 7994.0 8231 7646
## cvpred      8294 8017 7378 7856.1 7723.3 7689 7918 7975 7605 8005.8 8246 7634
## oppPTS      8452 7884 7276 7868.0 7766.0 7798 7631 7806 7190 8031.0 7971 7752
## CV residual  158 -133 -102   11.9   42.7  109 -287 -169 -415   25.2 -275  118
##              595    596    599  603  605  607    612  613  616  617    627
## Predicted   7475 7588.3 7896.9 7291 7378 8077 7351.6 7407 8328 7522 8201.3
## cvpred      7488 7569.5 7910.3 7279 7393 8107 7324.5 7409 8338 7531 8209.6
## oppPTS      7566 7585.0 7876.0 6909 7220 8147 7303.0 7196 8287 7419 8233.0
## CV residual   78   15.5  -34.3 -370 -173   40  -21.5 -213  -51 -112   23.4
##                628    630    631  632  636  637  639    648  651    655  656
## Predicted   8167.2 7874.3 7888.9 7887 7664 7768 7654 8518.3 7619 8314.0 8228
## cvpred      8156.3 7870.3 7869.1 7871 7677 7775 7653 8534.6 7616 8328.3 8252
## oppPTS      8220.0 7849.0 7934.0 7995 7564 7912 7473 8470.0 7248 8268.0 8362
## CV residual   63.7  -21.3   64.9  124 -113  137 -180  -64.6 -368  -60.3  110
##                657  659  664  666  669    671    676  678    683    684  685
## Predicted   8223.6 8291 8410 7660 7565 8100.0 7771.9 8594 8556.8 7805.3 8360
## cvpred      8240.1 8312 8432 7678 7570 8120.6 7777.2 8600 8572.9 7830.9 8370
## oppPTS      8159.0 7968 8187 7544 7255 8105.0 7872.0 8431 8532.0 7789.0 8184
## CV residual  -81.1 -344 -245 -134 -315  -15.6   94.8 -169  -40.9  -41.9 -186
##                694    698  699    704    705    712    714    715  718    719
## Predicted   8837.1 8511.8 8611 7985.1 8107.0 8279.7 8111.0 8647.5 8145 8170.0
## cvpred      8849.4 8533.9 8634 7998.8 8159.5 8300.4 8145.5 8658.8 8152 8174.6
## oppPTS      8765.0 8480.0 8753 7962.0 8228.0 8367.0 8087.0 8598.0 8318 8234.0
## CV residual  -84.4  -53.9  119  -36.8   68.5   66.6  -58.5  -60.8  166   59.4
##              721    724  733    734  736    737  742  744  748  757    758  759
## Predicted   7759 8893.9 8115 7747.7 8245 7799.8 8409 8037 7866 8353 8079.7 7949
## cvpred      7760 8898.6 8143 7741.8 8274 7790.8 8408 8045 7886 8356 8071.9 7945
## oppPTS      7862 8924.0 8272 7837.0 8119 7889.0 8717 8146 7781 8518 8140.0 8146
## CV residual  102   25.4  129   95.2 -155   98.2  309  101 -105  162   68.1  201
##                760    762  773    777  778  780  781  787    788    790  791
## Predicted   7979.9 8380.5 8164 7840.4 7837 7760 7955 8121 7986.3 7648.1 8195
## cvpred      7959.5 8403.8 8158 7837.1 7842 7764 7938 8139 7983.4 7630.9 8200
## oppPTS      8040.0 8422.0 8352 7836.0 7693 7838 8141 8370 7952.0 7727.0 7870
## CV residual   80.5   18.2  194   -1.1 -149   74  203  231  -31.4   96.1 -330
##              793  794  797    799    800  803    806    807  810  814    816
## Predicted   8172 8178 7930 8553.2 7863.1 8431 7913.7 7506.5 8403 8434 8282.5
## cvpred      8170 8178 7953 8581.9 7868.7 8436 7921.2 7491.4 8415 8433 8288.4
## oppPTS      8323 8422 7812 8637.0 7774.0 8680 7857.0 7473.0 8566 8668 8271.0
## CV residual  153  244 -141   55.1  -94.7  244  -64.2  -18.4  151  235  -17.4
##                 817    818    820  826    828    829  831  833  834  835
## Predicted   8325.28 7795.6 7768.3 8058 7918.3 8701.2 8440 8370 8177 8406
## cvpred      8355.49 7793.8 7768.5 8046 7902.5 8721.7 8454 8378 8173 8394
## oppPTS      8346.00 7820.0 7757.0 8285 7996.0 8684.0 8589 8639 8303 8584
## CV residual   -9.49   26.2  -11.5  239   93.5  -37.7  135  261  130  190
## 
## Sum of squares = 7714830    Mean square = 27652    n = 279 
## 
## fold 3 
## Observations in test set: 278 
##                  3    6    8   11   12   13     14     22    28   29   32   37
## Predicted   8967.5 9471 9265 8782 9016 9151 9484.6 9058.2 10219 8934 8829 9074
## cvpred      8959.3 9478 9274 8774 9024 9155 9495.5 9053.1 10232 8949 8844 9088
## oppPTS      9035.0 9609 9070 8954 8702 8975 9438.0 8982.0 10025 8692 8712 8716
## CV residual   75.7  131 -204  180 -322 -180  -57.5  -71.1  -207 -257 -132 -372
##               40   43   44   47     49   54     55     60   68   69   70   75
## Predicted   9122 8969 8577 8972 9191.7 8730 8611.1 8900.9 8595 8763 9009 9543
## cvpred      9129 8972 8580 8978 9208.5 8744 8622.3 8905.8 8592 8771 9013 9551
## oppPTS      9007 8666 8784 8657 9161.0 8683 8532.0 8926.0 8413 8413 8752 9272
## CV residual -122 -306  204 -321  -47.5  -61  -90.3   20.2 -179 -358 -261 -279
##               77     78   79     84   86   87   99  101  102    107  108    111
## Predicted   9354 9319.8 9530 8571.9 8855 8926 9436 8714 9304 8572.6 9138 9243.9
## cvpred      9361 9328.9 9532 8575.2 8860 8921 9456 8716 9310 8581.2 9142 9246.7
## oppPTS      9096 9391.0 9209 8562.0 8633 9075 9287 8961 9144 8658.0 9028 9344.0
## CV residual -265   62.1 -323  -13.2 -227  154 -169  245 -166   76.8 -114   97.3
##                113  116  119  120    121    122  127  128    129  130    132
## Predicted   9264.4 8749 9109 9736 9370.7 9602.8 8944 8921 8923.1 9163 9077.5
## cvpred      9267.3 8745 9115 9757 9377.8 9622.6 8935 8930 8935.2 9175 9070.5
## oppPTS      9335.0 8867 8938 9641 9304.0 9654.0 9093 8528 8956.0 9007 9031.0
## CV residual   67.7  122 -177 -116  -73.8   31.4  158 -402   20.8 -168  -39.5
##              136     138    139    140  143  145    149  153    154  159  163
## Predicted   9168 8708.79 8655.7 9198.9 9680 9378 9041.5 8682 9241.7 9039 8384
## cvpred      9171 8714.42 8646.5 9206.4 9699 9390 9034.6 8691 9232.9 9045 8392
## oppPTS      8946 8712.00 8587.0 9274.0 9303 9582 8983.0 8858 9268.0 8901 8523
## CV residual -225   -2.42  -59.5   67.6 -396  192  -51.6  167   35.1 -144  131
##              166  169    170    171  176  177  182  183    185    194    196
## Predicted   9723 8823 8698.9 9496.8 8507 9180 9009 8922 8737.7 8956.9 8728.5
## cvpred      9741 8827 8701.2 9513.5 8515 9180 9022 8950 8730.2 8965.9 8733.8
## oppPTS      9640 8683 8751.0 9503.0 8745 9311 8811 8802 8828.0 8949.0 8653.0
## CV residual -101 -144   49.8  -10.5  230  131 -211 -148   97.8  -16.9  -80.8
##              199  203    205    209  219    220  223  225    227    229  231
## Predicted   8642 9491 8614.0 9221.6 8630 8892.7 9494 9271 9053.3 8953.9 9220
## cvpred      8649 9499 8612.9 9228.1 8625 8895.8 9503 9272 9053.5 8970.4 9232
## oppPTS      8785 9714 8597.0 9265.0 8818 8937.0 9258 9096 9106.0 8958.0 9056
## CV residual  136  215  -15.9   36.9  193   41.2 -245 -176   52.5  -12.4 -176
##              233    234  237  240  241  245    247      248    251    254
## Predicted   8409 8819.6 8226 9713 8390 8956 8166.6 8831.172 8592.5 8837.1
## cvpred      8397 8829.2 8233 9722 8388 8962 8188.5 8853.474 8600.2 8837.9
## oppPTS      8696 8873.0 8378 9791 8632 9044 8150.0 8853.000 8630.0 8756.0
## CV residual  299   43.8  145   69  244   82  -38.5   -0.474   29.8  -81.9
##                255  256  258  260    261   265    267    275  278  294  296
## Predicted   8339.3 8700 9069 8201 8758.5 10480 9502.2 8845.3 8446 9254 8667
## cvpred      8348.2 8710 9076 8199 8769.3 10499 9514.7 8862.7 8452 9264 8655
## oppPTS      8432.0 8684 9009 8668 8858.0 10723 9430.0 8811.0 8656 9412 9042
## CV residual   83.8  -26  -67  469   88.7   224  -84.7  -51.7  204  148  387
##                298  299    301    303  305  314    316    317  323  328  329
## Predicted   8247.3 8737 8827.8 7988.9 8248 8209 8005.9 8243.7 8472 8500 8190
## cvpred      8269.3 8742 8848.1 7985.5 8259 8216 8019.9 8242.9 8470 8503 8196
## oppPTS      8319.0 8953 8815.0 8009.0 8462 8429 8109.0 8303.0 8697 8684 8328
## CV residual   49.7  211  -33.1   23.5  203  213   89.1   60.1  227  181  132
##                330  339    347    348  350     353  356  363    364    369
## Predicted   7923.7 8771 8506.7 8643.6 7878 8256.31 8162 7581 7910.9 8060.3
## cvpred      7914.7 8774 8507.7 8646.5 7874 8255.23 8171 7576 7926.1 8050.2
## oppPTS      7823.0 8930 8587.0 8701.0 7997 8256.00 8281 7771 7942.0 7980.0
## CV residual  -91.7  156   79.3   54.5  123    0.77  110  195   15.9  -70.2
##                370  372  373  376    382    389  392     396  397  398    409
## Predicted   7946.0 8660 8025 8155 8497.8 8190.3 7863 8489.10 7747 7317 8370.8
## cvpred      7946.6 8657 8024 8145 8499.4 8186.8 7856 8482.77 7745 7319 8371.3
## oppPTS      7929.0 8700 8240 8317 8464.0 8138.0 8071 8478.00 7621 7261 8463.0
## CV residual  -17.6   43  216  172  -35.4  -48.8  215   -4.77 -124  -58   91.7
##              413    417    418    420  422  424    425    430    432    434
## Predicted   8446 7933.5 7888.9 7810.5 8459 8724 7880.8 7353.7 7859.1 8199.4
## cvpred      8447 7926.1 7878.3 7806.1 8459 8731 7872.2 7352.3 7837.3 8205.7
## oppPTS      8566 7960.0 7933.0 7864.0 8321 8849 7955.0 7293.0 7881.0 8162.0
## CV residual  119   33.9   54.7   57.9 -138  118   82.8  -59.3   43.7  -43.7
##              438  439  441     445     447    452    453  456    457  462  465
## Predicted   8046 8148 7926 8178.29 7650.35 7498.6 8161.0 7473 8060.2 7592 7506
## cvpred      8049 8153 7926 8175.01 7652.79 7502.3 8178.1 7471 8069.4 7589 7500
## oppPTS      8003 8348 7748 8185.00 7644.00 7572.0 8079.0 7361 7995.0 7375 7383
## CV residual  -46  195 -178    9.99   -8.79   69.7  -99.1 -110  -74.4 -214 -117
##                466  467    472  476  477  480    483  493    495    496  497
## Predicted   7865.5 8184 7799.1 7830 8289 8043 7795.2 7809 8218.9 7909.0 8232
## cvpred      7869.7 8186 7794.1 7833 8306 8048 7789.2 7804 8219.9 7907.1 8241
## oppPTS      7905.0 8232 7741.0 7658 8541 7921 7853.0 7566 8282.0 7872.0 8121
## CV residual   35.3   46  -53.1 -175  235 -127   63.8 -238   62.1  -35.1 -120
##              498  499  503    504    509  518  519    521    524  525  535  538
## Predicted   7711 8302 8589 7475.7 8132.5 8190 7907 7923.8 7911.7 7776 7986 8078
## cvpred      7702 8302 8580 7468.3 8128.4 8197 7899 7916.4 7907.7 7776 7987 8076
## oppPTS      7435 8150 8368 7399.0 8190.0 8326 7784 7818.0 7942.0 7871 7822 8192
## CV residual -267 -152 -212  -69.3   61.6  129 -115  -98.4   34.3   95 -165  116
##                541    544    548    549  551    557  558    562  563  565  570
## Predicted   7557.0 8244.4 7992.0 7888.2 7912 7899.2 8291 7924.0 7622 7628 8179
## cvpred      7557.3 8240.9 7992.1 7879.9 7910 7888.5 8290 7918.6 7620 7633 8171
## oppPTS      7621.0 8280.0 7973.0 7916.0 7720 7841.0 8111 7954.0 7423 7530 8207
## CV residual   63.7   39.1  -19.1   36.1 -190  -47.5 -179   35.4 -197 -103   36
##              571  575    576    577    579  581    582  584    590    591
## Predicted   8074 8278 7627.3 7685.3 8076.5 7543 8092.7 7701 7628.1 7873.2
## cvpred      8068 8277 7621.8 7678.5 8072.1 7539 8091.7 7696 7623.3 7862.8
## oppPTS      8284 8493 7567.0 7654.0 8039.0 7430 8139.0 7392 7589.0 7809.0
## CV residual  216  216  -54.8  -24.5  -33.1 -109   47.3 -304  -34.3  -53.8
##                593    594    598    604  608  609  614    615    618    620
## Predicted   7606.7 7966.3 7906.1 7730.0 7918 7813 7697 7577.3 7928.8 8014.9
## cvpred      7605.1 7968.9 7897.5 7721.7 7914 7818 7691 7566.9 7930.4 8005.1
## oppPTS      7565.0 7934.0 7930.0 7709.0 7732 7730 7537 7663.0 8030.0 8022.0
## CV residual  -40.1  -34.9   32.5  -12.7 -182  -88 -154   96.1   99.6   16.9
##              621  624     625  626  633  634  646  650    661  663    665  673
## Predicted   7186 7601 7990.57 8077 7464 8404 8141 8206 7646.2 7691 7475.0 7786
## cvpred      7180 7602 7988.64 8077 7461 8404 8140 8204 7651.7 7695 7461.9 7771
## oppPTS      6909 7371 7990.00 8401 7336 8271 8344 8328 7632.0 7394 7517.0 7579
## CV residual -271 -231    1.36  324 -125 -133  204  124  -19.7 -301   55.1 -192
##                679  680  687    688  690    692  693  695    701  706     709
## Predicted   7968.8 8127 8273 8344.3 7892 8540.5 7675 7718 8527.5 7844 8060.41
## cvpred      7973.7 8117 8267 8342.6 7886 8531.6 7679 7700 8521.7 7837 8060.05
## oppPTS      8060.0 7980 8137 8252.0 7620 8506.0 7531 7555 8531.0 7707 8069.00
## CV residual   86.3 -137 -130  -90.6 -266  -25.6 -148 -145    9.3 -130    8.95
##                710    713  716  717  723    725    726  727  728  729  730  731
## Predicted   8539.1 8133.4 7981 7542 7593 7635.2 8643.8 8092 8195 8660 8049 8292
## cvpred      8535.4 8124.7 7978 7527 7595 7623.3 8630.2 8087 8182 8648 8044 8289
## oppPTS      8451.0 8076.0 8203 7404 7386 7547.0 8642.0 8288 8309 8766 8200 8521
## CV residual  -84.4  -48.7  225 -123 -209  -76.3   11.8  201  127  118  156  232
##              735  738    739  741  745    746    747    749    750  751  752
## Predicted   8376 8502 7985.4 7595 7979 7887.8 7646.8 8322.3 7586.6 7890 8149
## cvpred      8368 8483 7978.4 7581 7980 7885.8 7630.8 8321.9 7579.9 7884 8145
## oppPTS      8490 8612 7900.0 7427 8131 7917.0 7659.0 8401.0 7491.0 8181 8275
## CV residual  122  129  -78.4 -154  151   31.2   28.2   79.1  -88.9  297  130
##              754     755  756  761  764  766  768  769  774  775    782  783
## Predicted   8954 7765.46 8552 8343 7576 8187 7874 8504 8135 8094 8375.1 7997
## cvpred      8958 7749.55 8542 8341 7571 8182 7880 8494 8132 8097 8374.9 8003
## oppPTS      9212 7742.00 8708 8231 7730 8452 7981 8816 8275 8489 8394.0 8128
## CV residual  254   -7.55  166 -110  159  270  101  322  143  392   19.1  125
##              785  789  798  801  804    808    811  812  813  821  822    832
## Predicted   8617 8275 7961 8436 7972 7903.4 7802.2 8368 8133 7792 8614 7971.9
## cvpred      8608 8278 7960 8430 7958 7896.8 7782.2 8355 8138 7790 8605 7959.8
## oppPTS      8425 8528 8334 8558 8109 7978.0 7873.0 8421 8246 7603 8832 8034.0
## CV residual -183  250  374  128  151   81.2   90.8   66  108 -187  227   74.2
## 
## Sum of squares = 6828900    Mean square = 24564    n = 278 
## 
## Overall (Sum over all 278 folds) 
##    ms 
## 26891
## [1] 26891

MSPE Value for opposition points model.

## [1] 22454397

Press value.

## [1] 0.922

Predicted R^2 value.

## [1] 0.924

Adjusted R^2 value.

We have used the sample test data which is generated randomly to test our model.

## [1] 112359

MSPE

## [1] 28202013

Press Value

## [1] 0.907

Predicted R^2 value.

Logistic Model

Here we have to write logistics content

## 
## Call:
## lm(formula = W ~ PTSdiff, data = Nba)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.739 -2.102 -0.067  2.026 10.603 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 4.10e+01   1.06e-01     387   <2e-16 ***
## PTSdiff     3.26e-02   2.79e-04     117   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.06 on 833 degrees of freedom
## Multiple R-squared:  0.942,  Adjusted R-squared:  0.942 
## F-statistic: 1.36e+04 on 1 and 833 DF,  p-value: <2e-16
## 
## Call:
## glm(formula = Playoffs ~ W, family = "binomial", data = Nbalogitrain2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.8941  -0.0908   0.0131   0.1749   2.9447  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -18.4806     1.6214   -11.4   <2e-16 ***
## W             0.4719     0.0406    11.6   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1138.8  on 834  degrees of freedom
## Residual deviance:  308.9  on 833  degrees of freedom
## AIC: 312.9
## 
## Number of Fisher Scoring iterations: 8
## [1] 0.48

Optimal_pcut value is 0.48.

drawing

Here, we can observe that our model correctly predicted 768 out of the 835 observations from the test dataset which is a success rate of 91.97%

Based on the result of above code, we observed that the optimal p-cut value is 0.48 which corresponds to minimum number of wins required to qualify for the playoffs = 39.

##              pred_value2
## actual_value2   0   1
##             0 308  47
##             1  20 460

Final Equation

For calculating minimum number of wins required to qualify the playoffs, we will create a 4th model - logistic regression model.

Following are the results of the steps we’ve taken to create these models:

Model for calculating number of wins: W = 41 + 0.0326 * PTSdiff

W = 41 + 0.0326 * PTSdiff 39 = 41 + 0.0326 * PTSdiff We get PTSdiff = -61.35

Conclusion:

Thus, we can conclude that to qualify for the playoffs, a team needs to:

  1. Win at least 39 matches

  2. Have a points difference of at least -61.35

Combining all the models together.

Diff2<-(predictionfinal[,1]-predictionfinal2[,1])
Playoffsdif<-data.frame(NBA_test$Playoffs,Diff2)

Playoffsdif$Predictedplayoffs[Playoffsdif$Diff2 > -61] <- 1
Playoffsdif$Predictedplayoffs[Playoffsdif$Diff2 < -61] <- 0

actual_value3=NBA_test$Playoffs
pred_value3=Playoffsdif$Predictedplayoffs
confusion_matrix3=table(actual_value3,pred_value3)
confusion_matrix3
##              pred_value3
## actual_value3   0   1
##             0  79  23
##             1  28 121

Out of the 251 results, our model got 200 of them right, which is an accuracy of 79%.

CONCLUSION

From the tests conducted, we conclude that an NBA Team needs to win at least 39 games with a point difference (Points Scored -Opposition Points) of at least greater than -61.35 to make it to the playoffs.

Decision Tree

Decision Tree

A decision tree is an effective machine learning modeling technique for regression and classification problems. A decision tree provides a sequential, hierarchical decision about the response variable based on the predictor data. A hierarchical model is defined by a series of questions that lead to a value when applied to a reading. Once the model is ready, it acts like a protocol in a series of “If then” conditions that produce a particular result from the provided data.

CART stands for classification and regression tree.

  • Types
    • Regression tree: response variable Y is numerical
    • Classification tree: response variable Y is categorical

The decision tree models are classified as Classification trees when the target variable uses a discrete set of values and Regression trees when the response variable is binary. In our case the response variable Playoff has only two outcomes, that is whether team will qualify for the playoffs or not, hence we form a Classification tree. The goal of the decision tree is to select the optical choice at the end of each node. The decision to split at each node is made according to the metric known as purity. A node is 100% impure if split evenly 50/50 and 100% pure when the entire data belongs to a single class. In order to achieve the best model, impurity should be avoided.

Splitting the data in Training and Test data

The in-sample and out-of-sample prediction for regression tree:

set.seed(13439960)
sample_index <- sample(nrow(Nba),nrow(Nba)*0.70)
Nba.train = Nba[sample_index,]
Nba.test = Nba[-sample_index,]
  • In-sample prediction \[sample_index <- sample(nrow(Nba),nrow(Nba)*0.70)\]

  • Out-of-sample prediction \[Nba.train = Nba[sample_index,]\] \[Nba.test = Nba[-sample_index,]\]

Dimensions of the Training data

dim(Nba.train)
## [1] 584  21

We observe that the Training sample consists of 0.7*835=584 rows and 20 variables/columns

Dimensions of the Test data

dim(Nba.test)
## [1] 251  21

We observe that the testing sample consists of 0.3*835=251 rows and 20 variables/columns

library(rpart)
library(rpart.plot)

Classification tree with 12 predictors

The classification trees are a bit complicated, due to the presence of asymmetric cost function. We select the weight for the false negatives (predicting 0 when truth is 1) and the false positives (predicting 1 when truth is 0).

In our case we make the assumption that false negative cost equals the false positive cost therefore select symmetric cost function.

Also, In the rpart() funcation, the cp(complexity parameter) argument is one of the parameters that is used to control the compexity of the tree. The overall R-square value must increase by cp at each step and smaller the cp value, the larger (complex) tree rpart will attempt to fit. We have taken the default value for cp as 0.001 and would change this value after pruning.

Nba.rpart <- rpart(formula = Playoffs ~ W + PTS + oppPTS + X2PA + X3PA + FTA + AST + ORB + DRB + TOV + STL + BLK, data = Nba.train, method = "class",cp=0.001) # Method=class is req if the response var is not declared as factors

prp(Nba.rpart, extra = 1,fallen.leaves = TRUE)

In-sample performance

pred.train <- predict(Nba.rpart, type = "class")
table(Nba.train$Playoffs, pred.train, dnn = c("True", "Predicted"))
##     Predicted
## True   0   1
##    0 235  18
##    1  16 315
Misclassification rate :
sum(ifelse(pred.train != Nba.train$Playoffs,1,0))/dim(Nba.train)[1]
## [1] 0.0582

Out of Sample performance

pred.test <- predict(Nba.rpart, Nba.test, type = "class")
table(Nba.test$Playoffs, pred.test, dnn = c("True", "Predicted"))
##     Predicted
## True   0   1
##    0  94   8
##    1  19 130
Misclassification rate :
sum(ifelse(pred.test != Nba.test$Playoffs,1,0))/dim(Nba.test)[1]
## [1] 0.108
ROC and AUC
  • ROC: In order to determine the overall measure of goodness of classification, we use the Receiver Operating Characteristic (ROC) curve. Rather than using an overall misclassification rate, it employs two measures – true positive fraction (TPF) sensitivity and false positive fraction (FPF) 1 - specificity
Nba.test.prob.rpart = predict(Nba.rpart,Nba.test, type="prob")
library(ROCR)

pred = prediction(Nba.test.prob.rpart[,2], Nba.test$Playoffs)
perf = performance(pred, "tpr", "fpr")
plot(perf, colorize=TRUE)

  • AUC: Area under the curve, as per the industry standards any value of AUC above .70 is considered a good measure.
slot(performance(pred, "auc"), "y.values")[[1]]
## [1] 0.963
Prune the tree

We use pruning to avoid overfitting the training data. It is a technique that reduces the size of decision trees by removing sections of the tree that provide little value to classify instances. Pruning reduces the complexity of the final classifier, and hence improves predictive accuracy by the reduction of overfitting.

  • Computing the cp value
plotcp(Nba.rpart)

Nba.rpart.pruned <- prune(Nba.rpart, cp = 0.098)

Nba.rpart.pruned <- rpart(formula = Playoffs ~ W + PTS + oppPTS + X2PA + X3PA + FTA + AST + ORB + DRB + TOV + STL + BLK, data = Nba.train, method = "class",cp=0.098)

While selecting the best cp(Complexity parameter) we plot the cp graph and select the first cp value below the horizontal line. In our case that value turns out to be 0.098

Note: Method = class is required, if the response variable is not declared as factors

  • Forming the tree with new cp value

We substitute the value of cp = 0.098 in the equation and plot the optimal tree.

We observe that the new tree with cp = 0.098 has optimal length of 2 nodes.

After Pruning the tree

We now compute misclassification rate for the In-sample and Out-of-sample data and determine the ROC and AUC values.

Our aim here is to compare both the misclassification values before and after pruning.

In-sample performance

pred.train <- predict(Nba.rpart.pruned, type = "class")
table(Nba.train$Playoffs, pred.train, dnn = c("True", "Predicted"))
##     Predicted
## True   0   1
##    0 219  34
##    1  14 317
Misclassification rate :
sum(ifelse(pred.train != Nba.train$Playoffs,1,0))/dim(Nba.train)[1]
## [1] 0.0822

Out of Sample performance

pred.test <- predict(Nba.rpart.pruned, Nba.test, type = "class")
table(Nba.test$Playoffs, pred.test, dnn = c("True", "Predicted"))
##     Predicted
## True   0   1
##    0  89  13
##    1   6 143
Misclassification rate :
sum(ifelse(pred.test != Nba.test$Playoffs,1,0))/dim(Nba.test)[1]
## [1] 0.0757
ROC and AUC

Now computing the Receiver Operating Characteristic (ROC) curve and Area under the curve (AUC) values for the new pruned tree.

ROC:

Nba.test.prob.rpart = predict(Nba.rpart.pruned,Nba.test, type="prob")
library(ROCR)

pred = prediction(Nba.test.prob.rpart[,2], Nba.test$Playoffs)
perf = performance(pred, "tpr", "fpr")
plot(perf, colorize=TRUE)

AUC:

slot(performance(pred, "auc"), "y.values")[[1]]
## [1] 0.916

Conclusion:

  • Before Pruning
    • The Misclassification rate for In-sample data: 0.05821918
    • The Misclassification rate for In-sample data: 0.1075697
    • Area under curve value: 0.9632188
  • After Pruning
    • The Misclassification rate for In-sample data: 0.08219178
    • The Misclassification rate for In-sample data: 0.07569721
    • Area under curve value: 0.9161403

We observe that even though the AUC value before pruning seems better than the one after pruning, the missclassification rate after pruning is more constant and classifies better. Therefore, we can conclude that pruning the tree helps.

Random Forest

Random forest with 12 predictor variables

Random Forest is an advanced tree model which is basically an extension of the Bagging model. As compared to Bagging, Random Forest performs better in terms of prediction accuracy. Random decision forest corrects for the decisiion tree’s habit of overfitting to the training data set. Random forest builds lots of bushy trees and then averages them and reduces the variance of the bushy trees by averaging.The idea of random forest is to select (randomly) n out of the total p predictors as candidate variables and by doing so, it decorrelates the tree which in turn reduces the variance when we aggregate the trees. Along with a reduction in the variance, it does not compromise the Bias (i.e.it does not increase the Bias).

library(randomForest)
num_of_predictors <- sqrt(12)
Nba.rf <- randomForest(Playoffs~W + PTS + oppPTS + X2PA + X3PA + FTA + AST + ORB + DRB + TOV + STL + BLK, data = Nba.train, family = gaussian, mtry = floor(num_of_predictors),ntree = 500,importance=TRUE)
Nba.rf
## 
## Call:
##  randomForest(formula = Playoffs ~ W + PTS + oppPTS + X2PA + X3PA +      FTA + AST + ORB + DRB + TOV + STL + BLK, data = Nba.train,      family = gaussian, mtry = floor(num_of_predictors), ntree = 500,      importance = TRUE) 
##                Type of random forest: regression
##                      Number of trees: 500
## No. of variables tried at each split: 3
## 
##           Mean of squared residuals: 0.0659
##                     % Var explained: 73.2

The most important tuning parameter in random Forest is the variable mtry which is the number of predictors selected at each split of each tree. By default it is total no of predictors p/3 for a regression problem and square root of total predictors p for a classification problem. In our case, we can see from the summary it chooses 3 random predictor variables out of the 12 predictor variables during every split of each tree thus decorrelating the trees.The argument importance = TRUE allows us to view the variable importance. ntree specifies the number of trees to fit the model and by default it is 500. From the summary we can see that the MSR is nothing but the Out of bag error and it comes to be 0.06454635 and each observation is predicted using the average of trees that didn’t include it and hence these are de-biased estimates of prediction error.

varImpPlot(Nba.rf)

We observe that the most important variable in our equation is Win followed by oppPts and so on.

The fitted random Forest saves the OOB error for each ntree value ranging from 1 to 500 and it can be seen how the OOB error changes with the different values of ntree.

We observe that the OOB error decreases with an increase in the value of ntree.

Prediction on training sample i.e. In sample performance

NBA.rf.pred.train <- predict(Nba.rf, Nba.train)
NBA.rf.pred.train_classified <- as.numeric(NBA.rf.pred.train > 0.48)
table(Nba.train$Playoffs, NBA.rf.pred.train_classified, dnn=c("Truth","Predicted"))
##      Predicted
## Truth   0   1
##     0 253   0
##     1   0 331

The in-sample mean squared error is observed to be 0.01398344 which is very less than the in sample performances obtained from the linear and the classification model.

sum(ifelse(NBA.rf.pred.train_classified != Nba.train$Playoffs,1,0))/dim(Nba.test)[1] #0.003984064 : Symmetric misclassification rate
## [1] 0

In sample ROC and AUC

library(ROCR)

 

pred = prediction(NBA.rf.pred.train_classified, Nba.train$Playoffs)
perf = performance(pred, "tpr", "fpr")
plot(perf, colorize=TRUE)

slot(performance(pred, "auc"), "y.values")[[1]] 
## [1] 1

The In-sample AUC value is: 0.9980237 which implies that our model has an accuracy rate of 99%

Prediction on testing sample i.e. Out of sample performance

NBA.rf.pred.test <- predict(Nba.rf, Nba.test)
NBA.rf.pred.test_classified <- as.numeric(NBA.rf.pred.test> 0.48)
table(Nba.test$Playoffs, NBA.rf.pred.test_classified, dnn=c("Truth","Predicted"))
##      Predicted
## Truth   0   1
##     0  93   9
##     1   9 140

The out of sample error is 0.05468076 which is again the best value obtained as compared to the out of sample performance values from the linear and classification tree model.

sum(ifelse(NBA.rf.pred.test_classified != Nba.test$Playoffs,1,0))/dim(Nba.test)[1] #0.07569721 : Symmetric misclassification rate
## [1] 0.0717

Out-of-sample ROC and AUC

pred = prediction(NBA.rf.pred.test_classified, Nba.test$Playoffs)
perf = performance(pred, "tpr", "fpr")
plot(perf, colorize=TRUE)

slot(performance(pred, "auc"), "y.values")[[1]]
## [1] 0.926

The Out-of-sample AUC value is: 0.920779 which implies that our model has an accuracy rate of 92%

Comparison

We compare all the 4 techniques: Logistics regression model, Classification tree, Pruned Classification Tree and Random Forest, which we have used in our analysis.

drawing

Conclusion

We observe that the misclassification rate for the Random forest is minimum, making it the most suitable model for our prediction.

Using these techniques the NBA managers can scout the players during the off-season, who are suitable to the club’s style of play. They can concentrate on certain key areas that need to be worked on to gain an advantage over other NBA clubs.