Ge Chen

Mar.19th, 2015

RPI

Final

1.Data

(1).Data Selection

Statistical analysis in sports was being reconized as Oakland Athletics becoming a competitive baseball team using this statistic strategy [Reference:Lewis,M.(2003) Money Ball: The art of winning an unfair game].In the national basketball league (NBA), statistical analysis is widely used by almost every team manager for example, to dermine the intrisic value of a player.

Kobe Bryant is a famous NBA athlete at present.His career scores ranked in third place among all of the NBA players in the history. However, sometimes, Kobe is also critisized by bad shooting seletion or too many field goals attempt, which has no contributions to the team’s victory. Hence, to check whether the criticims are to the points,we would implement models using historical data.

The dataset we used collected from www.basketball-reference.com, containing the standardized statistic (assuming an athlete play 36 min per game). We selecte data in the period of 2005 ~ 2007. Because :

  1. Kobe was the only super star in the team, other players are either rookie or role players.
  2. Kobe had unlimited field goals attempts.
  3. Kobe is at his peak (healthy and superiorly skilled)
  4. Kobe attend all games in either of season.

In the simple regression model, we find that The coefficient between scores and team victory contribution are significant different from zero and positive, which means Los Angels Lakers (the team Kobe servered) has a higher possibility to win the game when Kobe gets high scores.

However, simple regression model cannot well explained our hytpothesis, because:

  1. R square is low.
  2. Points per game cannot clearly show whether Kobe has a bad shooting selection

Next, we are going to use multiple regression to continue the research.

(2).Reading in the data

attach(Kobe)
Kobe<-read.csv("~/Desktop/Applied_Regression/Kobe.csv");
head(Kobe,n=14L);
##     G     Date Opp FG FGA FGRatio ORB DRB TRB AST STL BLK TOV PF PTS
## 1   1  11/2/05 DEN 13  28   0.464   0   5   5   4   1   2   6  4  33
## 2   2  11/3/05 PHO 13  26   0.500   2   5   7   5   0   0   3  3  39
## 3   3  11/6/05 DEN 16  31   0.516   3   5   8   5   0   1   4  2  37
## 4   4  11/8/05 ATL 15  26   0.577   1   2   3   5   1   1   1  5  37
## 5   5  11/9/05 MIN 12  26   0.462   1   3   4   4   1   0   3  0  28
## 6   6 11/11/05 PHI  7  27   0.259   3   6   9   7   1   0   3  4  17
## 7   7 11/14/05 MEM  7  18   0.389   0   3   3   2   0   1   4  3  18
## 8   8 11/16/05 NYK 15  36   0.417   3   2   5   3   2   1   0  3  42
## 9   9 11/18/05 LAC 12  35   0.343   1   3   4   5   0   0   2  2  36
## 10 10 11/20/05 CHI 17  34   0.500   1   5   6   3   2   0   3  3  43
## 11 11 11/24/05 SEA 12  26   0.462   1   1   2   5   0   0   2  4  34
## 12 12 11/27/05 NJN 14  36   0.389   0   3   3   3   2   0   5  5  46
## 13 13 11/29/05 SAS  9  33   0.273   1   3   4   0   4   1   3  0  25
## 14 14  12/1/05 UTA 11  31   0.355   2   5   7   3   2   0   2  6  30
##    Plus_Minus EFGRatio ASTRatio STLRatio BLKRatio TOVRatio USGRatio
## 1           6    0.464    0.239    0.011    0.030    0.162    0.380
## 2           4    0.500    0.225    0.000    0.000    0.085    0.363
## 3          24    0.516    0.238    0.000    0.018    0.108    0.368
## 4          12    0.577    0.340    0.013    0.021    0.031    0.401
## 5         -11    0.462    0.262    0.014    0.000    0.095    0.382
## 6          -2    0.259    0.348    0.014    0.000    0.094    0.356
## 7         -27    0.389    0.187    0.000    0.021    0.168    0.322
## 8           6    0.417    0.151    0.025    0.014    0.000    0.414
## 9          -3    0.357    0.280    0.000    0.000    0.047    0.436
## 10         -5    0.515    0.204    0.022    0.000    0.071    0.427
## 11         15    0.538    0.280    0.000    0.000    0.063    0.365
## 12        -10    0.444    0.189    0.021    0.000    0.104    0.406
## 13         -7    0.273    0.000    0.051    0.018    0.076    0.409
## 14          1    0.387    0.172    0.026    0.000    0.054    0.398

(3).Data Summary

G:the game number Opp:opponents FG:field goal
FGA:Field Goal attempts
FGRatio:field goal percentage
ORB:offense oebound
DRB:defensive rebound
TRB:total rebound
AST:assist
STL:steal
BLK:block
TOV:turnover PF:personal fouls PTS:scores

PLus_Minus:Victory Contribution(When Kobe is on court the team will get more goals than opponents or get less goals)

EFGRatio:Effective Field goal Ratio(modified by 2-pts, 3-pts, and free throw)

ASTRatio:An estimate of the percentage of teammate field goals a player assisted while he was on the floor.

STLRatio:An estimate of the percentage of opponent possessions that end with a steal by the player while he was on the floor.

BLKRatio:An estimate of the percentage of opponent two-point field goal attempts blocked by the player while he was on the floor.

TOVRatio:An estimate of turnovers committed per 100 plays.

USGRatio:An estimate of the percentage of team plays used by a player while he was on the floor.

(4).Summary the data

The dataset contains 162 rows (total 162 matches in two seasons) and 22 columns which are illustrated in the above. In the following research, we will use USGRatio ASTRatio and EFGRatio as independent variables and Plus_Minus as dependent variable. Those selection will be explained in detail in the model section.

KobeStatistic<-Kobe[c("USGRatio","ASTRatio","EFGRatio","Plus_Minus")];
attach(KobeStatistic);
## The following objects are masked from Kobe:
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
summary(KobeStatistic);
##     USGRatio         ASTRatio        EFGRatio        Plus_Minus     
##  Min.   :0.1880   Min.   :0.000   Min.   :0.2350   Min.   :-27.000  
##  1st Qu.:0.3090   1st Qu.:0.174   1st Qu.:0.4110   1st Qu.: -6.000  
##  Median :0.3620   Median :0.234   Median :0.5000   Median :  2.000  
##  Mean   :0.3627   Mean   :0.248   Mean   :0.4961   Mean   :  2.688  
##  3rd Qu.:0.4140   3rd Qu.:0.315   3rd Qu.:0.5800   3rd Qu.: 12.000  
##  Max.   :0.5780   Max.   :0.596   Max.   :0.7800   Max.   : 35.000
str(KobeStatistic)
## 'data.frame':    157 obs. of  4 variables:
##  $ USGRatio  : num  0.38 0.363 0.368 0.401 0.382 0.356 0.322 0.414 0.436 0.427 ...
##  $ ASTRatio  : num  0.239 0.225 0.238 0.34 0.262 0.348 0.187 0.151 0.28 0.204 ...
##  $ EFGRatio  : num  0.464 0.5 0.516 0.577 0.462 0.259 0.389 0.417 0.357 0.515 ...
##  $ Plus_Minus: int  6 4 24 12 -11 -2 -27 6 -3 -5 ...

2.Plot Data

(1).Scatter Plot

1. Effective Field Goals Ratio

attach(KobeStatistic)
## The following objects are masked from KobeStatistic (pos = 3):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from Kobe:
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
plot(KobeStatistic$EFGRatio,KobeStatistic$Plus_Minus, pch=18,bg="darkviolet",main="Team Contribution vs Effecitve Field Goals Ratio", ylab=  "Team Victory Contribution", xlab = "Effective Field Goals Ratio")

2. Assist Ratio

attach(KobeStatistic)
## The following objects are masked from KobeStatistic (pos = 3):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 4):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from Kobe:
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
plot(KobeStatistic$ASTRatio,KobeStatistic$Plus_Minus, pch=21,bg="darkviolet",main="Team Contribution vs Assist Ratio", ylab = "Team Victory Contribution", xlab = "Assist Ratio")

3. Usage Ratio

attach(KobeStatistic)
## The following objects are masked from KobeStatistic (pos = 3):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 4):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 5):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from Kobe:
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
plot(KobeStatistic$USGRatio, KobeStatistic$Plus_Minus, pch=21, bg="darkviolet",main="Team Contribution vs Player Usage Ratio", ylab = "Team Victory Contribution", xlab = "Player usage ratio")

(2). Box plot

1. Effective Field Goals

boxplot(KobeStatistic$EFGRatio,main = "Kobe's Effective Field Goals Ratio",ylab="Percentage")

2. Assist Ratio

boxplot(KobeStatistic$ASTRatio,main = "Kobe's Assist Ratio",ylab="Percentage")

3.Usage Ratio

boxplot(KobeStatistic$USGRatio,main = "Kobe's usage ratio",ylab="Percentage")

(3). Histgram plot

1. Effective Field Goals

hist(KobeStatistic$EFGRatio,10, main = "Histogram of Kobe's Effective Field Goals Ratio")

2. Assist Ratio

hist(KobeStatistic$ASTRatio,10,main = "Histogram of Kobe's Assist Ratio")

3. Usage Ratio

hist(KobeStatistic$USGRatio,10,main = "Histogram of Kobe's Usage Ratio")

It is interesting to see in the Box plot graph that the exact role that Kobe plays on the floor.The Kobe’s average effecitve field goals ratio is around 50%, which is not bad. But Kobe is absolutely not a point guard, since his average assist ratio is just around 20%. It does not make sense that tag a player “Bad Shooting Selection” when he had 50% effective field goals ratio while used 1/3 of team play per game.

The histograms of three independent variables appear to approach bell shape, telling that the performance of Kobe in the season of 2005~2007 is stable.

3.Models

(1) Description of Independent Variable and Dependent Variable

The reseach will use the following as Independent variables:

  1. Kobe’s effective field goals ratio (EFGRatio)
  2. Kobe’s assist ratio per game(ASTRatio)
  3. Kobe’s usage ratio per game(USGRatio)

Firstly, I use a new variable effective field goals ratio (FGA) instead of scores (PTS).High scores do not mean high efficiency of a player. A player may try to use many field goals attempts to get a high scores, but with low shooting rate. This does not help team to win.

Second, USGRatio tells how many plays are used by the Player. On the floor, player may act as killer who cannot be defended, and keep scoring. on the contrast, if the player cannot get score and also does not want to share the ball with his teammates, team will be hardly to win. Last, Assist Ratio tells whether Kobe can help teammates get field goals when he is on the floor.

The Dependent Variable will be:

Kobe’s contribution to team victories(Plus_Minus)

(2) Hypothesis

In the research, we are going to check: 1. Kobe’s effective field goal ratio has effect on his contribution to team victory (Alternative:Hb1EFGRtio1:b1EfGRatio not equal to 0), or no relationship existed between two variable (NULL:Hb1EFGRatio0:b1EFGRatio=0);

  1. Kobe’s assist ratio has effect on his contribution to team victory (Alternative:Hb2ASTRatio1: b1ASTRatio not equal to 0) or no matter Kobe assist or not, his contribution to team victory will not be affected(NULL:Hb2ASTRatio0=0).

3.whether Kobe’s usage of play per game has effect on his contribution to team victory (Alternative:Hb3USGRatio1: b1USGRatio not equal to 0) or even though Kobe has high usage of play per game, his contribution to team victory is not affected (NULL:Hb3USGRatio0=0).

(3) Multiple Linear Regression model

1. Entry-Wise

We plug in three variables in the sametimes. The summary of the model shows that the coefficients of EFGRatio and USGRatio are significant different from zero under 95% hyothesis test. coefficient of ASTRatio fail in 90% hypothesis test, but its p-values is really close to the 0.1. We will implement the model in herarchical way as following,

modelentry <- lm(KobeStatistic$Plus_Minus~KobeStatistic$USGRatio+KobeStatistic$EFGRatio+KobeStatistic$ASTRatio)
summary(modelentry)
## 
## Call:
## lm(formula = KobeStatistic$Plus_Minus ~ KobeStatistic$USGRatio + 
##     KobeStatistic$EFGRatio + KobeStatistic$ASTRatio)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -25.0651  -8.4690   0.9699   8.5487  26.4409 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             -28.284      6.690  -4.228 4.05e-05 ***
## KobeStatistic$USGRatio   30.123     13.505   2.230   0.0272 *  
## KobeStatistic$EFGRatio   33.658      7.939   4.240 3.86e-05 ***
## KobeStatistic$ASTRatio   13.504      8.185   1.650   0.1010    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.58 on 153 degrees of freedom
## Multiple R-squared:  0.1482, Adjusted R-squared:  0.1315 
## F-statistic: 8.872 on 3 and 153 DF,  p-value: 1.856e-05

2.Hierarchical

Correlation
cor(KobeStatistic)
##              USGRatio    ASTRatio   EFGRatio Plus_Minus
## USGRatio    1.0000000 -0.22593456 0.07204520  0.1657483
## ASTRatio   -0.2259346  1.00000000 0.04595279  0.1024914
## EFGRatio    0.0720452  0.04595279 1.00000000  0.3359980
## Plus_Minus  0.1657483  0.10249136 0.33599800  1.0000000

According to the correlation table EFGRatio has a highest correlation when comparing to other two variables.USGRatio is second high. ASTRatio ranked the last. Therefore, We will add the variable to our model by the order: EFGRatio->EFGRatio+USGRatio->EFGRatio+USGRatio+ASTRatio.

I single variable model
modelsingle <- lm(KobeStatistic$Plus_Minus ~ KobeStatistic$EFGRatio)
summary(modelsingle);
## 
## Call:
## lm(formula = KobeStatistic$Plus_Minus ~ KobeStatistic$EFGRatio)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -26.4729  -7.8250   0.6695   8.6449  27.0155 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             -14.966      4.084  -3.665 0.000339 ***
## KobeStatistic$EFGRatio   35.583      8.012   4.441 1.69e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.74 on 155 degrees of freedom
## Multiple R-squared:  0.1129, Adjusted R-squared:  0.1072 
## F-statistic: 19.73 on 1 and 155 DF,  p-value: 1.691e-05
II Two variables model
modeldouble <- lm(KobeStatistic$Plus_Minus ~ KobeStatistic$EFGRatio+KobeStatistic$USGRatio)
summary(modeldouble);
## 
## Call:
## lm(formula = KobeStatistic$Plus_Minus ~ KobeStatistic$EFGRatio + 
##     KobeStatistic$USGRatio)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -24.975  -7.894   1.192   8.341  27.854 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             -23.493      6.060  -3.876 0.000157 ***
## KobeStatistic$EFGRatio   34.497      7.967   4.330 2.67e-05 ***
## KobeStatistic$USGRatio   24.996     13.216   1.891 0.060455 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.64 on 154 degrees of freedom
## Multiple R-squared:  0.133,  Adjusted R-squared:  0.1218 
## F-statistic: 11.82 on 2 and 154 DF,  p-value: 1.683e-05
III Three variables model
modeltriple <- lm(KobeStatistic$Plus_Minus ~ KobeStatistic$EFGRatio+KobeStatistic$ASTRatio+KobeStatistic$USGRatio)
summary(modeltriple)
## 
## Call:
## lm(formula = KobeStatistic$Plus_Minus ~ KobeStatistic$EFGRatio + 
##     KobeStatistic$ASTRatio + KobeStatistic$USGRatio)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -25.0651  -8.4690   0.9699   8.5487  26.4409 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             -28.284      6.690  -4.228 4.05e-05 ***
## KobeStatistic$EFGRatio   33.658      7.939   4.240 3.86e-05 ***
## KobeStatistic$ASTRatio   13.504      8.185   1.650   0.1010    
## KobeStatistic$USGRatio   30.123     13.505   2.230   0.0272 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.58 on 153 degrees of freedom
## Multiple R-squared:  0.1482, Adjusted R-squared:  0.1315 
## F-statistic: 8.872 on 3 and 153 DF,  p-value: 1.856e-05
ANOA Table
anova(modelsingle,modeldouble,modeltriple)
## Analysis of Variance Table
## 
## Model 1: KobeStatistic$Plus_Minus ~ KobeStatistic$EFGRatio
## Model 2: KobeStatistic$Plus_Minus ~ KobeStatistic$EFGRatio + KobeStatistic$USGRatio
## Model 3: KobeStatistic$Plus_Minus ~ KobeStatistic$EFGRatio + KobeStatistic$ASTRatio + 
##     KobeStatistic$USGRatio
##   Res.Df   RSS Df Sum of Sq      F  Pr(>F)  
## 1    155 21352                              
## 2    154 20868  1    484.73 3.6172 0.05906 .
## 3    153 20503  1    364.73 2.7217 0.10104  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

When using 2 variables regression, the USGRatio coefficient is not significant when test in the level of 95%. But After adding the third variable, the USGRatio coefficient is significant in 95% level test. Hence,using USGRatio and ASTRatio together could have a positive contribution to exlain the model.

3.Sequential

We find same result on sequential way that it’s recommned to use all three variables for the model.

attach(KobeStatistic)
## The following objects are masked from KobeStatistic (pos = 3):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 4):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 5):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 6):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from Kobe:
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
step.lm<- lm(Plus_Minus ~ EFGRatio+USGRatio+ASTRatio)
step(step.lm,direction='both')
## Start:  AIC=772.92
## Plus_Minus ~ EFGRatio + USGRatio + ASTRatio
## 
##            Df Sum of Sq   RSS    AIC
## <none>                  20503 772.92
## - ASTRatio  1    364.73 20868 773.68
## - USGRatio  1    666.67 21170 775.94
## - EFGRatio  1   2408.83 22912 788.36
## 
## Call:
## lm(formula = Plus_Minus ~ EFGRatio + USGRatio + ASTRatio)
## 
## Coefficients:
## (Intercept)     EFGRatio     USGRatio     ASTRatio  
##      -28.28        33.66        30.12        13.50

4.Model Plot

(1).Scattergram of Final Model

plot(KobeStatistic, pch=21, cex=1, bg='ivory4', main="Team Contribution Vs. EFGRatio, ASTRatio and USGRatio")

(2).Scatter Plot with Regression Line and 95% confidence of coefficient beta1, and beta 0

1. Team Contribution Vs Efficient Field Goals Ratio

attach(KobeStatistic)
## The following objects are masked from KobeStatistic (pos = 3):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 4):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 5):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 6):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 7):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from Kobe:
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
plot(EFGRatio,Plus_Minus, pch=19,main="Team Contribution vs Efficiency Field Goals Ratio",ylab=  "Team Victory Contribution", xlab = "Effective Field Goals Ratio")
EFGRatio.lm<-lm(Plus_Minus~EFGRatio)
abline(EFGRatio.lm$coef, lwd=2)
confiEFGRatio = confint(EFGRatio.lm,level=0.95)
abline(confiEFGRatio[,1],lty=2,col='red')
abline(confiEFGRatio[,2],lty=2,col='red')

2. Team Contribution Vs Assist Ratio

attach(KobeStatistic)
## The following objects are masked from KobeStatistic (pos = 3):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 4):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 5):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 6):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 7):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 8):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from Kobe:
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
plot(ASTRatio,Plus_Minus, pch=19,main="Team Contribution vs Assist Ratio",ylab=  "Team Victory Contribution", xlab = "Assist Ratio")
ASTRatio.lm<-lm(Plus_Minus~ASTRatio)
abline(AST.lm$coef, lwd=2)
confiASTRatio = confint(ASTRatio.lm,level=0.95)
abline(confiASTRatio[,1],lty=2,col='red')
abline(confiASTRatio[,2],lty=2,col='red')

3. Team Contribution Vs usage Ratio

plot(USGRatio,Plus_Minus, pch=19,main="Team Contribution vs Player Usage Ratio",ylab=  "Team Victory Contribution", xlab = "Player Usage Ratio")
USGRatio.lm<-lm(Plus_Minus~USGRatio)
abline(USGRatio.lm$coef, lwd=2)
confiUSGRatio = confint(USGRatio.lm,level=0.95)
abline(confiUSGRatio[,1],lty=2,col='red')
abline(confiUSGRatio[,2],lty=2,col='red')

(3).3D plot

Team Contribution Vs USGratio and EfGRatio (Because the coefficient of USGRatio and EFGRatio are most significant different from zero when test in the level of 90%)

library(scatterplot3d)
attach(KobeStatistic)
## The following objects are masked from KobeStatistic (pos = 4):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 5):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 6):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 7):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 8):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 9):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 10):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from Kobe:
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
model3D <- scatterplot3d(KobeStatistic$USGRatio, KobeStatistic$EFGRatio, KobeStatistic$Plus_Minus, pch=21, main = "Team Contributions Vs. USGRatio and EFGRatio", xlab = "USGRatio", ylab = "EFGRatio", zlab = "Team Contribution", axis=TRUE)
modelplane<- lm(KobeStatistic$USGRatio~KobeStatistic$EFGRatio)
model3D$plane3d(modeldouble,col = 'red')

(4). Residual Histogram

The residuals variance is not const over whole fitted values. It shows more variation in the range of -5 to 5. In the histogram plot, the shape of residual distribution turns to be like Guassian Distribution. But the mean of the residual distribution deviates from zero. The exact mean is around 2.5. Also the shape residual distribution tend to be slightly left skew.

layout(matrix(c(2,1), 2, 1, byrow = TRUE),widths=c(3,1), heights=c(3,3))
model.res <- resid(modeltriple)
plot(fitted(modeltriple), model.res, pch=21, cex=1, bg='blue',main="Plot of Fitted Values vs. Residuals ", xlab = "Fitted Values of Model", ylab = "Residuals")
abline(0,0,lwd=2,col="red")
hist(model.res, main="Model Residual Histogram",xlab = "Fitted value of model")

5 Interpret

After adding two new variables in the model, the R-squared of model increased from 6% to 14.8%, when comparing to single variable regression model. Every 1% Kobe’s effective field goal rate increase will lead to a 0.3 increase in team victory contribution. Also High play usage by Kobe will not lead to a negtive effect. In contrast, when Kobe is on the floor and act as the finisher in one single play,team will tend to win the game.

Assume Kobe has the problem that, as critic said, bad shooting selection. The coefficient ratio of USGRatio should be negitve(every play made by Kobe has negtive effect on team wining. Also the coefficient of ASTRatio will be significant different from zero in more harsh test (Kobe choose to assist otehr than shoot the ball by himself will help team more to win the game).

All in all, the critisims to Kobe’s shooting selection is groundless.

attach(KobeStatistic)
## The following objects are masked from KobeStatistic (pos = 3):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 5):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 6):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 7):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 8):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 9):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 10):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from KobeStatistic (pos = 11):
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
## 
## The following objects are masked from Kobe:
## 
##     ASTRatio, EFGRatio, Plus_Minus, USGRatio
finalmodel<- lm(Plus_Minus~KobeStatistic$EFGRatio+KobeStatistic$ASTRatio+KobeStatistic$USGRatio)
summary(finalmodel)
## 
## Call:
## lm(formula = Plus_Minus ~ KobeStatistic$EFGRatio + KobeStatistic$ASTRatio + 
##     KobeStatistic$USGRatio)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -25.0651  -8.4690   0.9699   8.5487  26.4409 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             -28.284      6.690  -4.228 4.05e-05 ***
## KobeStatistic$EFGRatio   33.658      7.939   4.240 3.86e-05 ***
## KobeStatistic$ASTRatio   13.504      8.185   1.650   0.1010    
## KobeStatistic$USGRatio   30.123     13.505   2.230   0.0272 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.58 on 153 degrees of freedom
## Multiple R-squared:  0.1482, Adjusted R-squared:  0.1315 
## F-statistic: 8.872 on 3 and 153 DF,  p-value: 1.856e-05