2018 NBA SRS Anova Test

by Carlos Jones;

For this analysis, my goal is to understand if there is a statistically significant difference, in the SRS Mean, between four groups in the NBA. First let’s discuss what SRS is, when it comes to the NBA. SRS, also known as Simple Rating System, is a system introduced by Sports Reference and used by Basketball Reference, Pro Football Focus, and others as well. It considers Average Point Differential and Strength of Schedule. For instance, the 2006-07 Spurs won games by an average of 8.43 points per game and played a schedule with opponents that were 0.08 points worse than average, giving them an SRS of 8.35. This means they were 8.35 points better than an average team. An average team would have an SRS of 0.0.

Let’s do a quick reminder of how the Anova is defined. Analysis of Variance (ANOVA) is a statistical technique, commonly used to studying differences between two or more group means.ANOVA in R primarily provides evidence of the existence of the mean equality between the groups. This statistical method is an extension of the t-test and is used in a situation where the factor variable has more than one group.

I have broken this analysis down into the following 4 Groups:

  • Eastern Conference Playoff Teams = East_P
  • Eastern Conference Non-Playoff Teams = East_NP
  • Western Conference Playoff Teams = West_P
  • Western Conference Non-Playoff Teams = West_NP

Lets begin our analysis to see if there is a statistical significant SRS Mean difference between the 4 Groups by defining our Hypothesis.

  • Null Hypothesis: All of the Means are statistically the same. No difference
  • Rejected Null Hypothesis: At least 1 Group’s Mean is different
nbaSRS <- read.csv("~/R Projects/nbaSRS/nbaSRS.csv")

Before we move further in our analysis, let’s take a quick look at our data to look for any missing rows, incorrect columns, duplicate rows, etc.

str(nbaSRS)
## 'data.frame':    31 obs. of  8 variables:
##  $ Eastern.Conference: Factor w/ 31 levels "Atlanta Hawks ",..: 17 28 23 2 12 3 22 9 4 16 ...
##  $ W                 : Factor w/ 21 levels "17","19","22",..: 20 19 15 13 12 11 11 10 9 9 ...
##  $ L                 : Factor w/ 21 levels "22","24","25",..: 1 2 6 8 9 10 10 11 12 12 ...
##  $ W.L.              : Factor w/ 21 levels "0.207","0.232",..: 20 19 15 13 12 11 11 10 9 9 ...
##  $ GB                : Factor w/ 20 levels "—","11","12",..: 1 6 19 2 3 4 4 5 8 8 ...
##  $ PS.G              : Factor w/ 29 levels "103.5","104.5",..: 28 21 25 15 9 14 7 6 11 5 ...
##  $ PA.G              : Factor w/ 30 levels "104.7","105.9",..: 11 9 19 8 1 18 5 7 17 2 ...
##  $ SRS               : Factor w/ 31 levels "-0.4","-0.45",..: 30 28 20 23 21 1 17 3 8 2 ...
head(nbaSRS,31)
##         Eastern.Conference  W  L  W.L. GB  PS.G  PA.G   SRS
## 1          Milwaukee Bucks 60 22 0.732  — 118.1 109.3  8.04
## 2          Toronto Raptors 58 24 0.707  2 114.4 108.4  5.49
## 3       Philadelphia 76ers 51 31 0.622  9 115.2 112.5  2.25
## 4           Boston Celtics 49 33 0.598 11 112.4   108   3.9
## 5           Indiana Pacers 48 34 0.585 12   108 104.7  2.76
## 6            Brooklyn Nets 42 40 0.512 18 112.2 112.3  -0.4
## 7            Orlando Magic 42 40 0.512 18 107.3 106.6  0.28
## 8          Detroit Pistons 41 41   0.5 19   107 107.3 -0.56
## 9        Charlotte Hornets 39 43 0.476 21 110.7 111.8 -1.32
## 10             Miami Heat  39 43 0.476 21 105.7 105.9 -0.45
## 11    Washington Wizards   32 50  0.39 28   114 116.9  -3.3
## 12          Atlanta Hawks  29 53 0.354 31 113.3 119.4 -6.06
## 13          Chicago Bulls  22 60 0.268 38 104.9 113.4 -8.32
## 14   Cleveland Cavaliers   19 63 0.232 41 104.5 114.1 -9.39
## 15        New York Knicks  17 65 0.207 43 104.6 113.8 -8.93
## 16      Western Conference  W  L  W/L% GB  PS/G  PA/G   SRS
## 17   Golden State Warriors 57 25 0.695  — 117.7 111.2  6.42
## 18          Denver Nuggets 54 28 0.659  3 110.7 106.7  4.19
## 19  Portland Trail Blazers 53 29 0.646  4 114.7 110.5  4.43
## 20         Houston Rockets 53 29 0.646  4 113.9 109.1  4.96
## 21               Utah Jazz 50 32  0.61  7 111.7 106.5  5.28
## 22   Oklahoma City Thunder 49 33 0.598  8 114.5 111.1  3.56
## 23       San Antonio Spurs 48 34 0.585  9 111.7   110   1.8
## 24    Los Angeles Clippers 48 34 0.585  9 115.1 114.3  1.09
## 25       Sacramento Kings  39 43 0.476 18 114.2 115.3 -0.81
## 26     Los Angeles Lakers  37 45 0.451 20 111.8 113.5 -1.33
## 27 Minnesota Timberwolves  36 46 0.439 21 112.5   114 -1.02
## 28      Memphis Grizzlies  33 49 0.402 24 103.5 106.1 -2.08
## 29   New Orleans Pelicans  33 49 0.402 24 115.4 116.8  -1.1
## 30       Dallas Mavericks  33 49 0.402 24 108.9 110.1 -0.87
## 31           Phoenix Suns  19 63 0.232 38 107.5 116.8 -8.61

I did notice that my SRS column is in a “Factor” format; however, I need to change it to “Numeric” in order to run my Anova Test. Let’s do that now.

nbaSRS$SRS<-as.numeric(nbaSRS$SRS)
str(nbaSRS)
## 'data.frame':    31 obs. of  8 variables:
##  $ Eastern.Conference: Factor w/ 31 levels "Atlanta Hawks ",..: 17 28 23 2 12 3 22 9 4 16 ...
##  $ W                 : Factor w/ 21 levels "17","19","22",..: 20 19 15 13 12 11 11 10 9 9 ...
##  $ L                 : Factor w/ 21 levels "22","24","25",..: 1 2 6 8 9 10 10 11 12 12 ...
##  $ W.L.              : Factor w/ 21 levels "0.207","0.232",..: 20 19 15 13 12 11 11 10 9 9 ...
##  $ GB                : Factor w/ 20 levels "—","11","12",..: 1 6 19 2 3 4 4 5 8 8 ...
##  $ PS.G              : Factor w/ 29 levels "103.5","104.5",..: 28 21 25 15 9 14 7 6 11 5 ...
##  $ PA.G              : Factor w/ 30 levels "104.7","105.9",..: 11 9 19 8 1 18 5 7 17 2 ...
##  $ SRS               : num  30 28 20 23 21 1 17 3 8 2 ...

Since my goal for the Anova Test is to look for significant differences in the means between four groups, I need to create a column for the groups. My groups will be the following:

  • Eastern Conference Playoff Teams = East_P
  • Eastern Conference Non-Playoff Teams = East_NP
  • Western Conference Playoff Teams = West_P
  • Western Conference Non-Playoff Teams = West_NP
nbaSRS$Groups<-c("East_P","East_P","East_P","East_P","East_P","East_P","East_P","East_P","East_NP","East_NP","East_NP","East_NP","East_NP","East_NP","East_NP","West_P","West_P","West_P","West_P","West_P","West_P","West_P","West_P","West_NP","West_NP","West_NP","West_NP","West_NP","West_NP","West_NP","West_NP")
View (nbaSRS)

One last data clean up before I perform the test. About half way down the dataset, the Western Conference Header Row is located on the 16th row. I do not need that for my analysis, so will remove that completely.

WestHeader<-16
nbaSRS<-nbaSRS[-WestHeader,]
head(nbaSRS,16)
##       Eastern.Conference  W  L  W.L. GB  PS.G  PA.G SRS  Groups
## 1        Milwaukee Bucks 60 22 0.732  — 118.1 109.3  30  East_P
## 2        Toronto Raptors 58 24 0.707  2 114.4 108.4  28  East_P
## 3     Philadelphia 76ers 51 31 0.622  9 115.2 112.5  20  East_P
## 4         Boston Celtics 49 33 0.598 11 112.4   108  23  East_P
## 5         Indiana Pacers 48 34 0.585 12   108 104.7  21  East_P
## 6          Brooklyn Nets 42 40 0.512 18 112.2 112.3   1  East_P
## 7          Orlando Magic 42 40 0.512 18 107.3 106.6  17  East_P
## 8        Detroit Pistons 41 41   0.5 19   107 107.3   3  East_P
## 9      Charlotte Hornets 39 43 0.476 21 110.7 111.8   8 East_NP
## 10           Miami Heat  39 43 0.476 21 105.7 105.9   2 East_NP
## 11  Washington Wizards   32 50  0.39 28   114 116.9  11 East_NP
## 12        Atlanta Hawks  29 53 0.354 31 113.3 119.4  12 East_NP
## 13        Chicago Bulls  22 60 0.268 38 104.9 113.4  13 East_NP
## 14 Cleveland Cavaliers   19 63 0.232 41 104.5 114.1  16 East_NP
## 15      New York Knicks  17 65 0.207 43 104.6 113.8  15 East_NP
## 17 Golden State Warriors 57 25 0.695  — 117.7 111.2  29  West_P

At this time I am able to perform the Anova test to see if there is a statistically significant difference in the SRS Means between the four groups.

anova<-aov(SRS~ Groups, data=nbaSRS)
summary(anova)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Groups       3   1088   362.7   8.133 0.000553 ***
## Residuals   26   1160    44.6                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

After performing the Anova test, I can clearly see that there is a statistically significant difference in the SRS Means between the groups, based on the p-value that was returned. Since the p-value is below the .05, we have enough evidence to reject the Null Hypothesis.

I am not completely finish with the analysis at this time. The Anova test explains that there is a statistically significant difference in the SRS Means between the groups; however, it doesn’t necessarily tell me which group. To understand which group, I will perform a Post Hoc Analysis test called TukeyHSD.

TukeyHSD(anova)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = SRS ~ Groups, data = nbaSRS)
## 
## $Groups
##                      diff        lwr        upr     p adj
## East_P-East_NP   6.875000  -2.606348 16.3563478 0.2178261
## West_NP-East_NP -1.875000 -11.356348  7.6063478 0.9477304
## West_P-East_NP  13.571429   3.779135 23.3637225 0.0040891
## West_NP-East_P  -8.750000 -17.909852  0.4098522 0.0650040
## West_P-East_P    6.696429  -2.784919 16.1777764 0.2375479
## West_P-West_NP  15.446429   5.965081 24.9277764 0.0007429

After running the TukeyHSD test, I noticed that I can’t reject the Null Hypothesis on every scenario; however, I can for 3 of the scenarios:

  • West_P ~ East_NP
  • West_NP ~ East_P
  • West_P ~ West_NP

This informs me that there is a 12.75, -10.01, and 15.89 statistical difference in the Means between the groups, respectively.

Thank you you for reviewing this analysis! I invite you to check out my other R, SQL, Excel, and Tableau projects on the following websites:

Thanks!!!