Table of Contents

1 Introduction



They say “offense sells tickets but defense wins championships”. Both aspects of the game require strong team play and the ability to read the game and anticipate the actions of the opposing team, but is it actually true that one matters more than the other? In a competitive team sport, is it better to have a good offense or a good defense?


As trivial of a question as that may sound, it’s actually a very important question in determining the strategies NBA teams will use. Teams have a limited amount of resources (player contracts, draft picks, time and schedule alotted for training, etc.) and so it is vital that these resources are utilized as efficiently as possible in order to affect the outcome of the season. Today we’ll discuss offense vs defense in detail using NBA data.

2 Data


We will be specifically looking at the average Team FG%, the average Opponent FG%, and the Win% of all 30 NBA teams over the 2021-2022 regular season. Here is a link to the NBA website’s entire dataset.


Below is dataset used for this analysis.

NBA statistics for the 2021-2022 regular season
NBA Team Win% Team FG% Opponent FG%
Atlanta Hawks 52.4 46.9 47.1
Boston Celtics 62.2 46.6 43.4
Brooklyn Nets 53.7 47.6 45.2
Charlotte Hornets 52.4 46.9 46.5
Chicago Bulls 56.1 48.0 47.4
Cleveland Cavaliers 53.7 46.9 45.2
Dallas Mavericks 63.4 46.2 45.7
Denver Nuggets 58.5 48.3 47.0
Detroit Pistons 28.0 43.0 47.3
Golden State Warriors 64.6 46.9 43.9
Houston Rockets 24.4 45.5 48.3
Indiana Pacers 30.5 46.3 48.3
LA Clippers 51.2 45.8 45.0
Los Angeles Lakers 40.2 47.0 47.0
Memphis Grizzlies 68.3 46.1 45.6
Miami Heat 64.6 46.6 44.7
Milwaukee Bucks 62.2 46.8 45.6
Minnesota Timberwolves 56.1 45.8 46.0
New Orleans Pelicans 43.9 45.7 47.2
New York Knicks 45.1 43.7 44.7
Oklahoma City Thunder 29.3 43.1 45.8
Orlando Magic 26.8 43.4 45.9
Philadelphia 76ers 62.2 46.5 45.9
Phoenix Suns 78.0 48.5 44.5
Portland Trail Blazers 32.9 44.3 47.9
Sacramento Kings 36.6 46.0 48.0
San Antonio Spurs 41.5 46.6 46.6
Toronto Raptors 58.5 44.5 46.2
Utah Jazz 59.8 47.1 45.3
Washington Wizards 42.7 47.1 46.4

3 Analysis


Our dataset above is a bit tedious to read so let’s visualize it and see how these teams ranked in the 2021-2022 regular season. The dropdown menu below let’s us choose an NBA metric and see how each team favors out.




Surprisingly, almost half of the teams that season won less than 50% of their games. Let’s see if we can try to help improve these numbers using a regression model. First, we will carry out a multiple regression analysis to ascertain the significance of either a strong offense or strong defense in improving a team’s overall record and then insights will be made. Below are the results of the first order multiple regression analysis using the built in linear model function lm(). Here, Team FG% is represented as x, Opponent FG% is represented as y and Win% is represented as z.

# Statistical analysis
lmwin_perc <- lm(z~x+y)
lmsummary <- summary(lmwin_perc)
lmsummary
## 
## Call:
## lm(formula = z ~ x + y)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -10.716  -6.668   0.060   4.401  17.637 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   95.957     75.588   1.269    0.215    
## x              5.314      1.010   5.260 1.51e-05 ***
## y             -6.311      1.156  -5.459 8.90e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.839 on 27 degrees of freedom
## Multiple R-squared:  0.7123, Adjusted R-squared:  0.691 
## F-statistic: 33.43 on 2 and 27 DF,  p-value: 4.952e-08
# Calculating VIF Values
vif(lmwin_perc)
##        x        y 
## 1.020147 1.020147
# Normal probability plot of residuals
res <- resid(lmwin_perc)
qqnorm(res)
qqline(res)


As we can see, our linear regression model is defensible! Our P values for Team FG% and Opponent FG% were 1.514^{-5} and 8.9^{-6}, respectively. Our adjusted R-squared = 0.69, both of our VIFs = 1.02, and our QQ plot is relatively normally distributed with only 1 large residual (the Toronto Raptors).


Our equation for the regression model reads as follows: \[z = 95.96 + 5.31x-6.31y\] where \(z=\) Win%, \(x=\) Team FG%, and \(y=\) Opponent FG%.


That’s all good but how does the relationship of offense and defense look? Let’s create a 3D plot of the regression plane along with the data points. Red lines are shown to visualize the variance of each of the points to the regression plane. The plot is interactive, allowing us to orient it in various ways.

# Creating an interactive 3D plot
plot3d(x,y,z, xlab = "Team FG%", ylab = "Opp FG%", zlab = "Win%")
# Creating the regression plane
surface3d(x.pred, y.pred, z.pred,alpha=0.4, front = "lines", back = "lines")
# Creating residual lines
segments3d(interleave(x,x), 
          interleave(y,y), 
          interleave(z,fitpoints), alpha=0.4, col="red")


4 Conclusion


As we’ve seen, our model was defensible in determining Win%. We discovered the magnitude of the coefficients of Team FG% and Opponent FG% to be 5.31 and 6.31, respectively, signaling that Opponent FG% was a greater determinant of Win% than Team FG%.


Furthermore, by using the linearHypothesis function, we can test the differences in our coefficients to determine if these differences are significant . This allows us to test the null hypothesis that there is no difference between the Team FG% and Opponent FG%.

# Calculating significance between coefficients
linearHypothesis(fit, "x-y = 0")
## Linear hypothesis test
## 
## Hypothesis:
## x - y = 0
## 
## Model 1: restricted model
## Model 2: z ~ x + y
## 
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1     28 5752.3                                  
## 2     27 1659.1  1    4093.2 66.611 9.135e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


P value was 9.135e-09, indicating a significant difference between the coefficients!

It’s safe to say that defensive prowess has a good chance at winning games compared to offensive talent. We’ve proven that here in this report. These data points aren’t complete in that they don’t consider multiple options for stratification and allowance of new factors, such as home vs away game, type of shot (2-point vs 3-point), active players, etc. These are all fascinating areas to look into.


Also, the NBA differs from other sports in that the NBA’s offense and defense are very intertwined. Maybe using the NFL (a very distinct game at both sides of the ball) as a dataset would give us different insights. Either way, front offices can use information like this regression model above to make tough decisions with respect to hiring coaches, signing free agents, and even drafting the next class of NBA players.

“Good basketball always starts with good defense.”
– Coach Bob Knight