They say “offense sells tickets but defense wins
championships”. Both aspects of the game require strong team play
and the ability to read the game and anticipate the actions of the
opposing team, but is it actually true that one matters more than the
other? In a competitive team sport, is it better to have a good offense
or a good defense?
As trivial of a question as that may sound, it’s actually a very
important question in determining the strategies NBA teams will use.
Teams have a limited amount of resources (player contracts, draft picks,
time and schedule alotted for training, etc.) and so it is vital that
these resources are utilized as efficiently as possible in order to
affect the outcome of the season. Today we’ll discuss offense vs defense
in detail using NBA data.
We will be specifically looking at the average
Team FG%, the average Opponent FG%, and the
Win% of all 30 NBA teams over the 2021-2022 regular season.
Here is a link to
the NBA website’s entire dataset.
Below is dataset used for this analysis.
| NBA Team | Win% | Team FG% | Opponent FG% |
|---|---|---|---|
| Atlanta Hawks | 52.4 | 46.9 | 47.1 |
| Boston Celtics | 62.2 | 46.6 | 43.4 |
| Brooklyn Nets | 53.7 | 47.6 | 45.2 |
| Charlotte Hornets | 52.4 | 46.9 | 46.5 |
| Chicago Bulls | 56.1 | 48.0 | 47.4 |
| Cleveland Cavaliers | 53.7 | 46.9 | 45.2 |
| Dallas Mavericks | 63.4 | 46.2 | 45.7 |
| Denver Nuggets | 58.5 | 48.3 | 47.0 |
| Detroit Pistons | 28.0 | 43.0 | 47.3 |
| Golden State Warriors | 64.6 | 46.9 | 43.9 |
| Houston Rockets | 24.4 | 45.5 | 48.3 |
| Indiana Pacers | 30.5 | 46.3 | 48.3 |
| LA Clippers | 51.2 | 45.8 | 45.0 |
| Los Angeles Lakers | 40.2 | 47.0 | 47.0 |
| Memphis Grizzlies | 68.3 | 46.1 | 45.6 |
| Miami Heat | 64.6 | 46.6 | 44.7 |
| Milwaukee Bucks | 62.2 | 46.8 | 45.6 |
| Minnesota Timberwolves | 56.1 | 45.8 | 46.0 |
| New Orleans Pelicans | 43.9 | 45.7 | 47.2 |
| New York Knicks | 45.1 | 43.7 | 44.7 |
| Oklahoma City Thunder | 29.3 | 43.1 | 45.8 |
| Orlando Magic | 26.8 | 43.4 | 45.9 |
| Philadelphia 76ers | 62.2 | 46.5 | 45.9 |
| Phoenix Suns | 78.0 | 48.5 | 44.5 |
| Portland Trail Blazers | 32.9 | 44.3 | 47.9 |
| Sacramento Kings | 36.6 | 46.0 | 48.0 |
| San Antonio Spurs | 41.5 | 46.6 | 46.6 |
| Toronto Raptors | 58.5 | 44.5 | 46.2 |
| Utah Jazz | 59.8 | 47.1 | 45.3 |
| Washington Wizards | 42.7 | 47.1 | 46.4 |
Our dataset above is a bit tedious to read so let’s visualize it
and see how these teams ranked in the 2021-2022 regular season. The
dropdown menu below let’s us choose an NBA metric and see how each team
favors out.
Surprisingly, almost half of the teams that season won
less than 50% of their games. Let’s see if we can try to help
improve these numbers using a regression model. First, we will carry out
a multiple regression analysis to ascertain the significance of either a
strong offense or strong defense in improving a team’s overall record
and then insights will be made. Below are the results of the first order
multiple regression analysis using the built in linear model function
lm(). Here, Team FG% is represented as
x, Opponent FG% is represented as
y and Win% is represented as z.
# Statistical analysis
lmwin_perc <- lm(z~x+y)
lmsummary <- summary(lmwin_perc)
lmsummary
##
## Call:
## lm(formula = z ~ x + y)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.716 -6.668 0.060 4.401 17.637
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 95.957 75.588 1.269 0.215
## x 5.314 1.010 5.260 1.51e-05 ***
## y -6.311 1.156 -5.459 8.90e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.839 on 27 degrees of freedom
## Multiple R-squared: 0.7123, Adjusted R-squared: 0.691
## F-statistic: 33.43 on 2 and 27 DF, p-value: 4.952e-08
# Calculating VIF Values
vif(lmwin_perc)
## x y
## 1.020147 1.020147
# Normal probability plot of residuals
res <- resid(lmwin_perc)
qqnorm(res)
qqline(res)
As we can see, our linear regression model is
defensible! Our P values for
Team FG% and Opponent FG% were
1.514^{-5} and 8.9^{-6}, respectively.
Our adjusted R-squared = 0.69, both of our VIFs
= 1.02, and our QQ plot is relatively normally distributed with
only 1 large residual (the Toronto Raptors).
Our equation for the regression model reads as follows: \[z = 95.96 + 5.31x-6.31y\] where \(z=\) Win%, \(x=\) Team FG%, and \(y=\) Opponent FG%.
That’s all good but how does the relationship of offense and
defense look? Let’s create a 3D plot of the regression plane along with
the data points. Red lines are shown to visualize the variance of each
of the points to the regression plane. The plot is interactive, allowing
us to orient it in various ways.
# Creating an interactive 3D plot
plot3d(x,y,z, xlab = "Team FG%", ylab = "Opp FG%", zlab = "Win%")
# Creating the regression plane
surface3d(x.pred, y.pred, z.pred,alpha=0.4, front = "lines", back = "lines")
# Creating residual lines
segments3d(interleave(x,x),
interleave(y,y),
interleave(z,fitpoints), alpha=0.4, col="red")
As we’ve seen, our model was defensible in determining
Win%. We discovered the magnitude of the coefficients of
Team FG% and Opponent FG% to be 5.31 and 6.31,
respectively, signaling that Opponent FG% was a greater
determinant of Win% than Team FG%.
Furthermore, by using the linearHypothesis
function, we can test the differences in our coefficients to determine
if these differences are significant . This allows us to test the null
hypothesis that there is no difference between the Team FG%
and Opponent FG%.
# Calculating significance between coefficients
linearHypothesis(fit, "x-y = 0")
## Linear hypothesis test
##
## Hypothesis:
## x - y = 0
##
## Model 1: restricted model
## Model 2: z ~ x + y
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 28 5752.3
## 2 27 1659.1 1 4093.2 66.611 9.135e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P value was 9.135e-09, indicating a
significant difference between the coefficients!
It’s safe to say that defensive prowess has a good chance at winning games compared to offensive talent. We’ve proven that here in this report. These data points aren’t complete in that they don’t consider multiple options for stratification and allowance of new factors, such as home vs away game, type of shot (2-point vs 3-point), active players, etc. These are all fascinating areas to look into.
Also, the NBA differs from other sports in that the NBA’s
offense and defense are very intertwined. Maybe using the NFL (a very
distinct game at both sides of the ball) as a dataset would give us
different insights. Either way, front offices can use information like
this regression model above to make tough decisions with respect to
hiring coaches, signing free agents, and even drafting the next class of
NBA players.
“Good basketball always starts with good defense.”
– Coach Bob Knight