For this week’s data dive I will be selecting a binary column of data and building a logistic regression model for this variable using 4 explanatory variables.
My binary column of data is the Playoffs column.
The explanatory variables are the TOV_per_100, ORB_per_100, X2p_Percent, and FTA_per_100 columns.
After building the logistic regression model I will be interpreting the coefficients and explain what they mean. Then, using the standard error for the TOV_per_100 variable, I will build a confidence interval for that coefficient and translate its meaning.
And as always I will be providing the insights gathered, their significance, and some additional questions to be explored.
Here is my logistic regression model built from the binary playoffs variable and the 4 explanatory variables: TOV_per_100, ORB_per_100, X2p_Percent, and FTA_per_100.
log_model <- glm(Playoffs ~ ORB_per_100 + TOV_per_100 + FTA_per_100 + X2p_Percent,
data = NBA_log,
family = "binomial")
summary(log_model)
##
## Call:
## glm(formula = Playoffs ~ ORB_per_100 + TOV_per_100 + FTA_per_100 +
## X2p_Percent, family = "binomial", data = NBA_log)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -14.24347 1.68673 -8.444 < 2e-16 ***
## ORB_per_100 0.14457 0.03992 3.621 0.000293 ***
## TOV_per_100 -0.17048 0.03865 -4.411 1.03e-05 ***
## FTA_per_100 0.17613 0.02214 7.957 1.76e-15 ***
## X2p_Percent 21.94868 2.61424 8.396 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1931.9 on 1401 degrees of freedom
## Residual deviance: 1757.5 on 1397 degrees of freedom
## AIC: 1767.5
##
## Number of Fisher Scoring iterations: 4
Interpretations:
- ORB_per_100:
The offensive rebounds estimate is 0.14457. e^0.14457 is about equal to 1.16. This means that each offensive rebound per 100 possessions increased the odds of making the playoffs by about 16%.
This tells us that the extra possessions really matter and that opportunities for additional scoring attempts contribute towards team success.
- TOV_per_100:
The turnovers estimate is -0.17048. e^-0.17048 is about equal to 0.84. (1-0.84=0.16) This means that each additional turnover per 100 possessions decreases the odds of making the playoffs by about 16%.
This means that turnovers are very impactful and can lead to significantly less odds towards making the playoffs.
- FTA_per_100:
The free throw attempt estimate is 0.17613. e^0.17613 is about equal to 1.19. This means that each additional free throw attempt per 100 possessions increases playoff odds by about 19%.
This means that getting to the free throw line is a major advantage. Having more free throw attempts can be very helpful for winning games and making the playoffs.
- X2p_Percent:
The 2pt% estimate is 21.94868. 21.94868/100=0.22. e^0.22 is about equal to 1.25. This means that a 1% increase in 2 point shooting percentage increases playoff odds by about 25%.
This means that efficiency near the rim and in the mid-range is very important. This is one of the strongest indicators of making the playoffs in the model.
exp(coef(log_model))
## (Intercept) ORB_per_100 TOV_per_100 FTA_per_100 X2p_Percent
## 6.518376e-07 1.155540e+00 8.432635e-01 1.192588e+00 3.405571e+09
Here is the Confidence Interval for TOV_per_100:
coef_val <- coef(summary(log_model))["TOV_per_100", "Estimate"]
se_val <- coef(summary(log_model))["TOV_per_100", "Std. Error"]
lower <- coef_val - 1.96 * se_val
upper <- coef_val + 1.96 * se_val
lower
## [1] -0.2462219
upper
## [1] -0.09472972
exp(lower)
## [1] 0.7817487
exp(upper)
## [1] 0.9096188
The confidence interval indicates that we are 95% confident that the true effect of turnovers lies between -0.246 and -0.095 in log-odds. In ratio form we are 95% confident that each additional turnover reduces playoff odds by between 9% and 22%.
Overall, the entire interval is below the value of 1. This would indicate that the turnovers variable is statistically significant. Turnovers are consistently harmful for teams when it comes to making the playoffs.
The main insight I gather from the model is that all 4 explanatory variables are very important and have massive impacts on the odds of a team making the playoffs. Increasing your offensive rebounds, your 2pt%, and your free throw attempts can all greatly improve your chances of making the playoffs. On the other hand, increasing your amount of turnovers can decrease your chances of making the playoffs. This insight is significant because it tells us how valuable each possession is in addition to simply making shots. Since increasing offensive rebounds and reducing turnovers lead to having additional possessions we know that winning the possession battle in a game can be more important than how many shots you make. Some additional questions I might explore could be: Does 3pt% have a massive impact towards playoff prediction. Can teams offset their turnovers with additional offensive rebounding?