library(readr)
library(Stat2Data)
nbaData <- read_csv("nba_logreg.csv")
## Parsed with column specification:
## cols(
## .default = col_double(),
## Name = col_character()
## )
## See spec(...) for full column specifications.
mod1 = glm(TARGET_5Yrs ~ GP, family = binomial, data = nbaData)
summary(mod1)
##
## Call:
## glm(formula = TARGET_5Yrs ~ GP, family = binomial, data = nbaData)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.9163 -1.0413 0.6176 0.8635 1.9361
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.524465 0.226275 -11.16 <2e-16 ***
## GP 0.051059 0.003749 13.62 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1779.5 on 1339 degrees of freedom
## Residual deviance: 1561.3 on 1338 degrees of freedom
## AIC: 1565.3
##
## Number of Fisher Scoring iterations: 4
B0 = summary(mod1)$coef[1]
B1 = summary(mod1)$coef[2]
plot(jitter(TARGET_5Yrs, amount = 0.1) ~ GP, data = nbaData)
curve(exp(B0+B1*x)/(1+exp(B0+B1*x)), add = TRUE, col = "red")
#### The above plot demonstrates mod1 (TARGET_5Yrs ~ GP) and its logistic curve. The plot does not appear to be completely fitting because the data points are gathered either at the bottom or at the top. However, we can see that as the number of games played increased, the likelihood of a player lasting at least 5 years in the league increases.
emplogitplot1(TARGET_5Yrs ~ GP, data = nbaData)
#### The above plot demonstrates the empirical logit plot for TARGET_5Yrs ~ GP. The data appears to be rather linear - the points are not directly on the line, but are pretty close to the fit. With only three data points, it is difficult to tell if it is completely linear for all data points. Slicing was unnecessary because the data had a single binary predictor variable.
summary(mod1)
##
## Call:
## glm(formula = TARGET_5Yrs ~ GP, family = binomial, data = nbaData)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.9163 -1.0413 0.6176 0.8635 1.9361
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.524465 0.226275 -11.16 <2e-16 ***
## GP 0.051059 0.003749 13.62 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1779.5 on 1339 degrees of freedom
## Residual deviance: 1561.3 on 1338 degrees of freedom
## AIC: 1565.3
##
## Number of Fisher Scoring iterations: 4
exp(confint.default(mod1))
## 2.5 % 97.5 %
## (Intercept) 0.05140824 0.1248086
## GP 1.04467976 1.0601467
G1 <- 1779.5 - 1561.3
G1
## [1] 218.2
1-pchisq(G1, 1)
## [1] 0
mod2 = glm(TARGET_5Yrs ~ PTS, family = binomial, data = nbaData)
summary(mod1)
##
## Call:
## glm(formula = TARGET_5Yrs ~ GP, family = binomial, data = nbaData)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.9163 -1.0413 0.6176 0.8635 1.9361
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.524465 0.226275 -11.16 <2e-16 ***
## GP 0.051059 0.003749 13.62 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1779.5 on 1339 degrees of freedom
## Residual deviance: 1561.3 on 1338 degrees of freedom
## AIC: 1565.3
##
## Number of Fisher Scoring iterations: 4
B0 = summary(mod2)$coef[1]
B1 = summary(mod2)$coef[2]
plot(jitter(TARGET_5Yrs, amount = 0.1) ~ PTS, data = nbaData)
curve(exp(B0+B1*x)/(1+exp(B0+B1*x)), add = TRUE, col = "red")
#### The above plot demonstrates mod2 (TARGET_5Yrs ~ PTS) and its logistic curve. The plot does not appear to be completely fitting because the data points are gathered either at the bottom or at the top. However, we can see that as the average number of points earned per game increased, the likelihood of a player lasting at least 5 years in the league increases.
emplogitplot1(TARGET_5Yrs ~ PTS, data = nbaData)
#### The above plot demonstrates the empirical logit plot for TARGET_5Yrs ~ PTS. The data appears to be rather linear - the points are not directly on the line, but are pretty close to the fit. With only three data points, it is difficult to tell if it is completely linear for all data points. Slicing was unnecessary because the data had a single binary predictor variable.
summary(mod2)
##
## Call:
## glm(formula = TARGET_5Yrs ~ PTS, family = binomial, data = nbaData)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.7285 -1.1412 0.6146 1.0190 1.4300
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.78108 0.12283 -6.359 2.03e-10 ***
## PTS 0.20452 0.01897 10.778 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1779.5 on 1339 degrees of freedom
## Residual deviance: 1620.3 on 1338 degrees of freedom
## AIC: 1624.3
##
## Number of Fisher Scoring iterations: 4
exp(confint.default(mod2))
## 2.5 % 97.5 %
## (Intercept) 0.3599377 0.5825534
## PTS 1.1821452 1.2734262
G2 <- 1779.5 - 1620.3
G2
## [1] 159.2
1-pchisq(G2, 1)
## [1] 0
summary(mod1)
##
## Call:
## glm(formula = TARGET_5Yrs ~ GP, family = binomial, data = nbaData)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.9163 -1.0413 0.6176 0.8635 1.9361
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.524465 0.226275 -11.16 <2e-16 ***
## GP 0.051059 0.003749 13.62 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1779.5 on 1339 degrees of freedom
## Residual deviance: 1561.3 on 1338 degrees of freedom
## AIC: 1565.3
##
## Number of Fisher Scoring iterations: 4
summary(mod2)
##
## Call:
## glm(formula = TARGET_5Yrs ~ PTS, family = binomial, data = nbaData)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.7285 -1.1412 0.6146 1.0190 1.4300
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.78108 0.12283 -6.359 2.03e-10 ***
## PTS 0.20452 0.01897 10.778 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1779.5 on 1339 degrees of freedom
## Residual deviance: 1620.3 on 1338 degrees of freedom
## AIC: 1624.3
##
## Number of Fisher Scoring iterations: 4
G1 <- 1779.5 - 1561.3
G1
## [1] 218.2
1-pchisq(G1, 1)
## [1] 0
G2 <- 1779.5 - 1620.3
G2
## [1] 159.2
1-pchisq(G2, 1)
## [1] 0