Problem 2.
set.seed(1)
library("ISLR2")
glm.fit <- glm(default ~ income + balance, data = Default, family = binomial(link = "logit"))
summary(glm.fit)
##
## Call:
## glm(formula = default ~ income + balance, family = binomial(link = "logit"),
## data = Default)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.154e+01 4.348e-01 -26.545 < 2e-16 ***
## income 2.081e-05 4.985e-06 4.174 2.99e-05 ***
## balance 5.647e-03 2.274e-04 24.836 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 2920.6 on 9999 degrees of freedom
## Residual deviance: 1579.0 on 9997 degrees of freedom
## AIC: 1585
##
## Number of Fisher Scoring iterations: 8
boot.fn <- function(data, index)
coef(glm(default ~ income + balance, data = data, subset = index, family = binomial(link = "logit")))
C
library(boot)
## Warning: package 'boot' was built under R version 4.3.3
boot(Default, boot.fn, 1000)
##
## ORDINARY NONPARAMETRIC BOOTSTRAP
##
##
## Call:
## boot(data = Default, statistic = boot.fn, R = 1000)
##
##
## Bootstrap Statistics :
## original bias std. error
## t1* -1.154047e+01 -3.945460e-02 4.344722e-01
## t2* 2.080898e-05 1.680317e-07 4.866284e-06
## t3* 5.647103e-03 1.855765e-05 2.298949e-04
Comparatively, the glm() model and bootstrap gave very similar standard errors for the coefficents in the logistic regression model. The only slight differences appear in the later decimal places with bootstrap generally giving the larger numbers.