\(b_0\) = -4.73931 \(b_1\) = 0.06773 \(b_2\) = 0.5986
The response function is as follows:
\(\hat{\pi}\) = \(\frac{1}{1 + exp(4.73931 - 0.67733X1 - 0.5986X2)}\)
ch14 <- read.table("CH14PR13.txt")
head(ch14)
## V1 V2 V3
## 1 0 32 3
## 2 0 45 2
## 3 1 60 2
## 4 0 53 1
## 5 0 25 4
## 6 1 68 1
names(ch14) <- c("Y", "X1", "X2")
full <- glm(Y~X1+X2,family=binomial(link=logit),data=ch14)
summary(full)
##
## Call:
## glm(formula = Y ~ X1 + X2, family = binomial(link = logit), data = ch14)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.6189 -0.8949 -0.5880 0.9653 2.0846
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.73931 2.10195 -2.255 0.0242 *
## X1 0.06773 0.02806 2.414 0.0158 *
## X2 0.59863 0.39007 1.535 0.1249
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 44.987 on 32 degrees of freedom
## Residual deviance: 36.690 on 30 degrees of freedom
## AIC: 42.69
##
## Number of Fisher Scoring iterations: 4
\(b_1\) : Holding the other predictors constant, it is expected that the log odds increases by 0.06773 for addition one unit of \(X_i\)
Also, the odds ratio increases by 7%.
\(b_2\) : Holding the other predictors constant, it is expected that the log odds increases by 0.59863 for addition one unit of \(X_i\)
Also, the odds ratio increases by 81.9%.
exp(0.06773) #1.07
## [1] 1.070076
exp(0.59863) #1.819
## [1] 1.819624
The estimated probability that a family with annual income of $50,000 and an oldest car of 3 years will purchase a new car next year is 0.609.
new <- data.frame(X1 = 50, X2 = 3)
predict(glm(Y~X1+X2,family=binomial(link=logit),data=ch14), new, type="response")
## 1
## 0.6090245
The joint confidence interval for the family income odds ratio exp(20\(\beta_1\)) for families whose income differ by 20 thousand dollars and for the age of the oldest family automobile odds ratio exp(2\(\beta_2\)) is (1.290263, 11.6401) and (1.290263, 11.6401) respectively.
This simply means we are 95% confident that both \(\beta_1\) and \(\beta_2\) will fall in the above intervals.
z <- 1.96 #z for 95% CI
sb1 <- 0.02806
sb2 <- 0.39007
exp(20*(0.067733+1.96*(0.02806))) #11.64
## [1] 11.64192
exp(20*(0.067733-1.96*(0.02806))) #1.29
## [1] 1.290085
### Wald CI for exp(20beta1)
exp(20*confint.default(full,2,level=0.95))
## 2.5 % 97.5 %
## X1 1.290263 11.6401
exp(2*(0.5986+1.96*(0.39007))) #15.27587
## [1] 15.27587
exp(2*(0.5986-1.96*(0.39007))) #0.7175
## [1] 0.7175774
### Wald CI for exp(20beta1)
exp(2*confint.default(full,3,level=0.95))
## 2.5 % 97.5 %
## X2 0.7176559 15.27613
Hypothesis Test: \(H_0\) : \(\beta_2\) = 0 \(H_a\) : $_2 \(\neq 0\)
Decision Rule: If |z*| \(\leq\) 1.96, we fail to reject the \(H_0\).
Conclusion: As |z*| = 1.53 < 1.96, we fail to reject \(H_0\).
From our summary of the full model for X2, we know the p-value is 0.125.
b2 <- 0.59863
sb2 <- 0.39007
z_star <- b2/sb2
z_star
## [1] 1.534673
wald <- z_star**2
wald #2.355
## [1] 2.355222
z <- 1.96
Hypothesis Test: \(H_0\) : \(\beta_2\) = 0 \(H_a\) : \(\beta_2\) \(\neq 0\)
Decision Rule: If \(G^2\) \(\leq\) \(\chi^2\)(0.95,1) = 3.841459, fail to reject \(H_0\).
Conclusion: \(G^2\) = 2.514 \(\leq\) \(\chi^2\)(0.95,1) = 3.841459.
Thus, we fail to reject the \(H_0\). The p-value is 0.1058638.
The full model is as follows: \(\hat{\pi}\) = \(\frac{1}{1 + exp(4.73931 - 0.67733X1 - 0.5986X2)}\)
The reduced model is as follows: \(\hat{\pi}\) = \(\frac{1}{1 + exp(1.98079 + 0.04342X1)}\)
The results here are similar to what was obtained in part B. We still conclude \(H_0\) but the pvalue is smaller now that we have dropped a variable.
red <- glm(Y~X1,family=binomial(link=logit),data=ch14)
summary(red)
##
## Call:
## glm(formula = Y ~ X1, family = binomial(link = logit), data = ch14)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.5883 -0.8430 -0.7121 0.9262 1.7688
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.98079 0.85720 -2.311 0.0208 *
## X1 0.04342 0.02011 2.159 0.0308 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 44.987 on 32 degrees of freedom
## Residual deviance: 39.305 on 31 degrees of freedom
## AIC: 43.305
##
## Number of Fisher Scoring iterations: 4
Gsq2 <- -2*(logLik(red)-logLik(full))
Gsq2 #2.614
## 'log Lik.' 2.614905 (df=2)
qchisq(0.95,1) #3.8414
## [1] 3.841459
1-pchisq(Gsq2,1)
## 'log Lik.' 0.1058638 (df=2)
Hypothesis Test: \(H_0\) : \(\beta_3\) = \(\beta_4\) = \(\beta_5\) = 0 \(H_a\) : at least 1 \(\beta_k\) \(\neq 0\) where k = 3,4,5.
Decision Rule: If \(|G^2|\) \(\leq\) \(\chi^2\)(0.95,3) = 7.814728, fail to reject \(H_0\).
Conclusion: \(|G^2|\) = 2.4369 \(\leq\) 7.814728. So, we fail to reject the \(H_0\).
The p-value is 0.486 for this test.
The full model is as follows: \(\hat{\pi}\) = \(\frac{1}{1 + exp(4.73931 - 0.67733X1 - 0.5986X2)}\)
The reduced model is as follows: \(\hat{\pi}\) = \(\frac{1}{1 + exp(-0.019395 + 0.168179 X1 + 0.0603X2 - 0.0017X1^2 + 0.085X2^2 + 0.0408X1X2)}\)
red2 <- glm(Y~X1+X2+I(X1^2)+I(X2^2)+X1:X2,family=binomial(link=logit),data=ch14)
summary(red2)
##
## Call:
## glm(formula = Y ~ X1 + X2 + I(X1^2) + I(X2^2) + X1:X2, family = binomial(link = logit),
## data = ch14)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.8342 -0.8221 -0.5545 0.9072 1.8512
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.019395 6.819616 0.003 0.998
## X1 -0.168179 0.241812 -0.695 0.487
## X2 -0.060251 2.492918 -0.024 0.981
## I(X1^2) 0.001720 0.002161 0.796 0.426
## I(X2^2) -0.085282 0.257060 -0.332 0.740
## X1:X2 0.040801 0.038588 1.057 0.290
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 44.987 on 32 degrees of freedom
## Residual deviance: 34.253 on 27 degrees of freedom
## AIC: 46.253
##
## Number of Fisher Scoring iterations: 6
Gsq2 <- -2*(logLik(red2)-logLik(full))
Gsq2 #2.4369
## 'log Lik.' -2.436953 (df=6)
qchisq(0.95,3) #7.814
## [1] 7.814728
1-pchisq(2.4369,3) #pvalue 0.486
## [1] 0.4868027