Problem 14.13

Part A.

\(b_0\) = -4.73931 \(b_1\) = 0.06773 \(b_2\) = 0.5986

The response function is as follows:

\(\hat{\pi}\) = \(\frac{1}{1 + exp(4.73931 - 0.67733X1 - 0.5986X2)}\)

ch14 <- read.table("CH14PR13.txt")
head(ch14)
##   V1 V2 V3
## 1  0 32  3
## 2  0 45  2
## 3  1 60  2
## 4  0 53  1
## 5  0 25  4
## 6  1 68  1
names(ch14) <- c("Y", "X1", "X2")

full <- glm(Y~X1+X2,family=binomial(link=logit),data=ch14)
summary(full)
## 
## Call:
## glm(formula = Y ~ X1 + X2, family = binomial(link = logit), data = ch14)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.6189  -0.8949  -0.5880   0.9653   2.0846  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -4.73931    2.10195  -2.255   0.0242 *
## X1           0.06773    0.02806   2.414   0.0158 *
## X2           0.59863    0.39007   1.535   0.1249  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 44.987  on 32  degrees of freedom
## Residual deviance: 36.690  on 30  degrees of freedom
## AIC: 42.69
## 
## Number of Fisher Scoring iterations: 4

Part B.

\(b_1\) : Holding the other predictors constant, it is expected that the log odds increases by 0.06773 for addition one unit of \(X_i\)

Also, the odds ratio increases by 7%.

\(b_2\) : Holding the other predictors constant, it is expected that the log odds increases by 0.59863 for addition one unit of \(X_i\)

Also, the odds ratio increases by 81.9%.

exp(0.06773) #1.07
## [1] 1.070076
exp(0.59863) #1.819
## [1] 1.819624

Part C.

The estimated probability that a family with annual income of $50,000 and an oldest car of 3 years will purchase a new car next year is 0.609.

new <- data.frame(X1 = 50, X2 = 3)
predict(glm(Y~X1+X2,family=binomial(link=logit),data=ch14), new, type="response")
##         1 
## 0.6090245

Problem 14.19

Part A.

The joint confidence interval for the family income odds ratio exp(20\(\beta_1\)) for families whose income differ by 20 thousand dollars and for the age of the oldest family automobile odds ratio exp(2\(\beta_2\)) is (1.290263, 11.6401) and (1.290263, 11.6401) respectively.

This simply means we are 95% confident that both \(\beta_1\) and \(\beta_2\) will fall in the above intervals.

z <- 1.96 #z for 95% CI
sb1 <- 0.02806
sb2 <- 0.39007

exp(20*(0.067733+1.96*(0.02806))) #11.64
## [1] 11.64192
exp(20*(0.067733-1.96*(0.02806))) #1.29
## [1] 1.290085
### Wald CI for exp(20beta1)
exp(20*confint.default(full,2,level=0.95))
##       2.5 %  97.5 %
## X1 1.290263 11.6401
exp(2*(0.5986+1.96*(0.39007))) #15.27587
## [1] 15.27587
exp(2*(0.5986-1.96*(0.39007))) #0.7175
## [1] 0.7175774
### Wald CI for exp(20beta1)
exp(2*confint.default(full,3,level=0.95))
##        2.5 %   97.5 %
## X2 0.7176559 15.27613

Part B.

Hypothesis Test: \(H_0\) : \(\beta_2\) = 0 \(H_a\) : $_2 \(\neq 0\)

Decision Rule: If |z*| \(\leq\) 1.96, we fail to reject the \(H_0\).

Conclusion: As |z*| = 1.53 < 1.96, we fail to reject \(H_0\).

From our summary of the full model for X2, we know the p-value is 0.125.

b2 <- 0.59863
sb2 <- 0.39007
z_star <- b2/sb2
z_star 
## [1] 1.534673
wald <- z_star**2
wald #2.355
## [1] 2.355222
z <- 1.96

Part C.

Hypothesis Test: \(H_0\) : \(\beta_2\) = 0 \(H_a\) : \(\beta_2\) \(\neq 0\)

Decision Rule: If \(G^2\) \(\leq\) \(\chi^2\)(0.95,1) = 3.841459, fail to reject \(H_0\).

Conclusion: \(G^2\) = 2.514 \(\leq\) \(\chi^2\)(0.95,1) = 3.841459.

Thus, we fail to reject the \(H_0\). The p-value is 0.1058638.

The full model is as follows: \(\hat{\pi}\) = \(\frac{1}{1 + exp(4.73931 - 0.67733X1 - 0.5986X2)}\)

The reduced model is as follows: \(\hat{\pi}\) = \(\frac{1}{1 + exp(1.98079 + 0.04342X1)}\)

The results here are similar to what was obtained in part B. We still conclude \(H_0\) but the pvalue is smaller now that we have dropped a variable.

red <- glm(Y~X1,family=binomial(link=logit),data=ch14)
summary(red)
## 
## Call:
## glm(formula = Y ~ X1, family = binomial(link = logit), data = ch14)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.5883  -0.8430  -0.7121   0.9262   1.7688  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -1.98079    0.85720  -2.311   0.0208 *
## X1           0.04342    0.02011   2.159   0.0308 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 44.987  on 32  degrees of freedom
## Residual deviance: 39.305  on 31  degrees of freedom
## AIC: 43.305
## 
## Number of Fisher Scoring iterations: 4
Gsq2 <- -2*(logLik(red)-logLik(full))
Gsq2 #2.614
## 'log Lik.' 2.614905 (df=2)
qchisq(0.95,1)  #3.8414
## [1] 3.841459
1-pchisq(Gsq2,1)
## 'log Lik.' 0.1058638 (df=2)

Part D.

Hypothesis Test: \(H_0\) : \(\beta_3\) = \(\beta_4\) = \(\beta_5\) = 0 \(H_a\) : at least 1 \(\beta_k\) \(\neq 0\) where k = 3,4,5.

Decision Rule: If \(|G^2|\) \(\leq\) \(\chi^2\)(0.95,3) = 7.814728, fail to reject \(H_0\).

Conclusion: \(|G^2|\) = 2.4369 \(\leq\) 7.814728. So, we fail to reject the \(H_0\).

The p-value is 0.486 for this test.

The full model is as follows: \(\hat{\pi}\) = \(\frac{1}{1 + exp(4.73931 - 0.67733X1 - 0.5986X2)}\)

The reduced model is as follows: \(\hat{\pi}\) = \(\frac{1}{1 + exp(-0.019395 + 0.168179 X1 + 0.0603X2 - 0.0017X1^2 + 0.085X2^2 + 0.0408X1X2)}\)

red2 <- glm(Y~X1+X2+I(X1^2)+I(X2^2)+X1:X2,family=binomial(link=logit),data=ch14)
summary(red2)
## 
## Call:
## glm(formula = Y ~ X1 + X2 + I(X1^2) + I(X2^2) + X1:X2, family = binomial(link = logit), 
##     data = ch14)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8342  -0.8221  -0.5545   0.9072   1.8512  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.019395   6.819616   0.003    0.998
## X1          -0.168179   0.241812  -0.695    0.487
## X2          -0.060251   2.492918  -0.024    0.981
## I(X1^2)      0.001720   0.002161   0.796    0.426
## I(X2^2)     -0.085282   0.257060  -0.332    0.740
## X1:X2        0.040801   0.038588   1.057    0.290
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 44.987  on 32  degrees of freedom
## Residual deviance: 34.253  on 27  degrees of freedom
## AIC: 46.253
## 
## Number of Fisher Scoring iterations: 6
Gsq2 <- -2*(logLik(red2)-logLik(full))
Gsq2 #2.4369
## 'log Lik.' -2.436953 (df=6)
qchisq(0.95,3)  #7.814
## [1] 7.814728
1-pchisq(2.4369,3) #pvalue 0.486
## [1] 0.4868027