Problem 14.13

Part A.

$b_0$ = -4.73931 $b_1$ = 0.06773 $b_2$ = 0.5986

The response function is as follows:

$\hat{\pi}$ = $\frac{1}{1 + exp(4.73931 - 0.67733X1 - 0.5986X2)}$

ch14 <- read.table("CH14PR13.txt")
head(ch14)

##   V1 V2 V3
## 1  0 32  3
## 2  0 45  2
## 3  1 60  2
## 4  0 53  1
## 5  0 25  4
## 6  1 68  1

names(ch14) <- c("Y", "X1", "X2")

full <- glm(Y~X1+X2,family=binomial(link=logit),data=ch14)
summary(full)

## 
## Call:
## glm(formula = Y ~ X1 + X2, family = binomial(link = logit), data = ch14)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.6189  -0.8949  -0.5880   0.9653   2.0846  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -4.73931    2.10195  -2.255   0.0242 *
## X1           0.06773    0.02806   2.414   0.0158 *
## X2           0.59863    0.39007   1.535   0.1249  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 44.987  on 32  degrees of freedom
## Residual deviance: 36.690  on 30  degrees of freedom
## AIC: 42.69
## 
## Number of Fisher Scoring iterations: 4

Part B.

$b_1$ : Holding the other predictors constant, it is expected that the log odds increases by 0.06773 for addition one unit of $X_i$

Also, the odds ratio increases by 7%.

$b_2$ : Holding the other predictors constant, it is expected that the log odds increases by 0.59863 for addition one unit of $X_i$

Also, the odds ratio increases by 81.9%.

exp(0.06773) #1.07

## [1] 1.070076

exp(0.59863) #1.819

## [1] 1.819624

Part C.

The estimated probability that a family with annual income of $50,000 and an oldest car of 3 years will purchase a new car next year is 0.609.

new <- data.frame(X1 = 50, X2 = 3)
predict(glm(Y~X1+X2,family=binomial(link=logit),data=ch14), new, type="response")

##         1 
## 0.6090245

Problem 14.19

Part A.

The joint confidence interval for the family income odds ratio exp(20$\beta_1$) for families whose income differ by 20 thousand dollars and for the age of the oldest family automobile odds ratio exp(2$\beta_2$) is (1.290263, 11.6401) and (1.290263, 11.6401) respectively.

This simply means we are 95% confident that both $\beta_1$ and $\beta_2$ will fall in the above intervals.

z <- 1.96 #z for 95% CI
sb1 <- 0.02806
sb2 <- 0.39007

exp(20*(0.067733+1.96*(0.02806))) #11.64

## [1] 11.64192

exp(20*(0.067733-1.96*(0.02806))) #1.29

## [1] 1.290085

### Wald CI for exp(20beta1)
exp(20*confint.default(full,2,level=0.95))

##       2.5 %  97.5 %
## X1 1.290263 11.6401

exp(2*(0.5986+1.96*(0.39007))) #15.27587

## [1] 15.27587

exp(2*(0.5986-1.96*(0.39007))) #0.7175

## [1] 0.7175774

### Wald CI for exp(20beta1)
exp(2*confint.default(full,3,level=0.95))

##        2.5 %   97.5 %
## X2 0.7176559 15.27613

Part B.

Hypothesis Test: $H_0$ : $\beta_2$ = 0 $H_a$ : $_2 $\neq 0$

Decision Rule: If |z*| $\leq$ 1.96, we fail to reject the $H_0$.

Conclusion: As |z*| = 1.53 < 1.96, we fail to reject $H_0$.

From our summary of the full model for X2, we know the p-value is 0.125.

b2 <- 0.59863
sb2 <- 0.39007
z_star <- b2/sb2
z_star

## [1] 1.534673

wald <- z_star**2
wald #2.355

## [1] 2.355222

z <- 1.96

Part C.

Hypothesis Test: $H_0$ : $\beta_2$ = 0 $H_a$ : $\beta_2$ $\neq 0$

Decision Rule: If $G^2$ $\leq$ $\chi^2$(0.95,1) = 3.841459, fail to reject $H_0$.

Conclusion: $G^2$ = 2.514 $\leq$ $\chi^2$(0.95,1) = 3.841459.

Thus, we fail to reject the $H_0$. The p-value is 0.1058638.

The full model is as follows: $\hat{\pi}$ = $\frac{1}{1 + exp(4.73931 - 0.67733X1 - 0.5986X2)}$

The reduced model is as follows: $\hat{\pi}$ = $\frac{1}{1 + exp(1.98079 + 0.04342X1)}$

The results here are similar to what was obtained in part B. We still conclude $H_0$ but the pvalue is smaller now that we have dropped a variable.

red <- glm(Y~X1,family=binomial(link=logit),data=ch14)
summary(red)

## 
## Call:
## glm(formula = Y ~ X1, family = binomial(link = logit), data = ch14)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.5883  -0.8430  -0.7121   0.9262   1.7688  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -1.98079    0.85720  -2.311   0.0208 *
## X1           0.04342    0.02011   2.159   0.0308 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 44.987  on 32  degrees of freedom
## Residual deviance: 39.305  on 31  degrees of freedom
## AIC: 43.305
## 
## Number of Fisher Scoring iterations: 4

Gsq2 <- -2*(logLik(red)-logLik(full))
Gsq2 #2.614

## 'log Lik.' 2.614905 (df=2)

qchisq(0.95,1)  #3.8414

## [1] 3.841459

1-pchisq(Gsq2,1)

## 'log Lik.' 0.1058638 (df=2)

Part D.

Hypothesis Test: $H_0$ : $\beta_3$ = $\beta_4$ = $\beta_5$ = 0 $H_a$ : at least 1 $\beta_k$ $\neq 0$ where k = 3,4,5.

Decision Rule: If $|G^2|$ $\leq$ $\chi^2$(0.95,3) = 7.814728, fail to reject $H_0$.

Conclusion: $|G^2|$ = 2.4369 $\leq$ 7.814728. So, we fail to reject the $H_0$.

The p-value is 0.486 for this test.

The full model is as follows: $\hat{\pi}$ = $\frac{1}{1 + exp(4.73931 - 0.67733X1 - 0.5986X2)}$

The reduced model is as follows: $\hat{\pi}$ = $\frac{1}{1 + exp(-0.019395 + 0.168179 X1 + 0.0603X2 - 0.0017X1^2 + 0.085X2^2 + 0.0408X1X2)}$

red2 <- glm(Y~X1+X2+I(X1^2)+I(X2^2)+X1:X2,family=binomial(link=logit),data=ch14)
summary(red2)

## 
## Call:
## glm(formula = Y ~ X1 + X2 + I(X1^2) + I(X2^2) + X1:X2, family = binomial(link = logit), 
##     data = ch14)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8342  -0.8221  -0.5545   0.9072   1.8512  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.019395   6.819616   0.003    0.998
## X1          -0.168179   0.241812  -0.695    0.487
## X2          -0.060251   2.492918  -0.024    0.981
## I(X1^2)      0.001720   0.002161   0.796    0.426
## I(X2^2)     -0.085282   0.257060  -0.332    0.740
## X1:X2        0.040801   0.038588   1.057    0.290
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 44.987  on 32  degrees of freedom
## Residual deviance: 34.253  on 27  degrees of freedom
## AIC: 46.253
## 
## Number of Fisher Scoring iterations: 6

Gsq2 <- -2*(logLik(red2)-logLik(full))
Gsq2 #2.4369

## 'log Lik.' -2.436953 (df=6)

qchisq(0.95,3)  #7.814

## [1] 7.814728

1-pchisq(2.4369,3) #pvalue 0.486

## [1] 0.4868027

STAT 408 11

Kajal Chokshi

12/4/2018

Problem 14.13

Part A.

Part B.

Part C.

Problem 14.19

Part A.

Part B.

Part C.

Part D.