Data=read.table("http://users.stat.ufl.edu/~rrandles/sta4210/Rclassnotes/data/textdatasets/KutnerData/Chapter%2014%20Data%20Sets/CH14PR13.txt")
names(Data) = c("Y", "X1", "X2") #X1 Annual income, X2 Current age Old car, Y: 1 purchased care
n=nrow(Data)
car_logit <- glm(Y~X1+X2,family=binomial(link=logit),data=Data)
coef(car_logit)
## (Intercept) X1 X2
## -4.73930950 0.06773256 0.59863166
\[Logit = -4.739 + 0.0677*AnnualIncome + 0.5986*AgeOldCar\] \[Logit = -4.739 + 0.0677*X_1 + 0.5986*X_2\]
expb1 = exp(coef(car_logit)[2])
expb2 = exp(coef(car_logit)[3])
c(expb1,expb2)
## X1 X2
## 1.070079 1.819627
The odds of Purchasing a car increase by 7% for each increase in one thousand dollars of family income, if everything else remains constant.
The odds of Purchasing a car increase by 82% for each increase in one year of age of the oldest family automobile, if everything else remains constant.
exp(predict(car_logit,new=data.frame(X1=50,X2=3)))/(1+exp(predict(car_logit,new=data.frame(X1=50,X2=3))))
## 1
## 0.6090245
AlphaF=.1 #90% Family Confidence Interval
exp(20*confint.default(car_logit,2,level=1-AlphaF/2))
## 2.5 % 97.5 %
## X1 1.290263 11.6401
exp(2*confint.default(car_logit,3,level=1-AlphaF/2))
## 2.5 % 97.5 %
## X2 0.7176559 15.27613
We are 90% confident that both intervals capture the true odds ratios for families whose incomes differ by 20 thousand dollars and whose oldest automobiles differ in age by 2 years.
summary(car_logit)
##
## Call:
## glm(formula = Y ~ X1 + X2, family = binomial(link = logit), data = Data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.6189 -0.8949 -0.5880 0.9653 2.0846
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.73931 2.10195 -2.255 0.0242 *
## X1 0.06773 0.02806 2.414 0.0158 *
## X2 0.59863 0.39007 1.535 0.1249
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 44.987 on 32 degrees of freedom
## Residual deviance: 36.690 on 30 degrees of freedom
## AIC: 42.69
##
## Number of Fisher Scoring iterations: 4
With z = 1.535 and p-value of .12 > \(\alpha\) = .05, we fail-to-reject the null and cannot conclude that X2 is significant and therefore we can remove it from the model.
red <- glm(Y~X1,family=binomial(link=logit),data=Data)
Gsq2 <- -2*(logLik(red)-logLik(car_logit));Gsq2
## 'log Lik.' 2.614905 (df=2)
qchisq(0.95,1)
## [1] 3.841459
1-pchisq(Gsq2,1)
## 'log Lik.' 0.1058638 (df=2)
With a \(\chi^2 = 2.61\) and a p-value of .11 > \(\alpha\) = .05, we fail-to-reject the null and cannot conclude that X2 is significant and therefore we can remove it from the model.
full_new=glm(Y~X1+X2+I(X1^2)+I(X2^2)+X1:X2,family=binomial(link=logit),data=Data)
Gsq2 <- -2*(logLik(car_logit)-logLik(full_new));Gsq2
## 'log Lik.' 2.436953 (df=3)
qchisq(0.95,3) #6 original parameters - 3 we removed
## [1] 7.814728
1-pchisq(Gsq2,3)
## 'log Lik.' 0.4867928 (df=3)
With a \(\chi^2=7.81\) and a p-value of .487 > \(\alpha\) = .05, we fail-to-reject the null and cannot conclude that \(X_1^2\) \(X_2^2\) and the interaction term \(X_1*X_2\) are not significant and therefore we can remove them from the model.