Consider the space shuttle data ?shuttle in the MASS library. Consider modeling the use of the autolander as the outcome (variable name use). Fit a logistic regression model with autolander (variable auto) use (labeled as “auto” 1) versus not (0) as predicted by wind sign (variable wind). Give the estimated odds ratio for autolander use comparing head winds, labeled as “head” in the variable headwind (numerator) to tail winds (denominator).
library(MASS)
head(shuttle)
## stability error sign wind magn vis use
## 1 xstab LX pp head Light no auto
## 2 xstab LX pp head Medium no auto
## 3 xstab LX pp head Strong no auto
## 4 xstab LX pp tail Light no auto
## 5 xstab LX pp tail Medium no auto
## 6 xstab LX pp tail Strong no auto
shuttle2<-shuttle
shuttle2$use2<-as.numeric(shuttle2$use=='auto')
fit<-glm(use2 ~ factor(wind) - 1, family = binomial, data = shuttle2)
fit
##
## Call: glm(formula = use2 ~ factor(wind) - 1, family = binomial, data = shuttle2)
##
## Coefficients:
## factor(wind)head factor(wind)tail
## 0.2513 0.2831
##
## Degrees of Freedom: 256 Total (i.e. Null); 254 Residual
## Null Deviance: 354.9
## Residual Deviance: 350.3 AIC: 354.3
fit<-glm(use2 ~ factor(wind) - 1, family = binomial, data = shuttle2)
shuttle2<-shuttle
shuttle2$use2<-as.numeric(shuttle2$use=='auto')
fit<-glm(use2 ~ factor(wind) - 1, family = binomial, data = shuttle2)
summary(fit)$coef
## Estimate Std. Error z value Pr(>|z|)
## factor(wind)head 0.2513144 0.1781742 1.410499 0.1583925
## factor(wind)tail 0.2831263 0.1785510 1.585689 0.1128099
exp(coef(fit))
## factor(wind)head factor(wind)tail
## 1.285714 1.327273
1.286 / 1.327
## [1] 0.9691032
# ou
exp(coef(fit))[1] / exp(coef(fit))[2]
## factor(wind)head
## 0.9686888
exp(cbind(OddsRatio = coef(fit), confint(fit)))
## Waiting for profiling to be done...
## OddsRatio 2.5 % 97.5 %
## factor(wind)head 1.285714 0.9082416 1.829382
## factor(wind)tail 1.327273 0.9372106 1.890548
Consider the previous problem. Give the estimated odds ratio for autoloader use comparing head winds (numerator) to tail winds (denominator) adjusting for wind strength from the variable mag
fit<-glm(use2 ~ factor(wind) + factor(magn) - 1, family = binomial, data = shuttle2)
summary(fit)$coef
## Estimate Std. Error z value Pr(>|z|)
## factor(wind)head 3.635093e-01 0.2840608 1.279688e+00 0.2006547
## factor(wind)tail 3.955180e-01 0.2843987 1.390717e+00 0.1643114
## factor(magn)Medium -1.009525e-15 0.3599481 -2.804642e-15 1.0000000
## factor(magn)Out -3.795136e-01 0.3567709 -1.063746e+00 0.2874438
## factor(magn)Strong -6.441258e-02 0.3589560 -1.794442e-01 0.8575889
exp(coef(fit))
## factor(wind)head factor(wind)tail factor(magn)Medium
## 1.4383682 1.4851533 1.0000000
## factor(magn)Out factor(magn)Strong
## 0.6841941 0.9376181
exp(cbind(OddsRatio = coef(fit), confint(fit)))
## Waiting for profiling to be done...
## OddsRatio 2.5 % 97.5 %
## factor(wind)head 1.4383682 0.8281020 2.534603
## factor(wind)tail 1.4851533 0.8548293 2.619816
## factor(magn)Medium 1.0000000 0.4928642 2.028956
## factor(magn)Out 0.6841941 0.3380374 1.373844
## factor(magn)Strong 0.9376181 0.4627029 1.897218
exp(cbind(OddsRatio = coef(fit), confint(fit)))[1] / exp(cbind(OddsRatio = coef(fit), confint(fit)))[2]
## Waiting for profiling to be done...
## Waiting for profiling to be done...
## [1] 0.9684981
fit<-glm(use2 ~ factor(wind), family = binomial, data = shuttle2)
summary(fit)$coef
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.25131443 0.1781742 1.4104987 0.1583925
## factor(wind)tail 0.03181183 0.2522429 0.1261159 0.8996402
fit<-glm(1 - use2 ~ factor(wind), family = binomial, data = shuttle2)
summary(fit)$coef
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.25131443 0.1781742 -1.4104987 0.1583925
## factor(wind)tail -0.03181183 0.2522429 -0.1261159 0.8996402
The coefficients reverse their signs.
Consider the insect spray data InsectSprays . Fit a Poisson model using spray as a factor level. Report the estimated relative rate comapring spray A (numerator) to spray B (denominator).
fit <- glm(count ~ relevel(spray, "B"), data = InsectSprays, family = poisson)
exp(coef(fit))[2]
## relevel(spray, "B")A
## 0.9456522
Consider a Poisson glm with an offset, t. So, for example, a model of the form glm(count ~ x + offset(t), family = poisson) where x is a factor variable comparing a treatment (1) to a control (0) and t is the natural log of a monitoring time. What is impact of the coefficient for x if we fit the model glm(count ~ x + offset(t2), family = poisson) where 2 <- log(10) + t? In other words, what happens to the coefficients if we change the units of the offset variable. (Note, adding log(10) on the log scale is multiplying by 10 on the original scale.)
fit<-glm(count ~ factor(spray), family = poisson,data=InsectSprays,offset = log(count + 1))
summary(fit)$coef
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.066691374 0.0758098 -0.87971965 0.3790112
## factor(spray)B 0.003512473 0.1057445 0.03321659 0.9735019
## factor(spray)C -0.325350713 0.2138857 -1.52114274 0.1282240
## factor(spray)D -0.118451059 0.1506528 -0.78625173 0.4317200
## factor(spray)E -0.184623054 0.1719205 -1.07388635 0.2828736
## factor(spray)F 0.008422466 0.1036683 0.08124434 0.9352476
fit2<-glm(count ~ factor(spray), family = poisson,data=InsectSprays,offset = log(10)+log(count+1))
summary(fit2)$coef
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.369276467 0.0758098 -31.25290307 2.038695e-214
## factor(spray)B 0.003512473 0.1057445 0.03321659 9.735019e-01
## factor(spray)C -0.325350713 0.2138857 -1.52114274 1.282240e-01
## factor(spray)D -0.118451059 0.1506528 -0.78625173 4.317200e-01
## factor(spray)E -0.184623054 0.1719205 -1.07388635 2.828736e-01
## factor(spray)F 0.008422466 0.1036683 0.08124434 9.352476e-01
fit<-glm(count ~ factor(spray) + offset(log(count+1)), family = poisson,data=InsectSprays)
summary(fit)$coef
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.066691374 0.0758098 -0.87971965 0.3790112
## factor(spray)B 0.003512473 0.1057445 0.03321659 0.9735019
## factor(spray)C -0.325350713 0.2138857 -1.52114274 0.1282240
## factor(spray)D -0.118451059 0.1506528 -0.78625173 0.4317200
## factor(spray)E -0.184623054 0.1719205 -1.07388635 0.2828736
## factor(spray)F 0.008422466 0.1036683 0.08124434 0.9352476
fit2<-glm(count ~ factor(spray) + offset(log(10)+log(count+1)), family = poisson,data=InsectSprays)
summary(fit2)$coef
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.369276467 0.0758098 -31.25290307 2.038695e-214
## factor(spray)B 0.003512473 0.1057445 0.03321659 9.735019e-01
## factor(spray)C -0.325350713 0.2138857 -1.52114274 1.282240e-01
## factor(spray)D -0.118451059 0.1506528 -0.78625173 4.317200e-01
## factor(spray)E -0.184623054 0.1719205 -1.07388635 2.828736e-01
## factor(spray)F 0.008422466 0.1036683 0.08124434 9.352476e-01
Consider the data
x <- -5:5
y <- c(5.12, 3.93, 2.67, 1.87, 0.52, 0.08, 0.93, 2.05, 2.54, 3.87, 4.97)
Using a knot point at 0, fit a linear model that looks like a hockey stick with two lines meeting at x=0. Include an intercept term, x and the knot point term. What is the estimated slope of the line after 0?
x <- -5:5
y <- c(5.12, 3.93, 2.67, 1.87, 0.52, 0.08, 0.93, 2.05, 2.54, 3.87, 4.97)
knots<-c(0)
splineTerms<-sapply(knots,function(knot) (x>knot)*(x-knot))
xmat<-cbind(1,x,splineTerms)
fit<-lm(y~xmat-1)
yhat<-predict(fit)
summary(fit)$coef
## Estimate Std. Error t value Pr(>|t|)
## xmat -0.1825806 0.13557812 -1.346682 2.149877e-01
## xmatx -1.0241584 0.04805280 -21.313188 2.470198e-08
## xmat 2.0372258 0.08574713 23.758531 1.048711e-08
(yhat[10]-yhat[6])/4
## 10
## 1.013067
plot(x,y)
lines(x,yhat,col="red")