Consider the space shuttle data ?shuttle in the MASS library. Consider modeling the use of the autolander as the outcome (variable name use). Fit a logistic regression model with autolander (variable auto) use (labeled as “auto” 1) versus not (0) as predicted by wind sign (variable wind). Give the estimated odds ratio for autolander use comparing head winds, labeled as “head” in the variable headwind (numerator) to tail winds (denominator).
library(MASS)
?shuttle
shuttle$use.binary <- as.integer(shuttle$use == "auto")
fit <- glm(use.binary ~ wind - 1, data = shuttle, family = binomial)
summary(fit)$coef
## Estimate Std. Error z value Pr(>|z|)
## windhead 0.2513144 0.1781742 1.410499 0.1583925
## windtail 0.2831263 0.1785510 1.585689 0.1128099
unname(exp(coef(fit))[1]/exp(coef(fit))[2]) #Estimated Odds ratio
## [1] 0.9686888
#Which mean windhead is 0.9687 times than the windtail to be auto , so windtail is more likely to be auto.
Consider the previous problem. Give the estimated odds ratio for autolander use comparing head winds (numerator) to tail winds (denominator) adjusting for wind strength from the variable magn.
library(MASS)
shuttle$use.binary <- as.integer(shuttle$use == "auto")
fit <- glm(use.binary ~ wind+magn - 1, data = shuttle, family = binomial)
summary(fit)$coef
## Estimate Std. Error z value Pr(>|z|)
## windhead 3.635093e-01 0.2840608 1.279688e+00 0.2006547
## windtail 3.955180e-01 0.2843987 1.390717e+00 0.1643114
## magnMedium -1.009525e-15 0.3599481 -2.804642e-15 1.0000000
## magnOut -3.795136e-01 0.3567709 -1.063746e+00 0.2874438
## magnStrong -6.441258e-02 0.3589560 -1.794442e-01 0.8575889
exp(coef(fit))
## windhead windtail magnMedium magnOut magnStrong
## 1.4383682 1.4851533 1.0000000 0.6841941 0.9376181
exp(cbind(OddsRatio=coef(fit),confint(fit)))
## Waiting for profiling to be done...
## OddsRatio 2.5 % 97.5 %
## windhead 1.4383682 0.8281020 2.534603
## windtail 1.4851533 0.8548293 2.619816
## magnMedium 1.0000000 0.4928642 2.028956
## magnOut 0.6841941 0.3380374 1.373844
## magnStrong 0.9376181 0.4627029 1.897218
1.4384/1.4852
## [1] 0.9684891
If you fit a logistic regression model to a binary variable, for example use of the autolander, then fit a logistic regression model for one minus the outcome (not using the autolander) what happens to the coefficients?
library(MASS)
shuttle$use.binary<- as.integer(shuttle$use=="auto")
fit1<- glm(1-use.binary~wind-1,data=shuttle,family=binomial)
summary(fit1)$coef
## Estimate Std. Error z value Pr(>|z|)
## windhead -0.2513144 0.1781742 -1.410499 0.1583925
## windtail -0.2831263 0.1785510 -1.585689 0.1128099
fit<- glm(use.binary~wind-1,data=shuttle,family=binomial)
summary(fit)$coef
## Estimate Std. Error z value Pr(>|z|)
## windhead 0.2513144 0.1781742 1.410499 0.1583925
## windtail 0.2831263 0.1785510 1.585689 0.1128099
Consider the insect spray data InsectSprays. Fit a Poisson model using spray as a factor level. Report the estimated relative rate comapring spray A (numerator) to spray B (denominator).
fit<- glm(count~factor(spray)-1,family="poisson",data=InsectSprays)
summary(fit)
##
## Call:
## glm(formula = count ~ factor(spray) - 1, family = "poisson",
## data = InsectSprays)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.3852 -0.8876 -0.1482 0.6063 2.6922
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## factor(spray)A 2.67415 0.07581 35.274 < 2e-16 ***
## factor(spray)B 2.73003 0.07372 37.032 < 2e-16 ***
## factor(spray)C 0.73397 0.20000 3.670 0.000243 ***
## factor(spray)D 1.59263 0.13019 12.233 < 2e-16 ***
## factor(spray)E 1.25276 0.15430 8.119 4.71e-16 ***
## factor(spray)F 2.81341 0.07071 39.788 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 2264.808 on 72 degrees of freedom
## Residual deviance: 98.329 on 66 degrees of freedom
## AIC: 376.59
##
## Number of Fisher Scoring iterations: 5
exp(coef(fit))
## factor(spray)A factor(spray)B factor(spray)C factor(spray)D factor(spray)E
## 14.500000 15.333333 2.083333 4.916667 3.500000
## factor(spray)F
## 16.666667
14.5/15.33333
## [1] 0.9456524
Consider a Poisson glm with an offset, t. So, for example, a model of the form glm(count ~ x + offset(t), family = poisson) where x is a factor variable comparing a treatment (1) to a control (0) and t is the natural log of a monitoring time. What is impact of the coefficient for x if we fit the model glm(count ~ x + offset(t2), family = poisson) where t2 <- log(10) + t? In other words, what happens to the coefficients if we change the units of the offset variable. (Note, adding log(10) on the log scale is multiplying by 10 on the original scale.)
set.seed(1234)
t<- rnorm(72)
t1<- log(10)+t
fit<- glm(count~factor(spray)+offset(t),family="poisson",data=InsectSprays)
fit1<- glm(count~factor(spray)+offset(t1),family="poisson",data=InsectSprays)
summary(fit)$coef[,1]
## (Intercept) factor(spray)B factor(spray)C factor(spray)D factor(spray)E
## 2.7632461 -0.6232906 -1.7941202 -0.8715497 -1.8091014
## factor(spray)F
## -0.9963552
summary(fit1)$coef[,1]
## (Intercept) factor(spray)B factor(spray)C factor(spray)D factor(spray)E
## 0.4606610 -0.6232906 -1.7941202 -0.8715497 -1.8091014
## factor(spray)F
## -0.9963552
Consider the data x <- -5:5 y <- c(5.12, 3.93, 2.67, 1.87, 0.52, 0.08, 0.93, 2.05, 2.54, 3.87, 4.97) Using a knot point at 0, fit a linear model that looks like a hockey stick with two lines meeting at x=0. Include an intercept term, x and the knot point term. What is the estimated slope of the line after 0?
x <- -5:5
y <- c(5.12, 3.93, 2.67, 1.87, 0.52, 0.08, 0.93, 2.05, 2.54, 3.87, 4.97)
knots<-c(0)
splineTerms<-sapply(knots,function(knot) (x>knot)*(x-knot))
xmat<-cbind(1,x,splineTerms)
fit<-lm(y~xmat-1)
yhat<-predict(fit)
summary(fit)$coef
## Estimate Std. Error t value Pr(>|t|)
## xmat -0.1825806 0.13557812 -1.346682 2.149877e-01
## xmatx -1.0241584 0.04805280 -21.313188 2.470198e-08
## xmat 2.0372258 0.08574713 23.758531 1.048711e-08
(yhat[10]-yhat[6])/4
## 10
## 1.013067
plot(x,y)
lines(x,yhat,col="red")