Consider the space shuttle data ?shuttle in the MASS library. Consider modeling the use of the autolander as the outcome (variable name use). Fit a logistic regression model with autolander (variable auto) use (labeled as “auto” 1) versus not (0) as predicted by wind sign (variable wind). Give the estimated odds ratio for autolander use comparing head winds, labeled as “head” in the variable headwind (numerator) to tail winds (denominator).
summary(lm(mpg ~ factor(cyl)+wt, data = mtcars))$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.990794 1.8877934 18.005569 6.257246e-17
## factor(cyl)6 -4.255582 1.3860728 -3.070244 4.717834e-03
## factor(cyl)8 -6.070860 1.6522878 -3.674214 9.991893e-04
## wt -3.205613 0.7538957 -4.252065 2.130435e-04
- The adjusted estimate for the expected change in mpg comparing 8 cylinders to 4 is **-6.07086**
Consider the mtcars data set. Fit a model with mpg as the outcome that includes number of cylinders as a factor variable and weight as a possible confounding variable. Compare the effect of 8 versus 4 cylinders on mpg for the adjusted and unadjusted by weight models. Here, adjusted means including the weight variable as a term in the regression model and unadjusted means the model without weight included. What can be said about the effect comparing 8 and 4 cylinders after looking at models with and without weight included?.
fit<-lm(mpg ~ factor(cyl)+wt, data = mtcars)
fit1<-lm(mpg ~ factor(cyl), data = mtcars)
summary(fit)$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.990794 1.8877934 18.005569 6.257246e-17
## factor(cyl)6 -4.255582 1.3860728 -3.070244 4.717834e-03
## factor(cyl)8 -6.070860 1.6522878 -3.674214 9.991893e-04
## wt -3.205613 0.7538957 -4.252065 2.130435e-04
summary(fit1)$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 26.663636 0.9718008 27.437347 2.688358e-22
## factor(cyl)6 -6.920779 1.5583482 -4.441099 1.194696e-04
## factor(cyl)8 -11.563636 1.2986235 -8.904534 8.568209e-10
Consider the mtcars data set. Fit a model with mpg as the outcome that considers number of cylinders as a factor variable and weight as confounder. Now fit a second model with mpg as the outcome model that considers the interaction between number of cylinders (as a factor variable) and weight. Give the P-value for the likelihood ratio test comparing the two models and suggest a model using 0.05 as a type I error rate significance benchmar
fit<-lm(mpg ~ factor(cyl)+ wt, data = mtcars)
fit1<-lm(mpg ~ factor(cyl)*wt, data = mtcars)
summary(fit)$coefficients; summary(fit1)$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 33.990794 1.8877934 18.005569 6.257246e-17
## factor(cyl)6 -4.255582 1.3860728 -3.070244 4.717834e-03
## factor(cyl)8 -6.070860 1.6522878 -3.674214 9.991893e-04
## wt -3.205613 0.7538957 -4.252065 2.130435e-04
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 39.571196 3.193940 12.3894599 2.058359e-12
## factor(cyl)6 -11.162351 9.355346 -1.1931522 2.435843e-01
## factor(cyl)8 -15.703167 4.839464 -3.2448150 3.223216e-03
## wt -5.647025 1.359498 -4.1537586 3.127578e-04
## factor(cyl)6:wt 2.866919 3.117330 0.9196716 3.661987e-01
## factor(cyl)8:wt 3.454587 1.627261 2.1229458 4.344037e-02
anova(fit)
## Analysis of Variance Table
##
## Response: mpg
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(cyl) 2 824.78 412.39 63.078 4.254e-11 ***
## wt 1 118.20 118.20 18.080 0.000213 ***
## Residuals 28 183.06 6.54
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-The P-value is larger than 0.05. So, according to our criterion, we would fail to reject, which suggests that the interaction terms may not be necessary.
Consider the mtcars data set. Fit a model with mpg as the outcome that includes number of cylinders as a factor variable and weight inlcuded in the model as
lm(mpg ~ I(wt * 0.5) + factor(cyl), data = mtcars)
lm(mpg ~ I(wt * 0.5) + factor(cyl), data = mtcars)
##
## Call:
## lm(formula = mpg ~ I(wt * 0.5) + factor(cyl), data = mtcars)
##
## Coefficients:
## (Intercept) I(wt * 0.5) factor(cyl)6 factor(cyl)8
## 33.991 -6.411 -4.256 -6.071
As the reference unit for wt is 1000 lbs (i.e. a half-ton), the wt coef is interpreted as The estimated expected change in MPG per one ton increase in weight for a specific number of cylinders (4, 6, 8)
Consider the following data set
x <- c(0.586, 0.166, -0.042, -0.614, 11.72)
y <- c(0.549, -0.026, -0.127, -0.751, 1.344)
Give the hat diagonal for the most influential point
x <- c(0.586, 0.166, -0.042, -0.614, 11.72)
y <- c(0.549, -0.026, -0.127, -0.751, 1.344)
fit2<-lm(y~x)
round(hatvalues(fit2), 3)
## 1 2 3 4 5
## 0.229 0.244 0.253 0.280 0.995
influence(lm(y ~ x))$hat
## 1 2 3 4 5
## 0.2286650 0.2438146 0.2525027 0.2804443 0.9945734
## showing how it's actually calculated
xm <- cbind(1, x)
diag(xm %*% solve(t(xm) %*% xm) %*% t(xm))
## [1] 0.2286650 0.2438146 0.2525027 0.2804443 0.9945734
Consider the following data set
x <- c(0.586, 0.166, -0.042, -0.614, 11.72)
y <- c(0.549, -0.026, -0.127, -0.751, 1.344)
Give the slope dfbeta for the point with the highest hat value
x <- c(0.586, 0.166, -0.042, -0.614, 11.72)
y <- c(0.549, -0.026, -0.127, -0.751, 1.344)
fit3<-lm(y~x)
round(dfbetas(fit3), 3)
## (Intercept) x
## 1 1.062 -0.378
## 2 0.067 -0.029
## 3 -0.017 0.008
## 4 -1.250 0.673
## 5 0.204 -133.823
influence.measures(lm(y ~ x))
## Influence measures of
## lm(formula = y ~ x) :
##
## dfb.1_ dfb.x dffit cov.r cook.d hat inf
## 1 1.0621 -3.78e-01 1.0679 0.341 2.93e-01 0.229 *
## 2 0.0675 -2.86e-02 0.0675 2.934 3.39e-03 0.244
## 3 -0.0174 7.92e-03 -0.0174 3.007 2.26e-04 0.253 *
## 4 -1.2496 6.73e-01 -1.2557 0.342 3.91e-01 0.280 *
## 5 0.2043 -1.34e+02 -149.7204 0.107 2.70e+02 0.995 *
Consider a regression relationship between Y and X with and without adjustment for a third variable Z. Which of the following is true about comparing the regression coefficient between Y and X with and without adjustment for Z.
It is possible for the coefficient to reverse sign after adjustment. For example, it can be strongly significant and positive before adjustment and strongly significant and negative after adjustment.