Linear Regression - quiz week2

Question 1

x <- c(0.61, 0.93, 0.83, 0.35, 0.54, 0.16, 0.91, 0.62, 0.62)
y <- c(0.67, 0.84, 0.6, 0.18, 0.85, 0.47, 1.1, 0.65, 0.36)

Give a P-value for the two sided hypothesis test of whether β1 from a linear regression model is 0 or not.

fit<-lm(y~x)
summary(fit)$coefficients[2,4] #Solution

## [1] 0.05296439

plot(x,y)
abline(fit,col="red")

Question 2

Consider the previous problem, give the estimate of the residual standard deviation.

summary(fit)$sigma

## [1] 0.2229981

Question 3

In the mtcars data set, fit a linear regression model of weight (predictor) on mpg (outcome). Get a 95% confidence interval for the expected mpg at the average weight. What is the lower endpoint?

data(mtcars)
x<-mtcars$wt
y<-mtcars$mpg
fit <- lm(y ~ x)

# First option
predict(fit, data.frame(x=mean(x)), interval=("confidence"))

##        fit      lwr      upr
## 1 20.09062 18.99098 21.19027

# Second option
yhat <- fit$coef[1] + fit$coef[2] * mean(x)
yhat + c(-1, 1) * qt(.975, df = fit$df) * summary(fit)$sigma / sqrt(length(y))

## [1] 18.99098 21.19027

# Plot
pre<-predict(fit,data.frame(x), interval="confidence")
plot(x, y)
abline(fit, col="red")

lines(x,pre[,2], col="green")
lines(x,pre[,3], col="green")

Question 4

Refer to the previous question. Read the help file for mtcars. What is the weight coefficient interpreted as?

Solution

From help(mtcars) mpg Miles/(US) gallon wt Weight (1000 lbs)

The estimated expected change in mpg per 1,000 lb increase in weight.

Question 5

Consider again the mtcars data set and a linear regression model with mpg as predicted by weight (1,000 lbs). A new car is coming weighing 3000 pounds. Construct a 95% prediction interval for its mpg. What is the upper endpoint?

x<-mtcars$wt
y<-mtcars$mpg
fit <- lm(y ~ x)

# First option
predict(fit, data.frame(x=3), interval=("prediction"))

##        fit      lwr      upr
## 1 21.25171 14.92987 27.57355

# Second option
yhat <- fit$coef[1] + fit$coef[2] * 3
yhat + c(-1, 1) * qt(.975, df = fit$df) * summary(fit)$sigma * sqrt(1 + (1/length(y)) + ((3 - mean(x)) ^ 2 / sum((x - mean(x)) ^ 2)))

## [1] 14.92987 27.57355

Question 6

Consider again the mtcars data set and a linear regression model with mpg as predicted by weight (in 1,000 lbs). A “short” ton is defined as 2,000 lbs. Construct a 95% confidence interval for the expected change in mpg per 1 short ton increase in weight. Give the lower endpoint.

data("mtcars")
x<-mtcars$wt
y<-mtcars$mpg

fit1<-lm(y~x)
fit2<-lm(y~I(x/2))
coef2 <- coef(summary(fit2))
coef2[2,1] + c(-1, 1) * qt(.975, df=fit2$df) * coef2[2,2]

## [1] -12.97262  -8.40527

par(mfrow=c(1,2))
plot(x,y)
abline(fit1,col="red")
plot(x/2,y)
abline(fit2, col="red")

Question 7

If my X from a linear regression is measured in centimeters and I convert it to meters what would happen to the slope coefficient?

x <- c(0.61, 0.93, 0.83, 0.35, 0.54, 0.16, 0.91, 0.62, 0.62)
y <- c(0.67, 0.84, 0.6, 0.18, 0.85, 0.47, 1.1, 0.65, 0.36)
fit<-lm(y~x)
coef(summary(fit))

##              Estimate Std. Error   t value   Pr(>|t|)
## (Intercept) 0.1884572  0.2061290 0.9142681 0.39098029
## x           0.7224211  0.3106531 2.3254912 0.05296439

fit2<-lm(y~I(x/100))
coef(summary(fit2))

##               Estimate Std. Error   t value   Pr(>|t|)
## (Intercept)  0.1884572   0.206129 0.9142681 0.39098029
## I(x/100)    72.2421080  31.065311 2.3254912 0.05296439

Solution
The slope will be 100 times bigger

Question 8

I have an outcome, Y, and a predictor, X and fit a linear regression model with Y=β0+β1X+ϵ to obtain β^0 and β^1. What would be the consequence to the subsequent slope and intercept if I were to refit the model with a new regressor, X+c for some constant, c?

Solution
Y=(β^0-β1c)+β^1X+ϵ

slope stays th esame intercept decreases by β^1c

Question 9

Refer back to the mtcars data set with mpg as an outcome and weight (wt) as the predictor. About what is the ratio of the the sum of the squared errors, ∑ni=1(Yi−Y^i)2 when comparing a model with just an intercept (denominator) to the model with the intercept and slope (numerator)?

fit_num <- lm(mpg ~ wt, data=mtcars)
fit_denom <- lm(mpg ~ 1, data=mtcars)
fit_2<-lm(mpg ~ wt - 1, data=mtcars) # no intercept (0,0)
sum(resid(fit_num)^2) / sum(resid(fit_denom)^2)

## [1] 0.2471672

plot(mtcars$wt, mtcars$mpg)
abline(fit_num, col="red")
abline(fit_denom, col="blue")
abline(fit_2, col="green")

Question 10

Do the residuals always have to sum to 0 in linear regression?

sum(resid(fit_num))

## [1] -1.637579e-15

sum(resid(fit_denom))

## [1] -5.995204e-15

sum(resid(fit_2))

## [1] 98.11672

Solution
If an intercept is included, then they will sum to 0.