Consider the data set given below x <- c(0.18, -1.54, 0.42, 0.95) And weights given by w <- c(2, 1, 3, 1) Give the value of μ that minimizes the least squares equation ∑ni=1wi(xi−μ)2
0.1471
0.300
0.0025
1.077
Answer
x <- c(0.18, -1.54, 0.42, 0.95)
w <- c(2, 1, 3, 1)
sum(mean(x * w) / mean(w))
## [1] 0.1471429
Consider the following data set x <- c(0.8, 0.47, 0.51, 0.73, 0.36, 0.58, 0.57, 0.85, 0.44, 0.42) y <- c(1.39, 0.72, 1.55, 0.48, 1.19, -1.59, 1.23, -0.65, 1.49, 0.05) Fit the regression through the origin and get the slope treating y as the outcome and x as the regressor. (Hint, do not center the data since we want regression through the origin, not through the means of the data.)
0.8263
0.59915
-0.04462
-1.713
Answer
x <- c(0.8, 0.47, 0.51, 0.73, 0.36, 0.58, 0.57, 0.85, 0.44, 0.42)
y <- c(1.39, 0.72, 1.55, 0.48, 1.19, -1.59, 1.23, -0.65, 1.49, 0.05)
fit <- lm(y ~ x - 1) # The minus 1 gets rid of the intercept because we want regression to the origin
summary(fit)
##
## Call:
## lm(formula = y ~ x - 1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.0692 -0.2536 0.5303 0.8592 1.1286
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## x 0.8263 0.5817 1.421 0.189
##
## Residual standard error: 1.094 on 9 degrees of freedom
## Multiple R-squared: 0.1831, Adjusted R-squared: 0.09238
## F-statistic: 2.018 on 1 and 9 DF, p-value: 0.1892
fit$coefficients[1]
## x
## 0.8262517
Do 𝚍𝚊𝚝𝚊(𝚖𝚝𝚌𝚊𝚛𝚜) from the datasets package and fit the regression model with mpg as the outcome and weight as the predictor. Give the slope coefficient.
-9.559
30.2851
-5.344
0.5591
Answer
data(mtcars)
fit2 <- (lm(mpg ~ wt, data = mtcars))
summary(fit2)
##
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.5432 -2.3647 -0.1252 1.4096 6.8727
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.2851 1.8776 19.858 < 2e-16 ***
## wt -5.3445 0.5591 -9.559 1.29e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446
## F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10
fit2$coefficients[2]
## wt
## -5.344472
Consider data with an outcome (Y) and a predictor (X). The standard deviation of the predictor is one half that of the outcome. The correlation between the two variables is .5. What value would the slope coefficient for the regression model with Y as the ### outcome and X as the predictor?
4
3
0.25
1
Answer
sd(X) = sd(Y)/2 cor(Y,X) = 0.5
cor(Y, X) * sd(Y) / (sd(Y)/2)
0.5 * 2 = 1
Students were given two hard tests and scores were normalized to have empirical mean 0 and variance 1. The correlation between the scores on the two tests was 0.4. What would be the expected score on Quiz 2 for a student who had a normalized ### score of 1.5 on Quiz 1?
0.6
1.0
0.4
0.16
Answer
# Slope of regression line is Cor(Y, X)
1.5 * 0.4 # Regression to the mean
## [1] 0.6
Consider the data given by the following x <- c(8.58, 10.46, 9.01, 9.64, 8.86)
What is the value of the first measurement if x were normalized (to have mean 0 and variance 1)?
8.86
-0.9719
9.31
8.58
Answer
x <- c(8.58, 10.46, 9.01, 9.64, 8.86)
value1 <- ((x - mean(x))/sd(x))[1]
value1
## [1] -0.9718658
Consider the following data set (used above as well). What is the intercept for fitting the model with x as the predictor and y as the outcome?
x <- c(0.8, 0.47, 0.51, 0.73, 0.36, 0.58, 0.57, 0.85, 0.44, 0.42) y <- c(1.39, 0.72, 1.55, 0.48, 1.19, -1.59, 1.23, -0.65, 1.49, 0.05)
1.567
2.105
1.252
-1.713
Answer
x <- c(0.8, 0.47, 0.51, 0.73, 0.36, 0.58, 0.57, 0.85, 0.44, 0.42)
y <- c(1.39, 0.72, 1.55, 0.48, 1.19, -1.59, 1.23, -0.65, 1.49, 0.05)
coef(lm(y ~ x))[1]
## (Intercept)
## 1.567461
You know that both the predictor and response have mean 0. What can be said about the intercept when you fit a linear regression?
It must be exactly one.
It must be identically 0.
It is undefined as you have to divide by zero.
Nothing about the intercept can be said from the information given.
Answer
Notation for intercept is defined as β0^=Y¯−β1.X¯
As Y¯an X¯ both equal zero so the intercept will also be zero.
Consider the data given by x <- c(0.8, 0.47, 0.51, 0.73, 0.36, 0.58, 0.57, 0.85, 0.44, 0.42) What value minimizes the sum of the squared distances between these points and itself?
0.573
0.8
0.44
0.36
Answer
This is the least squares estimate, which works out to be the mean in this case.
x <- c(0.8, 0.47, 0.51, 0.73, 0.36, 0.58, 0.57, 0.85, 0.44, 0.42)
mean(x) # Least squares estimate (X¯) which is the mean of x
## [1] 0.573
Let the slope having fit Y as the outcome and X as the predictor be denoted as ### β1. Let the slope from fitting X as the outcome and Y as the predictor be denoted ### as γ1. Suppose that you divide β1 by γ1; in other words consider β1/γ1. What is this ratio always equal to?
2SD(Y)/SD(X)
1
Var(Y)/Var(X)
Cor(Y,X)
Answer
The β1=Cor(Y,X)SD(Y)/SD(X) and γ1=Cor(Y,X)SD(X)/SD(Y).
Thus the ratio is then Var(Y)/Var(X).