Linear Regression

Question 1

Consider the data set given below

x <- c(0.18, -1.54, 0.42, 0.95)

And weights given by

w <- c(2, 1, 3, 1)

Given the value of \(\mu\) that minimizes the least squares equation

\(\sum_{i=1}^n w_{i}(x_{i} - \mu)^2\)

0.0025
0.300
0.1471
1.077

Solution

\(\mu\) = \(\hat{Y}\)

sum(w * x) / sum(w)

## [1] 0.1471429

Question 2

Consider the following data set

x <- c(0.8, 0.47, 0.51, 0.73, 0.36, 0.58, 0.57, 0.85, 0.44, 0.42)
y <- c(1.39, 0.72, 1.55, 0.48, 1.19, -1.59, 1.23, -0.65, 1.49, 0.05)

Fit the regression through the origin and get the slop treating y

as the outcome and x as the regressor. (Hint, do not center the dat since we want regression through the origin, not through the means of the data.)

0.59915
-0.04462
0.8263
-1.713

Solution

lm is used to fit linear models. The coefficients value is the named vector of coefficients

y ~ x - 1 specifies a line through the origin

y ~ x + 0 or y ~ 0 + x specifies a model with no intercept

fit <- lm(y ~ x - 1)
fit$coefficients

##         x 
## 0.8262517

Question 3

Do data(mtcars) from the datsets package and fit the regression model with the mpg as the outcome and weight as the predictor. Give the slope coefficient.

30.2851
-9.559
-5.344
0.5591

Solution

data(mtcars)

x <- mtcars$wt
y <- mtcars$mpg
fit <- lm(y ~ x)
fit$coefficients

## (Intercept)           x 
##   37.285126   -5.344472

Second solution to verify answer:

x <- mtcars$wt
y <- mtcars$mpg
cor(x,y) * sd(y)/sd(x)

## [1] -5.344472

Question 4

Consider data with an outcome (Y) and a predictor (X). The standard deviation of the predictor is one half that of the outcome. The correlation between the two variables is 0.5. What value would the slope coefficient for the regression model with Y as the outcome and X as the predictor?

0.25
1
3
4

Solution

slope = Corr(predictor, outcome) * Sd(outcome) / Sd(predictor)

\(\hat{\beta}\) = Corr(Y, X) Sd(X)/Sd(Y)

correlation <- 0.5
stdPredictor <- 0.5
slope <- correlation * ( 1 / stdPredictor)
slope

## [1] 1

Question 5

Students were given two hard tests and scores were normalized to have empirical mean 0 and variance 1. The correlation between the scores on the two test was 0.4. What would be the expected score on Quiz 2 for a student who had a normalized score of 1.5 on Quiz 1?

1.0
0.4
0.16
0.6

Solution

\(q_{1}\) = \(q_{2}\beta\)

\(\hat{\beta}\) = Corr(Y, X) Sd(X)/Sd(Y) = 0.4

correlation <- 0.4
q1 <- 1.5
q2 <- correlation * q1
q2

## [1] 0.6

Question 6

Consider the data given by the following

x <- c(8.58, 10.46, 9.01, 9.64, 8.86)

What is the value of the first measurement if x were normalized (to have mean 0 and variance 1)

8.86
9.31
-0.9719
8.58

Solution

(x - mean(x)) / sd(x)

## [1] -0.9718658  1.5310215 -0.3993969  0.4393366 -0.5990954

Question 7

Consider the data set (used above as well). What is the intercept for fitting the model with x as the predictor and y as the outcome?

x <- c(0.8, 0.47, 0.51, 0.73, 0.36, 0.58, 0.57, 0.85, 0.44, 0.42)
y <- c(1.39, 0.72, 1.55, 0.48, 1.19, -1.59, 1.23, -0.65, 1.49, 0.05)

1.252
2.105
1.567
-1.713

Solution

lm(y ~ x)

## 
## Call:
## lm(formula = y ~ x)
## 
## Coefficients:
## (Intercept)            x  
##       1.567       -1.713

Question 8

You know that both the predictor and response have mean 0. What can be said about the intercept when you fit a linear regression?

It is undefined as you have to divide by zero.
Nothing about the iintercept can be said from the information given.
It must be identically 0.
It muyst be exactly one.

Solution

if we know \(\hat{\beta}_0 = \bar{Y} - \beta_1.\bar{X}\)

then, the intercept is zero when \(\hat{Y} and \hat{X} are zero\)

Question 9

Consider the dat given by

x <- c(0.8, 0.47, 0.51, 0.73, 0.36, 0.58, 0.57, 0.85, 0.44, 0.42)

What value minimizes the sum of the squared distances between these points and itself?

0.8
0.44
0.36
0.573

Solution

\(\sum_{i=1}^n (X_{i} - \mu)^2 = \bar{X}\)

mean(x)

## [1] 0.573

Question 10

Let the slope having fit Y as the outcome and X as the predictor be donated as \(\beta_{1}\). Le the slope from fitting X as the outcome and Y as the predictor be denoted as \(\lambda_{1}\). Suppose that you divide \(\beta_{1}/\lambda_{1}\).
What is this ratio always equal to?

1
Cor(Y,X)
Var(Y)/Var(X)
2Sd(Y)/Sd(X)

Solution

\(\hat{\beta_{1}}\) = \(Corr(Y,X) \frac{sd(Y)}{sd(X)}\)

\(\hat{\lambda_{1}}\) = \(Corr(X,Y) \frac{sd(X)}{sd(Y)}\)

\(\hat{\beta_{1}} / \hat{\lambda_{1}}\) = \(\frac{sd(Y)^2}{sd(X)^2}\) = \(\frac{Var(Y)}{Var(X)}\)

Linear Regression - Quiz 1

Mark Spoto

March 30, 2018

Question 1

Solution

Question 2

Solution

Question 3

Solution

Question 4

Solution

Question 5

Solution

Question 6

Solution

Question 7

Solution

Question 8

Solution

Question 9

Solution

Question 10

Solution