set.seed(1)
x <- rnorm(100)
y <- 2*x + rnorm(100)
betaHat <- lm(y~x+0)
SLR <- lm(x~y + 0)
summary(SLR)
##
## Call:
## lm(formula = x ~ y + 0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.8699 -0.2368 0.1030 0.2858 0.8938
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## y 0.39111 0.02089 18.73 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4246 on 99 degrees of freedom
## Multiple R-squared: 0.7798, Adjusted R-squared: 0.7776
## F-statistic: 350.7 on 1 and 99 DF, p-value: < 2.2e-16
When changing which variable is first, the relationship does not change. Therefore, the regression line and the p and t statistics will not change as well.
The t-statistic is 18.56 for both \(x\) to \(y\) and \(y\) to \(x\).
reg1 <- lm(y~x)
reg2 <- lm(x~y)
summary(reg1)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8768 -0.6138 -0.1395 0.5394 2.3462
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.03769 0.09699 -0.389 0.698
## x 1.99894 0.10773 18.556 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9628 on 98 degrees of freedom
## Multiple R-squared: 0.7784, Adjusted R-squared: 0.7762
## F-statistic: 344.3 on 1 and 98 DF, p-value: < 2.2e-16
summary(reg2)
##
## Call:
## lm(formula = x ~ y)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.90848 -0.28101 0.06274 0.24570 0.85736
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.03880 0.04266 0.91 0.365
## y 0.38942 0.02099 18.56 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4249 on 98 degrees of freedom
## Multiple R-squared: 0.7784, Adjusted R-squared: 0.7762
## F-statistic: 344.3 on 1 and 98 DF, p-value: < 2.2e-16
set.seed(1)
x <- rnorm(100, mean = 0, sd = 1)
eps <- rnorm(100, mean = 0, sd = 0.25)
Length of y: 100 \(\beta_0\): -1 \(\beta_1\): 0.5
y <- -1 + (0.5*x) + eps
The scatterplot has a positive linear relationship.
plot(x~y)
\(\hat{\beta_0}\): -1.0094 \(\hat{\beta_1}\): 0.4997
\(\hat{\beta_0}\) and \(\beta_0\) are very close and \(\hat{\beta_1}\) and \(\beta_1\) are very close.
lsq <- lm(y~x)
lsq
##
## Call:
## lm(formula = y ~ x)
##
## Coefficients:
## (Intercept) x
## -1.0094 0.4997
The blue line represents the least squares model, and the red line represents the model.
plot(y~x)
abline(lsq, col = "blue")
abline(a = -1, b = 0.5, col = "red")
The standard deviation of the error has been reduced to 0.1. The blue line represents the least squares model, and the red line represents the model.
\(\beta_0\): -1 \(\beta_1\): 0.5
\(\hat{\beta_0}\): -1.0038 \(\hat{\beta_1}\): 0.4999
The Betas are even closer in this least squares model.
set.seed(1)
x2 <- rnorm(100, mean = 0, sd = 1)
eps2 <- rnorm(100, mean = 0, sd = 0.1)
y2 <- -1 + (0.5*x2) + eps2
lsq2 <- lm(y2~x2)
lsq2
##
## Call:
## lm(formula = y2 ~ x2)
##
## Coefficients:
## (Intercept) x2
## -1.0038 0.4999
plot(y2~x2)
abline(lsq2, col = "blue")
abline(a = -1, b = 0.5, col = "red")
The standard deviation of the error has been reduced to 1. The blue line represents the least squares model, and the red line represents the model.
\(\beta_0\): -1 \(\beta_1\): 0.5
\(\hat{\beta_0}\): -1.0377 \(\hat{\beta_1}\): 0.4989
The Betas are more different than in the previous models.
set.seed(1)
x3 <- rnorm(100, mean = 0, sd = 1)
eps3 <- rnorm(100, mean = 0, sd = 1)
y3 <- -1 + (0.5*x3) + eps3
lsq3 <- lm(y3~x3)
lsq3
##
## Call:
## lm(formula = y3 ~ x3)
##
## Coefficients:
## (Intercept) x3
## -1.0377 0.4989
plot(y3~x3)
abline(lsq3, col = "blue")
abline(a = -1, b = 0.5, col = "red")
The confidence interval for x2, with the smallest error variance, has the smallest condifence interval. Likewise, the confidence interval for x3, with the largest error variance, has the largest condifence interval. This makes sense because the distance from the regression lines to the points are smaller when there is smaller vairiance.
conf1 <- confint(lsq)
conf2 <- confint(lsq2)
conf3 <- confint(lsq3)
conf1
## 2.5 % 97.5 %
## (Intercept) -1.0575402 -0.9613061
## x 0.4462897 0.5531801
conf2
## 2.5 % 97.5 %
## (Intercept) -1.0230161 -0.9845224
## x2 0.4785159 0.5212720
conf3
## 2.5 % 97.5 %
## (Intercept) -1.2301607 -0.8452245
## x3 0.2851588 0.7127204