The coefficient estimate is 1.9939. The t-statistic is 18.73. The standard error of the coefficient estimate is 0.1065. The p-value is <2e-16, indicating that we can reject the null hypothesis.
set.seed (1)
x <- rnorm (100)
y <- 2 * x + rnorm (100)
lm_model = lm(y~x+0) #Create the linear regression
summary(lm_model)
##
## Call:
## lm(formula = y ~ x + 0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.9154 -0.6472 -0.1771 0.5056 2.3109
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## x 1.9939 0.1065 18.73 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9586 on 99 degrees of freedom
## Multiple R-squared: 0.7798, Adjusted R-squared: 0.7776
## F-statistic: 350.7 on 1 and 99 DF, p-value: < 2.2e-16
The coefficient estimate is 0.39111. The t-statistic is 18.73. The standard error of the coefficient estimate is 0.02089. The p-value is <2e-16, indicating that we can reject the null hypothesis.
The t-statistic and the p-value are the same for both the results. The relationships represent the same least squares line since y=2x+e can be written as x=0.5(y-e)
set.seed (1)
x <- rnorm (100)
y <- 2 * x + rnorm (100)
lm_model = lm(x~y+0) #Create the linear regression
summary(lm_model)
##
## Call:
## lm(formula = x ~ y + 0)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.8699 -0.2368 0.1030 0.2858 0.8938
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## y 0.39111 0.02089 18.73 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4246 on 99 degrees of freedom
## Multiple R-squared: 0.7798, Adjusted R-squared: 0.7776
## F-statistic: 350.7 on 1 and 99 DF, p-value: < 2.2e-16
The formulae for the estimator coefficient and standard error of estimator coefficient are as below - $$ ={_ix_iy_y/_jx_j^2}
; SE()={} \[ Considering that the t statistic is the ratio of the estimator coefficient to the standard error of estimator coefficient, the t statistic value can be represented as \] t = \[ Substituting the value of the estimator coefficient in the denominator, we get \] t = = . $$
When the formula derived for t statistic is used to calculate its value in R, the same value is obtained as one shown in the models in part a) and b).
n <- length(x)
t <- (x %*% y)*sqrt(n - 1)/sqrt(sum(x^2) * sum(y^2) - (x %*% y)^2)
t
## [,1]
## [1,] 18.72593
If we switch the values of x and y with each other in the formula for t-statistic, the formula essentially stays the same and we obtain the same value as shown below
n <- length(y)
t <- (y %*% x)*sqrt(n - 1)/sqrt(sum(y^2) * sum(x^2) - (y %*% x)^2)
t
## [,1]
## [1,] 18.72593
When the intercept is included, the t-statistic is the same for the regression of y onto x and x onto y (both are equal to 18.56 as shown below).
lm_model_with_intercept<-lm(y~x)
summary(lm_model_with_intercept)
##
## Call:
## lm(formula = y ~ x)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8768 -0.6138 -0.1395 0.5394 2.3462
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.03769 0.09699 -0.389 0.698
## x 1.99894 0.10773 18.556 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9628 on 98 degrees of freedom
## Multiple R-squared: 0.7784, Adjusted R-squared: 0.7762
## F-statistic: 344.3 on 1 and 98 DF, p-value: < 2.2e-16
lm_model_with_intercept<-lm(x~y)
summary(lm_model_with_intercept)
##
## Call:
## lm(formula = x ~ y)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.90848 -0.28101 0.06274 0.24570 0.85736
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.03880 0.04266 0.91 0.365
## y 0.38942 0.02099 18.56 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4249 on 98 degrees of freedom
## Multiple R-squared: 0.7784, Adjusted R-squared: 0.7762
## F-statistic: 344.3 on 1 and 98 DF, p-value: < 2.2e-16