Problem 2: Summary Statistics Regression Puzzle

A study uses a simple linear model to describe the relationship between the response variable Y and the explanatory variable x. Some key summary statistics.

n <- 20
r<- (-0.85)
sy<- 5
sx<- 6.78
y_bar <- 11.02
x_bar <- 9.28

a) Use the above statistics to report the least squares regression line.

beta_1 <- r * (sy/sx)
beta_1
## [1] -0.6268437
beta_0 <- y_bar - (beta_1*x_bar)
beta_0
## [1] 16.83711

The least squares regression line can be expressed as: y = (16.83711) + (-0.6268437)x

b) Use the above statistics to report the R^2 value. Please give an interpretation of this value in your own words.

R2<-r^2
R2
## [1] 0.7225

The R-squared value (0.7225) tells us how big of a percentage of the variable is represented by the regression line. In this case, about 72.25% of the variable is represented by the simple linear regression line.

c) Use the above statistics to construct a 95% confidence interval for the slope parameter.

ss_res <- (sy^2)*(n-1)

# lower bound
LB<-ss_res/qchisq(0.975, df=n-2)
LB
## [1] 15.06675
# upper bound
UB<-ss_res/qchisq(0.025, df=n-2)
UB
## [1] 57.71044

The lower bound for the slope is 15.06675 and the upper bound is 57.71044.