Simple Linear Regression

My Statistics Topic: Simple linear regression

Ultimately, simple linear regression models a relationship between: - a predictor X (what we know) - and a response Y (what we want to predict)

Example: Hours spent jump-roping (X) -> Vertical jump (Y)

SLR mathematical model / equation:

\[Y_i = \beta_0 + \beta_1 X_i + \epsilon_i\]

Example data (Randomly Generated)

x	y
2.655087	59.63320
3.721239	63.36479
5.728534	77.09591
9.082078	89.66829
2.016819	53.93474
8.983897	81.69062
9.446753	89.97450
6.607978	81.04311

Example scatterplot with regression line

Interpreting the slope from previous two slides

\[\hat{y} = \hat{\beta}_0 + \hat{\beta}_1 x\] - Intercept: 50.65 - Slope: 4.01 - Each extra hour jumping rope increases the predicted vertical jump by about 4.01 centimeters.

Residual plot

\[e_i = y_i - \hat{y}_i\]

The classic least squares idea

\[SSE = \sum_{i=1}^{n}(y_i - \hat{y}_i)^2\]

Plotly 3D example

R code example

fit <- lm(y ~ x, data = df)
summary(fit)
predict(fit, newdata = data.frame(x = c(2, 5, 9)))

Conclusion!!

Linear regression models a straight-line relationship.
The slope shows how the response variable changes when you alter the predictor variable.
Residuals help check if the model is reasonable, and help to visualize accuracy of the model.