- What is simple linear regression
- Model formulation
- Example with data
- Visualization and interpretation
2026-01-23
## ## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2': ## ## last_plot
## The following object is masked from 'package:stats': ## ## filter
## The following object is masked from 'package:graphics': ## ## layout
Simple linear regression models the relationship between a response variable \(y\)
and a predictor variable \(x\) as:
\[ y = \beta_0 + \beta_1 x + \varepsilon \]
where:
To illustrate simple linear regression, we generate simulated data using the model:
\[ y = 2 + 1.5x + \varepsilon,\quad \varepsilon \sim N(0, 2^2) \]
This approach allows us to control the true relationship between \(x\) and \(y\), while introducing realistic random variation.
We fit a simple linear regression model and overlay the fitted line on the data.
## `geom_smooth()` using formula = 'y ~ x'
To assess model assumptions, we examine the residuals plotted against the fitted values. A random scatter around zero indicates a good linear fit.
## Estimation of Coefficients
The estimates of \(\beta_0\) and \(\beta_1\) are obtained by minimizing the sum of squared residuals:
\[ \text{RSS} = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 \]
The estimated regression line is:
\[ \hat{y} = \hat{\beta}_0 + \hat{\beta}_1 x \]
The following R code fits a simple linear regression model to the data:
model <- lm(y ~ x, data = data) summary(model)
## ## Call: ## lm(formula = y ~ x, data = data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -4.0224 -1.2445 -0.1639 1.3318 4.3193 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 2.13532 0.52122 4.097 0.00016 *** ## x 1.48670 0.08982 16.552 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 1.871 on 48 degrees of freedom ## Multiple R-squared: 0.8509, Adjusted R-squared: 0.8478 ## F-statistic: 274 on 1 and 48 DF, p-value: < 2.2e-16
A three-dimensional view of the fitted values and residuals.