Simple linear regression is a statistical method that models the relationship between two variables by fitting a linear equation to observed data.
\[ Y = \beta_0 + \beta_1 X + \epsilon \]
Simple linear regression is a statistical method that models the relationship between two variables by fitting a linear equation to observed data.
\[ Y = \beta_0 + \beta_1 X + \epsilon \]
We’ll use the built-in cars dataset in R.
library(ggplot2) library(plotly) data(cars) head(cars)
## speed dist ## 1 4 2 ## 2 4 10 ## 3 7 4 ## 4 7 22 ## 5 8 16 ## 6 9 10
ggplot(cars, aes(x = speed, y = dist)) +
geom_point(color = "darkblue") +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Speed vs. Stopping Distance",
x = "Speed (mph)", y = "Stopping Distance (ft)")
## `geom_smooth()` using formula = 'y ~ x'
model <- lm(dist ~ speed, data = cars)
cars$residuals <- resid(model)
ggplot(cars, aes(x = speed, y = residuals)) +
geom_point(color = "purple") +
geom_hline(yintercept = 0, linetype = "dashed") +
labs(title = "Residuals of Linear Model",
x = "Speed", y = "Residual")
set.seed(123)
x <- rnorm(100)
y <- rnorm(100)
z <- 2 + 3 * x + 4 * y + rnorm(100)
plot_ly(x = ~x, y = ~y, z = ~z, type = "scatter3d", mode = "markers",
marker = list(size = 3, color = z, colorscale = "Viridis"))
The estimated coefficients are computed as: \[ \hat{\beta}_1 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2} \] \[ \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x} \]
Thank you!