Simple Linear Regression is a basic yet powerful statistical technique used to model the relationship between a single explanatory variable and a dependent variable.
2025-03-26
Simple Linear Regression is a basic yet powerful statistical technique used to model the relationship between a single explanatory variable and a dependent variable.
The simple linear regression model is represented by the equation:
\[ y = \beta_0 + \beta_1 x + \varepsilon \]
Where:
- \(y\): Response variable
- \(x\): Predictor variable
- \(\beta_0\): Intercept
- \(\beta_1\): Slope
- \(\varepsilon\): Random error
mtcarsWe will explore the relationship between car weight (wt) and miles per gallon (mpg) using the built-in mtcars dataset in R.
library(ggplot2)
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point(color = "darkblue") +
geom_smooth(method = "lm", se = TRUE, color = "red") +
labs(title = "Miles per Gallon vs Weight of Car",
x = "Weight (1000 lbs)",
y = "Miles per Gallon")
model <- lm(mpg ~ wt, data = mtcars)
res <- residuals(model)
ggplot(data.frame(wt = mtcars$wt, res = res), aes(x = wt, y = res)) +
geom_point(color = "purple") +
geom_hline(yintercept = 0, linetype = "dashed", color = "black") +
labs(title = "Residuals vs Weight",
x = "Weight (1000 lbs)",
y = "Residuals")
library(plotly)
## ## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2': ## ## last_plot
## The following object is masked from 'package:stats': ## ## filter
## The following object is masked from 'package:graphics': ## ## layout
plot_ly(data = mtcars, x = ~wt, y = ~mpg,
type = 'scatter', mode = 'markers',
marker = list(color = 'green', size = 10)) %>%
layout(title = "Interactive MPG vs Weight",
xaxis = list(title = "Weight"),
yaxis = list(title = "MPG"))
summary(model)
## ## Call: ## lm(formula = mpg ~ wt, data = mtcars) ## ## Residuals: ## Min 1Q Median 3Q Max ## -4.5432 -2.3647 -0.1252 1.4096 6.8727 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 37.2851 1.8776 19.858 < 2e-16 *** ## wt -5.3445 0.5591 -9.559 1.29e-10 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 3.046 on 30 degrees of freedom ## Multiple R-squared: 0.7528, Adjusted R-squared: 0.7446 ## F-statistic: 91.38 on 1 and 30 DF, p-value: 1.294e-10
\[ \text{mpg} = 37.2851 - 5.3445 \cdot \text{wt} \]
wt is very small (\(< 0.001\)), so it’s a statistically significant predictor.ggplot2 and plotly helps us understand patterns and validate model assumptions.