2026-04-13

Simple Linear Regression

Modeling the relationship between a predictor (X) and a response (Y) using a straight line.

What is Simple Linear Regression?

  • A statistical method for describing the relationship between two variables

  • One predictor variable (X) is used to explain or predict one response variable (Y)

  • The relationship is modeled with a straight line

  • Example:

    • Predicting fuel efficiency (MPG) from car weight

Why is Simple Linear Regression Important?

  • Helps predict outcomes using observed data
  • Measures the strength and direction of a relationship
  • Supports decision making in many fields:
    • Business
    • Science
    • Engineering
    • Data analytics
  • Example:
    • Estimating fuel efficiency from vehicle weight

Regression Equation

\[ y = \beta_0 + \beta_1 x + \epsilon \]

  • \(\beta_0\) = intercept (value of y when x = 0)
  • \(\beta_1\) = slope (change in y for a 1-unit increase in x)
  • \(\epsilon\) = error term (unexplained variation)

Interpreting the Slope

\[ \beta_1 = \frac{\Delta y}{\Delta x} \]

  • Represents how much Y changes for a one-unit increase in X
  • Positive slope = Y increases as X increases
  • Negative slope = Y decreases as X increases

Scatter Plot with Regression Line

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point(size = 2) + 
  geom_smooth(method = "lm", se = FALSE) +
  labs (
    x = "Weight (1000 lbs)",
    y = "Miles Per Gallon"
  ) +
  theme_minimal()

Residual Plot

model <- lm(mpg ~ wt, data = mtcars)

ggplot(mtcars, aes(x = wt, y = resid(model))) +
  geom_point(size = 2) + 
  geom_hline(yintercept = 0, linetype = "dashed") +
  labs(
    x = "Weight (1000 lbs)",
    y = "Residuals"
  ) +
  theme_minimal()

Linear Regression in R

# Create Linear Regression Model
model <- lm(mpg ~ wt, data = mtcars)

# View Model Summary
coef(model)
## (Intercept)          wt 
##   37.285126   -5.344472
# Generate Predictions
predicted_mpg <- predict(model)

# Show First Few Predicted Values
head(predicted_mpg)
##         Mazda RX4     Mazda RX4 Wag        Datsun 710    Hornet 4 Drive 
##          23.28261          21.91977          24.88595          20.10265 
## Hornet Sportabout           Valiant 
##          18.90014          18.79325

Interactive Plot

p <- plot_ly(
  data = mtcars,
  x = ~wt,
  y = ~mpg,
  type = "scatter",
  mode = "markers"
)

p

Conclusion

  • Simple linear regression models the relationship between one predictor (X) and one response (Y)
  • In this example, car weight helps explain fuel efficiency
  • Heavier cars tend to have lower miles per gallon
  • This method is useful for prediction, analysis, and data-driven decision making