2025-06-06

What is Simple Linear Regression?

Simple linear regression models the relationship between two variables by fitting a straight line to the observed data.

  • One independent variable (x)
  • One dependent variable (y)
  • Goal: predict y using x

The equation of the line is:

\(\hat{y} = \beta_0 + \beta_1 x\)

Where:
- \(\beta_0\) is the intercept
- \(\beta_1\) is the slope
- \(\hat{y}\) is the predicted value of y given x

Assumptions of Simple Linear Regression

For the linear regression model to be valid, several key assumptions must hold:

  • Linearity: The relationship between x and y is linear
    \(y = \beta_0 + \beta_1 x + \varepsilon\)

  • Independence: Observations are independent of each other

  • Homoscedasticity: The variance of residuals is constant across all levels of x

  • Normality of Residuals: The errors \(\varepsilon\) are normally distributed
    \(\varepsilon \sim N(0, \sigma^2)\)

Dataset: mtcars

We’ll use the built-in mtcars dataset in R, which contains various car attributes.

For this example, we’ll model:

  • mpg (Miles per Gallon) — our dependent variable
  • hp (Horsepower) — our independent variable

This will help us explore how a car’s horsepower affects its fuel efficiency.

Visualizing the Relationship

Residual Plot

Interactive Plot with Plotly

R Code for Linear Regression


# Load dataset
data(mtcars)

# Build the linear regression model
model = lm(mpg ~ hp, data = mtcars)

# View model summary
summary(model)

# Add residuals to dataset
mtcars$residuals = residuals(model)

# Create residual plot
ggplot(mtcars, aes(x = hp, y = residuals)) +
  geom_point() +
  geom_hline(yintercept = 0, linetype = "dashed") +
  labs(
    title = "Residuals vs Horsepower",
    x = "Horsepower",
    y = "Residuals"
  )

Conclusion

  • Simple linear regression helps us understand and predict relationships between two variables.

  • In this case, we explored how horsepower impacts fuel efficiency using the mtcars dataset.

  • Regression models are used across many fields, from economics to engineering, for forecasting and decision-making.

  • This foundational concept is the building block for more advanced techniques like multiple regression and machine learning.

Thanks for viewing!