What is Simple Linear Regression?

Simple linear regression is a statistical method used to study the relationship between one explanatory variable and one response variable.

In this presentation, we use the mtcars dataset and study:

  • wt = weight of car
  • mpg = miles per gallon

We want to understand how the weight of a car affects its fuel efficiency.

Why is it Useful?

Simple linear regression helps us:

  • understand the relationship between two variables
  • predict the value of one variable using another
  • measure whether the relationship is positive or negative

Example question:

How does car weight affect fuel efficiency?

Mathematical Model

The simple linear regression model is:

\[ y = \beta_0 + \beta_1 x + \varepsilon \]

Where:

  • \(\beta_0\) is the intercept
  • \(\beta_1\) is the slope
  • \(\varepsilon\) is the random error term

For our example:

\[ mpg = \beta_0 + \beta_1(wt) + \varepsilon \]

Interpretation of Coefficients

The estimated regression line is:

\[ \hat{y} = b_0 + b_1x \]

The slope tells us how much the predicted value of \(y\) changes when \(x\) increases by 1 unit.

\[ b_1 = \frac{\text{change in predicted } y}{\text{change in } x} \]

If \(b_1 < 0\), then as \(x\) increases, \(y\) tends to decrease.

In this example, heavier cars tend to have lower miles per gallon.

Example Data and Model Summary

From the fitted model:

  • Intercept = 37.285
  • Slope = -5.344
  • \(R^2 = 0.753\)

Interpretation:

  • When car weight increases by 1 unit, predicted miles per gallon decreases by about 5.344.
  • About 75.3% of the variation in mpg is explained by wt.

GGPlot 1: Scatterplot with Regression Line

GGPlot 2: Residual Plot

Plotly Interactive Plot

Example Prediction

Suppose a car weighs 3.0 thousand pounds.

The fitted model predicts fuel efficiency of about 21.25 miles per gallon.

## R Code Used for a Plot


``` r
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point(size = 2.5) +
  geom_smooth(method = "lm", se = TRUE) +
  labs(
    title = "Car Weight vs Miles per Gallon",
    x = "Weight (1000 lbs)",
    y = "Miles per Gallon"
  )

Conclusion

Simple linear regression is a useful statistical tool for studying the relationship between two variables.

From this example, we observe that:

  • heavier cars tend to have lower fuel efficiency
  • regression summarizes the relationship clearly
  • graphs help visualize the trend
  • prediction is possible after fitting the model

Simple linear regression is widely used in science, engineering, business, and data analysis.