2026-03-07

What is Linear Regression

  • Linear regression is used to predict a value.
  • It models the relationship between two variables.
  • The relationship is represented by a straight line.
  • It helps us understand how one variable affects another.

Variables in Linear Regression

  • Independent variable (x) → input variable
  • Dependent variable (y) → output variable
  • The model uses x to predict y

We will use the trees dataset in R and study how tree girth affects tree volume.

x = tree girth
y = tree volume

Linear Regression Equation

\[ y = mx + b \] - \(y\) = predicted value
- \(x\) = input variable
- \(m\) = slope of the line
- \(b\) = intercept

Best Fit Line

  • Linear regression finds the best fit line.
  • The line shows the relationship between x and y.
  • It helps us predict new values.
  • The goal is to fit the line closest to the data points.

Data plot

This slide shows the relationship between Girth and Volume using a scatter plot.

Regression Line

This slide shows the linear regression line fitted to the data.

Slope Interpretation

The linear regression equation is: \[ y = mx + b \] The slope \(m\) tells us how much \(y\) changes when \(x\) increases by 1.

For the trees dataset the model gives \[ y = 5.07x - 36.94 \]

This means: \[ m = 5.07 \] So for every increase of 1 unit in Girth, the Volume increases by about 5.07 units.

R Code Example

ggplot(data = trees, aes(x = Girth, y = Volume)) + geom_point() + 
  geom_smooth(method = "lm", se = FALSE)

Interactive Plot

Thank You!