2025-10-20

Equation for best-fit line (one variable)

Formula for simple linear regression: y = \(\beta_0\) + \(\beta_1\) x

  • y = predicted value
  • x = input
  • \(\beta_1\) = slope
  • \(\beta_0\) = intercept

Example finding the best fit line

Let’s do an example using the linear regression formula.
Assume we have a data set with 2 points: (0,5), (6,17). Find the best-fit line equation.

Formula for best-fit line: y = \(\beta_0\) + \(\beta_1\) x
Step 1: Calulate \(\beta_1\) (slope):
Formula for slope:\(\beta_1 = \frac{y_2 - y_1}{x_2 - x_1}\)
Apply: \(\beta_1 = \frac{17 - 5}{6 - 0}\) = 2

Step 2: Calulate \(\beta_0\) (intercept): Rearrange best-fit line formula to solve for \(\beta_0\) and substitute variables with calculated slope and one data point above.
Apply using (6,17): \(\beta_0\) = 17 - 2 * 6 = 5

Plugging in our findings, we have the final equation!
y = 5 + 2 x

How can we visualize this in R?

R code to make a plot of the previous example:

# load plotting library
library(ggplot2)

# Define points
point1 = c(0,6)
point2 = c(5,17)

# Make dataframe of points
data = data.frame(x=point1, y=point2)

# Plot points/line
ggplot(data, aes(x=x,y=y)) + geom_point() +   
  geom_smooth(method="lm", se=FALSE)

Let’s plot this in R!

## `geom_smooth()` using formula = 'y ~ x'

Best-Fit line - R’s built-in USArrests dataset

Let’s see what the best-fit line looks like in a real-world dataset in R, which has many more data points.
This plot shows the murder rate across several urban populations:

Let’s make the same plot interactive with plotly!

What can linear regression be used for in the real world?

  • Future production of agricultural crops
  • Stock forecasing
  • Analyzing advertising impact