What is Simple Linear Regression?

Simple linear regression is a method for visualizing the relationship between a singular independent variable and a singular dependent variable.

Often times Simple linear regression uses the Least Squares Method paired with slope-intercept form to find the “line of best fit” to better visualize all the points into one line.

Simple Linear Regression Formula

\[\Huge\hat{y} = \, \beta_0 \, + \, \beta_1 \, x\]

ŷ = Dependent Variable

β₀ = y-intercept (constant)

β₁ = Slope

x = Independent Variable

Least Squares Method Formula

\[\normalsize\ m = \frac{n(\sum xy) - (\sum x)(\sum y)}{n(\sum x^2) - (\sum x)^2}\]

m = Slope of the line

n = Total # of data points

∑xy = The sum of each xy combination

∑x = Sum of all x values

∑y = Sum of all y values

∑x² = Sum of all x² values

Dataset \(\normalsize cars\) using Plotly

Speed vs. Braking Distance

Code from previous slide

lobf = lm(dist~speed)

x_axis <- list(title = "Speed (Mph)")

y_axis <- list(title = "Braking Distance (ft)")

plot_ly(cars, x=speed, y=dist, type="scatter", mode = "markers") %>%
  add_lines(x = speed, y = fitted(lobf)) %>%
  layout(xaxis= x_axis, yaxis=y_axis)

Dataset \(\normalsize cars\) using ggplot2

Speed vs. Braking Distance

Code for previous slide

sp = ggplot(data = cars, aes(x = speed, y = dist)) + 
  geom_point() + 
  xlab("Speed (Mph)") +
  ylab("Braking Distance (ft)")

sp + geom_smooth(formula = 'y ~ x', method = "lm", se = FALSE)

Dataset \(\normalsize cars\) using ggplot2

Braking Distance vs. Speed

Code for previous slide

sp = ggplot(data = cars, aes(x = dist, y = speed)) + 
  geom_point() + 
  xlab("Braking Distance (ft)") +
  ylab("Speed (Mph)")

sp + geom_smooth(formula = 'y ~ x', method = "lm", se = FALSE)