2025-02-09

Topic Introduction

What is Linear Regression?

A statistical method used to establish a relationship between two variables. Linear regression identifies a straight line that minimizes the difference between predicted outputs and the actual data points.

Equation: \(y = \beta_0 + \beta_1\cdot x + \varepsilon\)

y = dependent variable, what you are trying to predict

x = independent variable, the predictor

β0 = y intercept where x is 0

β1 = slope of the line (change in y as there is a change in x)

ϵ = error term

Dataset used in demostration

iris

summary(iris)
##   Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
##  Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
##  1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
##  Median :5.800   Median :3.000   Median :4.350   Median :1.300  
##  Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
##  3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
##  Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
##        Species  
##  setosa    :50  
##  versicolor:50  
##  virginica :50  
##                 
##                 
## 

R code

mod = lm(Sepal.Width ~ Sepal.Length, data= iris)
x = iris$Sepal.Length; y=iris$Sepal.Width
xax = list( 
  title= "Sepal Length")
yax = list(
  title= "Sepal Width")

fig <- plot_ly(data= iris, x = ~Sepal.Length, y = ~Sepal.Width, 
      type = 'scatter', mode = 'markers', name= "data", 
      width=800, height = 430)%>%
      add_lines(x=x,y=fitted(mod),name= "fitted")%>%
      layout(xaxis = xax, yaxis = yax)

Plots

Math text for Sepal Length and Sepal Width

y = \(\beta_0\) + \(\beta_1\) x + \(\epsilon\)

\(\beta_0\)= 4.2642

\(\beta_1\)= 0.8865

Sepal.Width = 4.0000+0.5000⋅Sepla.Length

Plots

## `geom_smooth()` using formula = 'y ~ x'

Math text for Petal Length and Petal Width

y = \(\beta_0\) + \(\beta_1\) x + \(\epsilon\)

\(\beta_0\)= 1.2

\(\beta_1\)= 2.5

Petal.Length=1.2+2.5⋅Petal.Width

Plots

## `geom_smooth()` using formula = 'y ~ x'

## NULL

References