2024-11-17

What is Simple Linear Regression?

Definition: A method utilized in statistics as a model to estimate the relationship between two quantitative variables (dependent and independent).

Usage: When trying to figure out the strength of the relationship between two variables and the dependent variable’s value in relationship to the independent variable’s specific value

Why: Analysis via simple linear regression has enormous applicability across all industries and fields and is also easy to understand as simple regression outputs a linear equation to visualize. This allows for predictability in future events such as holiday sales.

Simple Linear Regression Formula

\[ {Y}_i = {\beta}_0 + {\beta}_1 X_i + {\epsilon}_i \]

\[ {Y}_i = \text{dependent variable} \]

\[ {\beta}_0 = \text{constant / intercept} \]

\[ {\beta}_1 = \text{slope / coefficient} \]

\[ X_i = \text{independent variable} \]

\[ {\epsilon}_i = \text{error (random variable)} \]

Least Squares Line (Fitted Regresion Line)

\[ \hat{Y}_i = \hat{\beta}_0 + \hat{\beta}_1 X_i \] While similar to the simple linear regression which is a model, the least squares line is a statistical method utilized to determine the best fit line given by the provided data. This method reduces the sum of square errors (SSE) as optimally as possible.

\[ SSE = \sum (y - \hat{y}_i)^2 = \sum [y_i - (\hat{\beta_0} + \hat{\beta_1} x_i )]^2 \] The purpose of SSE is to determine how well the regression model fits the data by determining the delta between actual observed values versus values predicted by the simple linear regression model

MTCARS Data Frame

##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Horse Power vs Displacement

Weight vs MPG

Quarter Mile time vs (Weight, Horse Power)

Code of Quarter Mile Time vs (Weight, Horse Power)

library(plotly)
data(mtcars)
xax <- list(
  title = "Weight", 
  titlefont = list(family="Modern Computer Roman"))
yax <- list(
  title = "Horse Power", 
  titlefont = list(family="Modern Computer Roman"))
zax <- list(
  title = "Quarter Mile (s)", 
  titlefont = list(family="Modern Computer Roman"))
plot_ly(data=mtcars, x = mtcars$wt, y = mtcars$hp, z = mtcars$qsec, 
        type = "scatter3d", mode = "markers",
        color=as.factor(mtcars$cyl)) %>%
  layout(title = "Quarter Mile Time vs. (Weight, Horse Power)", 
    scene = list(xaxis = xax, yaxis = yax, zaxis = zax))