2024-10-16

What is a Simple Linear Regression?

  • According to Newcastle University, a linear regression aims to find a linear relationship that describes the correlation between an independent and possibly dependent variable.
  • it can also be represented by the following formula: y=+ x

Linear Regression Example: Plotly

  • Let’s look back at the dataset mtcars which we used previously in this course
## Warning: package 'plotly' was built under R version 4.4.1
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout

Linear Regression Example: ggplot2

  • Continuing with the mtcars dataset, observe the differences in plot between plotly and ggplot2
library(ggplot2)
data(mtcars)
ggplot(mtcars, aes(x=wt, y=mpg)) +
  geom_point() +
  geom_smooth(method="lm", col="blue") +
  labs(title="Linear Regression: MPG vs. Weight", x="Weight (1000 lbs)", y="Miles Per Gallon") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

Linear Regression Example: ggplot2

  • Now, let’s observe the relationship between horsepower and miles per gallon.
## `geom_smooth()` using formula = 'y ~ x'

Slide 1: LaTeX Representation of Simple Linear Regression

  • equation:

\[ Y = \beta_0 + \beta_1 X + \epsilon \]

  • \(Y\): DV
  • \(X\): IV
  • \(\beta_0\): Intercept
  • \(\beta_1\): Slope
  • \(\epsilon\): Error

LaTeX Representation of Least Squares Estimation

  • The coefficients \(\beta_0\) and \(\beta_1\) are estimated by:

\[ \text{RSS} = \sum_{i=1}^{n} (Y_i - \hat{Y}_i)^2 \]

Where:

\[ \hat{Y}_i = \beta_0 + \beta_1 X_i \]

Applications of Linear Regressions

  • Research: any form of analytic research will utilize linear models to visualize observed relationships
  • Data Analytics: business models may use linear regressions to monitor profit or company growth
  • And many others!

Conclusion

  • Linear regression is a statistical method used to observe the relationship between two variables.
  • Linear regression models may not be the most accurate model for every relationship (ex: ANOVA)
  • Can be applied and useful in many fields!

Sources Used