2024-11-01

What is Simple Linear Regression?

Simple Linear Regression is predicting the value of a dependent variable based on an independent variable. In simpler terms, its using one variable to predict another variable. This is done by establishing if there is a relationship between two variables and then using that relationship to predict an unknown variable.

Simple Linear Regression Mathematical Model

The model for simple linear regression is: \[ y = \beta_0 + \beta_1 x + \epsilon \]

  • \(y\): Response variable
  • \(x\): Predictor variable
  • \(\beta_0\): Intercept (expected \(y\) when \(x = 0\))
  • \(\beta_1\): Slope (rate of change in \(y\) per unit of \(x\))
  • \(\epsilon\): Error term (difference between observed and predicted values)

Linear Equation: Slope-Intercept Form

The equation for linear regression can also be expressed as: \[ y = mx + b \]

  • \(m\) is the slope: the rate of change in \(y\) per unit of \(x\).
  • \(b\) is the intercept: the predicted \(y\) value when \(x = 0\).

Simple Linear Regression Example

R Code for the Graph

data(cars) plot(cars\(speed, cars\)dist, main=“Simple Linear Regression: Speed vs Distance”, xlab=“Speed (mph)”, ylab=“Distance (ft)”, pch=19, col=“blue”) abline(lm(dist ~ speed, data=cars), col=“red”)

Model Summary

## 
## Call:
## lm(formula = dist ~ speed, data = cars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.069  -9.525  -2.272   9.215  43.201 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -17.5791     6.7584  -2.601   0.0123 *  
## speed         3.9324     0.4155   9.464 1.49e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared:  0.6511, Adjusted R-squared:  0.6438 
## F-statistic: 89.57 on 1 and 48 DF,  p-value: 1.49e-12

Making predictions based on graph

The graph shows a strong linear relationship. We can predict distances based on speed using the model’s slope and intercept: \[ Predicted Distance = \beta_0 + \beta_1 * Speed \]

##   Speed Predicted_Distance
## 1    15           41.40704
## 2    20           61.06908
## 3    25           80.73112

ggplot Visualization

## `geom_smooth()` using formula = 'y ~ x'

ggplot 3D Plot with Plotly

Conclusion

In conclusion simple linear regression provides a powerful method for predicting one variable based on another. By understanding the relationship between variables, we can make informed predictions and decisions.