2024-10-31

What is Simple Linear Regression?

Simple Linear Regression is a method used to model the relationship between two variables by fitting a linear equation to observed data.

  • Independent Variable (X): The predictor or input variable

  • Dependent Variable (Y): The outcome or response variable

  • Goal: To find the best-fit line that minimizes the error in predicting Y based on X

Simple Linear Regression Equation The equation of a simple linear regression model is: \[ Y = \beta_0 + \beta_1 X + \epsilon \]

  • \(\beta_0\): Intercept

  • \(\beta_1\): Slope of the line

  • \(\epsilon\): Error term, representing unexplained variability

Exploring an Example Dataset with Plotly

Visualizing the Data with ggplot2

A scatter plot can visually illustrate the relationship between \(X\) and \(Y\).

Fitting the Linear Model

## `geom_smooth()` using formula = 'y ~ x'

Interpretation of Slope and Intercept

Understanding Slope and Intercept - Intercept (\(\beta_0\)): The expected value of Y when X is zero. - Slope (\(\beta_1\)):The change in Y fora one-unit increase in X. For example, if \(\beta_1 = 2\), this means that for every increase of 1 unit in X, Y increases by 2 units.

R Code for Simple Linear Regression

# Fitting the model
model <- lm(y ~ x, data = data)

# Displaying a clean summary table
model_summary <- tidy(model)
kable(model_summary, caption = "Summary of Linear Model")
Summary of Linear Model
term estimate std.error statistic p.value
(Intercept) 2.919164 0.7550668 3.8661 0.000332
x 2.050315 0.1625299 12.6150 0.000000

Conclusion

Simple linear regression is a fundamental statistical technique used to understand and predict relationships between two continuous variables. It provides insights into the strength and direction of the relationship.