2024-11-17

What is Simple Linear Regression?

Simple linear regression is a statistical method used to understand the relationship between two continuous variables: 1. An independent variable (predictor). 2. A dependent variable (response).

Objective: Predict the value of the dependent variable based on the independent variable.

Linear Regression Equation

The equation for simple linear regression is:

\[ y = \beta_0 + \beta_1 x + \epsilon \]

  • \(y\): Dependent variable (response).
  • \(x\): Independent variable (predictor).
  • \(\beta_0\): Intercept.
  • \(\beta_1\): Slope of the line.
  • \(\epsilon\): Random error.

Variance Equation

Variance of residuals:

\[ Var(\epsilon) = \sigma^2 \]

Dataset: Hours Studied vs. Test Scores

The dataset includes information about the hours students studied and their corresponding test scores.

Hours Scores
1 50
2 55
3 60
4 65
5 70

Scatter Plot

Regression Line Plot

The plot below shows the relationship between Hours Studied and Test Scores, with a linear regression line fitted to the data.

Residuals plot

R Code for Regression Line Plot

library(ggplot2)
ggplot(data, aes(x = Hours, y = Scores)) +
  geom_point(color = "blue") +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(title = "Regression Line", x = "Hours Studied",
  y = "Test Scores")

Summary & Insights

  • Simple linear regression helps analyze the relationship between two continuous variables.
  • Example: Hours studied and test scores.
  • Key takeaway: Understanding relationships can help make predictions and improve outcomes.

Thank you!