2026-03-07

What is Simple Linear Regression?

Simple Linear regression studies the relationship between two variables.

Example: predicting exam score from hours studied.

Regression Equation

The regression model is:

\[ y = \beta_0 + \beta_1 x + \epsilon \]

Where

  • y = response variable
  • x = predictor variable
  • β0 = intercept
  • β1 = slope

Example Data

hours <- c(1,2,3,4,5,6,7,8,9,10)
score <- c(50,55,58,62,65,70,75,80,85,90)

data <- data.frame(hours, score)
data
##    hours score
## 1      1    50
## 2      2    55
## 3      3    58
## 4      4    62
## 5      5    65
## 6      6    70
## 7      7    75
## 8      8    80
## 9      9    85
## 10    10    90

Scatterplot with Regression Line

ggplot(data, aes(hours, score)) +
geom_point() +
geom_smooth(method="lm")
## `geom_smooth()` using formula = 'y ~ x'

Distribution of Scores

ggplot(data,aes(score)) +
geom_histogram(bins=5)

Regression Model Code

model <- lm(score ~ hours, data=data)
summary(model)
## 
## Call:
## lm(formula = score ~ hours, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8061 -0.5409  0.0000  0.7197  1.3576 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  44.8667     0.7479   59.99 6.62e-12 ***
## hours         4.3879     0.1205   36.41 3.55e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.095 on 8 degrees of freedom
## Multiple R-squared:  0.994,  Adjusted R-squared:  0.9933 
## F-statistic:  1325 on 1 and 8 DF,  p-value: 3.552e-10

Predicted Equation

\[ \hat{y} = b_0 + b_1 x \]

This equation predicts exam scores based on study hours.

Interactive Plot

plot_ly(data, x=~hours,y = ~score, type="scatter",mode="markers")

Conclusion

Simple linear regression helps predict relationships between variables.

In this example, more study hours lead to higher exam scores.