2025-11-10

Slide 1 : Introduction

Simple Linear Regression

Simple Linear Regression describes the relationship between a predictor \(x\) and a response \(y\).

\[ y = \beta_0 + \beta_1 x + \epsilon \]

Slide 2 — Meaning of the model

  • \(\beta_0\): intercept
  • \(\beta_1\): slope
  • \(\epsilon\): random error

We fit the line by minimizing

\[ \text{SSE} = \sum (y_i - \hat{y}_i)^2 \]

Slide 3 — Sample Data

set.seed(1)
x <- rnorm(50, mean = 5, sd = 2)
y <- 3 + 1.8 * x + rnorm(50, sd = 2)
data <- data.frame(x, y)

Slide 4 — ggplot2 Scatter Plot + Regression Line (Plot 1)

library(ggplot2)

ggplot(data, aes(x = x, y = y)) +
  geom_point(color = "blue") +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  ggtitle("Scatter Plot with Regression Line")
## `geom_smooth()` using formula = 'y ~ x'

Slide 5 — ggplot2 Residual Plot (Plot 2)

model <- lm(y ~ x, data = data)
residuals <- resid(model)

ggplot(data, aes(x = x, y = residuals)) +
  geom_point(color = "purple") +
  geom_hline(yintercept = 0, color = "black") +
  ggtitle("Residual Plot")

Slide 6 — Plotly 3D

library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
data$pred <- predict(model)

plot_ly(
  data, x = ~x, y = ~y, z = ~pred,
  type = "scatter3d", mode = "markers",
  marker = list(size = 4)
) %>%
  layout(title = "3D View of Linear Relationship")

Slide 7 — Model Summary

summary(model)
## 
## Call:
## lm(formula = y ~ x, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.8552 -1.3380 -0.0045  0.9754  4.6972 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.4715     0.9168   3.786 0.000425 ***
## x             1.7545     0.1681  10.439 6.09e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.956 on 48 degrees of freedom
## Multiple R-squared:  0.6942, Adjusted R-squared:  0.6878 
## F-statistic:   109 on 1 and 48 DF,  p-value: 6.092e-14
## Slide 8 — Conclusion
- Linear trend that is captured by \(\hat{y} = \hat{\beta}_0 + \hat{\beta}_1 x\) - Scatter & fit depicts the positive association - 3D view \(y\) to fitted values - Useing the model for prediction within the range of \(x\)