2025-04-13

Slide 1: Overview

Simple linear regression models the relationship between a dependent variable and a single independent variable.
In its simplest form, the model is expressed as:
\[ y = \beta_0 + \beta_1 x + \epsilon \]
where:
- \(y\) is the dependent variable,
- \(x\) is the independent variable,
- \(\beta_0\) is the intercept,
- \(\beta_1\) is the slope, and
- \(\epsilon\) is the error term.

Slide 2: Model Assumptions

The model relies on several key assumptions: - Linearity: The relationship between \(x\) and \(y\) is linear. - Independence: Observations are independent. - Homoscedasticity: The variance of the errors is constant. - Normality: The errors are normally distributed.

Slide 3: Data Simulation and Model Fitting (R Code)

Below is the R code used to simulate data and fit a linear regression model:

set.seed(123)

n <- 100
x <- rnorm(n, mean = 5, sd = 2)
y <- 3 + 1.5 * x + rnorm(n, mean = 0, sd = 1)

data <- data.frame(x = x, y = y)

model <- lm(y ~ x, data = data)

summary(model)
## 
## Call:
## lm(formula = y ~ x, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.9073 -0.6835 -0.0875  0.5806  3.2904 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.02838    0.29338   10.32   <2e-16 ***
## x            1.47376    0.05344   27.58   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9707 on 98 degrees of freedom
## Multiple R-squared:  0.8859, Adjusted R-squared:  0.8847 
## F-statistic: 760.6 on 1 and 98 DF,  p-value: < 2.2e-16

Slide 4: ggplot2 plot: Scatter Plot with Regression Line

This slide shows a scatter plot with the fitted regression line using ggplot2:

Slide 5: ggplot2 Plot: Residual Plot

This slide plots the residuals versus the predictor to assess the model fit:

Slide 6: Plotly 3D Plot: Fitted Values and Residuals

This slide presents an interactive Plotly 3D scatter plot visualizing the predictor, predicted values, and residuals:

Slide 7: Least Squares Estimation

The estimates for the coefficients in simple linear regression are obtained by minimizing the sum of squared errors. The formulas are: \[ \hat{\beta}1 = \frac{\sum{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y})}{\sum_{i=1}^{n}(x_i-\bar{x})^2} \] \[ \hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x} \] These formulas provide the foundation for the least squares estimator used in the regression analysis.

Slide 8: Conclusion

•   Simple linear regression is a basic method to understand the
    relationship between variables.
•   Model assumptions are crucial for valid results.
•   Visualizations with ggplot2 and Plotly help in interpreting and
    diagnosing the model fit.
•   Further analyses and diagnostic measures are recommended for
    in-depth evaluation.