What is Simple Linear Regression?

Simple linear regression models how a response \(Y\) changes with a predictor \(X\).

Examples: - Cars: MPG vs weight (mtcars dataset in R) - Finance: stock return vs market return (simulated CAPM-style data)

Model (Math)

\[ Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i \]

Assumptions (common): \[ E(\varepsilon_i)=0, \qquad Var(\varepsilon_i)=\sigma^2 \] —

R Code to Fit Models

#Car example: Mpg vs weight
fit_cars <- lm(mpg ~ wt, data = mtcars)
summary(fit_cars)
## 
## Call:
## lm(formula = mpg ~ wt, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5432 -2.3647 -0.1252  1.4096  6.8727 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.2851     1.8776  19.858  < 2e-16 ***
## wt           -5.3445     0.5591  -9.559 1.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.046 on 30 degrees of freedom
## Multiple R-squared:  0.7528, Adjusted R-squared:  0.7446 
## F-statistic: 91.38 on 1 and 30 DF,  p-value: 1.294e-10
#Finance example: simulated returns
set.seed(42)
market <- rnorm(120, 0.0005, 0.01)
stock <- 0.0002 + 1.2*market + rnorm(120, 0, 0.012)

fit_fin <- lm(stock ~ market)
summary(fit_fin)
## 
## Call:
## lm(formula = stock ~ market)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0231907 -0.0076228 -0.0005725  0.0067849  0.0257825 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.0009691  0.0009734  -0.996    0.322    
## market       1.0574839  0.0940264  11.247   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01063 on 118 degrees of freedom
## Multiple R-squared:  0.5174, Adjusted R-squared:  0.5133 
## F-statistic: 126.5 on 1 and 118 DF,  p-value: < 2.2e-16

ggplot #1: Car MPG vs Weight

Finance Interpretation

Let: - \(X\) = market daily return
- \(Y\) = stock daily return

Interpretation: - \(\beta_1\) = “beta” (market sensitivity) - \(\beta_0\) = “alpha” (return not explained by market)

ggplot #2: Stock vs Market

Least Squares + 3D Plotly

OLS minimizes:

\[ S(\beta_0,\beta_1)=\sum_{i=1}^{n}\left(Y_i-(\beta_0+\beta_1X_i)\right)^2 \]

Inference on the Slope

Test:

\[ H_0:\beta_1 = 0 \quad \text{vs} \quad H_a:\beta_1 \neq 0 \]

Statistic: \[ t = \frac{\hat{\beta}_1}{SE(\hat{\beta}_1)} \]

  • A small p-value provides evidence that \(X\) is linearly related to \(Y\)
  • A large p-value suggests there is not enough evidence of a linear relationship