- SLR idea & notation
- Example:
trees(Girth → Volume) - Inference (CIs / tests)
- Diagnostics (residual checks)
- Plotly 3D demo
- Reproducible workflow
trees (Girth → Volume)We model a response \(Y\) using a single predictor \(X\): \[ Y_i=\beta_0+\beta_1 X_i+\varepsilon_i,\quad \varepsilon_i\stackrel{iid}{\sim}N(0,\sigma^2) \] Assumptions: linearity, constant variance, independence, (approx.) normal errors.
## Girth Height Volume ## 1 8.3 70 10.3 ## 2 8.6 65 10.3 ## 3 8.8 63 10.2 ## 4 10.5 72 16.4 ## 5 10.7 81 18.8 ## 6 10.8 83 19.7
\[ \hat\beta_1 = \frac{\sum (x_i-\bar x)(y_i-\bar y)}{\sum (x_i-\bar x)^2}, \quad \hat\beta_0 = \bar y - \hat\beta_1 \bar x \] \[ \widehat{SE}(\hat\beta_1) = \frac{\hat\sigma}{\sqrt{\sum (x_i-\bar x)^2}}, \quad \hat\sigma = \sqrt{\frac{SSE}{n-2}}, \quad t = \frac{\hat\beta_1}{\widehat{SE}(\hat\beta_1)} \sim t_{n-2} \] \[ \hat\beta_1 \pm t_{n-2,\,1-\alpha/2}\cdot \widehat{SE}(\hat\beta_1) \]
## ## Call: ## lm(formula = Volume ~ Girth, data = trees) ## ## Residuals: ## Min 1Q Median 3Q Max ## -8.065 -3.107 0.152 3.495 9.587 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -36.9435 3.3651 -10.98 7.62e-12 *** ## Girth 5.0659 0.2474 20.48 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 4.252 on 29 degrees of freedom ## Multiple R-squared: 0.9353, Adjusted R-squared: 0.9331 ## F-statistic: 419.4 on 1 and 29 DF, p-value: < 2.2e-16
## # A tibble: 1 × 12 ## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC ## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 0.935 0.933 4.25 419. 8.64e-19 1 -87.8 182. 186. ## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
## # A tibble: 2 × 7 ## term estimate std.error statistic p.value conf.low conf.high ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 (Intercept) -36.9 3.37 -11.0 7.62e-12 -43.8 -30.1 ## 2 Girth 5.07 0.247 20.5 8.64e-19 4.56 5.57
mod <- lm(Y ~ X, data=...)summary(mod) (\(\hat\beta\), SEs, p-values, \(R^2\))geom_point() + geom_smooth(method="lm")