Jared Cross
3/18/2021
.Note: There are MANY different types of best fit lines. The most common type, which we’ll discuss here, is an “ordinary least squares” best-fit line.
The best fit line passes the point (\(\bar{x}\), \(\bar{y}\)).
## [1] 0.406289
The best-fit line for a z-score v. z-score graph passes through (0,0) and has a slope is the correlation between x and y.
\[y = mx + b\]
\[m = r \cdot \frac{\sigma_y}{\sigma_x}\]
and since it passes through (\(\bar{x}\), \(\bar{y}\)), we know that \(\bar{y} = m \cdot \bar{x} + b\) which rearranges to: \(y = m \cdot x + \bar{y} - m\bar{x}\) and thus
\[b = \bar{y} - m\bar{x}\]
##
## Call:
## lm(formula = cubit ~ foot, data = mes)
##
## Coefficients:
## (Intercept) foot
## 19.2385 0.9472
which means:
\[cubit = 0.947 \cdot foot + 19.2\] (in centimeters)
In the case of cubits and feet we get the equation:
\[cubit = 0.947 \cdot foot + 19.2\] (in centimeters)
If we know that someone’s foot is 28 cm long, we would guess that their cubit is, \(0.947 \cdot 28 + 19.2 = 45.7\) cm long
To get the uncertainty in the slope…
bootstrapped_equations <- replicate(1e3,
{m <- lm(cubit~ foot, data=sample_frac(mes, 1, replace = TRUE));
coef(m)})
## [1] 0.965405
## [1] 0.6338283
## [1] 18.87761
## [1] 16.71232
##
## Call:
## lm(formula = cubit ~ foot, data = mes)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.4976 -1.7864 -0.1113 0.5611 11.3688
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 19.2385 17.1206 1.124 0.287
## foot 0.9472 0.6736 1.406 0.190
##
## Residual standard error: 4.566 on 10 degrees of freedom
## Multiple R-squared: 0.1651, Adjusted R-squared: 0.08158
## F-statistic: 1.977 on 1 and 10 DF, p-value: 0.19