Phillip M. Alday
12 August 2016 (Melbourne)
Determines the line of best fit
“Fits” a line
The line such that minimizes the summed distances from the line to the data points.
We can do this for arbitrary powers – these are the so-called \( L_p \) norms
(Fine print: \( 0^0 \) is in general undefined because it is equal to \( \frac{0}{0} \) and division by zero is undefined, but this definition is internally consistent here.)
Inspired by some comments from John Myles White
\[ Y = \beta_{1}X_1 + \beta_{0} + \varepsilon \]
Related via Bayes' Theorem: \[ p(H|D) = \frac{p(D|H)p(H)}{p(D)} \]
You need to remember that “likelihood” is a technical term. The likelihood of \( H \), \( Pr(O\|H) \), and the posterior probability of \( H \), \( Pr(H\|O) \), are different quantities and they can have different values. The likelihood of \( H \) is the probability that \( H \) confers on \( O \), not the probability that \( O \) confers on \( H \). Suppose you hear a noise coming from the attic of your house. You consider the hypothesis that there are gremlins up there bowling. The likelihood of this hypothesis is very high, since if there are gremlins bowling in the attic, there probably will be noise. But surely you don’t think that the noise makes it very probable that there are gremlins up there bowling. In this example, \( Pr(O\|H) \) is high and \( Pr(H\|O) \) is low. The gremlin hypothesis has a high likelihood (in the technical sense) but a low probability.
Sober, E. (2008). Evidence and Evolution: the Logic Behind the Science. Cambridge University Press.
glm()
)lm()
)summary(glm(y~x))
Call:
glm(formula = y ~ x)
Deviance Residuals:
Min 1Q Median 3Q Max
-181.168 -39.710 4.854 39.162 133.582
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 46.1361 12.6521 3.647 0.000428 ***
x 2.6567 0.2175 12.214 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for gaussian family taken to be 3942.172)
Null deviance: 974458 on 99 degrees of freedom
Residual deviance: 386333 on 98 degrees of freedom
AIC: 1115.7
Number of Fisher Scoring iterations: 2
summary(lm(y~x))
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-181.168 -39.710 4.854 39.162 133.582
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 46.1361 12.6521 3.647 0.000428 ***
x 2.6567 0.2175 12.214 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 62.79 on 98 degrees of freedom
Multiple R-squared: 0.6035, Adjusted R-squared: 0.5995
F-statistic: 149.2 on 1 and 98 DF, p-value: < 2.2e-16
xyplot(weight ~ feed,data=chickwts,main="Effect of diet on chick growth",ylab="weight",xlab="feed")
pair <- subset(chickwts,feed %in% c("casein","horsebean"),drop=TRUE)
xyplot(weight ~ feed,data=pair,main="Effect of diet on chick growth",ylab="weight",xlab="feed")
t.test(weight ~ feed,data=pair,var.equal=TRUE,paired=FALSE)
Two Sample t-test
data: weight by feed
t = 7.0197, df = 20, p-value = 8.255e-07
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
114.8329 211.9338
sample estimates:
mean in group casein mean in group horsebean
323.5833 160.2000
summary(aov(weight ~ feed,data=pair))
Df Sum Sq Mean Sq F value Pr(>F)
feed 1 145604 145604 49.28 8.25e-07 ***
Residuals 20 59097 2955
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Does the slope of this line differ from zero?
xyplot(weight ~ feed,data=pair,main="Effect of diet on chick growth",ylab="weight",xlab="feed",type=c("p","r"))
summary(lm(weight ~ feed,data=pair))
Call:
lm(formula = weight ~ feed, data = pair)
Residuals:
Min 1Q Median 3Q Max
-107.58 -33.20 3.80 42.17 80.42
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 323.58 15.69 20.62 6.02e-15 ***
feedhorsebean -163.38 23.27 -7.02 8.25e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 54.36 on 20 degrees of freedom
Multiple R-squared: 0.7113, Adjusted R-squared: 0.6969
F-statistic: 49.28 on 1 and 20 DF, p-value: 8.255e-07
anova(lm(weight ~ feed,data=pair))
Analysis of Variance Table
Response: weight
Df Sum Sq Mean Sq F value Pr(>F)
feed 1 145604 145604 49.277 8.255e-07 ***
Residuals 20 59097 2955
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(lm(weight ~ feed,data=chickwts))
Call:
lm(formula = weight ~ feed, data = chickwts)
Residuals:
Min 1Q Median 3Q Max
-123.909 -34.413 1.571 38.170 103.091
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 323.583 15.834 20.436 < 2e-16 ***
feedhorsebean -163.383 23.485 -6.957 2.07e-09 ***
feedlinseed -104.833 22.393 -4.682 1.49e-05 ***
feedmeatmeal -46.674 22.896 -2.039 0.045567 *
feedsoybean -77.155 21.578 -3.576 0.000665 ***
feedsunflower 5.333 22.393 0.238 0.812495
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 54.85 on 65 degrees of freedom
Multiple R-squared: 0.5417, Adjusted R-squared: 0.5064
F-statistic: 15.36 on 5 and 65 DF, p-value: 5.936e-10
anova(lm(weight ~ feed,data=chickwts))
Analysis of Variance Table
Response: weight
Df Sum Sq Mean Sq F value Pr(>F)
feed 5 231129 46226 15.365 5.936e-10 ***
Residuals 65 195556 3009
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Contrasts matter!
library(car)
options(contrasts=c("contr.Sum","contr.poly"))
summary(lm(weight ~ feed,data=chickwts))
Call:
lm(formula = weight ~ feed, data = chickwts)
Residuals:
Min 1Q Median 3Q Max
-123.909 -34.413 1.571 38.170 103.091
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 259.131 6.543 39.602 < 2e-16 ***
feed[S.casein] 64.452 14.490 4.448 3.47e-05 ***
feed[S.horsebean] -98.931 15.601 -6.341 2.48e-08 ***
feed[S.linseed] -40.381 14.490 -2.787 0.00697 **
feed[S.meatmeal] 17.778 15.005 1.185 0.24042
feed[S.soybean] -12.703 13.641 -0.931 0.35519
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 54.85 on 65 degrees of freedom
Multiple R-squared: 0.5417, Adjusted R-squared: 0.5064
F-statistic: 15.36 on 5 and 65 DF, p-value: 5.936e-10