Rasmus Bååth
15/04/2014
Vi vill jämföra
M0: income ~ age + sex
med
M1: income ~ age + sex + height
Är M1 bättre än M0? R² är alltid högre för M1, vad betyder bättre?
Nollhypotes: height ökar inte R² mer än slumpen
income ~ age + sex + randomhead(d)
age sex height income
1 25.12 f 157.4 21257
2 48.00 f 165.9 23997
3 47.42 f 162.2 24617
4 48.05 f 162.9 23440
5 58.74 f 168.9 25157
6 48.81 f 173.9 24911
# Cut delar upp en kvantitiv variabel i grupper
# Praktiskt om man vill plotta!
xyplot(income ~ age | cut(height, 3) + sex, data=d)
mod1 <- lm(income ~ age + sex + height, data=d)
rsquared(mod1)
[1] 0.8499
mod_rand <- lm(income ~ age + sex + rand(), data=d)
rsquared(mod_rand)
[1] 0.8395
mod_rand <- lm(income ~ age + sex + rand(), data=d)
rsquared(mod_rand)
[1] 0.846
null <- do(500) * lm(income ~ age + sex + rand(), data=d)
head(null)
Intercept age sexm rand. sigma r.squared
1 19293 100.2 797.5 99.77 560.3 0.8445
2 19283 100.6 802.8 -13.46 569.2 0.8395
3 19284 100.6 801.6 -12.11 569.2 0.8395
4 19303 100.5 782.8 -70.76 564.7 0.8420
5 19281 100.7 805.0 25.21 568.7 0.8398
6 19273 100.7 811.0 -55.65 567.2 0.8406
histogram(~ r.squared, data=null)
mod1 <- lm(income ~ age + sex + height, data=d)
rsquared(mod1)
[1] 0.8499
# p-värdet
mean( null$r.squared >= 0.8499)
[1] 0.014
mod1 <- lm(income ~ age + sex + height, data=d)
summary(mod1)
Coefficients:
Estimate Std. Error Pr(>|t|)
Intcpt 17393.15 1143.21 < 2e-16 ***
age 93.40 3.61 < 2e-16 ***
sexm 672.19 105.17 5.9e-09 ***
height 13.20 6.47 0.044 *
Ett märkligt mått på hur bra oberoende variabler är i en modell. Se det som en tillkrånglad version av R².
\[ \frac{\text{Hur mycket en variabel ökat R² i genomsnitt}}{\text{Hur mycket ökar R² med slumpade variabler}} \]
\[ \frac{\text{Hur mycket en variabel ökat R² i genomsnitt}}{\text{Hur mycket ökar R² med slumpade variabler}} \]
\[ F = \frac{R^2}{m - 1} / \frac{1 - R^2}{n - m} \]
Nollhypotesen för F-måttet
När man gör modelljämförelser på det här sättet kallas det ofta ANOVA- Speciellt om man jämför modeller med kategoriska variabler.
Psykologer älskar ANOVA.
mod0 <- lm(income ~ age + sex, data=d)
mod1 <- lm(income ~ age + sex + height, data=d)
anova(mod0, mod1)
Analysis of Variance Table
Model 1: income ~ age + sex
Model 2: income ~ age + sex + height
Res.Df RSS Df Sum of Sq F Pr(>F)
1 97 31119751
2 96 29084074 1 2035677 6.72 0.011 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mod0 <- lm(income ~ age, data=d)
mod1 <- lm(income ~ age + sex + height, data=d)
anova(mod0, mod1)
Analysis of Variance Table
Model 1: income ~ age
Model 2: income ~ age + sex + height
Res.Df RSS Df Sum of Sq F Pr(>F)
1 98 47119408
2 96 29084074 2 1.8e+07 29.8 8.7e-11 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(mod1)
Analysis of Variance Table
Response: income
Df Sum Sq F value Pr(>F)
age 1 1.47e+08 484.11 <2e-16 ***
sex 1 1.60e+07 52.81 9.8e-11 ***
height 1 2.04e+06 6.72 0.011 *
Residuals 96 2.91e+07 3.03e+05
Standard i SPSS. Måste arbeta för det lite i R.
mod0 <- lm(income ~ age + sex, data=d)
mod1 <- lm(income ~ age + sex + height, data=d)
anova(mod0, mod1)
#
mod0 <- lm(income ~ age + height, data=d)
mod1 <- lm(income ~ age + height + sex, data=d)
anova(mod0, mod1)
#
mod0 <- lm(income ~ sex + height, data=d)
mod1 <- lm(income ~ sex + height + age, data=d)
anova(mod0, mod1)
Kan vara lurigt att lista ut vad som använts. Sum of Squares typ III är standard i SPSS så det är en bra gissning…
A two-way analysis of variance yielded a main effect for the diner’s gender, F(1, 108) = 3.93, p < .05, such that the average tip was significantly higher for men (M = 15.3%, SD = 4.44) than for women (M = 12.6%, SD = 6.18). The main effect of touch was non-significant, F(1, 108) = 2.24, p > .05. However, the interaction effect was significant, F(1, 108) = 5.55, p < .05.
tip ~ gender + touch + gender:touch
A two-way analysis of variance yielded a main effect for the diner’s gender, F(1, 108) = 3.93, p < .05, such that the average tip was significantly higher for men (M = 15.3%, SD = 4.44) than for women (M = 12.6%, SD = 6.18).
tip ~ gender + touch + gender:touch
vs
tip ~ touch + gender:touch + rand()
The main effect of touch was non-significant, F(1, 108) = 2.24, p > .05.
tip ~ gender + touch + gender:touch
vs
tip ~ gender + gender:touch + rand()
However, the interaction effect was significant, F(1, 108) = 5.55, p < .05, indicating that the gender effect was greater in the touch condition than in the non-touch condition.
tip ~ gender + touch + gender:touch
vs
tip ~ gender + touch + rand()
Varför? Because tabeller
Evidence
Decisions
The Fisherian and Neyman-Pearson approaches are not the same. The central contention of the Neyman-Pearson framework is that at the end of your study, you have to make a decision and walk away. Allegedly, a researcher once approached Fisher with 'non-significant' results, asking him what he should do, and Fisher said, 'go get more data'.
Evidence = märkligt
Decisions = Supermärkligt
“… surely, God loves the .06 nearly as much as the .05. Can there be any doubt that God views the strength of evidence for or against the null as a fairly continuous function of the magnitude of p?”
Rosnow, R. L., & Rosenthal, R. (1989)
is not to play the game.
Om man spelar reject-accept spelet hamnar man i konstiga situationer. p = 0.06 …
practically significant (p = 0.0831)
on the very fringes of significance (p = 0.0831)
fell barely short of significance (p = 0.0831)
height ~ nkids + sex + mother + father
Jag kan a priori säga att alla dessa faktorer relaterar till ett barns höjd. Behöver inte samla in data för det…