Interactive effects of the explantory variables on the predictor, so-called interactions are specified using the colon ':
'. To save ourselves some coding, we can simply use the shorthand asterisk '*
' notation to include all main effects and interactions.
lm(y ~ x1 + x2 + x1:x2, data = ...)
## Can be written as:
lm(y ~ x1 * x2, data = ...)
## ...even with more than two explanatory variables
lm(y ~ x1 * x2 * x3, data = ...)
Week 1
The principle of test statistics
A test statistic (t-value, F-value, …) can be regarded a signal-to-noise ratio. The higher the ratio, the easier it becomes to detect a true signal. E.g. the t-statistic of a t-test boils down to the difference in group means (signal) divided by the pooled standard error (noise).
Week 1
Analysis of variance (ANOVA)
## Using the built-in data set PlantGrowth
data(PlantGrowth)
str(PlantGrowth)
## 'data.frame': 30 obs. of 2 variables:
## $ weight: num 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
## $ group : Factor w/ 3 levels "ctrl","trt1",..: 1 1 1 1 1 1 1 1 1 1 ...
summary(PlantGrowth)
## weight group
## Min. :3.590 ctrl:10
## 1st Qu.:4.550 trt1:10
## Median :5.155 trt2:10
## Mean :5.073
## 3rd Qu.:5.530
## Max. :6.310
Week 1
Analysis of variance (ANOVA)
Always carry out sanity checks (str
, summary
) and plot your data before you get carried away with statistical modelling. What do you conclude looking at the boxplot below?
Week 1
Analysis of variance (ANOVA)
If we carry out an ANOVA with the aov
command, we obtain a summary stating an overall P-value for the predictor variable. So, it only indicates whether the predictor variable had a statistically significant effect but it does not tell us where those differences lie, i.e. which of the levels of the predictor variable differ significantly from each other. They could all differ signficantly from each other but a single significant difference between any two of the factor levels is enough to give an overall significance for the effect of the predictor variable.
## Df Sum Sq Mean Sq F value Pr(>F)
## group 2 3.766 1.8832 4.846 0.0159 *
## Residuals 27 10.492 0.3886
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Week 1