To interpret the output table correctly, we should know couple things before hand.
What lm()
does is to find a line that connects your data. The estimates in the lm()
output is the regression coefficients, which mean how much the response variable (Y) would change with one unit change of the explanatory variable (X). If the explanatory variable is categorical, one unit change means moving from one group to another. For instance, the estimates in the first example are the changes of decrease (response variable) with one unit change of treatment group (explanatory variable), which mean moving from one treatment to any other one treatment, how much the decrease variable would change. In linear model with categorical explanatory variable (ANOVA is one type of linear model), the first is always the one that is being compared to. This is way we said that it is the changes of decrease between treatment A and other treatments.
When there are two treatments (two-way factorial design like the second example), there can be “interaction term”. Interaction term describe how much the change of response variable due to one treatment differ from due to the other treatment. For instance, in the second example, if there is no interaction (you do not specifiy it to be fitted), using lm()
will only estimate the change of uptake due to the change of Type (Quebec or Mississippi) and treatment (chilled or nonchilled). However, the two changes are not allowed to depend on each other. You can think of is as a general change or uptake due to plant type (treatment) regardless of treatment (plant type).
CO2.lm = lm(uptake~Treatment + Type, data = CO2)
summary(CO2.lm)
##
## Call:
## lm(formula = uptake ~ Treatment + Type, data = CO2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.373 -4.658 1.967 5.747 12.287
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.973 1.536 24.065 < 2e-16 ***
## Treatmentchilled -6.860 1.774 -3.867 0.000222 ***
## TypeMississippi -12.660 1.774 -7.136 3.68e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.13 on 81 degrees of freedom
## Multiple R-squared: 0.4485, Adjusted R-squared: 0.4349
## F-statistic: 32.94 on 2 and 81 DF, p-value: 3.407e-11
The -6.8595238 is the mean difference between chilled VS nonchilled (calculate the mean of the two groups to proof to yourself). The -12.6595238 is the mean difference between Quebec VS Mississippi (calculate the mean of the two groups to proof to yourself). R does not do pair wise comparison like the one-way anova because there are two factors. One unit change of explanatory only moves from chilled to nonchilled ot Quebec to Mississippi. lm()
can only estimat the generally effects of plant type and treatment.
The reason 36.972619 is not exactly the mean uptake of nonchilled Quebec plants is that lm()
is trying to find a intercept (mean of nonchilled Quebec plants) given a fixed effects of treatment and plant type. So, 36.972619 is what lm()
“think” what the mean of nonchilled Quebec plants should be. Note that the two effects are forced to not interact with each other.
However, if we include interaction term to allow the two changes to depend on each other, we have a third estimate (third regression coefficient, “Treatmentchilled:TypeMississippi”). This is showing the change of uptake from nonchilled to chilled in Quebec plants versus in Mississippi plans.
CO2.lm2 = lm(uptake ~ Treatment*Type, data = CO2)
summary(CO2.lm2)
##
## Call:
## lm(formula = uptake ~ Treatment * Type, data = CO2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.452 -3.624 2.167 5.773 10.648
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 35.333 1.747 20.225 < 2e-16 ***
## Treatmentchilled -3.581 2.471 -1.449 0.151141
## TypeMississippi -9.381 2.471 -3.797 0.000284 ***
## Treatmentchilled:TypeMississippi -6.557 3.494 -1.877 0.064213 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.006 on 80 degrees of freedom
## Multiple R-squared: 0.4718, Adjusted R-squared: 0.452
## F-statistic: 23.82 on 3 and 80 DF, p-value: 4.106e-11
From the output, we see that uptake in nonchilled treatment is -3.5809524 higher than in chilled treatment FOR Quebec plants! The uptake of Quebec plants is -9.3809524 higher than Mississippii plants in nonchilled treatment! The third estimate (regression coefficient) describes how much more uptake would decrease from nonchilled to chilled in Mississippi than in Quebec, OR how much more uptake would decrease from Quebec to Mississippi in nonchilled than in chilled. The two are equal.
The reason why -3.5809524 is exactly the same as the nonchilled Quebec plants is that we now have interaction term to adjust for the change from nonchilled Quebec plants to chilled Mississippi plant.