Similar to ANOVA, linear regression models can deal with multiple explanatory variables. When we have two or more predictors we talk about multiple linear regresssion. With two predictors, we can still graphically imagine the relationship with the response variable in a 3-dimensional space. However, with more predictors we are looking at a multi-dimensional space with is hard for us to imagine.
Let’s start with an example with two predictors using the longley
macroeconomic data in the car
package. We will use the plotly
package to produce a 3D figure of this data (for more information see https://plot.ly/r/reference/). Here are a few tips on how to handle the 3D image in the plotting window:
NOTE that currently 3D figures cannot be saved into a Word doc from RMarkdown, so you can only try out the 3D image rendering by running the code live (remove the ‘#’ symbol where I muted the code starting with plot-ly(...)
below)!
## Without interaction
library(car)
data(longley)
## Data exploration
?longley
head(longley)
str(longley)
## 'data.frame': 16 obs. of 7 variables:
## $ GNP.deflator: num 83 88.5 88.2 89.5 96.2 ...
## $ GNP : num 234 259 258 285 329 ...
## $ Unemployed : num 236 232 368 335 210 ...
## $ Armed.Forces: num 159 146 162 165 310 ...
## $ Population : num 108 109 110 111 112 ...
## $ Year : int 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 ...
## $ Employed : num 60.3 61.1 60.2 61.2 63.2 ...
summary(longley)
## GNP.deflator GNP Unemployed Armed.Forces
## Min. : 83.00 Min. :234.3 Min. :187.0 Min. :145.6
## 1st Qu.: 94.53 1st Qu.:317.9 1st Qu.:234.8 1st Qu.:229.8
## Median :100.60 Median :381.4 Median :314.4 Median :271.8
## Mean :101.68 Mean :387.7 Mean :319.3 Mean :260.7
## 3rd Qu.:111.25 3rd Qu.:454.1 3rd Qu.:384.2 3rd Qu.:306.1
## Max. :116.90 Max. :554.9 Max. :480.6 Max. :359.4
## Population Year Employed
## Min. :107.6 Min. :1947 Min. :60.17
## 1st Qu.:111.8 1st Qu.:1951 1st Qu.:62.71
## Median :116.8 Median :1954 Median :65.50
## Mean :117.4 Mean :1954 Mean :65.32
## 3rd Qu.:122.3 3rd Qu.:1958 3rd Qu.:68.29
## Max. :130.1 Max. :1962 Max. :70.55
## Create a 3-D plot of the data (not important for the exam but fun!)
library(plotly)
plot(GNP ~ Population, data = longley)
plot(GNP ~ Unemployed, data = longley)
## Funny enough, in this type of 3D plot, the response is specified as z-variable.
## Remove the '#' symbol below to run the code in live mode. RMarkdown is not capable of including 3D images.
plot_ly(data = longley, x = ~ Unemployed, y = ~ Population, z = ~ GNP, type = "scatter3d", mode = "markers")
Â
Â
And here is the corresponding multiple regression model without and with interaction term:
## Without interaction
m1 <- lm(GNP ~ Unemployed + Population, data = longley)
summary(m1)
##
## Call:
## lm(formula = GNP ~ Unemployed + Population, data = longley)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.468 -5.059 1.479 5.515 11.956
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.392e+03 4.603e+01 -30.244 1.96e-13 ***
## Unemployed -1.533e-01 3.337e-02 -4.593 0.000504 ***
## Population 1.558e+01 4.483e-01 34.743 3.29e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.781 on 13 degrees of freedom
## Multiple R-squared: 0.9932, Adjusted R-squared: 0.9922
## F-statistic: 954.4 on 2 and 13 DF, p-value: 7.88e-15
## With interaction
m2 <- lm(GNP ~ Unemployed * Population, data = longley)
summary(m2)
##
## Call:
## lm(formula = GNP ~ Unemployed * Population, data = longley)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.8533 -4.1051 0.2683 3.5188 10.3752
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.716e+03 1.843e+02 -9.310 7.71e-07 ***
## Unemployed 7.737e-01 5.144e-01 1.504 0.1584
## Population 1.837e+01 1.600e+00 11.479 7.92e-08 ***
## Unemployed:Population -7.908e-03 4.380e-03 -1.805 0.0962 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.105 on 12 degrees of freedom
## Multiple R-squared: 0.9947, Adjusted R-squared: 0.9934
## F-statistic: 747.9 on 3 and 12 DF, p-value: 6.631e-14
## Model diagnostic plots
layout(matrix(1:4, nrow = 2, byrow = T))
par(mar = c(4, 4, 1.5, 1.5), mgp = c(2.3, 0.8, 0))
plot(m2)
Â
No gross violation of the variance homogeneity or normality assumption. Only the year 1962 sticks out, something to look at bit more closely perhaps if this was your own data. Â
We can add the regression plane to our 3D plot but first we have to get predictions on a fine grid of the two predictors and then transform those into a matrix (not required in the exam).
## Create new Unemployed and Population values to predict from
axis_x <- seq(min(longley$Unemployed), max(longley$Unemployed), length.out = 100)
axis_y <- seq(min(longley$Population), max(longley$Population), length.out = 100)
## Model predictions for the regression plane
lm_surface <- expand.grid(Unemployed = axis_x, Population = axis_y)
lm_surface$fit <- predict(m2, newdata = lm_surface)
library(reshape2)
lm_surface <- acast(lm_surface, Unemployed ~ Population, value.var = "fit") # create the surface matrix
## Remove the '#' symbols below to run the code in live mode. RMarkdown is not capable of including 3D images.
plot_ly(data = longley, x = ~ Unemployed, y = ~ Population, z = ~ GNP, type = "scatter3d", mode = "markers") %>%
add_surface(z = lm_surface, x = axis_x, y = axis_y, opacity = 0.7)
Â
Â
One frequent problem with multiple regression models is a phenomenon called multicollinearity. Multicollinearity occurs when two or more of the continuous predictor variables are highly correlated, i.e. one variable can be linearly predicted from the others with high accuracy. Often, multicollinearity produces unstable parameter estimates and increases the variance and thus the standard errors of the coefficients (inflated standard errors) resulting in flawed P-values and poor overall model performance. In practice, this often happens when one or more of the predictors in a multiple regression model provide similar or complementary information. For example ‘relative humidity’ and ‘temperature’ are highly correlated and will hence give rise to multicollinearity issues in a regression model. One cannot always tell from the model summary whether multicollinearity is present but there are two main ways of detecting multicollinearity:
graphically, using scatterplot matrices (pairs plot
), where all explanatory variables are plotted against each other
by using variance inflation factors (VIF) (vif
function in R-package car
). VIF values larger than 10 (or 5) suggest multicollinearity caused by the respective variable.
VIFs are calculated for all predictors in a regression model and give a measure of how much the variance of the estimated regression coefficients is inflated as compared to when the predictor variables are not linearly related. So, a predictor with a VIF value of 10 suggests that the variance around a model coefficient is 10 times larger than one would expect if there was no correlation with other predictors, i.e. in a model with no multicollinearity.
These tools can help you decide which explanatory variables to retain in the model (i.e. one would drop the variable(s) with large VIF values ). Here are both tools in action:
m3 <- lm(GNP ~ Unemployed + Employed + Armed.Forces + Population, data = longley)
summary(m3)
##
## Call:
## lm(formula = GNP ~ Unemployed + Employed + Armed.Forces + Population,
## data = longley)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.7850 -4.3097 -0.6786 3.3328 8.8336
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.339e+03 3.512e+01 -38.137 4.87e-13 ***
## Unemployed 3.430e-04 4.055e-02 0.008 0.99340
## Employed 9.321e+00 2.397e+00 3.888 0.00253 **
## Armed.Forces 8.245e-02 2.955e-02 2.790 0.01759 *
## Population 9.338e+00 1.504e+00 6.211 6.62e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.683 on 11 degrees of freedom
## Multiple R-squared: 0.9976, Adjusted R-squared: 0.9967
## F-statistic: 1145 on 4 and 11 DF, p-value: 2.513e-14
## Scatterplot matrix
pairs(longley)
## Add correlation information to the pairs plot.
panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor, ...)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- abs(cor(x, y))
txt <- format(c(r, 0.123456789), digits = digits)[1]
txt <- paste0(prefix, txt)
if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex.cor * r)
}
head(longley)
pairs(longley[, -c(1, 2)], lower.panel = panel.smooth, upper.panel = panel.cor)
## Variance inflation factors
library(car)
vif(m3)
## Unemployed Employed Armed.Forces Population
## 6.668482 32.923691 1.964550 50.814189
## Remove population from the model
m4 <- lm(GNP ~ Unemployed + Employed + Armed.Forces, data = longley)
summary(m4)
##
## Call:
## lm(formula = GNP ~ Unemployed + Employed + Armed.Forces, data = longley)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.2811 -4.7494 0.1353 5.2067 22.6691
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.265e+03 6.708e+01 -18.854 2.78e-10 ***
## Unemployed 2.142e-01 4.352e-02 4.922 0.000353 ***
## Employed 2.369e+01 1.281e+00 18.485 3.49e-10 ***
## Armed.Forces 1.420e-01 5.681e-02 2.500 0.027892 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.55 on 12 degrees of freedom
## Multiple R-squared: 0.9892, Adjusted R-squared: 0.9865
## F-statistic: 366.3 on 3 and 12 DF, p-value: 4.638e-12
vif(m4)
## Unemployed Employed Armed.Forces
## 1.859355 2.277019 1.757381
Both, the graphical assessment as well as the VIFs suggest that Population
is the main culprit causing the multicollinearity issue. And indeed, removing Population
from the model gets rid of the multicollinearity. However, if we are particularly interested in Population
we may also try to remove one of the remaining predictors with a large VIF value to see if that alleviates the problem.
## What if we are specifically interested in 'population' though...
m5 <- lm(GNP ~ Unemployed + Population + Armed.Forces, data = longley)
summary(m5)
##
## Call:
## lm(formula = GNP ~ Unemployed + Population + Armed.Forces, data = longley)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.525 -6.989 1.574 5.657 13.434
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.352e+03 5.160e+01 -26.195 5.86e-12 ***
## Unemployed -1.142e-01 4.109e-02 -2.780 0.0167 *
## Population 1.498e+01 5.834e-01 25.677 7.42e-12 ***
## Armed.Forces 6.480e-02 4.308e-02 1.504 0.1584
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.384 on 12 degrees of freedom
## Multiple R-squared: 0.9943, Adjusted R-squared: 0.9929
## F-statistic: 698.8 on 3 and 12 DF, p-value: 9.943e-14
vif(m5)
## Unemployed Population Armed.Forces
## 3.146686 3.514335 1.918225
As we can gather from the VIF values < 5, this approach was also valid allowing us to retain Population
in the model.
Â
If we have a continuous response variable and a mix of continuous and categorical predictors (at least one of each kind), we use a so-called analysis of covariance model (ANCOVA). In R this type of model can be formulated with the familiar lm
function. I simulated a dataset consisting of the growth rate (mm day-1) of three different fungal species (growth
) as a function of fungicide concentration (conc
in mol L-1).
First, we plot the data and fit a model without interaction term, get the model predictions and add those as regression lines to the plot.
str(dat)
## 'data.frame': 54 obs. of 3 variables:
## $ conc : num 10 8 6 4 2 0 10 8 6 4 ...
## $ growth : num 37.3 76.2 118.9 166.8 158.6 ...
## $ species: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
plot(growth ~ conc, data = dat, pch = 21, bg = c("black", "white", "grey")[dat$species]) # use 3 different colours to identify the fungal species
legend(x = 8, y = 270, legend = c("A", "B", "C"), pch = 21, pt.bg = c("black", "white", "grey"), lty = c(1, 2, 3), title = "Species", bty = "n")
## ANCOVA without interaction
m1 <- lm(growth ~ conc + species, data = dat)
summary(m1)
##
## Call:
## lm(formula = growth ~ conc + species, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -43.194 -10.800 -0.018 12.022 58.072
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 206.9652 5.7620 35.919 < 2e-16 ***
## conc -14.1804 0.7439 -19.063 < 2e-16 ***
## speciesB -65.8031 6.2236 -10.573 2.38e-14 ***
## speciesC -45.9073 6.2236 -7.376 1.55e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 18.67 on 50 degrees of freedom
## Multiple R-squared: 0.9058, Adjusted R-squared: 0.9002
## F-statistic: 160.3 on 3 and 50 DF, p-value: < 2.2e-16
summary.aov(m1)
## Df Sum Sq Mean Sq F value Pr(>F)
## conc 1 126683 126683 363.40 < 2e-16 ***
## species 2 41000 20500 58.81 7.36e-14 ***
## Residuals 50 17430 349
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Create a new data frame containing a fine grid of predictor values (here conc values) to predict from
xv <- with(dat, seq(min(conc), max(conc), length.out = 100)) # fine grid of conc values spanning the original range
newdat <- expand.grid(conc = xv, species = levels(dat$species))
newdat$fits <- predict(m1, newdata = newdat)
## Add the regression lines (model predictions) to the plot
for (i in 1:3){
lines(fits ~ conc, data = newdat[newdat$species == levels(newdat$species)[i], ], lty = c(1, 2, 3)[i])
}
An ANCOVA model without interaction term allows the intercept to vary with species but imposes a common slope.
Â
Ok, here is the interpretation of the summary(m1)
output, which can be verified by comparing the coefficient values with the figure: In general, the intercept is the estimate of response value (y-value) when the predictor value (x-value) is zero. In the model summary output, the (intercept)
term represents the intercept of species A
, i.e. the growth of species A
in the absence of the fungicide, which is about 207 mm day-1. The conc
coefficient gives the common slope for all species, which is rougly -15, suggesting that fungal growth decreases by 14 mm day-1 with a one unit increase in fungicide. The coefficient speciesB
tells us, that the intercept of species B
is around 66 units lower than the intercept of species A
. Likewise the coefficient speciesC
indicated that for this species the intercept is 46 units lower than for species A
. This model contains only the main effects of fungicide concentration and species identity but no interaction term. As a result, the model assumes a common slope among the three fungal species.
Is this a valid assumption? Certainly not.
Theoretically, it could well be that all species share a similar slope, but more often then not this is not the case in reality. Therefore, we allow for separate slopes in our model by incorporating a fungicide \(\times\) species interaction. So, the interaction between a continuous and a categorical predictor means that the slopes are allowed to vary with each level of the categorical predictor variable. In our example a statistically significant interaction would translate into: the effect of the fungicide on fungal growth rate varies across species.
## ANOCOCA with interaction term
m2 <- lm(growth ~ conc * species, data = dat)
summary.aov(m2) # Overall significances
## Df Sum Sq Mean Sq F value Pr(>F)
## conc 1 126683 126683 629.90 < 2e-16 ***
## species 2 41000 20500 101.93 < 2e-16 ***
## conc:species 2 7777 3888 19.33 6.94e-07 ***
## Residuals 48 9654 201
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(m2)
##
## Call:
## lm(formula = growth ~ conc * species, data = dat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -34.131 -5.939 -0.960 6.238 34.447
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 230.5909 5.9258 38.913 < 2e-16 ***
## conc -18.9056 0.9786 -19.319 < 2e-16 ***
## speciesB -107.8949 8.3804 -12.875 < 2e-16 ***
## speciesC -74.6925 8.3804 -8.913 9.53e-12 ***
## conc:speciesB 8.4184 1.3840 6.083 1.87e-07 ***
## conc:speciesC 5.7570 1.3840 4.160 0.000131 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 14.18 on 48 degrees of freedom
## Multiple R-squared: 0.9479, Adjusted R-squared: 0.9424
## F-statistic: 174.5 on 5 and 48 DF, p-value: < 2.2e-16
## Model predictions
newdat$fits2 <- predict(m2, newdata = newdat)
## Add predictions to the plot (regression lines)
cols <- c("black", "white", "grey")
plot(growth ~ conc, data = dat, pch = 21, bg = cols[dat$species]) # use 3 different colours to identify the fungal species
legend(x = 8, y = 270, legend = c("A", "B", "C"), pch = 21, pt.bg = cols, lty = c(1, 2, 3), title = "Species", bty = "n")
## Add the regression lines (model predictions) to the plot
for (i in 1:3) {
lines(fits2 ~ conc, data = newdat[newdat$species == levels(newdat$species)[i], ], lty = c(1, 2, 3)[i])
}
An ANCOVA model with interaction term allows the intercept AND the slope to vary with species.
Â
Adding the model predictions (regression lines) to the plot, nicely visualises the interaction: at least one of the three species shows a steeper slope than the remaining two species.
As usual, the summary.aov
command gives us an overall significance for the model coefficients. In case of significant interaction term our job is done. In a paper, thesis or report, one would state the overall significance of the interaction term along the lines of: ‘The efficacy of the tested fungicide varied with fungal species as evidenced by a significant fungicide \(\times\) species interaction (\(F_{2,48}\) = 42.84, P < 0.001).’
If the interaction was not significant, we could safely remove it and simplify the model.
Let’s move on to the summary
output, which contains more detailed information but is somewhat harder to interpret.
Â
Â
Â
Here, the (intercept)
is again the intercept for species A
but now the conc
indicates the slope for species A
only. The next two coefficients (speciesB
and speciesC
) give the deviation of the intercepts of the remaining two species from that of species A
. Now comes the tricky bit: the coefficient conc:speciesB
gives the difference in slopes between species A
and species B
. Because the linear relationship we see in our plot is decreasing for each of the fungal species, we expect negative slopes and thus it is clear that the conc:speciesB
and conc:speciesC
are not referring to the true slope values. The slope value for species B
is calculated by adding the estimate of the conc:speciesB
interaction term to the slope of species A
(conc
). And the same approach is used to obtain the slope for species C
. \[
slope_B = -18.9056 + 8.4184 = -10.4872
\] \[
slope_C = -18.9056 + 5.7570 = -13.1486
\]
As a sanity check, we could simply extract the regression equation for each of the fungal species and use them to redraw the regression lines. This is basically what the predict
function is doing. Hence, if our calculations are correct, the resulting lines should fall right onto the red lines we have already plotted using the model predictions. To do so, we also need to add the intercept (species A
) or reconstruct it first (speices B
and C
) but that is a piece of cake.
Â
## Predictions for species A: y = intercept + slope x predictor
spec.a <- 230.5909 - 18.9056 * unique(dat$conc) # we use unique() because in our data we have three sets of conc values (one for each species)
lines(x = unique(dat$conc), y = spec.a, col = "blue")
## Predictions for species B (look up coefficients in the summary(m2) output)
spec.b <- (230.5909 - 107.8949) + (-18.9056 + 8.4184) * unique(dat$conc)
lines(unique(dat$conc), spec.b, col = "blue")
## Predictions for species C (compare with summary(m2) output)
spec.c <- (230.5909 - 74.6925) + (-18.9056 + 5.7570) * unique(dat$conc)
lines(unique(dat$conc), spec.c, col = "blue")
Our predictions, calculated by hand, perfectly overlay (blue lines) the model predictions derived by the predict
function.
Â
The overall significance obtained with the summary.aov(m2)
command just tells us that there is a statistically significant difference in slopes but not whether all slopes differ significantly or just some. The summary(m2)
output gives us a bit more information. From there we can conclude that the slopes of species B
and C
are significantly different from the slope of species A
but we do not know whether the slopes of B
and C
differ significantly. To test this, we have to follow up with a post-hoc analysis performing pairwise comparisons of the slopes (similar to what we have done in the ANOVA case, only here were are comparing slopes instead of group means). For this purpose, we use the emtrends
function in the emmeans
package like this:
## Post-hoc test comparing the slopes
# Obtain slopes for each fungus ...
library(emmeans)
slopes <- emtrends(m2, specs = "species", var = "conc")
slopes # compare to the slopes we have calculate by hand above ;-)
## species conc.trend SE df lower.CL upper.CL
## A -18.90555 0.9786178 48 -20.87319 -16.937908
## B -10.48718 0.9786178 48 -12.45483 -8.519539
## C -13.14852 0.9786178 48 -15.11616 -11.180875
##
## Confidence level used: 0.95
pairs(slopes)
## contrast estimate SE df t.ratio p.value
## A - B -8.418368 1.383975 48 -6.083 <.0001
## A - C -5.757033 1.383975 48 -4.160 0.0004
## B - C 2.661336 1.383975 48 1.923 0.1432
##
## P value adjustment: tukey method for comparing a family of 3 estimates
The pairwise comparisons obtained from the post-hoc test confirm our visual impression: the slope of species A
differs significantly from the slopes of species B
and C
but the latter two are not significantly different from each other. In a paper or thesis, you could report the small table showing the pairwise comparisons or add compact annotation to the figure like this:
## Replot with statistical annotation
plot(growth ~ conc, data = dat, pch = 21, bg = cols[dat$species], las = 1, ann = F) # use 3 different colours to identify the fungal species
legend(x = 5, y = 315, horiz = T, legend = c("A", "B", "C"), pch = 21, pt.bg = cols, lty = c(1, 2, 3), bty = "n", xpd = NA, xjust = 0.5, text.width = 0.8, seg.len = 2.3)
## Axis labels
mtext(expression(paste("Growth (mm d"^-1, ")")), side = 2, line = 2.7)
mtext(expression(paste("Fungicide concentration (mol L"^-1, ")")), side = 1, line = 2.4)
for (i in 1:3) {
lines(fits2 ~ conc, data = newdat[newdat$species == levels(newdat$species)[i], ], lty = c(1, 2, 3)[i])
}
text(x = 7, y = c(255, 235, 215), c("A \u2013 B ***", "A \u2013 C ***", expression(paste("B \u2013 C"^" ns"))), adj = c(0, 1)) # \u2013 is the unicode abbreviation for long dash. We use 'ns' to indicate 'not significant'.
Figure 1 Growth of three fungal species as a function of fungicide concentration. Lines indicate predictions from an analysis of covariance model. The small inset table indicates the results of a multiple comparison procedure on the slopes (Tukey contrasts, \(\alpha\) = 0.05). *** P < 0.001, ns = not significant.