PSYC302 ANCOVA Lecture

Straying from One Way ANOVA and MANOVA

In cases discussed thus far, the only factor predicting the outcome(s) is a single grouping variable.
You can imagine that many applied research questions would be concerned with how other factors relate to the outcome(s), in addition to just group membership.
If you add a continuous variable to the predictor side of the model, then the analysis is called an Analysis of Covariance (ANCOVA).
If you are interested in adding another grouping variable to the predictor side of the model, then you extend the model from a one-way ANOVA to a two-way ANOVA.

ANCOVA

Recall ODD Example:

Outcome: behavior score collected from a survey of the child's mother after several mothns of treatment.
Grouping variable: 3 treatment groups: 1=standard best practice, 2=new therapeutic treatment, 3=control.

In a true experimental design, you would randomly assign participants to one of these 3 treatment groups and assume that the children in the three groups have characteristics and qualities that are the same or very similar

In other words, once you have controlled for other factors, you can observe whether grouping has an effect on ODD behaviors.

ANCOVA, cont.

For whatever reason, sometimes you might be interested in other factors as they relate to the outcome, in addition to the grouping.

For example, if participants were not able to be randomly assigned to treatment conditions in our ODD study, there may be other variables/factors that differ between the groups
We would want to add those variables to the model if they are available to us, so we can control for their effect
When you add covariates to this model, it becomes an ANCOVA

ANCOVA, cont.

Example: we are interested in the outcome of Number of Derogatory Remarks in the classroom, predicted by treatment group. However, because random assignment wasn't entirely possible and we know that younger students may naturally have more behavioral problems in school, we also want to include Age as a predictor of the Number of Derogatory Remarks.

Informally, the model is:

Number of Derogatory Remarks = Intercept + Group + Age

Our model can be more formally expressed as a GLM, shown on the next slide.

ANCOVA Model (in GLM lingo)

\[ Y_{ik} = \beta_{0} + \beta_{1}G_{1} + \beta_{2}G_{2} + ... + \beta_{k}G_{k} + \beta_{k+1}C + e_{ij}\]

for person i in group j. In this expression, each G is a dummy coded variable for a particular level of the grouping variable. C is the covariate.

"The expected number of derogatory remarks (Y) for a child is equal to a global mean, plus the unique effect of their treatment group, plus B(k+1)Age for that child."

ANCOVA Assumptions

Combo of assumptions of both ANOVA and Linear Modeling, such as

errors are independent and normally distributed
homogeneity of variance across the groups
continuous predictor is linearly related to the dependent variable.

Additionally, for the ANCOVA, we also assume that the slope for the covariate (e.g. Age) is the same across all levels of the grouping variable. In other words, the effect of Age on Behavior is the same across all groups. This would result visually in three parallel lines (one representing each group), if you plotted the effect of Age on Derogatory Remarks by Treatment Group.

Benefits of added Covariate

Already discussed: adding a covariate helps control for confounding effects, so you can get a clearer picture of the true effect of Group on Y.

New Info: By adding a covariate that is presumably related to the outcome Y, you are helping explain some of the variance in the outcome that is not related to the groups and that was previously unexplained, increasing our power to detect a grouping effect.

Reasoning: The variance now explained by the covariate was previously just hanging out in the denominator of the F statistic in the ANOVA model. When we account for that previously unexplained variance, it is extracted from the denominator of the F statistic. The result is that our F value, which is the test statistic for the grouping effect, is larger.

Conducting the ANCOVA

Fitting data to an ANCOVA model is actually a multi-step process, rather than a single test:

Step 1: Check model assumptions
Step 2: Run a model for outcome Y with both predictors (grouping variable and continuous covariate) as well as the interaction between the two predictors. If the interaction is significant, you would not move forward with the ANCOVA
Step 3: Run a model that only includes the grouping factor for outcome Y (ignoring the covariate)
Step 4: Run a model that includes both the grouping variable and the covariate
Step 5: Conduct a LRT to determine if the covariate has a significant main effect.

ANCOVA is a GLM (surprise)

The ANCOVA model is yet another special case of a general linear model:

regressing Y on two IVs
one IV is a categorical (grouping) predictor
one IV is a continuous predictor

You can call this ANCOVA to help summarise exactly what you are asking of and doing with your data, but you could also call this a linear model – they are actually not different models. The decision to use one name or the other is typically a matter of preference or convention within a sub-field, literature, or discipline.

MANCOVA

You can add another outcome variable to the ANCOVA model, which thereby makes the model a multivariate anaysis of covariance, or a MANCOVA. Remember that the added assumption here is that the outcome variables are jointly normally distributed.

Two-way ANOVA

If you are not interested in an added continuous variable but rather interested in an additional grouping variable, the model you want is called a two-way ANOVA.

Called a two-way ANOVA because there are 2 grouping variables.
If there were 3 grouing variables, the model would be a three-way ANOVA, and so on
All ANOVAs with more than one grouping variable are called factorial ANOVAs

Two-way ANOVA, cont.

In the ODD example, we could use a two-way ANOVA to study whether the number of derogatory remarks in the classroom differs across treatment groups and across gender groups.

If you have three levels of Gender ID (0=girl, 1=boy, 2=non-binary/trans), this analysis is a 3x3 factorial ANOVA, because both grouping variables have 3 levels.

The two-way ANOVA model described in the previous paragraph, informally, is: Derog ~ Global Mean (intercept) + treatment grouping effect + gender grouping effect.

Two-way ANOVA, cont.

Usually, the primary goal of this analysis is to study the interaction between the two grouping variables.

Example: "How does the effect of treatment differ (or not) for children with different gender identities?"

Two-way ANOVA, cont.

To test whether the interaction between grouping variables is an important explanatory piece of information in the GLM framework, you can conduct an LRT

The LRT compares the model with the interaction term removed to the model with the interaction term retained.
A p-value of <.05 for this test suggests that the interaction term is a valuable and informative addition to the model, and that the effect of one grouping variable varies at different levels of the other grouping variable.