ANOVA Theoretical & Practical

Typical ANOVA standardised feature space with principle components shown

Key Concepts (KC)

Terminology/Alternative Classification

ANOVA: Analysis of the variance in the outcome \(y\) by partitioning variance attributable to categorical factors, generally without interaction. Serves as an umbrella term for the broader family of ANOVA methods.

ANCOVA: Analysis of the variance in the outcome variable after adjusting for the linear covariance between the outcome and continuous covariates, while also partitioning variance due to categorical factors. By default, ANCOVA models treat factors and covariates additively (no interaction), though interaction terms can be included in extended models.

Factorial ANOVA: Analysis of the variance in the outcome \(y\) with respect to two or more categorical factors considered simultaneously, including all their main effects and (optionally) their interactions. This design allows testing whether factors act independently or whether their effects on \(y\) depend on each other.

KC 1. ANOVA as Adjustment

click here to expand/collapse…

While ANOVA models a single continuous response variable \(y\) (strictly one dimensional), its core idea is partitioning the variation in \(y\) according to different explanatory factors (fundamentally multi-dimensional).
In this sense, ANOVA is fundamentally about adjustment i.e. we compare group means while accounting for other sources of variation in the data.

For example: - In a one-way ANOVA, we adjust \(y\) for differences among levels of a single factor.
- In a two-way ANOVA, we adjust for two factors simultaneously, and can also include their interaction. - In ANCOVA, we further adjust for continuous covariates, which highlights the regression perspective - removing variance explained by the covariates before testing factor effects.

Conceptually, each ANOVA is performing a series of comparisons of means while controlling for other influences which can be viewed as sequential adjustments.
From this perspective, ANOVA is not just a test of group differences but it is a framework for understanding how multiple factors jointly influence the outcome.

KC 2. Snedecor’s F-Test Statistic

click here to expand/collapse…

Consider the linear (alternative-hypothesis) model \(\mathbf{Y}=X_A\beta+\varepsilon\) with \(\varepsilon\sim N(\mathbf{0},\sigma^2 I_n)\), where \(X_A\) is the full design matrix of rank \(p_A\) encoding all predictors, including intercepts, main effects, interactions, or covariates. A reduced (null-hypothesis) model \(X_0\) of rank \(p_0<p_A\) contains only the predictors assumed true under the null hypothesis (e.g., intercept only for testing group differences).

The hat (projection) matrices are symmetric and idempotent, projecting onto the column spaces of the full and reduced models. \[ P_A := X_A(X_A^\top X_A)^{-1}X_A^\top \qquad P_0 := X_0(X_0^\top X_0)^{-1}X_0^\top \]

the sum of squares for the effect being tested is given by the quadratic form: \[\mathrm{SS}_{\text{effect}} = \mathbf{Y}^\top (P_A-P_0)\mathbf{Y}\qquad\text{with dof}\quad\nu_1 = \operatorname{rank}(P_A-P_0) = p_A-p_0\]
- \(P_A-P_0\) is the projection that is missed by the null model
- \(\mathrm{SS}_{\text{effect}}\) is the residual explainable sum of squares
- In ANOVA, an “effect” refers to the contribution of a factor (or set of predictors) to explaining variance in the outcome.

the residual sum of squares for the full (alternative) model is also convex (McCabe, 2024): \[\mathrm{SS}_{\text{res}} = \mathbf{Y}^\top (I-P_A)\mathbf{Y}\qquad\text{with dof}\quad\nu_2 = \operatorname{rank}(I-P_A) = n-p_A\]
- \(I-P_A\) is the projection that no linear model could ever capture due to noise/nonlinearity etc…
- \(\mathrm{SS}_{\text{res}}\) is the noise

TODO: Since \(P-P_0\) and \(I-P\) are symmetric, idempotent, and project onto orthogonal subspaces, Cochran’s theorem implies two independent chi-squared variables. \[\frac{\mathrm{SS}_{\text{effect}}}{\sigma^2} \sim \chi^2_{\nu_1}, \qquad \frac{\mathrm{SS}_{\text{res}}}{\sigma^2} \sim \chi^2_{\nu_2}\]

Form the mean squares \[ \mathrm{MS}_{\text{effect}} = \frac{\mathrm{SS}_{\text{effect}}}{\nu_1}, \qquad \mathrm{MS}_{\text{res}} = \frac{\mathrm{SS}_{\text{res}}}{\nu_2}. \] Under the null hypothesis, the ratio \[ \frac{\text{Variance with Adjustment}}{\text{Variance without adjustment}} =\frac{\mathrm{MS}_{\text{effect}}}{\mathrm{MS}_{\text{res}}} = \frac{\big(\mathrm{SS}_{\text{effect}}/\nu_1\big)}{\big(\mathrm{SS}_{\text{res}}/\nu_2\big)} \sim F_{\nu_1,\nu_2}. \]

Remarks / assumptions: exact \(F\)-distribution requires \(\varepsilon \sim N(0, \sigma^2 I)\). Homoscedasticity and independence are implied by this assumption. In practice, for large samples the \(F\)-test is approximately valid under weaker conditions via asymptotic arguments.

KC 3. Dependent Variable Types

click here to expand/collapse…

ANOVA is essentially a regression model where all predictors are exclusively categorical factors ANCOVA which sits under the ANCOVA umbrella uses categorical factors as well as including continuous covariates.

KC 3.1. Continuous Covariates

In ANOVA, the response variable \(y\) is continuous, while the explanatory variables are categorical factors (e.g. treatment groups). The method partitions total variation in \(y\) into components attributable to each factor and to residual error.

In ANCOVA, we extend this framework by including continuous explanatory variables (covariates) alongside categorical factors. These covariates are adjusted for in the model, removing part of the variability in \(y\) that is explained by linear relationships with them. This makes the regression aspect explicit: ANCOVA is essentially ANOVA embedded in a linear regression framework, where predictors can be both categorical (coded as dummies/indicators) and continuous.

KC 3.2. Discrete/Categorical/Qualitative/Factor Variables

Indicator Variables

We can include factor variables in our regression, and they behave like normal covariates.

For example, consider political affiliation: Republican / Democrat / Neither. You might think to assign each factor a numeric value, but this imposes an arbitrary order, which usually does not make sense.¹ Instead we use a binary flag for each group (despite the groups being mutually exclusive).

Political Affiliation	I_Neither	I_Democrat	I_Republican
Neither	1	0	0
Democrat	0	1	0
Republican	0	0	1

Dummy Coding

Tip

See my one way example for dropping the intercept and for custom reference variable selection.

Instead, we create dummy variables for regression. To visualise in 3D space; x-axis - Republican = 1, else 0, y-axis - Democrat = 1, else 0, z-axis - y (outcome, e.g., policy score or voting tendency).

Pick one level as the reference category (e.g., Independent).
Create k-1 dummy variables that indicate binary membership in the other categories.

Political Affiliation	(D_1) (Democrat)	(D_2) (Republican)
Neither	0	0
Democrat	1	0
Republican	0	1

We ignore the reference variable because it is redundant. The X and Y axes completely encode the Independent voters. If we added a separate dummy for Independents, we would have perfect multicollinearity, since it would be a linear combination of the other covariates. In R, lm would then return NA due to a non-invertible matrix.

KC 4. Interaction Terms

click here to expand/collapse…

Interaction terms feature in some flavours of ANOVA and are absent from others. When we include an interaction term we are simply performing multivariable linear regression i.e. fitting a hyperplane to the data.

Multivariable Linear Regression (No interaction) `lm(z~x+y, ...)`

Think about the model the general model: \[\mathbb{E}[y;x_1x_2]=\beta_0+\beta_1x_1+\beta_2x_2\]

for protestant states (\(x_2=0\)): \(\mathbb{E}[y;x_1x_2]=\beta_0+\beta_1x_1\)
for catholic states (\(x_2=1\)): \(\mathbb{E}[y;x_1x_2]=\beta_0+\beta_1x_1+\beta_2\):

both models have the same slope \(\beta_1\)- two cross-sections through a hyper-plane pitched at a \(\beta_1\) angle with an offset of \(\beta_2\)

Linear Interaction Models `lm(z~x*y, ...)`

Think about the model the general model (interaction included): \[\mathbb{E}[y;x_1x_2]=\beta_0+\beta_1x_1+\beta_2x_2+\beta_3x_1x_2\] (we can project the space to simplify- removing the \(x_2\) dimension) here the binary catholic coefficient becomes a vertical (y-direction) offset and both models have the same slope \(\beta_1\)- two cross-sections through a single plane:

for protestant states (\(x_2=0\)): \(\mathbb{E}[y;x_1x_2]=\beta_0+\beta_1x_1\)
for catholic states (\(x_2=1\)): \(\begin{align}\mathbb{E}[y;x_1x_2]&=\beta_0+\beta_1x_1+\beta_2+\beta_3x_1\\&=(\beta_0+\beta_2)+(\beta_1+\beta_3)x_1\end{align}\)

here the models have different slope \(\beta_1\) versus \(\beta_1+\beta_3\) - here the hyper-plane is twisted by the interaction term giving different linear gradiants at each of the cross-sections in the religion dimension.

KC 4.b) Interaction Plots

click here to expand/collapse…

An interaction plot is a graphical tool used in factorial experiments to illustrate how the effect of one factor depends on the level of another. It displays the mean response of the outcome variable across combinations of two categorical factors, with one factor on the x-axis and the other represented by separate lines. Non-parallel lines suggest the presence of an interaction: the effect of one factor varies according to the level of the other. The plot itself is descriptive rather than inferential; it does not fit a model or provide p-values or constitute a physical model, but it can reveal patterns that simple ANOVA techniques might overlook when only main effects are considered. Formal testing of these patterns requires a model that includes interaction terms, such as factorial ANOVA or a linear model.

See later notes on causal analysis.

dt <- as.data.table(ToothGrowth)
dt[,c("supp","dose"):=lapply(.SD,as.factor), .SDcols=c("supp","dose")]

model_twoway <- aov(len ~ dose * supp, data = dt)
summary(model_twoway)

            Df Sum Sq Mean Sq F value   Pr(>F)    
dose         2 2426.4  1213.2  92.000  < 2e-16 ***
supp         1  205.4   205.4  15.572 0.000231 ***
dose:supp    2  108.3    54.2   4.107 0.021860 *  
Residuals   54  712.1    13.2                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

interaction.plot(dt$dose, dt$supp, dt$len,
                 col = 1:3, lty = 1, lwd = 2,
                 ylab = "odentoblast length", xlab = "Dose")

ANOVA Tooling in R

After fitting the model, R provides a few quick checks to assess the main aspects of model fit, including overall metrics such as the ANOVA table and VIF (aside: additional diagnostics can examine individual data points if needed residuals, influence/leverage see (McCabe, 2025)).

dt<-as.data.table(mtcars)
model<-lm(mpg~wt+disp+carb,dt)

ANOVA Table

An ANOVA (Analysis of Variance) table summarises how much of the variation in the response variable can be attributed to each term in a regression model. The F-value is calculated by comparing the fit of the full model to a model with that specific term removed, effectively testing whether adding the term significantly improves the model.

anova(model)

Analysis of Variance Table

Response: mpg
          Df Sum Sq Mean Sq  F value    Pr(>F)    
wt         1 847.73  847.73 115.9334 1.836e-11 ***
disp       1  31.64   31.64   4.3270   0.04678 *  
carb       1  41.94   41.94   5.7359   0.02355 *  
Residuals 28 204.74    7.31                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Small p-values (high significance terms) indicate that the predictor contributes meaningfully to explaining the response, while large p-values (low significance terms) suggest little additional contribution once other predictors are included.

The residual row shows the variation left unexplained by the model.

Variance inflation Factors

The Variance Inflation Factor (VIF) quantifies how much the variance of a regression coefficient is inflated due to multicollinearity among the terms in a model. For each term, the VIF is calculated by fitting a regression of that term on all the other terms and measuring how well it can be predicted. (swap dependent and independent variables).

vif(model)

      wt     disp     carb 
4.890224 4.734708 1.225414

A VIF of 1 indicates no correlation with other terms, values between 1 and 5 suggest moderate correlation, and values above 5 (or 10) indicate high multicollinearity, which can make coefficient estimates unstable (imagine a flate plane teetering on a tightrope).

High VIF values highlight terms whose explanatory power overlaps substantially with other terms, making it difficult to isolate their individual effects.

ANOVA Workflows with special names

General Model

Suppose we have a response \(Y\) and several categorical factors \(A, B, \dots\). The general ANOVA model is:

\[ \boxed{Y_{ijk\dots} = \mu + \underbrace{\overbrace{\underbrace{\alpha_i + \beta_j}_{\text{main effects}} + \underbrace{(\alpha\beta)_{ij}}_{\text{interaction}} + \dots}^{adjustment} + \underbrace{\epsilon_{ijk}}_{\text{error}}}_{variance}}\quad\text{where} \quad \epsilon_{ijk\dots} \sim N(0, \sigma^2) \]

where:

\(\mu\) is the overall mean
\(\alpha_i\) is the main effect of level \(i\) of factor \(A\) - group offset from the total mean
\(\beta_j\) is the mzin effect of level \(j\) of factor \(B\) - group offset from the total mean
\((\alpha\beta)_{ij}\) is the interaction effect between factors \(A\) and \(B\)
Additional terms can be added for more factors (ANCOVA) or interactions (factorial ANOVA).

One Way ANOVA

Variance of a numeric dependent variable adjusted over a single categorical factor

ANOVA uses multivariable linear regression to test if there are significant statistical differences between the means of three or more groups (student’s t-test/simple linear regression is used in the case of two groups) classified by a single independent factor.

\[ \boxed{Y_{ij} = \mu + \alpha_i + \epsilon_{ij}}\qquad\text{where} \quad i = 1, \dots, k \quad j = 1, \dots, n_i \] \[ \begin{align*} Y_{ij} &\text{ is the response of observation } j \text{ in group } i, \\ \mu &\text{ is the overall mean,} \\ \alpha_i &\text{ is the effect of group } i, \\ \epsilon_{ij} &\sim N(0, \sigma^2) \text{ is the random error.} \end{align*} \]

hypotheses:

Null \(H_0\): all groups have the same mean
Alternative \(H_A\): not all groups have the same mean

Requires/Expects:

single continous/quantitative dependent variable \(Y\) (aka metric variable)
single factored/qualitative independent variable \(\alpha\) (aka nominal variable)

Assumptions:

independence: the independent variables should be mutually independent (strictly no confounding e.g. intellegence ~ age, shoe_size)
normality: data should be normally distributed within groups (testable using the Shapiro-Wilk test) or the residuals from the normal model should be normally distributed (testable via qqnorm + qqline plot). The Mean Squared Error (MSE) of the residuals estimates the within-group variance and assumes normality for valid inference.
homogeneity / homoscedasticity: each group should have the same variance (testable through the Levene test; if violated, a Welch ANOVA can be used, where degrees of freedom are adjusted as in the Welch test)

Groups must be selected data normallly distributed within groups

decomposition of variance:

\[ \underbrace{\sum_{j=1}^{k} \sum_{i=1}^{n_j} (y_{ij} - \bar{y})^2}_{\text{Total Sum of Squares}} = \underbrace{\sum_{j=1}^{k} n_j (\bar{y}_j - \bar{y})^2}_{\text{SSB (Between-group)}} + \underbrace{\sum_{j=1}^{k} \sum_{i=1}^{n_j} (y_{ij} - \bar{y}_j)^2}_{\text{SSW (Within-group)}} \]

Example:

dt <- as.data.table(InsectSprays)

    count  spray
    <num> <fctr>
 1:    10      A
 2:     7      A
 3:    20      A
 4:    14      A
 5:    14      A
 6:    12      A
 7:    10      A
 8:    23      A
 9:    17      A
10:    20      A

Automatic reference variable selection

summary(lm(count~spray,data=dt))$coef

               Estimate Std. Error    t value     Pr(>|t|)
(Intercept)  14.5000000   1.132156 12.8074279 1.470512e-19
sprayB        0.8333333   1.601110  0.5204724 6.044761e-01
sprayC      -12.4166667   1.601110 -7.7550382 7.266893e-11
sprayD       -9.5833333   1.601110 -5.9854322 9.816910e-08
sprayE      -11.0000000   1.601110 -6.8702352 2.753922e-09
sprayF        2.1666667   1.601110  1.3532281 1.805998e-01

summary.lm(aov(count ~ spray, data = dt))$coef

               Estimate Std. Error    t value     Pr(>|t|)
(Intercept)  14.5000000   1.132156 12.8074279 1.470512e-19
sprayB        0.8333333   1.601110  0.5204724 6.044761e-01
sprayC      -12.4166667   1.601110 -7.7550382 7.266893e-11
sprayD       -9.5833333   1.601110 -5.9854322 9.816910e-08
sprayE      -11.0000000   1.601110 -6.8702352 2.753922e-09
sprayF        2.1666667   1.601110  1.3532281 1.805998e-01

observation…notice the lm command recognised the ANOVA operation and performed dummy coding automatically taking spray A as the reference variable and ommiting it from the feature space. here; - 14.5000000 is the mean for spray A - 0.8333333 is the change in the mean between spray A and spray B - -12.4166667 is the mean of spray A minus the mean of spray C as seen in the violin plot

Dropping the intercept

dt[,.(mean(count)),spray]

    spray        V1
   <fctr>     <num>
1:      A 14.500000
2:      B 15.333333
3:      C  2.083333
4:      D  4.916667
5:      E  3.500000
6:      F 16.666667

summary(lm(count~spray-1,data=dt))$coef

        Estimate Std. Error   t value     Pr(>|t|)
sprayA 14.500000   1.132156 12.807428 1.470512e-19
sprayB 15.333333   1.132156 13.543487 1.001994e-20
sprayC  2.083333   1.132156  1.840148 7.024334e-02
sprayD  4.916667   1.132156  4.342749 4.953047e-05
sprayE  3.500000   1.132156  3.091448 2.916794e-03
sprayF 16.666667   1.132156 14.721181 1.573471e-22

observation…now we see the precise means for each spray in the first column of the table relative to 0

Custom reference variable selection

summary(lm(count~
             I(as.numeric(spray=='A'))+  # now B is the reference
             I(as.numeric(spray=='C'))+
             I(as.numeric(spray=='D'|spray=='E'|spray=='F')) # combine the remainder
           ,data=dt))$coef

                                                             Estimate
(Intercept)                                                15.3333333
I(as.numeric(spray == "A"))                                -0.8333333
I(as.numeric(spray == "C"))                               -13.2500000
I(as.numeric(spray == "D" | spray == "E" | spray == "F"))  -6.9722222
                                                          Std. Error    t value
(Intercept)                                                 1.667483  9.1954935
I(as.numeric(spray == "A"))                                 2.358178 -0.3533802
I(as.numeric(spray == "C"))                                 2.358178 -5.6187453
I(as.numeric(spray == "D" | spray == "E" | spray == "F"))   1.925444 -3.6210983
                                                              Pr(>|t|)
(Intercept)                                               1.499591e-13
I(as.numeric(spray == "A"))                               7.248966e-01
I(as.numeric(spray == "C"))                               3.889695e-07
I(as.numeric(spray == "D" | spray == "E" | spray == "F")) 5.602160e-04

Alternatively:

dt[, spray := relevel(spray, ref = "B")]
summary(lm(count~spray,data=dt))$coef

               Estimate Std. Error    t value     Pr(>|t|)
(Intercept)  15.3333333   1.132156 13.5434869 1.001994e-20
sprayA       -0.8333333   1.601110 -0.5204724 6.044761e-01
sprayC      -13.2500000   1.601110 -8.2755106 8.509776e-12
sprayD      -10.4166667   1.601110 -6.5059045 1.212803e-08
sprayE      -11.8333333   1.601110 -7.3907075 3.257986e-10
sprayF        1.3333333   1.601110  0.8327558 4.079858e-01

ANCOVA

Variance of a numeric dependent variable adjusted over independent continuous covariates as well as categorical factor

The general ANCOVA model is:

\[ \boxed{Y_{ijk\dots} \;=\; \mu \;+\; \underbrace{ \overbrace{ \underbrace{\alpha_i + \beta_j}_{\text{main effects}} + \underbrace{(\alpha\beta)_{ij}}_{\text{interaction}} + \overbrace{\gamma_1 x_{ijk\dots}^{(1)} + \gamma_2 x_{ijk\dots}^{(2)}}^{\text{ANCOVA regression terms}} + \dots }^{\text{adjustment}} + \underbrace{\epsilon_{ijk\dots}}_{\text{error}} }_{\text{variance}}} \quad \text{where } \epsilon_{ijk\dots} \sim N(0,\sigma^2) \]

coursera fertility example with binary state religion (DATAtab, 2025)

dt <- as.data.table(swiss)
hist(dt$Catholic, main="Bimodal catholic distribution - most states are either catholic or protestant")

# Fit parallel-lines model: same slope, different intercepts
m_par <- lm(Fertility ~ Agriculture + factor(CatholicBin), data = dt)

# Extract coefficients
coef(m_par)[1]          # Intercept for Non-Catholic

(Intercept) 
   60.83224

coef(m_par)[2]          # Shared slope

Agriculture 
  0.1241776

coef(m_par)[3]          # Intercept difference for Catholic

factor(CatholicBin)1 
            7.884329

Two Way ANOVA

Variance of a numeric dependent variable adjusted over two dependent categorical factors with interaction

ANOVA tests if there are significant statistical differences between the means of three or more groups (student’s t-testis used to test two groups) classified by a multiple independent factors.

\[ \begin{equation} \boxed{Y_{ij} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij} + \epsilon_{ij}}, \quad i = 1, \dots, a, \quad j = 1, \dots, b \end{equation} \] where.. \[ \begin{align*} Y_{ij} &\text{ is the response at level } i \text{ of factor A and level } j \text{ of factor B,} \\ \mu &\text{ is the overall mean,} \\ \alpha_i &\text{ is the effect of level } i \text{ of factor A,} \\ \beta_j &\text{ is the effect of level } j \text{ of factor B,} \\ (\alpha\beta)_{ij} &\text{ is the interaction effect between level } i \text{ of A and level } j \text{ of B,} \\ \epsilon_{ij} &\sim N(0, \sigma^2) \text{ is the random error.} \end{align*} \]

twisted-hyperplane (not quite - two triangles)

Compound hypotheses:

There’s a symmetric matrix of things to check here: - Main effect of first factor: - Null \(H_{0_{11}}\): all groups of the first factor have the same mean
- Alternative \(H_{A_{11}}\): not all groups of the first factor have the same mean
- Main effect of second factor: - Null \(H_{0_{22}}\): all groups of the second factor have the same mean
- Alternative \(H_{A_{22}}\): not all groups of the second factor have the same mean
- Interaction of first and second factors: - Null \(H_{0_{12}}\): no interaction between the two factors
- Alternative \(H_{A_{12}}\): there is an interaction between the two factors

effects can be seen in an Interaction - similar to the linear ANCOVA interaction plot

Requires/Expects:

single continous/quantitative dependent variable \(y\) (aka metric variable)
multiple factored/qualitative independent variable \(a,b,c\) (aka nominal variable)

\[y\sim f(a,b,c)\]

Assumptions:

independence: the independent variables should be mutually independent (strictly no confounding e.g. intellegence ~ age, shoe_size)
normality: data should be normally distributed within groups (testable using the Shapiro-Wilk test) or the residuals from the normal model should be normally distributed (testable via qqnorm + qqline plot). The Mean Squared Error (MSE) of the residuals estimates the within-group variance and assumes normality for valid inference.
homogeneity / homoscedasticity: each group should have the same variance (testable through the Levene test)

example:

dt <- as.data.table(ToothGrowth)
dt[,c("supp","dose"):=lapply(.SD,as.factor), .SDcols=c("supp","dose")]

model_twoway <- aov(len ~ dose * supp, data = dt)
summary(model_twoway)

            Df Sum Sq Mean Sq F value   Pr(>F)    
dose         2 2426.4  1213.2  92.000  < 2e-16 ***
supp         1  205.4   205.4  15.572 0.000231 ***
dose:supp    2  108.3    54.2   4.107 0.021860 *  
Residuals   54  712.1    13.2                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

RM-ANOVA

ANOVA with repeated measures - ANOVA where the grouping of the datapoints is the measurement ordinal

Repeated measures ANOVA can be seen as a special case of ANOVA. In repeated measures designs, the same subjects are observed under multiple conditions or timepoints, and the analysis accounts for the natural pairing of these observations. Mathematically, this is equivalent to treating one measurement (for example, a baseline) as a covariate when modelling the next, so that between-subject variability is removed and only within-subject differences are tested.

I really don’t underderstand why this is given such prominance, It’s only ANOVA applied to a particular experiment type, see one-way

MMRM ANOVA (Different groups accross repeated measures)

Mixed Linear Model ANOVA with repeated measures

Why all the fancy titles?!?!? It’s only ANOVA with repeat measures (within subject groupings) and “proper” groupings (between subject groupings ), see two-way

References

DATAtab. (2025). ANCOVA (analysis of covariance): A mix of ANOVA and···. https://www.youtube.com/watch?v=PngndHgZOgY.

McCabe, D. (2024). Positive definite matrices. Zenodo. https://doi.org/10.5281/zenodo.14392792

McCabe, D. (2025). Regression i: Practical linear multivariable regression. https://rpubs.com/mccabe08/1335788

Footnotes

We can do this where appropriate, as ANOVA is not limited to binary groups.↩︎

Key Concepts (KC)

KC 1. ANOVA as Adjustment

KC 2. Snedecor’s F-Test Statistic

KC 3. Dependent Variable Types

KC 3.1. Continuous Covariates

KC 3.2. Discrete/Categorical/Qualitative/Factor Variables

Indicator Variables

Dummy Coding

KC 4. Interaction Terms

Multivariable Linear Regression (No interaction) lm(z~x+y, ...)

Linear Interaction Models lm(z~x*y, ...)

KC 4.b) Interaction Plots

ANOVA Tooling in R

ANOVA Table

Variance inflation Factors

ANOVA Workflows with special names

One Way ANOVA

Example:

Automatic reference variable selection

Dropping the intercept

Custom reference variable selection

ANCOVA

Two Way ANOVA

RM-ANOVA

MMRM ANOVA (Different groups accross repeated measures)

References

Footnotes

Multivariable Linear Regression (No interaction) `lm(z~x+y, ...)`

Linear Interaction Models `lm(z~x*y, ...)`