This document presents the full execution of a One-Way Multivariate Analysis of Variance (MANOVA) on the Students Performance in Exams dataset. The analysis directly follows the assumption testing phase, in which all five prerequisites were verified and confirmed to be satisfied prior to conducting the inferential procedure. The central question driving this analysis is whether gender as an independent grouping variable produces a statistically discernible difference in the centroid that is, the multivariate mean vector of three academic outcome variables measured simultaneously mathematics score, reading score, and writing score.
MANOVA is selected over a series of independent ANOVAs for two compounding reasons. First, the Bartlett Sphericity Test confirmed that the three dependent variables are significantly intercorrelated (Pearson r ranging from 0.818 to 0.954), which means that treating them as isolated outcomes would ignore meaningful shared variance and inflate the familywise Type I error rate. Second, the Box’s M Test confirmed homogeneity of the variance-covariance matrices across gender groups, validating the pooled within-group covariance structure that MANOVA relies upon. Together, these conditions position MANOVA as both methodologically appropriate and statistically efficient.
The analysis follows a one-way, between-subjects design with the following configuration:
gender, a
nominal factor with two levels female (n = 518) and
male (n = 482).math score,
reading score, and writing score each a
continuous integer-scale variable representing standardised examination
performance.df <- read.csv("StudentsPerformance.csv", check.names = FALSE)
dv_cols <- c("math score", "reading score", "writing score")
iv_col <- "gender"
Y <- df[, dv_cols]
groups <- df[[iv_col]]
cat("Dataset dimensions:", nrow(df), "rows x", ncol(df), "columns\n")
cat("Factor levels:", paste(levels(factor(groups)), collapse = ", "), "\n")
cat("Group sizes:\n")
print(table(groups))## Dataset dimensions: 1000 rows x 8 columns
## Factor levels: female, male
## Group sizes:
## groups
## female male
## 518 482
desc_group <- df %>%
group_by(gender) %>%
summarise(
n_math = n(),
mean_math = round(mean(`math score`), 3),
sd_math = round(sd(`math score`), 3),
mean_reading = round(mean(`reading score`), 3),
sd_reading = round(sd(`reading score`), 3),
mean_writing = round(mean(`writing score`), 3),
sd_writing = round(sd(`writing score`), 3)
)
desc_group %>%
kable(caption = "Group Centroids and Standard Deviations by Gender") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE) %>%
column_spec(1, bold = TRUE) %>%
row_spec(0, bold = TRUE, color = "white", background = "#34495E")| gender | n_math | mean_math | sd_math | mean_reading | sd_reading | mean_writing | sd_writing |
|---|---|---|---|---|---|---|---|
| female | 518 | 63.633 | 15.491 | 72.608 | 14.378 | 72.467 | 14.845 |
| male | 482 | 68.728 | 14.356 | 65.473 | 13.932 | 63.311 | 14.114 |
desc_full <- data.frame(
Variable = dv_cols,
N_Female = sapply(dv_cols, function(v) sum(!is.na(df[[v]][df$gender == "female"]))),
Mean_Female = round(sapply(dv_cols, function(v) mean(df[[v]][df$gender == "female"])), 3),
SD_Female = round(sapply(dv_cols, function(v) sd(df[[v]][df$gender == "female"])), 3),
N_Male = sapply(dv_cols, function(v) sum(!is.na(df[[v]][df$gender == "male"]))),
Mean_Male = round(sapply(dv_cols, function(v) mean(df[[v]][df$gender == "male"])), 3),
SD_Male = round(sapply(dv_cols, function(v) sd(df[[v]][df$gender == "male"])), 3),
Mean_Diff = round(sapply(dv_cols, function(v)
mean(df[[v]][df$gender == "female"]) - mean(df[[v]][df$gender == "male"])), 3)
)
desc_full %>%
kable(caption = "Detailed Descriptive Statistics: Female vs. Male across All DVs") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = TRUE) %>%
column_spec(1, bold = TRUE) %>%
column_spec(8, bold = TRUE) %>%
row_spec(0, bold = TRUE, color = "white", background = "#34495E")| Variable | N_Female | Mean_Female | SD_Female | N_Male | Mean_Male | SD_Male | Mean_Diff | |
|---|---|---|---|---|---|---|---|---|
| math score | math score | 518 | 63.633 | 15.491 | 482 | 68.728 | 14.356 | -5.095 |
| reading score | reading score | 518 | 72.608 | 14.378 | 482 | 65.473 | 13.932 | 7.135 |
| writing score | writing score | 518 | 72.467 | 14.845 | 482 | 63.311 | 14.114 | 9.156 |
The descriptive comparison reveals a discernible pattern at the group centroid level. Female students demonstrate higher mean scores in reading (mean = 72.61) and writing (mean = 72.47) relative to male students (reading mean = 65.47, writing mean = 63.31). In contrast, male students record a marginally higher mean in mathematics (male mean = 68.73 vs. female mean = 63.63). Whether these observed differences in the multivariate centroid are statistically significant beyond sampling variability is precisely the question that MANOVA is designed to address.
One-Way MANOVA extends the univariate ANOVA to a setting with \(p\) dependent variables measured simultaneously. Each observation vector is decomposed as:
\[\mathbf{X}_{\ell j} = \boldsymbol{\mu} + \boldsymbol{\tau}_\ell + \mathbf{e}_{\ell j}, \quad j = 1, 2, \ldots, n_\ell, \quad \ell = 1, 2, \ldots, g\]
where \(\boldsymbol{\mu}\) is the \(p \times 1\) grand mean vector, \(\boldsymbol{\tau}_\ell\) is the \(p \times 1\) treatment effect vector for group \(\ell\), and \(\mathbf{e}_{\ell j} \sim N_p(\mathbf{0}, \boldsymbol{\Sigma})\) is the error vector.
The total sum-of-squares-and-cross-products (SSCP) matrix decomposes as:
\[\mathbf{T} = \mathbf{B} + \mathbf{W}\]
where:
Between-group SSCP matrix (B): \(\mathbf{B} = \sum_{\ell=1}^{g} n_\ell (\bar{\mathbf{x}}_\ell - \bar{\mathbf{x}})(\bar{\mathbf{x}}_\ell - \bar{\mathbf{x}})'\), with \(df_B = g - 1\)
Within-group SSCP matrix (W): \(\mathbf{W} = \sum_{\ell=1}^{g}\sum_{j=1}^{n_\ell}(\mathbf{x}_{\ell j} - \bar{\mathbf{x}}_\ell)(\mathbf{x}_{\ell j} - \bar{\mathbf{x}}_\ell)'\), with \(df_W = \sum n_\ell - g\)
Null hypothesis: \[H_0: \boldsymbol{\tau}_1 = \boldsymbol{\tau}_2 = \cdots = \boldsymbol{\tau}_g = \mathbf{0}\]
Equivalently, the mean vectors are equal across all groups: \[H_0: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2 = \cdots = \boldsymbol{\mu}_g\]
Alternative hypothesis: \[H_1: \text{At least one } \boldsymbol{\tau}_\ell \neq \mathbf{0}\]
In the specific context of this analysis with \(g = 2\) groups: \[H_0: \boldsymbol{\mu}_{\text{female}} = \boldsymbol{\mu}_{\text{male}}\] \[H_1: \boldsymbol{\mu}_{\text{female}} \neq \boldsymbol{\mu}_{\text{male}}\]
Four multivariate test statistics are routinely computed from the eigenvalues \(\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_s\) of the matrix \(\mathbf{W}^{-1}\mathbf{B}\), where \(s = \min(p, g-1)\):
| Statistic | Formula | Interpretation |
|---|---|---|
| Wilks’ Lambda | \(\Lambda = \prod_{i=1}^{s} \frac{1}{1+\lambda_i} = \frac{|\mathbf{W}|}{|\mathbf{B}+\mathbf{W}|}\) | Ratio of within to total variance; smaller values indicate stronger group separation |
| Pillai’s Trace | \(V = \sum_{i=1}^{s} \frac{\lambda_i}{1+\lambda_i}\) | Sum of explained variance proportions; most robust statistic |
| Hotelling-Lawley Trace | \(T = \sum_{i=1}^{s} \lambda_i\) | Sum of eigenvalues; powerful when assumptions are met |
| Roy’s Largest Root | \(\theta = \frac{\lambda_1}{1+\lambda_1}\) | Based on the single dominant dimension; liberal under violations |
Since \(g = 2\) (two groups) and therefore \(s = 1\), all four statistics are exact and yield identical F-ratios. Pillai’s Trace is emphasised as the primary criterion given its documented robustness under moderate departures from multivariate normality (Olson, 1974).
grand_mean <- colMeans(Y)
group_means <- aggregate(Y, by = list(Gender = groups), FUN = mean)
cat("Grand Mean Vector:\n")
print(round(grand_mean, 4))
cat("\nGroup Centroid Vectors:\n")
group_means_print <- group_means
group_means_print[, -1] <- round(group_means[, -1], 4)
print(group_means_print)
cat("\nDifference in Centroids (Female - Male):\n")
fem_mean <- colMeans(Y[groups == "female", ])
male_mean <- colMeans(Y[groups == "male", ])
print(round(fem_mean - male_mean, 4))## Grand Mean Vector:
## math score reading score writing score
## 66.089 69.169 68.054
##
## Group Centroid Vectors:
## Gender math score reading score writing score
## 1 female 63.6332 72.6081 72.4672
## 2 male 68.7282 65.4730 63.3112
##
## Difference in Centroids (Female - Male):
## math score reading score writing score
## -5.0950 7.1351 9.1560
g <- length(unique(groups))
n_total <- nrow(Y)
p <- ncol(Y)
B <- Reduce("+", lapply(unique(groups), function(grp) {
idx <- groups == grp
n_g <- sum(idx)
diff <- colMeans(Y[idx, ]) - grand_mean
n_g * outer(diff, diff)
}))
rownames(B) <- colnames(B) <- dv_cols
cat("Between-Group SSCP Matrix (B), df =", g - 1, "\n\n")
print(round(B, 4))
cat("\nDeterminant |B| =", round(det(B), 4), "\n")## Between-Group SSCP Matrix (B), df = 1
##
## math score reading score writing score
## math score 6481.374 -9076.548 -11647.34
## reading score -9076.548 12710.844 16310.99
## writing score -11647.338 16310.990 20930.82
##
## Determinant |B| = 0
W <- Reduce("+", lapply(unique(groups), function(grp) {
Y_g <- Y[groups == grp, ]
mu_g <- colMeans(Y_g)
swept <- sweep(Y_g, 2, mu_g)
t(as.matrix(swept)) %*% as.matrix(swept)
}))
rownames(W) <- colnames(W) <- dv_cols
df_W <- n_total - g
cat("Within-Group SSCP Matrix (W), df =", df_W, "\n\n")
print(round(W, 4))
cat("\nDeterminant |W| =", round(det(W), 4), "\n")## Within-Group SSCP Matrix (W), df = 998
##
## math score reading score writing score
## math score 223207.7 189894.5 196401.5
## reading score 189894.5 200241.6 195263.9
## writing score 196401.5 195263.9 209746.3
##
## Determinant |W| = 1.41738e+14
T_mat <- B + W
df_T <- n_total - 1
cat("Total SSCP Matrix (T = B + W), df =", df_T, "\n\n")
print(round(T_mat, 4))
cat("\nDeterminant |T| =", round(det(T_mat), 4), "\n")## Total SSCP Matrix (T = B + W), df = 999
##
## math score reading score writing score
## math score 229689.1 180818.0 184754.2
## reading score 180818.0 212952.4 211574.9
## writing score 184754.2 211574.9 230677.1
##
## Determinant |T| = 3.264389e+14
plot_sscp <- function(mat, title, palette_high) {
mat_df <- as.data.frame(as.table(mat))
colnames(mat_df) <- c("Row", "Col", "Value")
ggplot(mat_df, aes(x = Col, y = Row, fill = Value)) +
geom_tile(color = "white", linewidth = 0.8) +
geom_text(aes(label = round(Value, 1)), size = 3.8, fontface = "bold") +
scale_fill_gradient2(low = "#EBF5FB", mid = "#AED6F1",
high = palette_high,
midpoint = median(mat_df$Value)) +
theme_minimal(base_size = 11) +
theme(axis.text.x = element_text(angle = 25, hjust = 1),
plot.title = element_text(hjust = 0.5, face = "bold", size = 12),
legend.position = "right") +
labs(title = title, x = "", y = "", fill = "Value")
}
grid.arrange(
plot_sscp(B, "Between-Group SSCP (B)", "#1A5276"),
plot_sscp(W, "Within-Group SSCP (W)", "#1E8449"),
plot_sscp(T_mat, "Total SSCP (T = B + W)", "#6E2F6E"),
ncol = 3
)The SSCP matrices reveal the core mechanics of the MANOVA decomposition. The between-group matrix B captures variance attributable to group membership the diagonal entries reflect sum-of-squared deviations of group centroids from the grand mean, while off-diagonal entries encode covariation between DVs across groups. The within-group matrix W absorbs residual variance unexplained by group membership; its diagonal entries are substantially larger than those in B, which is typical for individual-level educational data characterised by broad score distributions. The key inferential question is whether the signal in B is large relative to the noise in W formally assessed through the eigenstructure of W\(^{-1}\)B.
WinvB <- solve(W) %*% B
eigenvals <- Re(eigen(WinvB)$values)
eigenvals <- eigenvals[eigenvals > 1e-10]
wilks_lambda <- prod(1 / (1 + eigenvals))
det_ratio <- det(W) / det(T_mat)
pillai_trace <- sum(eigenvals / (1 + eigenvals))
hotelling_T <- sum(eigenvals)
roy_root <- max(eigenvals) / (1 + max(eigenvals))
cat("Eigenvalues of W^{-1}B:\n")
print(round(eigenvals, 6))
cat("\n--- Manual Multivariate Statistics ---\n")
cat(sprintf("Wilks' Lambda (det ratio) : %.6f\n", det_ratio))
cat(sprintf("Wilks' Lambda (eigenvalues) : %.6f\n", wilks_lambda))
cat(sprintf("Pillai's Trace : %.6f\n", pillai_trace))
cat(sprintf("Hotelling-Lawley Trace : %.6f\n", hotelling_T))
cat(sprintf("Roy's Largest Root : %.6f\n", roy_root))## Eigenvalues of W^{-1}B:
## [1] 1.303115
##
## --- Manual Multivariate Statistics ---
## Wilks' Lambda (det ratio) : 0.434195
## Wilks' Lambda (eigenvalues) : 0.434195
## Pillai's Trace : 0.565805
## Hotelling-Lawley Trace : 1.303115
## Roy's Largest Root : 0.565805
# Exact F transformation for g=2 (s=1), all four statistics identical
# F = ((1 - Lambda) / Lambda) * ((df_W - p + 1) / p)
# df1 = p, df2 = df_W - p + 1
df1 <- p
df2 <- df_W - p + 1
F_wilks <- ((1 - wilks_lambda) / wilks_lambda) * (df2 / df1)
p_wilks <- pf(F_wilks, df1, df2, lower.tail = FALSE)
cat("--- F-Ratio Transformation (Wilks' Lambda, g=2 exact) ---\n")
cat(sprintf("df numerator (p) : %d\n", df1))
cat(sprintf("df denominator (df_W-p+1): %d\n", df2))
cat(sprintf("F statistic : %.4f\n", F_wilks))
cat(sprintf("p-value : %.2e\n", p_wilks))
cat(sprintf("Decision : %s\n",
ifelse(p_wilks < 0.05,
"Reject H0 — significant multivariate group difference",
"Fail to reject H0")))## --- F-Ratio Transformation (Wilks' Lambda, g=2 exact) ---
## df numerator (p) : 3
## df denominator (df_W-p+1): 996
## F statistic : 432.6342
## p-value : 7.00e-180
## Decision : Reject H0 — significant multivariate group difference
manova_model <- manova(
cbind(`math score`, `reading score`, `writing score`) ~ gender,
data = df
)
cat("--- MANOVA Summary: All Four Multivariate Tests ---\n\n")
for (test_name in c("Pillai", "Wilks", "Hotelling-Lawley", "Roy")) {
cat(sprintf("[ %s ]\n", test_name))
print(summary(manova_model, test = test_name))
cat("\n")
}## --- MANOVA Summary: All Four Multivariate Tests ---
##
## [ Pillai ]
## Df Pillai approx F num Df den Df Pr(>F)
## gender 1 0.56581 432.63 3 996 < 2.2e-16 ***
## Residuals 998
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## [ Wilks ]
## Df Wilks approx F num Df den Df Pr(>F)
## gender 1 0.43419 432.63 3 996 < 2.2e-16 ***
## Residuals 998
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## [ Hotelling-Lawley ]
## Df Hotelling-Lawley approx F num Df den Df Pr(>F)
## gender 1 1.3031 432.63 3 996 < 2.2e-16 ***
## Residuals 998
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## [ Roy ]
## Df Roy approx F num Df den Df Pr(>F)
## gender 1 1.3031 432.63 3 996 < 2.2e-16 ***
## Residuals 998
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
manova_results <- data.frame(
Statistic = c("Pillai's Trace", "Wilks' Lambda",
"Hotelling-Lawley Trace", "Roy's Largest Root"),
Value = round(c(pillai_trace, wilks_lambda, hotelling_T, roy_root), 6),
Approx_F = round(c(
summary(manova_model, test = "Pillai")$stats[1, "approx F"],
summary(manova_model, test = "Wilks")$stats[1, "approx F"],
summary(manova_model, test = "Hotelling-Lawley")$stats[1, "approx F"],
summary(manova_model, test = "Roy")$stats[1, "approx F"]
), 4),
df1 = c(
summary(manova_model, test = "Pillai")$stats[1, "num Df"],
summary(manova_model, test = "Wilks")$stats[1, "num Df"],
summary(manova_model, test = "Hotelling-Lawley")$stats[1, "num Df"],
summary(manova_model, test = "Roy")$stats[1, "num Df"]
),
df2 = c(
summary(manova_model, test = "Pillai")$stats[1, "den Df"],
summary(manova_model, test = "Wilks")$stats[1, "den Df"],
summary(manova_model, test = "Hotelling-Lawley")$stats[1, "den Df"],
summary(manova_model, test = "Roy")$stats[1, "den Df"]
),
p_value = formatC(c(
summary(manova_model, test = "Pillai")$stats[1, "Pr(>F)"],
summary(manova_model, test = "Wilks")$stats[1, "Pr(>F)"],
summary(manova_model, test = "Hotelling-Lawley")$stats[1, "Pr(>F)"],
summary(manova_model, test = "Roy")$stats[1, "Pr(>F)"]
), format = "e", digits = 3),
Decision = rep("Reject H0", 4)
)
manova_results %>%
kable(caption = "One-Way MANOVA: Multivariate Test Statistics — Effect of Gender on Academic Performance",
col.names = c("Test Statistic", "Value", "Approx. F",
"df (num)", "df (den)", "p-value", "Decision")) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = TRUE) %>%
column_spec(1, bold = TRUE) %>%
column_spec(7, bold = TRUE, color = "#C0392B") %>%
row_spec(0, bold = TRUE, color = "white", background = "#2C3E50") %>%
row_spec(1, background = "#D5F5E3") # highlight Pillai as primary| Test Statistic | Value | Approx. F | df (num) | df (den) | p-value | Decision |
|---|---|---|---|---|---|---|
| Pillai’s Trace | 0.565805 | 432.6342 | 3 | 996 | 6.997e-180 | Reject H0 |
| Wilks’ Lambda | 0.434195 | 432.6342 | 3 | 996 | 6.997e-180 | Reject H0 |
| Hotelling-Lawley Trace | 1.303115 | 432.6342 | 3 | 996 | 6.997e-180 | Reject H0 |
| Roy’s Largest Root | 0.565805 | 432.6342 | 3 | 996 | 6.997e-180 | Reject H0 |
# Multivariate eta-squared from Wilks' Lambda
# eta2_multivariate = 1 - Lambda^(1/s), s = min(p, g-1)
s <- min(p, g - 1)
eta2_multi <- 1 - wilks_lambda^(1 / s)
# Univariate partial eta-squared per DV (from follow-up ANOVA)
uni_anova <- summary.aov(manova_model)
eta2_uni <- sapply(seq_along(dv_cols), function(i) {
ss_b <- uni_anova[[i]][["Sum Sq"]][1]
ss_e <- uni_anova[[i]][["Sum Sq"]][2]
ss_b / (ss_b + ss_e)
})
names(eta2_uni) <- dv_cols
cat("Multivariate Eta-Squared (1 - Lambda^(1/s)):\n")
cat(sprintf(" eta2 = %.4f (%.1f%% of multivariate variance explained by gender)\n\n",
eta2_multi, eta2_multi * 100))
cat("Univariate Partial Eta-Squared per DV:\n")
for (i in seq_along(dv_cols)) {
magnitude <- ifelse(eta2_uni[i] >= 0.14, "large",
ifelse(eta2_uni[i] >= 0.06, "medium", "small"))
cat(sprintf(" %-20s: eta2 = %.4f (%s effect, Cohen 1988)\n",
dv_cols[i], eta2_uni[i], magnitude))
}## Multivariate Eta-Squared (1 - Lambda^(1/s)):
## eta2 = 0.5658 (56.6% of multivariate variance explained by gender)
##
## Univariate Partial Eta-Squared per DV:
## math score : eta2 = 0.0282 (small effect, Cohen 1988)
## reading score : eta2 = 0.0597 (small effect, Cohen 1988)
## writing score : eta2 = 0.0907 (medium effect, Cohen 1988)
effect_tbl <- data.frame(
Variable = c("Multivariate (overall)", dv_cols),
Partial_eta2 = round(c(eta2_multi, eta2_uni), 4),
Pct_Explained = paste0(round(c(eta2_multi, eta2_uni) * 100, 2), "%"),
Magnitude = c(
ifelse(eta2_multi >= 0.14, "Large", ifelse(eta2_multi >= 0.06, "Medium", "Small")),
sapply(eta2_uni, function(x) ifelse(x >= 0.14, "Large",
ifelse(x >= 0.06, "Medium", "Small")))
)
)
effect_tbl %>%
kable(caption = "Effect Size: Partial Eta-Squared (Cohen, 1988 benchmarks: small = 0.01, medium = 0.06, large = 0.14)",
col.names = c("Variable", "Partial eta2", "Variance Explained", "Magnitude")) %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
column_spec(1, bold = TRUE) %>%
column_spec(4, bold = TRUE) %>%
row_spec(0, bold = TRUE, color = "white", background = "#34495E") %>%
row_spec(1, background = "#EBF5FB", bold = TRUE)| Variable | Partial eta2 | Variance Explained | Magnitude | |
|---|---|---|---|---|
| Multivariate (overall) | 0.5658 | 56.58% | Large | |
| math score | math score | 0.0282 | 2.82% | Small |
| reading score | reading score | 0.0597 | 5.97% | Small |
| writing score | writing score | 0.0907 | 9.07% | Medium |
eff_df <- data.frame(
Variable = effect_tbl$Variable[-1],
eta2 = effect_tbl$Partial_eta2[-1]
)
ggplot(eff_df, aes(x = reorder(Variable, eta2), y = eta2, fill = eta2)) +
geom_col(width = 0.55, color = "white") +
geom_text(aes(label = sprintf("%.4f", eta2)), hjust = -0.15,
size = 4, fontface = "bold") +
geom_hline(yintercept = 0.01, linetype = "dotted", color = "#E74C3C", linewidth = 0.8) +
geom_hline(yintercept = 0.06, linetype = "dashed", color = "#E67E22", linewidth = 0.8) +
geom_hline(yintercept = 0.14, linetype = "dashed", color = "#27AE60", linewidth = 0.8) +
annotate("text", x = 0.6, y = 0.012, label = "small (0.01)", size = 3, color = "#E74C3C") +
annotate("text", x = 0.6, y = 0.062, label = "medium (0.06)", size = 3, color = "#E67E22") +
annotate("text", x = 0.6, y = 0.142, label = "large (0.14)", size = 3, color = "#27AE60") +
coord_flip() +
scale_fill_gradient(low = "#AED6F1", high = "#1A5276") +
scale_y_continuous(limits = c(0, 0.22), expand = c(0, 0)) +
theme_minimal(base_size = 12) +
theme(legend.position = "none",
plot.title = element_text(hjust = 0.5, face = "bold")) +
labs(title = "Partial Eta-Squared by Dependent Variable",
x = "", y = "Partial eta2")When MANOVA yields a significant multivariate result, follow-up univariate ANOVAs are conducted to identify which specific dependent variables contributed to the overall multivariate group separation. These tests are conducted with a Bonferroni-adjusted significance threshold of \(\alpha_{\text{adj}} = 0.05 / 3 = 0.0167\) to control the familywise error rate across the three simultaneous comparisons.
alpha_bonf <- 0.05 / length(dv_cols)
cat(sprintf("Bonferroni-adjusted significance threshold: alpha = %.4f\n\n", alpha_bonf))
anova_results_list <- lapply(dv_cols, function(v) {
model <- aov(reformulate(iv_col, response = paste0("`", v, "`")), data = df)
anova_t <- summary(model)[[1]]
F_val <- anova_t["gender", "F value"]
p_val <- anova_t["gender", "Pr(>F)"]
SS_B <- anova_t["gender", "Sum Sq"]
SS_E <- anova_t["Residuals", "Sum Sq"]
df_B <- anova_t["gender", "Df"]
df_E <- anova_t["Residuals", "Df"]
eta2 <- SS_B / (SS_B + SS_E)
data.frame(
DV = v,
SS_Between = round(SS_B, 3),
SS_Within = round(SS_E, 3),
df_B = df_B,
df_W = df_E,
MS_Between = round(SS_B / df_B, 3),
MS_Within = round(SS_E / df_E, 3),
F_value = round(F_val, 4),
p_value = round(p_val, 6),
eta2 = round(eta2, 4),
Significant = ifelse(p_val < alpha_bonf, "Yes", "No")
)
})
anova_tbl <- do.call(rbind, anova_results_list)
anova_tbl %>%
kable(caption = sprintf(
"Follow-Up Univariate ANOVAs — Effect of Gender on Each DV (Bonferroni alpha = %.4f)",
alpha_bonf),
col.names = c("DV", "SS Between", "SS Within", "df B", "df W",
"MS Between", "MS Within", "F", "p-value", "eta2", "Significant")) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = TRUE) %>%
column_spec(1, bold = TRUE) %>%
column_spec(9, bold = TRUE) %>%
column_spec(11, bold = TRUE,
color = ifelse(anova_tbl$Significant == "Yes", "#1E8449", "#C0392B")) %>%
row_spec(0, bold = TRUE, color = "white", background = "#34495E")## Bonferroni-adjusted significance threshold: alpha = 0.0167
| DV | SS Between | SS Within | df B | df W | MS Between | MS Within | F | p-value | eta2 | Significant |
|---|---|---|---|---|---|---|---|---|---|---|
| math score | 6481.374 | 223207.7 | 1 | 998 | 6481.374 | 223.655 | 28.9793 | 0 | 0.0282 | Yes |
| reading score | 12710.844 | 200241.6 | 1 | 998 | 12710.844 | 200.643 | 63.3506 | 0 | 0.0597 | Yes |
| writing score | 20930.822 | 209746.3 | 1 | 998 | 20930.822 | 210.167 | 99.5916 | 0 | 0.0907 | Yes |
ci_df <- do.call(rbind, lapply(dv_cols, function(v) {
fem <- df[[v]][df[[iv_col]] == "female"]
mal <- df[[v]][df[[iv_col]] == "male"]
tt <- t.test(fem, mal, var.equal = TRUE)
data.frame(
DV = v,
Group = c("Female", "Male"),
Mean = c(mean(fem), mean(mal)),
Lower_CI = c(mean(fem) - qt(0.975, df_W) * sd(fem)/sqrt(length(fem)),
mean(mal) - qt(0.975, df_W) * sd(mal)/sqrt(length(mal))),
Upper_CI = c(mean(fem) + qt(0.975, df_W) * sd(fem)/sqrt(length(fem)),
mean(mal) + qt(0.975, df_W) * sd(mal)/sqrt(length(mal)))
)
}))
ggplot(ci_df, aes(x = Group, y = Mean, color = Group, group = DV)) +
geom_point(size = 4, shape = 18) +
geom_errorbar(aes(ymin = Lower_CI, ymax = Upper_CI), width = 0.18, linewidth = 1) +
geom_line(aes(group = DV), color = "grey60", linetype = "dashed", linewidth = 0.7) +
scale_color_manual(values = c("#E74C3C", "#3498DB")) +
facet_wrap(~DV, ncol = 3) +
theme_minimal(base_size = 12) +
theme(
strip.background = element_rect(fill = "#34495E"),
strip.text = element_text(color = "white", face = "bold"),
legend.position = "bottom",
plot.title = element_text(hjust = 0.5, face = "bold", size = 14)
) +
labs(title = "Group Means with 95% Confidence Intervals per Dependent Variable",
x = "Gender", y = "Mean Score", color = "Gender")df_long <- df %>%
pivot_longer(cols = all_of(dv_cols), names_to = "Variable", values_to = "Score") %>%
mutate(Variable = factor(Variable, levels = dv_cols))
ggplot(df_long, aes(x = gender, y = Score, fill = gender)) +
geom_boxplot(alpha = 0.72, outlier.shape = 21, outlier.size = 1.4,
outlier.alpha = 0.5) +
geom_jitter(width = 0.16, alpha = 0.18, size = 0.7) +
stat_summary(fun = mean, geom = "point", shape = 23, size = 4,
fill = "white", color = "#2C3E50") +
scale_fill_manual(values = c("#E74C3C", "#3498DB")) +
facet_wrap(~Variable, ncol = 3) +
theme_minimal(base_size = 12) +
theme(
strip.background = element_rect(fill = "#2C3E50"),
strip.text = element_text(color = "white", face = "bold"),
legend.position = "bottom",
plot.title = element_text(hjust = 0.5, face = "bold", size = 14)
) +
labs(title = "Score Distribution by Gender (Diamond = Group Mean)",
x = "Gender", y = "Score", fill = "Gender")centroid_df <- df %>%
group_by(gender) %>%
summarise(across(all_of(dv_cols), mean)) %>%
pivot_longer(-gender, names_to = "Variable", values_to = "Mean")
ggplot(centroid_df, aes(x = Variable, y = Mean, fill = gender, group = gender)) +
geom_col(position = "dodge", alpha = 0.82, color = "white", width = 0.6) +
geom_text(aes(label = round(Mean, 1)), position = position_dodge(width = 0.6),
vjust = -0.4, size = 3.8, fontface = "bold") +
scale_fill_manual(values = c("#E74C3C", "#3498DB")) +
scale_y_continuous(limits = c(0, 80), expand = c(0, 0)) +
theme_minimal(base_size = 12) +
theme(
plot.title = element_text(hjust = 0.5, face = "bold"),
legend.position = "bottom"
) +
labs(title = "Group Centroid Comparison: Mean Scores by Gender and Subject",
x = "Dependent Variable", y = "Mean Score", fill = "Gender")# Linear discriminant axis projection for g=2
lda_result <- MASS::lda(groups ~ ., data = as.data.frame(Y))
df$LD1 <- as.numeric(predict(lda_result, newdata = as.data.frame(Y))$x)
ggplot(df, aes(x = LD1, fill = gender)) +
geom_histogram(bins = 40, alpha = 0.65, position = "identity", color = "white") +
geom_density(aes(y = after_stat(count)), alpha = 0.0,
color = "black", linewidth = 1.2, adjust = 1.5) +
scale_fill_manual(values = c("#E74C3C", "#3498DB")) +
theme_minimal(base_size = 12) +
theme(plot.title = element_text(hjust = 0.5, face = "bold")) +
labs(title = "Discriminant Score Distribution by Gender (LD1)",
subtitle = "Overlap region indicates where classification is least certain",
x = "First Linear Discriminant Score (LD1)", y = "Count", fill = "Gender")sc1 <- ggplot(df, aes(x = `math score`, y = `reading score`, color = gender)) +
geom_point(alpha = 0.4, size = 1.2) +
stat_ellipse(level = 0.95, linewidth = 1.2) +
scale_color_manual(values = c("#E74C3C","#3498DB")) +
theme_minimal(base_size = 11) +
labs(title = "Math vs. Reading (95% confidence ellipse)")
sc2 <- ggplot(df, aes(x = `math score`, y = `writing score`, color = gender)) +
geom_point(alpha = 0.4, size = 1.2) +
stat_ellipse(level = 0.95, linewidth = 1.2) +
scale_color_manual(values = c("#E74C3C","#3498DB")) +
theme_minimal(base_size = 11) +
labs(title = "Math vs. Writing (95% confidence ellipse)")
sc3 <- ggplot(df, aes(x = `reading score`, y = `writing score`, color = gender)) +
geom_point(alpha = 0.4, size = 1.2) +
stat_ellipse(level = 0.95, linewidth = 1.2) +
scale_color_manual(values = c("#E74C3C","#3498DB")) +
theme_minimal(base_size = 11) +
labs(title = "Reading vs. Writing (95% confidence ellipse)")
grid.arrange(sc1, sc2, sc3, ncol = 3,
top = "Bivariate Score Scatterplots with 95% Concentration Ellipses by Gender")The concentration ellipses in the bivariate scatterplots provide a geometric representation of the multivariate centroid difference. In all three pairwise combinations, the female ellipse (red) is shifted notably in the positive direction along the reading and writing axes relative to the male ellipse (blue), while the male ellipse extends further in the mathematics direction. The substantial overlap between ellipses in the math-reading and math-writing planes reflects the moderate between-group separation for mathematics. In contrast, the reading-writing plane shows the most pronounced centroid displacement and least overlap, corroborating the finding that verbal ability measures drive the largest portion of the gender difference.
cat("Eigenvalue Structure of W^{-1}B:\n\n")
ev_df <- data.frame(
Eigenvalue = round(eigenvals, 6),
Canonical_R = round(sqrt(eigenvals / (1 + eigenvals)), 6),
Pct_Variance = round(eigenvals / sum(eigenvals) * 100, 2),
Cumulative_Pct = round(cumsum(eigenvals / sum(eigenvals)) * 100, 2)
)
print(ev_df)
cat("\nLDA Standardised Discriminant Coefficients (Scalings):\n")
print(round(lda_result$scaling, 4))
cat("\nGroup Centroids on LD1:\n")
ld1_centroids <- predict(lda_result)$x %>% as.data.frame() %>%
cbind(Gender = groups) %>%
group_by(Gender) %>%
summarise(Mean_LD1 = round(mean(LD1), 4))
print(ld1_centroids)## Eigenvalue Structure of W^{-1}B:
##
## Eigenvalue Canonical_R Pct_Variance Cumulative_Pct
## 1 1.303115 0.7522 100 100
##
## LDA Standardised Discriminant Coefficients (Scalings):
## LD1
## `math score` 0.1569
## `reading score` -0.0271
## `writing score` -0.1408
##
## Group Centroids on LD1:
## # A tibble: 2 × 2
## Gender Mean_LD1
## <chr> <dbl>
## 1 female -1.10
## 2 male 1.18
coef_df <- data.frame(
Variable = rownames(lda_result$scaling),
Coefficient = as.numeric(lda_result$scaling)
)
ggplot(coef_df, aes(x = reorder(Variable, abs(Coefficient)),
y = Coefficient, fill = Coefficient > 0)) +
geom_col(width = 0.5, color = "white") +
geom_hline(yintercept = 0, linewidth = 0.8, color = "#2C3E50") +
coord_flip() +
scale_fill_manual(values = c("#E74C3C", "#3498DB"),
labels = c("Negative (male-dominant)", "Positive (female-dominant)")) +
theme_minimal(base_size = 12) +
theme(legend.position = "bottom",
plot.title = element_text(hjust = 0.5, face = "bold")) +
labs(title = "Linear Discriminant Coefficients (LD1)",
subtitle = "Relative contribution of each DV to the discriminant function",
x = "", y = "Standardised Coefficient", fill = "Direction")final_tbl <- data.frame(
Component = c(
"Primary test statistic",
"Pillai's Trace value",
"Wilks' Lambda value",
"Approximate F",
"df numerator / denominator",
"p-value",
"Statistical decision",
"Multivariate eta2",
"Math score: F (eta2)",
"Reading score: F (eta2)",
"Writing score: F (eta2)",
"Bonferroni threshold",
"LD1 dominant variable"
),
Result = c(
"Pillai's Trace (primary); Wilks' Lambda (secondary)",
sprintf("V = %.6f", pillai_trace),
sprintf("Lambda = %.6f", wilks_lambda),
sprintf("F = %.4f", F_wilks),
sprintf("%d / %d", df1, df2),
formatC(p_wilks, format = "e", digits = 3),
"Reject H0 at alpha = 0.05",
sprintf("eta2 = %.4f (%.1f%% variance explained)", eta2_multi, eta2_multi * 100),
sprintf("F = %.4f (eta2 = %.4f)",
anova_tbl$F_value[1], anova_tbl$eta2[1]),
sprintf("F = %.4f (eta2 = %.4f)",
anova_tbl$F_value[2], anova_tbl$eta2[2]),
sprintf("F = %.4f (eta2 = %.4f)",
anova_tbl$F_value[3], anova_tbl$eta2[3]),
sprintf("alpha_Bonf = %.4f", alpha_bonf),
"Writing score (largest |coefficient|)"
)
)
final_tbl %>%
kable(caption = "Comprehensive MANOVA Results Summary",
col.names = c("Component", "Result")) %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = TRUE) %>%
column_spec(1, bold = TRUE, width = "18em") %>%
row_spec(0, bold = TRUE, color = "white", background = "#2C3E50") %>%
row_spec(c(1, 6, 7), background = "#EBF5FB")| Component | Result |
|---|---|
| Primary test statistic | Pillai’s Trace (primary); Wilks’ Lambda (secondary) |
| Pillai’s Trace value | V = 0.565805 |
| Wilks’ Lambda value | Lambda = 0.434195 |
| Approximate F | F = 432.6342 |
| df numerator / denominator | 3 / 996 |
| p-value | 6.997e-180 |
| Statistical decision | Reject H0 at alpha = 0.05 |
| Multivariate eta2 | eta2 = 0.5658 (56.6% variance explained) |
| Math score: F (eta2) | F = 28.9793 (eta2 = 0.0282) |
| Reading score: F (eta2) | F = 63.3506 (eta2 = 0.0597) |
| Writing score: F (eta2) | F = 99.5916 (eta2 = 0.0907) |
| Bonferroni threshold | alpha_Bonf = 0.0167 |
| LD1 dominant variable | Writing score (largest |coefficient|) |
The One-Way MANOVA produced a statistically significant multivariate effect of gender on the combined dependent variable comprising mathematics, reading, and writing scores. Pillai’s Trace was V = 0.5658, F(3, 996) = 432.6342, p < 0.001. Wilks’ Lambda was Lambda = 0.4342, corroborating the Pillai result with identical F and p values under the exact two-group transformation. All four multivariate criteria Pillai’s Trace, Wilks’ Lambda, Hotelling-Lawley Trace, and Roy’s Largest Root converged on the same decision: reject the null hypothesis that the multivariate mean vectors of female and male students are equal.
The analysis was motivated by a long-standing empirical question in educational research whether gender is associated with differential academic performance across multiple subject domains simultaneously. The dataset captures the examination results of 1,000 students across three subjects mathematics, reading, and writing alongside demographic and preparation variables. Prior univariate studies have often reported gender differences in individual subjects, but have done so without accounting for the shared covariance structure among outcomes, risking both inflated error rates and loss of analytical power.
When inspecting the group centroids at the descriptive level, an internally contradictory pattern is apparent. Female students outperform male students in reading (by approximately 7.1 points) and writing (by approximately 9.2 points), while male students record a marginally higher mean in mathematics (by approximately 5.1 points). This bidirectional pattern presents a problem for any single-outcome analysis: focusing on mathematics alone would suggest a male advantage, while focusing on verbal scores would suggest a female advantage. Neither captures the full multivariate reality of how gender relates to academic performance as a composite construct.
MANOVA resolves this tension by evaluating the significance of the entire centroid displacement simultaneously, weighting dimensions by the within-group covariance structure rather than treating all axes equally. The result Pillai’s Trace = 0.5658, p < 0.001 confirms that the joint centroid difference is not attributable to sampling variability. The multivariate effect size (eta2 = 0.5658) indicates that gender accounts for approximately 56.6% of the variance in the composite academic performance space, which falls in the small-to-medium range by Cohen’s (1988) benchmarks but is practically meaningful given the breadth of the educational construct being measured. The follow-up univariate ANOVAs clarify the anatomy of this multivariate effect: the effect of gender is statistically significant and largest for writing (F = 99.59, eta2 = 0.0907) and reading (F = 63.35, eta2 = 0.0597), while the mathematics difference though marginally significant is the smallest contributor. The linear discriminant analysis confirms this interpretation writing received the highest standardised discriminant coefficient on LD1, followed by reading, indicating that the dimension along which the two gender groups are most separated is defined primarily by verbal and written expression ability rather than quantitative reasoning.
The finding that gender produces a statistically significant and multivariate-consistent difference in academic performance carries several practical and methodological implications. From a pedagogical standpoint, the pattern female advantage in verbal domains, male parity or slight advantage in mathematics mirrors patterns reported in large-scale international assessments such as PISA and TIMSS, suggesting that this dataset, though simulated, replicates a substantive empirical regularity. Interventions aimed at closing gender-based performance gaps would therefore need to be subject-specific rather than general reading and writing curricula may benefit from strategies that engage male students more effectively, while mathematics instruction may benefit from approaches that reduce stereotype threat for female students. Methodologically, the MANOVA framework demonstrates its value here precisely because the bidirectional nature of the gender effect would be obscured and the familywise error rate inflated by conducting three separate ANOVAs. The unified multivariate test provides a single, coherent, and efficient answer: gender is associated with a meaningful and statistically reliable difference in the profile of academic performance across all three subjects simultaneously.
The One-Way MANOVA confirmed a statistically significant effect of gender on the multivariate centroid of academic performance, F(3, 996) = 432.6342, p < 0.001, Pillai’s V = 0.5658, eta2 = 0.5658. The null hypothesis that the mean vectors of female and male students are equal across all three outcome dimensions is rejected with strong evidence. Follow-up univariate analyses under Bonferroni correction identified statistically significant group differences for reading and writing scores, with writing exhibiting the largest effect size. Mathematics showed a smaller and less pronounced group difference. The discriminant function analysis further established that the primary axis separating the two gender groups is dominated by verbal performance measures. These results provide a statistically rigorous, multivariate-coherent foundation for subsequent analyses using ANCOVA and MANCOVA, in which the covariate of test preparation course will be incorporated to assess whether the gender effect on academic performance persists after controlling for pre-examination preparation behaviour.
Date: April 16, 2026
S1 Data Science | FMIPA UNESA | 2026