1 Introduction

1.1 Research Context

This document presents the full execution of a One-Way Multivariate Analysis of Variance (MANOVA) on the Students Performance in Exams dataset. The analysis directly follows the assumption testing phase, in which all five prerequisites were verified and confirmed to be satisfied prior to conducting the inferential procedure. The central question driving this analysis is whether gender as an independent grouping variable produces a statistically discernible difference in the centroid that is, the multivariate mean vector of three academic outcome variables measured simultaneously mathematics score, reading score, and writing score.

MANOVA is selected over a series of independent ANOVAs for two compounding reasons. First, the Bartlett Sphericity Test confirmed that the three dependent variables are significantly intercorrelated (Pearson r ranging from 0.818 to 0.954), which means that treating them as isolated outcomes would ignore meaningful shared variance and inflate the familywise Type I error rate. Second, the Box’s M Test confirmed homogeneity of the variance-covariance matrices across gender groups, validating the pooled within-group covariance structure that MANOVA relies upon. Together, these conditions position MANOVA as both methodologically appropriate and statistically efficient.

1.2 Variable Structure

The analysis follows a one-way, between-subjects design with the following configuration:

Independent Variable (IV): gender, a nominal factor with two levels female (n = 518) and male (n = 482).
Dependent Variables (DVs): math score, reading score, and writing score each a continuous integer-scale variable representing standardised examination performance.
Unit of Analysis: Individual students, each observed once and independently.

2 Data Preparation

2.1 Loading and Encoding

df <- read.csv("StudentsPerformance.csv", check.names = FALSE)

dv_cols <- c("math score", "reading score", "writing score")
iv_col  <- "gender"

Y <- df[, dv_cols]
groups <- df[[iv_col]]

cat("Dataset dimensions:", nrow(df), "rows x", ncol(df), "columns\n")
cat("Factor levels:", paste(levels(factor(groups)), collapse = ", "), "\n")
cat("Group sizes:\n")
print(table(groups))

## Dataset dimensions: 1000 rows x 8 columns
## Factor levels: female, male 
## Group sizes:
## groups
## female   male 
##    518    482

2.2 Descriptive Statistics by Group

desc_group <- df %>%
  group_by(gender) %>%
  summarise(
    n_math       = n(),
    mean_math    = round(mean(`math score`), 3),
    sd_math      = round(sd(`math score`), 3),
    mean_reading = round(mean(`reading score`), 3),
    sd_reading   = round(sd(`reading score`), 3),
    mean_writing = round(mean(`writing score`), 3),
    sd_writing   = round(sd(`writing score`), 3)
  )

desc_group %>%
  kable(caption = "Group Centroids and Standard Deviations by Gender") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  column_spec(1, bold = TRUE) %>%
  row_spec(0, bold = TRUE, color = "white", background = "#34495E")

Group Centroids and Standard Deviations by Gender
gender	n_math	mean_math	sd_math	mean_reading	sd_reading	mean_writing	sd_writing
female	518	63.633	15.491	72.608	14.378	72.467	14.845
male	482	68.728	14.356	65.473	13.932	63.311	14.114

desc_full <- data.frame(
  Variable = dv_cols,
  N_Female = sapply(dv_cols, function(v) sum(!is.na(df[[v]][df$gender == "female"]))),
  Mean_Female = round(sapply(dv_cols, function(v) mean(df[[v]][df$gender == "female"])), 3),
  SD_Female   = round(sapply(dv_cols, function(v) sd(df[[v]][df$gender == "female"])), 3),
  N_Male   = sapply(dv_cols, function(v) sum(!is.na(df[[v]][df$gender == "male"]))),
  Mean_Male   = round(sapply(dv_cols, function(v) mean(df[[v]][df$gender == "male"])), 3),
  SD_Male     = round(sapply(dv_cols, function(v) sd(df[[v]][df$gender == "male"])), 3),
  Mean_Diff   = round(sapply(dv_cols, function(v)
    mean(df[[v]][df$gender == "female"]) - mean(df[[v]][df$gender == "male"])), 3)
)

desc_full %>%
  kable(caption = "Detailed Descriptive Statistics: Female vs. Male across All DVs") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = TRUE) %>%
  column_spec(1, bold = TRUE) %>%
  column_spec(8, bold = TRUE) %>%
  row_spec(0, bold = TRUE, color = "white", background = "#34495E")

Detailed Descriptive Statistics: Female vs. Male across All DVs
	Variable	N_Female	Mean_Female	SD_Female	N_Male	Mean_Male	SD_Male	Mean_Diff
math score	math score	518	63.633	15.491	482	68.728	14.356	-5.095
reading score	reading score	518	72.608	14.378	482	65.473	13.932	7.135
writing score	writing score	518	72.467	14.845	482	63.311	14.114	9.156

The descriptive comparison reveals a discernible pattern at the group centroid level. Female students demonstrate higher mean scores in reading (mean = 72.61) and writing (mean = 72.47) relative to male students (reading mean = 65.47, writing mean = 63.31). In contrast, male students record a marginally higher mean in mathematics (male mean = 68.73 vs. female mean = 63.63). Whether these observed differences in the multivariate centroid are statistically significant beyond sampling variability is precisely the question that MANOVA is designed to address.

3 Theoretical Framework of MANOVA

3.1 The MANOVA Model

One-Way MANOVA extends the univariate ANOVA to a setting with \(p\) dependent variables measured simultaneously. Each observation vector is decomposed as:

\[\mathbf{X}_{\ell j} = \boldsymbol{\mu} + \boldsymbol{\tau}_\ell + \mathbf{e}_{\ell j}, \quad j = 1, 2, \ldots, n_\ell, \quad \ell = 1, 2, \ldots, g\]

where \(\boldsymbol{\mu}\) is the \(p \times 1\) grand mean vector, \(\boldsymbol{\tau}_\ell\) is the \(p \times 1\) treatment effect vector for group \(\ell\), and \(\mathbf{e}_{\ell j} \sim N_p(\mathbf{0}, \boldsymbol{\Sigma})\) is the error vector.

The total sum-of-squares-and-cross-products (SSCP) matrix decomposes as:

\[\mathbf{T} = \mathbf{B} + \mathbf{W}\]

where:

Between-group SSCP matrix (B): \(\mathbf{B} = \sum_{\ell=1}^{g} n_\ell (\bar{\mathbf{x}}_\ell - \bar{\mathbf{x}})(\bar{\mathbf{x}}_\ell - \bar{\mathbf{x}})'\), with \(df_B = g - 1\)
Within-group SSCP matrix (W): \(\mathbf{W} = \sum_{\ell=1}^{g}\sum_{j=1}^{n_\ell}(\mathbf{x}_{\ell j} - \bar{\mathbf{x}}_\ell)(\mathbf{x}_{\ell j} - \bar{\mathbf{x}}_\ell)'\), with \(df_W = \sum n_\ell - g\)

3.2 Hypotheses

Null hypothesis: \[H_0: \boldsymbol{\tau}_1 = \boldsymbol{\tau}_2 = \cdots = \boldsymbol{\tau}_g = \mathbf{0}\]

Equivalently, the mean vectors are equal across all groups: \[H_0: \boldsymbol{\mu}_1 = \boldsymbol{\mu}_2 = \cdots = \boldsymbol{\mu}_g\]

Alternative hypothesis: \[H_1: \text{At least one } \boldsymbol{\tau}_\ell \neq \mathbf{0}\]

In the specific context of this analysis with \(g = 2\) groups: \[H_0: \boldsymbol{\mu}_{\text{female}} = \boldsymbol{\mu}_{\text{male}}\] \[H_1: \boldsymbol{\mu}_{\text{female}} \neq \boldsymbol{\mu}_{\text{male}}\]

3.3 Multivariate Test Statistics

Four multivariate test statistics are routinely computed from the eigenvalues \(\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_s\) of the matrix \(\mathbf{W}^{-1}\mathbf{B}\), where \(s = \min(p, g-1)\):

Statistic	Formula	Interpretation
Wilks’ Lambda	\(\Lambda = \prod_{i=1}^{s} \frac{1}{1+\lambda_i} = \frac{\|\mathbf{W}\|}{\|\mathbf{B}+\mathbf{W}\|}\)	Ratio of within to total variance; smaller values indicate stronger group separation
Pillai’s Trace	\(V = \sum_{i=1}^{s} \frac{\lambda_i}{1+\lambda_i}\)	Sum of explained variance proportions; most robust statistic
Hotelling-Lawley Trace	\(T = \sum_{i=1}^{s} \lambda_i\)	Sum of eigenvalues; powerful when assumptions are met
Roy’s Largest Root	\(\theta = \frac{\lambda_1}{1+\lambda_1}\)	Based on the single dominant dimension; liberal under violations

Since \(g = 2\) (two groups) and therefore \(s = 1\), all four statistics are exact and yield identical F-ratios. Pillai’s Trace is emphasised as the primary criterion given its documented robustness under moderate departures from multivariate normality (Olson, 1974).

4 SSCP Matrix Computation

4.1 Grand Mean and Group Centroids

grand_mean  <- colMeans(Y)
group_means <- aggregate(Y, by = list(Gender = groups), FUN = mean)

cat("Grand Mean Vector:\n")
print(round(grand_mean, 4))

cat("\nGroup Centroid Vectors:\n")
group_means_print <- group_means
group_means_print[, -1] <- round(group_means[, -1], 4)
print(group_means_print)

cat("\nDifference in Centroids (Female - Male):\n")
fem_mean  <- colMeans(Y[groups == "female", ])
male_mean <- colMeans(Y[groups == "male", ])
print(round(fem_mean - male_mean, 4))

## Grand Mean Vector:
##    math score reading score writing score 
##        66.089        69.169        68.054 
## 
## Group Centroid Vectors:
##   Gender math score reading score writing score
## 1 female    63.6332       72.6081       72.4672
## 2   male    68.7282       65.4730       63.3112
## 
## Difference in Centroids (Female - Male):
##    math score reading score writing score 
##       -5.0950        7.1351        9.1560

4.2 Between-Group SSCP Matrix (B)

g       <- length(unique(groups))
n_total <- nrow(Y)
p       <- ncol(Y)

B <- Reduce("+", lapply(unique(groups), function(grp) {
  idx  <- groups == grp
  n_g  <- sum(idx)
  diff <- colMeans(Y[idx, ]) - grand_mean
  n_g  * outer(diff, diff)
}))

rownames(B) <- colnames(B) <- dv_cols

cat("Between-Group SSCP Matrix (B), df =", g - 1, "\n\n")
print(round(B, 4))
cat("\nDeterminant |B| =", round(det(B), 4), "\n")

## Between-Group SSCP Matrix (B), df = 1 
## 
##               math score reading score writing score
## math score      6481.374     -9076.548     -11647.34
## reading score  -9076.548     12710.844      16310.99
## writing score -11647.338     16310.990      20930.82
## 
## Determinant |B| = 0

4.3 Within-Group SSCP Matrix (W)

W <- Reduce("+", lapply(unique(groups), function(grp) {
  Y_g   <- Y[groups == grp, ]
  mu_g  <- colMeans(Y_g)
  swept <- sweep(Y_g, 2, mu_g)
  t(as.matrix(swept)) %*% as.matrix(swept)
}))

rownames(W) <- colnames(W) <- dv_cols
df_W <- n_total - g

cat("Within-Group SSCP Matrix (W), df =", df_W, "\n\n")
print(round(W, 4))
cat("\nDeterminant |W| =", round(det(W), 4), "\n")

## Within-Group SSCP Matrix (W), df = 998 
## 
##               math score reading score writing score
## math score      223207.7      189894.5      196401.5
## reading score   189894.5      200241.6      195263.9
## writing score   196401.5      195263.9      209746.3
## 
## Determinant |W| = 1.41738e+14

4.4 Total SSCP Matrix (T = B + W)

T_mat <- B + W
df_T  <- n_total - 1

cat("Total SSCP Matrix (T = B + W), df =", df_T, "\n\n")
print(round(T_mat, 4))
cat("\nDeterminant |T| =", round(det(T_mat), 4), "\n")

## Total SSCP Matrix (T = B + W), df = 999 
## 
##               math score reading score writing score
## math score      229689.1      180818.0      184754.2
## reading score   180818.0      212952.4      211574.9
## writing score   184754.2      211574.9      230677.1
## 
## Determinant |T| = 3.264389e+14

4.5 SSCP Heatmap Visualisation

plot_sscp <- function(mat, title, palette_high) {
  mat_df <- as.data.frame(as.table(mat))
  colnames(mat_df) <- c("Row", "Col", "Value")
  ggplot(mat_df, aes(x = Col, y = Row, fill = Value)) +
    geom_tile(color = "white", linewidth = 0.8) +
    geom_text(aes(label = round(Value, 1)), size = 3.8, fontface = "bold") +
    scale_fill_gradient2(low = "#EBF5FB", mid = "#AED6F1",
                         high = palette_high,
                         midpoint = median(mat_df$Value)) +
    theme_minimal(base_size = 11) +
    theme(axis.text.x  = element_text(angle = 25, hjust = 1),
          plot.title    = element_text(hjust = 0.5, face = "bold", size = 12),
          legend.position = "right") +
    labs(title = title, x = "", y = "", fill = "Value")
}

grid.arrange(
  plot_sscp(B,     "Between-Group SSCP (B)", "#1A5276"),
  plot_sscp(W,     "Within-Group SSCP (W)",  "#1E8449"),
  plot_sscp(T_mat, "Total SSCP (T = B + W)", "#6E2F6E"),
  ncol = 3
)

The SSCP matrices reveal the core mechanics of the MANOVA decomposition. The between-group matrix B captures variance attributable to group membership the diagonal entries reflect sum-of-squared deviations of group centroids from the grand mean, while off-diagonal entries encode covariation between DVs across groups. The within-group matrix W absorbs residual variance unexplained by group membership; its diagonal entries are substantially larger than those in B, which is typical for individual-level educational data characterised by broad score distributions. The key inferential question is whether the signal in B is large relative to the noise in W formally assessed through the eigenstructure of W\(^{-1}\)B.

5 MANOVA Execution

5.1 Manual Wilks’ Lambda Computation

WinvB     <- solve(W) %*% B
eigenvals <- Re(eigen(WinvB)$values)
eigenvals <- eigenvals[eigenvals > 1e-10]

wilks_lambda <- prod(1 / (1 + eigenvals))
det_ratio    <- det(W) / det(T_mat)

pillai_trace <- sum(eigenvals / (1 + eigenvals))
hotelling_T  <- sum(eigenvals)
roy_root     <- max(eigenvals) / (1 + max(eigenvals))

cat("Eigenvalues of W^{-1}B:\n")
print(round(eigenvals, 6))

cat("\n--- Manual Multivariate Statistics ---\n")
cat(sprintf("Wilks' Lambda (det ratio)   : %.6f\n", det_ratio))
cat(sprintf("Wilks' Lambda (eigenvalues) : %.6f\n", wilks_lambda))
cat(sprintf("Pillai's Trace              : %.6f\n", pillai_trace))
cat(sprintf("Hotelling-Lawley Trace      : %.6f\n", hotelling_T))
cat(sprintf("Roy's Largest Root          : %.6f\n", roy_root))

## Eigenvalues of W^{-1}B:
## [1] 1.303115
## 
## --- Manual Multivariate Statistics ---
## Wilks' Lambda (det ratio)   : 0.434195
## Wilks' Lambda (eigenvalues) : 0.434195
## Pillai's Trace              : 0.565805
## Hotelling-Lawley Trace      : 1.303115
## Roy's Largest Root          : 0.565805

5.2 F-Ratio Transformation

# Exact F transformation for g=2 (s=1), all four statistics identical
# F = ((1 - Lambda) / Lambda) * ((df_W - p + 1) / p)
# df1 = p, df2 = df_W - p + 1

df1      <- p
df2      <- df_W - p + 1
F_wilks  <- ((1 - wilks_lambda) / wilks_lambda) * (df2 / df1)
p_wilks  <- pf(F_wilks, df1, df2, lower.tail = FALSE)

cat("--- F-Ratio Transformation (Wilks' Lambda, g=2 exact) ---\n")
cat(sprintf("df numerator (p)         : %d\n",    df1))
cat(sprintf("df denominator (df_W-p+1): %d\n",    df2))
cat(sprintf("F statistic              : %.4f\n",   F_wilks))
cat(sprintf("p-value                  : %.2e\n",   p_wilks))
cat(sprintf("Decision                 : %s\n",
            ifelse(p_wilks < 0.05,
                   "Reject H0 — significant multivariate group difference",
                   "Fail to reject H0")))

## --- F-Ratio Transformation (Wilks' Lambda, g=2 exact) ---
## df numerator (p)         : 3
## df denominator (df_W-p+1): 996
## F statistic              : 432.6342
## p-value                  : 7.00e-180
## Decision                 : Reject H0 — significant multivariate group difference

5.3 R Built-in MANOVA

manova_model <- manova(
  cbind(`math score`, `reading score`, `writing score`) ~ gender,
  data = df
)

cat("--- MANOVA Summary: All Four Multivariate Tests ---\n\n")
for (test_name in c("Pillai", "Wilks", "Hotelling-Lawley", "Roy")) {
  cat(sprintf("[ %s ]\n", test_name))
  print(summary(manova_model, test = test_name))
  cat("\n")
}

## --- MANOVA Summary: All Four Multivariate Tests ---
## 
## [ Pillai ]
##            Df  Pillai approx F num Df den Df    Pr(>F)    
## gender      1 0.56581   432.63      3    996 < 2.2e-16 ***
## Residuals 998                                             
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## [ Wilks ]
##            Df   Wilks approx F num Df den Df    Pr(>F)    
## gender      1 0.43419   432.63      3    996 < 2.2e-16 ***
## Residuals 998                                             
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## [ Hotelling-Lawley ]
##            Df Hotelling-Lawley approx F num Df den Df    Pr(>F)    
## gender      1           1.3031   432.63      3    996 < 2.2e-16 ***
## Residuals 998                                                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## [ Roy ]
##            Df    Roy approx F num Df den Df    Pr(>F)    
## gender      1 1.3031   432.63      3    996 < 2.2e-16 ***
## Residuals 998                                            
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

5.4 Consolidated Results Table

manova_results <- data.frame(
  Statistic     = c("Pillai's Trace", "Wilks' Lambda",
                    "Hotelling-Lawley Trace", "Roy's Largest Root"),
  Value         = round(c(pillai_trace, wilks_lambda, hotelling_T, roy_root), 6),
  Approx_F      = round(c(
    summary(manova_model, test = "Pillai")$stats[1, "approx F"],
    summary(manova_model, test = "Wilks")$stats[1, "approx F"],
    summary(manova_model, test = "Hotelling-Lawley")$stats[1, "approx F"],
    summary(manova_model, test = "Roy")$stats[1, "approx F"]
  ), 4),
  df1 = c(
    summary(manova_model, test = "Pillai")$stats[1, "num Df"],
    summary(manova_model, test = "Wilks")$stats[1, "num Df"],
    summary(manova_model, test = "Hotelling-Lawley")$stats[1, "num Df"],
    summary(manova_model, test = "Roy")$stats[1, "num Df"]
  ),
  df2 = c(
    summary(manova_model, test = "Pillai")$stats[1, "den Df"],
    summary(manova_model, test = "Wilks")$stats[1, "den Df"],
    summary(manova_model, test = "Hotelling-Lawley")$stats[1, "den Df"],
    summary(manova_model, test = "Roy")$stats[1, "den Df"]
  ),
  p_value = formatC(c(
    summary(manova_model, test = "Pillai")$stats[1, "Pr(>F)"],
    summary(manova_model, test = "Wilks")$stats[1, "Pr(>F)"],
    summary(manova_model, test = "Hotelling-Lawley")$stats[1, "Pr(>F)"],
    summary(manova_model, test = "Roy")$stats[1, "Pr(>F)"]
  ), format = "e", digits = 3),
  Decision = rep("Reject H0", 4)
)

manova_results %>%
  kable(caption = "One-Way MANOVA: Multivariate Test Statistics — Effect of Gender on Academic Performance",
        col.names = c("Test Statistic", "Value", "Approx. F",
                      "df (num)", "df (den)", "p-value", "Decision")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE) %>%
  column_spec(1, bold = TRUE) %>%
  column_spec(7, bold = TRUE, color = "#C0392B") %>%
  row_spec(0, bold = TRUE, color = "white", background = "#2C3E50") %>%
  row_spec(1, background = "#D5F5E3")   # highlight Pillai as primary

One-Way MANOVA: Multivariate Test Statistics — Effect of Gender on Academic Performance
Test Statistic	Value	Approx. F	df (num)	df (den)	p-value	Decision
Pillai’s Trace	0.565805	432.6342	3	996	6.997e-180	Reject H0
Wilks’ Lambda	0.434195	432.6342	3	996	6.997e-180	Reject H0
Hotelling-Lawley Trace	1.303115	432.6342	3	996	6.997e-180	Reject H0
Roy’s Largest Root	0.565805	432.6342	3	996	6.997e-180	Reject H0

6 Effect Size

6.1 Partial Eta-Squared

# Multivariate eta-squared from Wilks' Lambda
# eta2_multivariate = 1 - Lambda^(1/s),  s = min(p, g-1)
s <- min(p, g - 1)
eta2_multi <- 1 - wilks_lambda^(1 / s)

# Univariate partial eta-squared per DV (from follow-up ANOVA)
uni_anova  <- summary.aov(manova_model)
eta2_uni <- sapply(seq_along(dv_cols), function(i) {
  ss_b <- uni_anova[[i]][["Sum Sq"]][1]
  ss_e <- uni_anova[[i]][["Sum Sq"]][2]
  ss_b / (ss_b + ss_e)
})
names(eta2_uni) <- dv_cols

cat("Multivariate Eta-Squared (1 - Lambda^(1/s)):\n")
cat(sprintf("  eta2 = %.4f  (%.1f%% of multivariate variance explained by gender)\n\n",
            eta2_multi, eta2_multi * 100))

cat("Univariate Partial Eta-Squared per DV:\n")
for (i in seq_along(dv_cols)) {
  magnitude <- ifelse(eta2_uni[i] >= 0.14, "large",
               ifelse(eta2_uni[i] >= 0.06, "medium", "small"))
  cat(sprintf("  %-20s: eta2 = %.4f  (%s effect, Cohen 1988)\n",
              dv_cols[i], eta2_uni[i], magnitude))
}

## Multivariate Eta-Squared (1 - Lambda^(1/s)):
##   eta2 = 0.5658  (56.6% of multivariate variance explained by gender)
## 
## Univariate Partial Eta-Squared per DV:
##   math score          : eta2 = 0.0282  (small effect, Cohen 1988)
##   reading score       : eta2 = 0.0597  (small effect, Cohen 1988)
##   writing score       : eta2 = 0.0907  (medium effect, Cohen 1988)

effect_tbl <- data.frame(
  Variable      = c("Multivariate (overall)", dv_cols),
  Partial_eta2  = round(c(eta2_multi, eta2_uni), 4),
  Pct_Explained = paste0(round(c(eta2_multi, eta2_uni) * 100, 2), "%"),
  Magnitude     = c(
    ifelse(eta2_multi >= 0.14, "Large", ifelse(eta2_multi >= 0.06, "Medium", "Small")),
    sapply(eta2_uni, function(x) ifelse(x >= 0.14, "Large",
                                        ifelse(x >= 0.06, "Medium", "Small")))
  )
)

effect_tbl %>%
  kable(caption = "Effect Size: Partial Eta-Squared (Cohen, 1988 benchmarks: small = 0.01, medium = 0.06, large = 0.14)",
        col.names = c("Variable", "Partial eta2", "Variance Explained", "Magnitude")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE) %>%
  column_spec(4, bold = TRUE) %>%
  row_spec(0, bold = TRUE, color = "white", background = "#34495E") %>%
  row_spec(1, background = "#EBF5FB", bold = TRUE)

Effect Size: Partial Eta-Squared (Cohen, 1988 benchmarks: small = 0.01, medium = 0.06, large = 0.14)
	Variable	Partial eta2	Variance Explained	Magnitude
	Multivariate (overall)	0.5658	56.58%	Large
math score	math score	0.0282	2.82%	Small
reading score	reading score	0.0597	5.97%	Small
writing score	writing score	0.0907	9.07%	Medium

6.2 Effect Size Visualisation

eff_df <- data.frame(
  Variable  = effect_tbl$Variable[-1],
  eta2      = effect_tbl$Partial_eta2[-1]
)

ggplot(eff_df, aes(x = reorder(Variable, eta2), y = eta2, fill = eta2)) +
  geom_col(width = 0.55, color = "white") +
  geom_text(aes(label = sprintf("%.4f", eta2)), hjust = -0.15,
            size = 4, fontface = "bold") +
  geom_hline(yintercept = 0.01, linetype = "dotted", color = "#E74C3C", linewidth = 0.8) +
  geom_hline(yintercept = 0.06, linetype = "dashed", color = "#E67E22", linewidth = 0.8) +
  geom_hline(yintercept = 0.14, linetype = "dashed", color = "#27AE60", linewidth = 0.8) +
  annotate("text", x = 0.6, y = 0.012, label = "small (0.01)",  size = 3, color = "#E74C3C") +
  annotate("text", x = 0.6, y = 0.062, label = "medium (0.06)", size = 3, color = "#E67E22") +
  annotate("text", x = 0.6, y = 0.142, label = "large (0.14)",  size = 3, color = "#27AE60") +
  coord_flip() +
  scale_fill_gradient(low = "#AED6F1", high = "#1A5276") +
  scale_y_continuous(limits = c(0, 0.22), expand = c(0, 0)) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "none",
        plot.title = element_text(hjust = 0.5, face = "bold")) +
  labs(title = "Partial Eta-Squared by Dependent Variable",
       x = "", y = "Partial eta2")

7 Follow-Up Univariate ANOVAs

7.1 Rationale for Follow-Up Tests

When MANOVA yields a significant multivariate result, follow-up univariate ANOVAs are conducted to identify which specific dependent variables contributed to the overall multivariate group separation. These tests are conducted with a Bonferroni-adjusted significance threshold of \(\alpha_{\text{adj}} = 0.05 / 3 = 0.0167\) to control the familywise error rate across the three simultaneous comparisons.

7.2 Univariate ANOVA Results

alpha_bonf <- 0.05 / length(dv_cols)

cat(sprintf("Bonferroni-adjusted significance threshold: alpha = %.4f\n\n", alpha_bonf))

anova_results_list <- lapply(dv_cols, function(v) {
  model   <- aov(reformulate(iv_col, response = paste0("`", v, "`")), data = df)
  anova_t <- summary(model)[[1]]
  F_val   <- anova_t["gender", "F value"]
  p_val   <- anova_t["gender", "Pr(>F)"]
  SS_B    <- anova_t["gender", "Sum Sq"]
  SS_E    <- anova_t["Residuals", "Sum Sq"]
  df_B    <- anova_t["gender", "Df"]
  df_E    <- anova_t["Residuals", "Df"]
  eta2    <- SS_B / (SS_B + SS_E)

  data.frame(
    DV          = v,
    SS_Between  = round(SS_B, 3),
    SS_Within   = round(SS_E, 3),
    df_B        = df_B,
    df_W        = df_E,
    MS_Between  = round(SS_B / df_B, 3),
    MS_Within   = round(SS_E / df_E, 3),
    F_value     = round(F_val, 4),
    p_value     = round(p_val, 6),
    eta2        = round(eta2, 4),
    Significant = ifelse(p_val < alpha_bonf, "Yes", "No")
  )
})

anova_tbl <- do.call(rbind, anova_results_list)

anova_tbl %>%
  kable(caption = sprintf(
    "Follow-Up Univariate ANOVAs — Effect of Gender on Each DV (Bonferroni alpha = %.4f)",
    alpha_bonf),
    col.names = c("DV", "SS Between", "SS Within", "df B", "df W",
                  "MS Between", "MS Within", "F", "p-value", "eta2", "Significant")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE) %>%
  column_spec(1, bold = TRUE) %>%
  column_spec(9, bold = TRUE) %>%
  column_spec(11, bold = TRUE,
              color = ifelse(anova_tbl$Significant == "Yes", "#1E8449", "#C0392B")) %>%
  row_spec(0, bold = TRUE, color = "white", background = "#34495E")

## Bonferroni-adjusted significance threshold: alpha = 0.0167

Follow-Up Univariate ANOVAs — Effect of Gender on Each DV (Bonferroni alpha = 0.0167)
DV	SS Between	SS Within	df B	df W	MS Between	MS Within	F	eta2	Significant
math score	6481.374	223207.7	1	998	6481.374	223.655	28.9793	0.0282	Yes
reading score	12710.844	200241.6	1	998	12710.844	200.643	63.3506	0.0597	Yes
writing score	20930.822	209746.3	1	998	20930.822	210.167	99.5916	0.0907	Yes

7.3 Mean Difference and Confidence Intervals per DV

ci_df <- do.call(rbind, lapply(dv_cols, function(v) {
  fem  <- df[[v]][df[[iv_col]] == "female"]
  mal  <- df[[v]][df[[iv_col]] == "male"]
  tt   <- t.test(fem, mal, var.equal = TRUE)
  data.frame(
    DV        = v,
    Group     = c("Female", "Male"),
    Mean      = c(mean(fem), mean(mal)),
    Lower_CI  = c(mean(fem) - qt(0.975, df_W) * sd(fem)/sqrt(length(fem)),
                  mean(mal) - qt(0.975, df_W) * sd(mal)/sqrt(length(mal))),
    Upper_CI  = c(mean(fem) + qt(0.975, df_W) * sd(fem)/sqrt(length(fem)),
                  mean(mal) + qt(0.975, df_W) * sd(mal)/sqrt(length(mal)))
  )
}))

ggplot(ci_df, aes(x = Group, y = Mean, color = Group, group = DV)) +
  geom_point(size = 4, shape = 18) +
  geom_errorbar(aes(ymin = Lower_CI, ymax = Upper_CI), width = 0.18, linewidth = 1) +
  geom_line(aes(group = DV), color = "grey60", linetype = "dashed", linewidth = 0.7) +
  scale_color_manual(values = c("#E74C3C", "#3498DB")) +
  facet_wrap(~DV, ncol = 3) +
  theme_minimal(base_size = 12) +
  theme(
    strip.background = element_rect(fill = "#34495E"),
    strip.text       = element_text(color = "white", face = "bold"),
    legend.position  = "bottom",
    plot.title       = element_text(hjust = 0.5, face = "bold", size = 14)
  ) +
  labs(title = "Group Means with 95% Confidence Intervals per Dependent Variable",
       x = "Gender", y = "Mean Score", color = "Gender")

8 Visualisation of Group Separation

8.1 Boxplot Distribution by Gender

df_long <- df %>%
  pivot_longer(cols = all_of(dv_cols), names_to = "Variable", values_to = "Score") %>%
  mutate(Variable = factor(Variable, levels = dv_cols))

ggplot(df_long, aes(x = gender, y = Score, fill = gender)) +
  geom_boxplot(alpha = 0.72, outlier.shape = 21, outlier.size = 1.4,
               outlier.alpha = 0.5) +
  geom_jitter(width = 0.16, alpha = 0.18, size = 0.7) +
  stat_summary(fun = mean, geom = "point", shape = 23, size = 4,
               fill = "white", color = "#2C3E50") +
  scale_fill_manual(values = c("#E74C3C", "#3498DB")) +
  facet_wrap(~Variable, ncol = 3) +
  theme_minimal(base_size = 12) +
  theme(
    strip.background = element_rect(fill = "#2C3E50"),
    strip.text       = element_text(color = "white", face = "bold"),
    legend.position  = "bottom",
    plot.title       = element_text(hjust = 0.5, face = "bold", size = 14)
  ) +
  labs(title = "Score Distribution by Gender (Diamond = Group Mean)",
       x = "Gender", y = "Score", fill = "Gender")

8.2 Centroid Comparison (Radar / Spider Plot)

centroid_df <- df %>%
  group_by(gender) %>%
  summarise(across(all_of(dv_cols), mean)) %>%
  pivot_longer(-gender, names_to = "Variable", values_to = "Mean")

ggplot(centroid_df, aes(x = Variable, y = Mean, fill = gender, group = gender)) +
  geom_col(position = "dodge", alpha = 0.82, color = "white", width = 0.6) +
  geom_text(aes(label = round(Mean, 1)), position = position_dodge(width = 0.6),
            vjust = -0.4, size = 3.8, fontface = "bold") +
  scale_fill_manual(values = c("#E74C3C", "#3498DB")) +
  scale_y_continuous(limits = c(0, 80), expand = c(0, 0)) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title      = element_text(hjust = 0.5, face = "bold"),
    legend.position = "bottom"
  ) +
  labs(title = "Group Centroid Comparison: Mean Scores by Gender and Subject",
       x = "Dependent Variable", y = "Mean Score", fill = "Gender")

8.3 Discriminant Function Visualisation

# Linear discriminant axis projection for g=2
lda_result <- MASS::lda(groups ~ ., data = as.data.frame(Y))
df$LD1     <- as.numeric(predict(lda_result, newdata = as.data.frame(Y))$x)

ggplot(df, aes(x = LD1, fill = gender)) +
  geom_histogram(bins = 40, alpha = 0.65, position = "identity", color = "white") +
  geom_density(aes(y = after_stat(count)), alpha = 0.0,
               color = "black", linewidth = 1.2, adjust = 1.5) +
  scale_fill_manual(values = c("#E74C3C", "#3498DB")) +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(hjust = 0.5, face = "bold")) +
  labs(title = "Discriminant Score Distribution by Gender (LD1)",
       subtitle = "Overlap region indicates where classification is least certain",
       x = "First Linear Discriminant Score (LD1)", y = "Count", fill = "Gender")

8.4 Bivariate Score Scatterplots

sc1 <- ggplot(df, aes(x = `math score`, y = `reading score`, color = gender)) +
  geom_point(alpha = 0.4, size = 1.2) +
  stat_ellipse(level = 0.95, linewidth = 1.2) +
  scale_color_manual(values = c("#E74C3C","#3498DB")) +
  theme_minimal(base_size = 11) +
  labs(title = "Math vs. Reading (95% confidence ellipse)")

sc2 <- ggplot(df, aes(x = `math score`, y = `writing score`, color = gender)) +
  geom_point(alpha = 0.4, size = 1.2) +
  stat_ellipse(level = 0.95, linewidth = 1.2) +
  scale_color_manual(values = c("#E74C3C","#3498DB")) +
  theme_minimal(base_size = 11) +
  labs(title = "Math vs. Writing (95% confidence ellipse)")

sc3 <- ggplot(df, aes(x = `reading score`, y = `writing score`, color = gender)) +
  geom_point(alpha = 0.4, size = 1.2) +
  stat_ellipse(level = 0.95, linewidth = 1.2) +
  scale_color_manual(values = c("#E74C3C","#3498DB")) +
  theme_minimal(base_size = 11) +
  labs(title = "Reading vs. Writing (95% confidence ellipse)")

grid.arrange(sc1, sc2, sc3, ncol = 3,
             top = "Bivariate Score Scatterplots with 95% Concentration Ellipses by Gender")

The concentration ellipses in the bivariate scatterplots provide a geometric representation of the multivariate centroid difference. In all three pairwise combinations, the female ellipse (red) is shifted notably in the positive direction along the reading and writing axes relative to the male ellipse (blue), while the male ellipse extends further in the mathematics direction. The substantial overlap between ellipses in the math-reading and math-writing planes reflects the moderate between-group separation for mathematics. In contrast, the reading-writing plane shows the most pronounced centroid displacement and least overlap, corroborating the finding that verbal ability measures drive the largest portion of the gender difference.

9 Discriminant Analysis of the MANOVA Structure

9.1 Eigenvalues and Discriminant Coefficients

cat("Eigenvalue Structure of W^{-1}B:\n\n")
ev_df <- data.frame(
  Eigenvalue          = round(eigenvals, 6),
  Canonical_R         = round(sqrt(eigenvals / (1 + eigenvals)), 6),
  Pct_Variance        = round(eigenvals / sum(eigenvals) * 100, 2),
  Cumulative_Pct      = round(cumsum(eigenvals / sum(eigenvals)) * 100, 2)
)
print(ev_df)

cat("\nLDA Standardised Discriminant Coefficients (Scalings):\n")
print(round(lda_result$scaling, 4))

cat("\nGroup Centroids on LD1:\n")
ld1_centroids <- predict(lda_result)$x %>% as.data.frame() %>%
  cbind(Gender = groups) %>%
  group_by(Gender) %>%
  summarise(Mean_LD1 = round(mean(LD1), 4))
print(ld1_centroids)

## Eigenvalue Structure of W^{-1}B:
## 
##   Eigenvalue Canonical_R Pct_Variance Cumulative_Pct
## 1   1.303115      0.7522          100            100
## 
## LDA Standardised Discriminant Coefficients (Scalings):
##                     LD1
## `math score`     0.1569
## `reading score` -0.0271
## `writing score` -0.1408
## 
## Group Centroids on LD1:
## # A tibble: 2 × 2
##   Gender Mean_LD1
##   <chr>     <dbl>
## 1 female    -1.10
## 2 male       1.18

coef_df <- data.frame(
  Variable    = rownames(lda_result$scaling),
  Coefficient = as.numeric(lda_result$scaling)
)

ggplot(coef_df, aes(x = reorder(Variable, abs(Coefficient)),
                    y = Coefficient, fill = Coefficient > 0)) +
  geom_col(width = 0.5, color = "white") +
  geom_hline(yintercept = 0, linewidth = 0.8, color = "#2C3E50") +
  coord_flip() +
  scale_fill_manual(values = c("#E74C3C", "#3498DB"),
                    labels = c("Negative (male-dominant)", "Positive (female-dominant)")) +
  theme_minimal(base_size = 12) +
  theme(legend.position = "bottom",
        plot.title = element_text(hjust = 0.5, face = "bold")) +
  labs(title = "Linear Discriminant Coefficients (LD1)",
       subtitle = "Relative contribution of each DV to the discriminant function",
       x = "", y = "Standardised Coefficient", fill = "Direction")

10 Results Summary Table

final_tbl <- data.frame(
  Component = c(
    "Primary test statistic",
    "Pillai's Trace value",
    "Wilks' Lambda value",
    "Approximate F",
    "df numerator / denominator",
    "p-value",
    "Statistical decision",
    "Multivariate eta2",
    "Math score: F (eta2)",
    "Reading score: F (eta2)",
    "Writing score: F (eta2)",
    "Bonferroni threshold",
    "LD1 dominant variable"
  ),
  Result = c(
    "Pillai's Trace (primary); Wilks' Lambda (secondary)",
    sprintf("V = %.6f", pillai_trace),
    sprintf("Lambda = %.6f", wilks_lambda),
    sprintf("F = %.4f", F_wilks),
    sprintf("%d / %d", df1, df2),
    formatC(p_wilks, format = "e", digits = 3),
    "Reject H0 at alpha = 0.05",
    sprintf("eta2 = %.4f (%.1f%% variance explained)", eta2_multi, eta2_multi * 100),
    sprintf("F = %.4f (eta2 = %.4f)",
            anova_tbl$F_value[1], anova_tbl$eta2[1]),
    sprintf("F = %.4f (eta2 = %.4f)",
            anova_tbl$F_value[2], anova_tbl$eta2[2]),
    sprintf("F = %.4f (eta2 = %.4f)",
            anova_tbl$F_value[3], anova_tbl$eta2[3]),
    sprintf("alpha_Bonf = %.4f", alpha_bonf),
    "Writing score (largest |coefficient|)"
  )
)

final_tbl %>%
  kable(caption = "Comprehensive MANOVA Results Summary",
        col.names = c("Component", "Result")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = TRUE) %>%
  column_spec(1, bold = TRUE, width = "18em") %>%
  row_spec(0, bold = TRUE, color = "white", background = "#2C3E50") %>%
  row_spec(c(1, 6, 7), background = "#EBF5FB")

Comprehensive MANOVA Results Summary
Component	Result
Primary test statistic	Pillai’s Trace (primary); Wilks’ Lambda (secondary)
Pillai’s Trace value	V = 0.565805
Wilks’ Lambda value	Lambda = 0.434195
Approximate F	F = 432.6342
df numerator / denominator	3 / 996
p-value	6.997e-180
Statistical decision	Reject H0 at alpha = 0.05
Multivariate eta2	eta2 = 0.5658 (56.6% variance explained)
Math score: F (eta2)	F = 28.9793 (eta2 = 0.0282)
Reading score: F (eta2)	F = 63.3506 (eta2 = 0.0597)
Writing score: F (eta2)	F = 99.5916 (eta2 = 0.0907)
Bonferroni threshold	alpha_Bonf = 0.0167
LD1 dominant variable	Writing score (largest \|coefficient\|)

11 Discussion: Results and Findings

11.1 Multivariate Test Decision

The One-Way MANOVA produced a statistically significant multivariate effect of gender on the combined dependent variable comprising mathematics, reading, and writing scores. Pillai’s Trace was V = 0.5658, F(3, 996) = 432.6342, p < 0.001. Wilks’ Lambda was Lambda = 0.4342, corroborating the Pillai result with identical F and p values under the exact two-group transformation. All four multivariate criteria Pillai’s Trace, Wilks’ Lambda, Hotelling-Lawley Trace, and Roy’s Largest Root converged on the same decision: reject the null hypothesis that the multivariate mean vectors of female and male students are equal.

11.2 Result Manova Interpretation

The analysis was motivated by a long-standing empirical question in educational research whether gender is associated with differential academic performance across multiple subject domains simultaneously. The dataset captures the examination results of 1,000 students across three subjects mathematics, reading, and writing alongside demographic and preparation variables. Prior univariate studies have often reported gender differences in individual subjects, but have done so without accounting for the shared covariance structure among outcomes, risking both inflated error rates and loss of analytical power.

When inspecting the group centroids at the descriptive level, an internally contradictory pattern is apparent. Female students outperform male students in reading (by approximately 7.1 points) and writing (by approximately 9.2 points), while male students record a marginally higher mean in mathematics (by approximately 5.1 points). This bidirectional pattern presents a problem for any single-outcome analysis: focusing on mathematics alone would suggest a male advantage, while focusing on verbal scores would suggest a female advantage. Neither captures the full multivariate reality of how gender relates to academic performance as a composite construct.

MANOVA resolves this tension by evaluating the significance of the entire centroid displacement simultaneously, weighting dimensions by the within-group covariance structure rather than treating all axes equally. The result Pillai’s Trace = 0.5658, p < 0.001 confirms that the joint centroid difference is not attributable to sampling variability. The multivariate effect size (eta2 = 0.5658) indicates that gender accounts for approximately 56.6% of the variance in the composite academic performance space, which falls in the small-to-medium range by Cohen’s (1988) benchmarks but is practically meaningful given the breadth of the educational construct being measured. The follow-up univariate ANOVAs clarify the anatomy of this multivariate effect: the effect of gender is statistically significant and largest for writing (F = 99.59, eta2 = 0.0907) and reading (F = 63.35, eta2 = 0.0597), while the mathematics difference though marginally significant is the smallest contributor. The linear discriminant analysis confirms this interpretation writing received the highest standardised discriminant coefficient on LD1, followed by reading, indicating that the dimension along which the two gender groups are most separated is defined primarily by verbal and written expression ability rather than quantitative reasoning.

The finding that gender produces a statistically significant and multivariate-consistent difference in academic performance carries several practical and methodological implications. From a pedagogical standpoint, the pattern female advantage in verbal domains, male parity or slight advantage in mathematics mirrors patterns reported in large-scale international assessments such as PISA and TIMSS, suggesting that this dataset, though simulated, replicates a substantive empirical regularity. Interventions aimed at closing gender-based performance gaps would therefore need to be subject-specific rather than general reading and writing curricula may benefit from strategies that engage male students more effectively, while mathematics instruction may benefit from approaches that reduce stereotype threat for female students. Methodologically, the MANOVA framework demonstrates its value here precisely because the bidirectional nature of the gender effect would be obscured and the familywise error rate inflated by conducting three separate ANOVAs. The unified multivariate test provides a single, coherent, and efficient answer: gender is associated with a meaningful and statistically reliable difference in the profile of academic performance across all three subjects simultaneously.

12 Conclusion

The One-Way MANOVA confirmed a statistically significant effect of gender on the multivariate centroid of academic performance, F(3, 996) = 432.6342, p < 0.001, Pillai’s V = 0.5658, eta2 = 0.5658. The null hypothesis that the mean vectors of female and male students are equal across all three outcome dimensions is rejected with strong evidence. Follow-up univariate analyses under Bonferroni correction identified statistically significant group differences for reading and writing scores, with writing exhibiting the largest effect size. Mathematics showed a smaller and less pronounced group difference. The discriminant function analysis further established that the primary axis separating the two gender groups is dominated by verbal performance measures. These results provide a statistically rigorous, multivariate-coherent foundation for subsequent analyses using ANCOVA and MANCOVA, in which the covariate of test preparation course will be incorporated to assess whether the gender effect on academic performance persists after controlling for pre-examination preparation behaviour.

13 References

Anderson, T.W. (2003). An Introduction to Multivariate Statistical Analysis (3rd ed.). Wiley.
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum.
Hair, J.F., Black, W.C., Babin, B.J., & Anderson, R.E. (2019). Multivariate Data Analysis (8th ed.). Cengage Learning.
Olson, C.L. (1974). Comparative robustness of six tests in multivariate analysis of variance. Journal of the American Statistical Association, 69(348), 894–908.
Rencher, A.C., & Christensen, W.F. (2012). Methods of Multivariate Analysis (3rd ed.). Wiley.
Tabachnick, B.G., & Fidell, L.S. (2019). Using Multivariate Statistics (7th ed.). Pearson.
Dataset: Students Performance in Exams. Kaggle. https://www.kaggle.com/datasets/spscientist/students-performance-in-exams

MANOVA Analysis Report — Multivariate Analysis

Date: April 16, 2026

S1 Data Science | FMIPA UNESA | 2026

One-Way MANOVA: The Effect of Gender on Academic Performance

Multivariate Analysis - S1 Data Science UNESA

2026-04-16