1 Executive Summary

This document presents a comprehensive and technical assumption testing process before conducting MANOVA, ANCOVA, and MANCOVA analyses on the Students Performance in Exams dataset. Five fundamental assumptions were tested sequentially:

No. Assumption Test Method
1 Dependence between DVs Bartlett’s Sphericity Test
2 Homogeneity of the covariance matrix Box’s M Test
3 Multivariate normality Mardia + Henze-Zirkler + MVN plot
4 Linearity of covariates–DVs Scatter plot + Pearson correlation + ANOVA F-test
5 Independence of observations Review of the research design

Variables used:

  • DV (Dependent Variables): math score, reading score, writing score
  • IV (Independent Variable / Factor): gender (female, male)
  • Covariate: test preparation course (none, completed → coded 0/1)

2 Data Loading & Preparation

2.1 Import Dataset

df <- read.csv("StudentsPerformance.csv", check.names = FALSE)

cat("STUDENTS PERFORMANCE DATASET - Overview\n")
cat(sprintf(" Number of Observations : %d rows\n", nrow(df)))
cat(sprintf(" Number of Variables : %d columns\n", ncol(df)))
cat("\n Variable Name:\n")
for (v in names(df)) cat(sprintf(" ▸ %s\n", v))
## STUDENTS PERFORMANCE DATASET - Overview
##  Number of Observations : 1000 rows
##  Number of Variables : 8 columns
## 
##  Variable Name:
##  ▸ gender
##  ▸ race/ethnicity
##  ▸ parental level of education
##  ▸ lunch
##  ▸ test preparation course
##  ▸ math score
##  ▸ reading score
##  ▸ writing score

2.2 Dataset Preview

head(df, 10) %>% 
  kable(caption = "First 10 Rows — Students Performance Dataset") %>% 
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), 
                full_width = TRUE, font_size = 12) %>% 
scroll_box(width = "100%")
First 10 Rows — Students Performance Dataset
gender race/ethnicity parental level of education lunch test preparation course math score reading score writing score
female group B bachelor’s degree standard none 72 72 74
female group C some college standard completed 69 90 88
female group B master’s degree standard none 90 95 93
male group A associate’s degree free/reduced none 47 57 44
male group C some college standard none 76 78 75
female group B associate’s degree standard none 71 83 78
female group B some college standard completed 88 95 92
male group B some college free/reduced none 40 43 39
male group D high school free/reduced completed 64 64 67
female group B high school free/reduced none 38 60 50

2.3 Persiapan Variabel

# Main variable
dv_cols <- c("math score", "reading score", "writing score")
iv_col <- "gender"
cov_col <- "test preparation course"

# Encode covariate → numeric
df$test_prep_num <- ifelse(df[[cov_col]] == "completed", 1, 0)

# DV matrix subset
Y <- df[, dv_cols]

cat("Group Distribution (gender):\n")
print(table(df[[iv_col]]))

cat("\nCovariate Distribution (test preparation course):\n")
print(table(df[[cov_col]]))
## Group Distribution (gender):
## 
## female   male 
##    518    482 
## 
## Covariate Distribution (test preparation course):
## 
## completed      none 
##       358       642

2.4 Statistik Deskriptif

# Per variabel DV
desc_tbl <- data.frame(
  Variable = dv_cols,
  N = sapply(Y, length),
  Mean = round(sapply(Y, mean), 3),
  Median = round(sapply(Y, median), 3),
  SD = round(sapply(Y, sd), 3),
  Min = sapply(Y, min),
  Max = sapply(Y, max),
  Skewness = round(sapply(Y, function(x) mean((x - mean(x))^3) / sd(x)^3), 3),
  Kurtosis = round(sapply(Y, function(x) mean((x - mean(x))^4) / sd(x)^4 - 3), 3)
)

desc_tbl %>%
  kable(caption = "Descriptive Statistics - Dependent Variable") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE) %>%
  row_spec(0, bold = TRUE, color = "white", background = "#34495E")
Descriptive Statistics - Dependent Variable
Variable N Mean Median SD Min Max Skewness Kurtosis
math score math score 1000 66.089 66 15.163 0 100 -0.278 0.261
reading score reading score 1000 69.169 70 14.600 17 100 -0.258 -0.080
writing score writing score 1000 68.054 69 15.196 10 100 -0.289 -0.045
# Each variable each group
for (col in dv_cols) {
  tbl <- df %>%
    group_by(gender) %>%
    summarise(
      N = n(),
      Mean = round(mean(.data[[col]]), 3),
      Median = round(median(.data[[col]]), 3),
      SD = round(sd(.data[[col]]), 3),
      Min = min(.data[[col]]),
      Max = max(.data[[col]])
    )
  
  print(
    tbl %>%
      kable(caption = sprintf("Descriptive '%s' each Gender", col)) %>%
      kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
      column_spec(1, bold = TRUE)
  )
}
## <table class="table table-striped table-hover" style="color: black; width: auto !important; margin-left: auto; margin-right: auto;">
## <caption>Descriptive 'math score' each Gender</caption>
##  <thead>
##   <tr>
##    <th style="text-align:left;"> gender </th>
##    <th style="text-align:right;"> N </th>
##    <th style="text-align:right;"> Mean </th>
##    <th style="text-align:right;"> Median </th>
##    <th style="text-align:right;"> SD </th>
##    <th style="text-align:right;"> Min </th>
##    <th style="text-align:right;"> Max </th>
##   </tr>
##  </thead>
## <tbody>
##   <tr>
##    <td style="text-align:left;font-weight: bold;"> female </td>
##    <td style="text-align:right;"> 518 </td>
##    <td style="text-align:right;"> 63.633 </td>
##    <td style="text-align:right;"> 65 </td>
##    <td style="text-align:right;"> 15.491 </td>
##    <td style="text-align:right;"> 0 </td>
##    <td style="text-align:right;"> 100 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;font-weight: bold;"> male </td>
##    <td style="text-align:right;"> 482 </td>
##    <td style="text-align:right;"> 68.728 </td>
##    <td style="text-align:right;"> 69 </td>
##    <td style="text-align:right;"> 14.356 </td>
##    <td style="text-align:right;"> 27 </td>
##    <td style="text-align:right;"> 100 </td>
##   </tr>
## </tbody>
## </table><table class="table table-striped table-hover" style="color: black; width: auto !important; margin-left: auto; margin-right: auto;">
## <caption>Descriptive 'reading score' each Gender</caption>
##  <thead>
##   <tr>
##    <th style="text-align:left;"> gender </th>
##    <th style="text-align:right;"> N </th>
##    <th style="text-align:right;"> Mean </th>
##    <th style="text-align:right;"> Median </th>
##    <th style="text-align:right;"> SD </th>
##    <th style="text-align:right;"> Min </th>
##    <th style="text-align:right;"> Max </th>
##   </tr>
##  </thead>
## <tbody>
##   <tr>
##    <td style="text-align:left;font-weight: bold;"> female </td>
##    <td style="text-align:right;"> 518 </td>
##    <td style="text-align:right;"> 72.608 </td>
##    <td style="text-align:right;"> 73 </td>
##    <td style="text-align:right;"> 14.378 </td>
##    <td style="text-align:right;"> 17 </td>
##    <td style="text-align:right;"> 100 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;font-weight: bold;"> male </td>
##    <td style="text-align:right;"> 482 </td>
##    <td style="text-align:right;"> 65.473 </td>
##    <td style="text-align:right;"> 66 </td>
##    <td style="text-align:right;"> 13.932 </td>
##    <td style="text-align:right;"> 23 </td>
##    <td style="text-align:right;"> 100 </td>
##   </tr>
## </tbody>
## </table><table class="table table-striped table-hover" style="color: black; width: auto !important; margin-left: auto; margin-right: auto;">
## <caption>Descriptive 'writing score' each Gender</caption>
##  <thead>
##   <tr>
##    <th style="text-align:left;"> gender </th>
##    <th style="text-align:right;"> N </th>
##    <th style="text-align:right;"> Mean </th>
##    <th style="text-align:right;"> Median </th>
##    <th style="text-align:right;"> SD </th>
##    <th style="text-align:right;"> Min </th>
##    <th style="text-align:right;"> Max </th>
##   </tr>
##  </thead>
## <tbody>
##   <tr>
##    <td style="text-align:left;font-weight: bold;"> female </td>
##    <td style="text-align:right;"> 518 </td>
##    <td style="text-align:right;"> 72.467 </td>
##    <td style="text-align:right;"> 74 </td>
##    <td style="text-align:right;"> 14.845 </td>
##    <td style="text-align:right;"> 10 </td>
##    <td style="text-align:right;"> 100 </td>
##   </tr>
##   <tr>
##    <td style="text-align:left;font-weight: bold;"> male </td>
##    <td style="text-align:right;"> 482 </td>
##    <td style="text-align:right;"> 63.311 </td>
##    <td style="text-align:right;"> 64 </td>
##    <td style="text-align:right;"> 14.114 </td>
##    <td style="text-align:right;"> 15 </td>
##    <td style="text-align:right;"> 100 </td>
##   </tr>
## </tbody>
## </table>
Y_long <- pivot_longer(Y, cols = everything(), names_to = "Variable", values_to = "Score")

ggplot(Y_long, aes(x = Score, fill = Variable)) +
  geom_histogram(aes(y = after_stat(density)), bins = 30,
                 alpha = 0.7, color = "white") +
  geom_density(color = "#2C3E50", linewidth = 0.9) +
  facet_wrap(~Variable, scales = "free", ncol = 1) +
  scale_fill_manual(values = c("#3498DB", "#E74C3C", "#2ECC71")) +
  theme_minimal(base_size = 13) +
  theme(
    strip.background = element_rect(fill = "#34495E"),
    strip.text       = element_text(color = "white", face = "bold"),
    legend.position  = "none",
    plot.title       = element_text(hjust = 0.5, face = "bold", size = 15)
  ) +
  labs(title = "Distribution of Dependent Variable", x = "Score", y = "Density")


3 Assumption 1 - Dependence between Dependent Variables (Bartlett’s Sphericity Test)

3.1 Theoretical Basis

Objective: To verify that the dependent variables (\(Y_1\), \(Y_2\), \(Y_3\)) are not independent, meaning there is a meaningful correlation between them. MANOVA only provides analytical advantages over a series of univariate ANOVAs if the DVs are correlated.

Hypothesis:

\[H_0 : \boldsymbol{\rho} = \mathbf{I} \quad \text{(correlation matrix = identity matrix; independent DV)}\] \[H_1 : \boldsymbol{\rho} \neq \mathbf{I} \quad \text{(there is at least one correlation }\neq 0\text{)}\]

Statistik Uji — Bartlett (1950): \[\chi^2 = -\left[(n-1) - \frac{2p+5}{6}\right] \ln |\mathbf{R}|\]

where \(n\) = number of observations, \(p\) = number of DV variables, \(|\mathbf{R}|\) = determinant of the correlation matrix. Degrees of freedom: \(df = \frac{p(p-1)}{2}\).

Criteria: Reject \(H_0\) if \(\chi^2_{\text{count}} > \chi^2_{\alpha, df}\), or \(p\text{-value} < 0.05\).

3.2 Correlation Matrix between DVs

R <- cor(Y)

# Correlation Table
R %>%
  round(4) %>%
  kable(caption = "Pearson Correlation Matrix between Dependent Variables") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE) %>%
  row_spec(0, bold = TRUE, color = "white", background = "#2C3E50")
Pearson Correlation Matrix between Dependent Variables
math score reading score writing score
math score 1.0000 0.8176 0.8026
reading score 0.8176 1.0000 0.9546
writing score 0.8026 0.9546 1.0000
corrplot(R,
         method    = "color",
         type      = "upper",
         addCoef.col = "black",
         tl.col    = "#2C3E50",
         tl.srt    = 0,
         cl.cex    = 0.9,
         number.cex = 1.1,
         col       = colorRampPalette(c("#EBF5FB","#3498DB","#154360"))(200),
         title     = "Correlation Matrix - Math, Reading, Writing Score",
         mar       = c(0,0,2,0))

3.3 Implementation of Bartlett’s Test

n <- nrow(Y)
p <- ncol(Y)
det_R <- det(R)

# Bartlett Correction
chi2_stat <- -((n - 1) - (2*p + 5)/6) * log(det_R)
df_bart <- p * (p - 1) / 2
p_value <- pchisq(chi2_stat, df = df_bart, lower.tail = FALSE)

cat("BARTLETT SPHERICITY TEST — Manual Calculation Results\n")
cat(sprintf(" n (number of observations) : %d\n", n))
cat(sprintf(" p (number of DVs) : %d\n", p))
cat(sprintf(" |R| (correlation matrix det) : %.8f\n", det_R))
cat(sprintf(" ln|R| : %.6f\n", log(det_R)))
cat(sprintf(" Correction factor : %.4f\n", (n - 1) - (2*p + 5)/6))
cat()
cat(sprintf("χ² calculated statistic : %.4f\n", chi2_stat))
cat(sprintf(" Degrees of freedom (df) : %d\n", df_bart))
cat(sprintf(" χ² table (α=0.05) : %.4f\n", 
qchisq(0.95, df = df_bart)))
cat(sprintf(" p-value : %.6e\n", p_value))
cat()
cat(sprintf(" Decision: %s\n", 
            ifelse(p_value < 0.05, "REJECT H₀ → MUTUAL DEPENDENT DV ✓", "FAILED TO REJECT H₀ → Independent DV")))
cat()

# Validation via R's built-in bartlett.test (per pair)
cat("Confirmation via R's built-in bartlett.test() (per pair of DVs):\n")
pairs_list <- list(
  c("math score","reading score"),
  c("math score","writing score"),
  c("reading score","writing score")
)
for (pair in pairs_list) {
  bt <- cor.test(df[[pair[1]]], df[[pair[2]]])
  cat(sprintf("  %s vs %s → r = %.4f, p = %.2e\n",
              pair[1], pair[2], bt$estimate, bt$p.value))
}
## BARTLETT SPHERICITY TEST — Manual Calculation Results
##  n (number of observations) : 1000
##  p (number of DVs) : 3
##  |R| (correlation matrix det) : 0.02893173
##  ln|R| : -3.542816
##  Correction factor : 997.1667
## χ² calculated statistic : 3532.7783
##  Degrees of freedom (df) : 3
##  χ² table (α=0.05) : 7.8147
##  p-value : 0.000000e+00
##  Decision: REJECT H₀ → MUTUAL DEPENDENT DV ✓
## Confirmation via R's built-in bartlett.test() (per pair of DVs):
##   math score vs reading score → r = 0.8176, p = 1.79e-241
##   math score vs writing score → r = 0.8026, p = 3.38e-226
##   reading score vs writing score → r = 0.9546, p = 0.00e+00
bartlett_result <- data.frame(
  Parameters = c("χ² Calculated", "Degrees of Freedom", "χ² Table (α=0.05)", "p-value", "Decision"),
  Values = c(round(chi2_stat, 4), df_bart, round(qchisq(0.95, df=df_bart), 4), formatC(p_value, format="e", digits=4), 
               ifelse(p_value < 0.05, "Reject H₀", "Fail to Reject H₀"))
)

bartlett_result %>%
  kable(caption = "Summary of the Bartlett Sphericity Test") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE) %>% 
  row_spec(5, bold = TRUE, color = "white", background = ifelse(p_value < 0.05, "#1E8449", "#E67E22"))
Summary of the Bartlett Sphericity Test
Parameters Values
χ² Calculated 3532.7783
Degrees of Freedom 3
χ² Table (α=0.05) 7.8147
p-value 0.0000e+00
Decision Reject H₀

3.4 Scatter Plot Relationship between DV

pairs_df <- data.frame(Y, group = df[[iv_col]])

names(pairs_df)[1:3] <- c("math_score", "reading_score", "writing_score")

p1 <- ggplot(pairs_df, aes(x = math_score, y = reading_score, color = group)) +
  geom_point(alpha = 0.5, size = 1.5) +
  geom_smooth(method = "lm", se = TRUE, linewidth = 1) +
  scale_color_manual(values = c("#E74C3C","#3498DB")) +
  theme_minimal(base_size = 11) +
  labs(title = "Math vs Reading", color = "Gender")

p2 <- ggplot(pairs_df, aes(x = math_score, y = writing_score, color = group)) +
  geom_point(alpha = 0.5, size = 1.5) +
  geom_smooth(method = "lm", se = TRUE, linewidth = 1) +
  scale_color_manual(values = c("#E74C3C","#3498DB")) +
  theme_minimal(base_size = 11) +
  labs(title = "Math vs Writing", color = "Gender")

p3 <- ggplot(pairs_df, aes(x = reading_score, y = writing_score, color = group)) +
  geom_point(alpha = 0.5, size = 1.5) +
  geom_smooth(method = "lm", se = TRUE, linewidth = 1) +
  scale_color_manual(values = c("#E74C3C","#3498DB")) +
  theme_minimal(base_size = 11) +
  labs(title = "Reading vs Writing", color = "Gender")

grid.arrange(p1, p2, p3, ncol = 3, top = "Scatter Plot of the Relationship between Dependent Variables each Gender")

3.5 Interpretation

Assumption 1 FULFILLED - DV Variables Are Mutually Dependent

The Bartlett Sphericity Test yields \(\chi^2 = 3532.78\) with \(df = 3\) and \(p\text{-value} < 0.05\). Since \(\chi^2_{\text{count}} > \chi^2_{\text{table}}\), we reject \(H_0\) — the DV correlation matrix is not an identity matrix.

Correlation between DVs: - Math–Reading: \(r = 0.818\)strong correlation - Math–Writing: \(r = 0.803\)strong correlation - Reading–Writing: \(r = 0.955\)very strong correlation

The three score variables (math, reading, writing) reflect interrelated academic dimensions. MANOVA is appropriate because analyzing all three DVs simultaneously provides richer information than separate ANOVAs.


4 Assumption 2 - Homogeneity of Covariance Matrices (Box’s M Test)

4.1 Theoretical Basis

Purpose: Test the equality of variance-covariance matrices (\(\boldsymbol{\Sigma}\)) between groups — a multivariate analogue of Levene’s test (univariate). MANOVA assumes \(\boldsymbol{\Sigma}_1 = \boldsymbol{\Sigma}_2 = \cdots = \boldsymbol{\Sigma}_g\).

Hypothesis: \[H_0 : \boldsymbol{\Sigma}_1 = \boldsymbol{\Sigma}_2 = \cdots = \boldsymbol{\Sigma}_g\] \[H_1 : \text{Minimum one } \boldsymbol{\Sigma}_\ell \neq \boldsymbol{\Sigma}_m\]

Box’s M Test Statistics (1949): \[M = \left[\sum_\ell (n_\ell - 1)\right] \ln |\mathbf{S}_{\text{pooled}}| - \sum_\ell (n_\ell - 1)\ln|\mathbf{S}_\ell|\]

\[C = (1 - u)M \approx \chi^2_{\nu}\]

where \(\nu = \frac{1}{2}p(p+1)(g-1)\), and \(u\) is Box’s correction factor.

Important Note: Box’s M is very sensitive to non-normality. If the normal distribution is not perfectly met, non-significant results (p > 0.05) are interpreted more conservatively.

4.2 Covariance Matrix per Group

groups <- unique(df[[iv_col]])

cat("Covariance Matrix per Group:\n\n")
for (g in groups) { 
  cat(sprintf("Group: %s\n", toupper(g)))
  idx <- df[[iv_col]] == g 
  S_g <- cov(Y[idx, ]) 
  det_g <- det(S_g) 
  cat(sprintf(" n = %d\n", sum(idx))) 
  cat(sprintf(" |S| = %.4f\n", det_g)) 
  print(round(S_g, 4)) 
  cat("\n")
}

# Pooled covariance
n_total <- nrow(df)
g_count <- length(groups)

S_pooled <- Reduce("+", lapply(groups, function(g) { 
  idx <- df[[iv_col]] == g 
  (sum(idx) - 1) * cov(Y[idx, ])
})) / (n_total - g_count)

cat("Pooled Covariance Matrix\n")
print(round(S_pooled, 4))
cat(sprintf(" |S_pooled| = %.4f\n", det(S_pooled)))
## Covariance Matrix per Group:
## 
## Group: FEMALE
##  n = 518
##  |S| = 137344.0080
##               math score reading score writing score
## math score      239.9851      202.5272      211.7384
## reading score   202.5272      206.7339      203.7792
## writing score   211.7384      203.7792      220.3693
## 
## Group: MALE
##  n = 482
##  |S| = 145617.9018
##               math score reading score writing score
## math score      206.1027      177.1060      180.7334
## reading score   177.1060      194.0959      186.9232
## writing score   180.7334      186.9232      199.2002
## 
## Pooled Covariance Matrix
##               math score reading score writing score
## math score      223.6550      190.2751      196.7951
## reading score   190.2751      200.6429      195.6552
## writing score   196.7951      195.6552      210.1666
##  |S_pooled| = 142591.8251
make_cov_heatmap <- function(mat, title, color_mid = "#3498DB") {
  mat_long <- melt(mat)
  ggplot(mat_long, aes(Var1, Var2, fill = value)) +
    geom_tile(color = "white", linewidth = 0.8) +
    geom_text(aes(label = round(value, 1)), size = 3.5, fontface = "bold") +
    scale_fill_gradient2(low = "#EBF5FB", mid = color_mid, high = "#154360", midpoint = median(mat_long$value)) +
    theme_minimal(base_size = 11) +
    theme(axis.text.x = element_text(angle = 30, hjust = 1), plot.title   = element_text(hjust = 0.5, face = "bold")) + labs(title = title, x = "", y = "", fill = "Cov")
}

cov_plots <- lapply(seq_along(groups), function(i) {
  idx <- df[[iv_col]] == groups[i]
  make_cov_heatmap(cov(Y[idx, ]),
                   sprintf("Covariance %s", toupper(groups[i])),
                   color_mid = ifelse(i == 1, "#E74C3C", "#3498DB"))
})
cov_plots[["Pooled"]] <- make_cov_heatmap(S_pooled, "Covariance Pooled", "#2ECC71")

do.call(grid.arrange, c(cov_plots, ncol = 3, list(top = "Comparison of Covariance Matrices between Groups")))

4.3 Box’s M Test

# Calculate Box's M manually
ns <- sapply(groups, function(g) sum(df[[iv_col]] == g))
p  <- ncol(Y)
g  <- length(groups)

M_stat <- sum(ns - 1) * log(det(S_pooled)) -
          sum(sapply(seq_along(groups), function(i) {
            idx <- df[[iv_col]] == groups[i]
            (ns[i] - 1) * log(det(cov(Y[idx, ])))
          }))

# Correction Factor u
u_factor <- (sum(1/(ns - 1)) - 1/sum(ns - 1)) *
             (2*p^2 + 3*p - 1) / (6*(p + 1)*(g - 1))

C_stat <- (1 - u_factor) * M_stat
df_box  <- p*(p + 1)*(g - 1) / 2
p_boxm  <- pchisq(C_stat, df = df_box, lower.tail = FALSE)

cat("BOX'S M TEST - Calculation Results\n")
cat(sprintf(" M statistics : %.4f\n", M_stat))
cat(sprintf(" Correction factor u : %.6f\n", u_factor))
cat(sprintf(" C = (1-u)M : %.4f\n", C_stat))
cat(sprintf(" Degrees of freedom : %.0f\n", df_box))
cat(sprintf(" χ² table α=0.05 : %.4f\n", qchisq(0.95, df = df_box)))
cat(sprintf(" p-value : %.6f\n", p_boxm))
cat()
cat(sprintf(" Decision: %s\n", ifelse(p_boxm > 0.05, "FAILED TO REJECT H₀ → HOMOGENEOUS Covariance Matrix ✓", "REJECT H₀ → NOT Homogeneous Covariance Matrix")))
cat()

# Validation via biotools::boxM
cat("\nValidation via biotools::boxM():\n")
bm <- biotools::boxM(Y, df[[iv_col]])
print(bm)
## BOX'S M TEST - Calculation Results
##  M statistics : 9.2852
##  Correction factor u : 0.003262
##  C = (1-u)M : 9.2549
##  Degrees of freedom : 6
##  χ² table α=0.05 : 12.5916
##  p-value : 0.159741
##  Decision: FAILED TO REJECT H₀ → HOMOGENEOUS Covariance Matrix ✓
## 
## Validation via biotools::boxM():
## 
##  Box's M-test for Homogeneity of Covariance Matrices
## 
## data:  Y
## Chi-Sq (approx.) = 9.2549, df = 6, p-value = 0.1597
boxm_result <- data.frame(
  Parameters = c("M Statistic", "C = (1-u)M", "Degrees of Freedom", "χ² Table (α=0.05)", "p-value", "Decision"),
  Values = c(round(M_stat, 4), round(C_stat, 4), df_box, round(qchisq(0.95, df=df_box), 4), round(p_boxm, 6), ifelse(p_boxm > 0.05, "Failed to Reject H₀ (Homogeneous)", "Reject H₀ (Inhomogeneous)"))
)

boxm_result %>%
  kable(caption = "Box's M Test Summary") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE) %>%
  row_spec(6, bold = TRUE, color = "white",
           background = ifelse(p_boxm > 0.05, "#1E8449", "#E67E22"))
Box’s M Test Summary
Parameters Values
M Statistic 9.2852
C = (1-u)M 9.2549
Degrees of Freedom 6
χ² Table (α=0.05) 12.5916
p-value 0.159741
Decision Failed to Reject H₀ (Homogeneous)

4.4 Boxplot Comparison of Variance each Group

df_long <- df %>%
  pivot_longer(cols = all_of(dv_cols), names_to = "Variable", values_to = "Score")

ggplot(df_long, aes(x = gender, y = Score, fill = gender)) +
  geom_boxplot(alpha = 0.75, outlier.shape = 21, outlier.size = 1.5) +
  geom_jitter(width = 0.15, alpha = 0.2, size = 0.8) +
  scale_fill_manual(values = c("#E74C3C","#3498DB")) +
  facet_wrap(~Variable, ncol = 3) +
  theme_minimal(base_size = 12) +
  theme(
    strip.background = element_rect(fill = "#34495E"),
    strip.text       = element_text(color = "white", face = "bold"),
    legend.position  = "bottom",
    plot.title       = element_text(hjust = 0.5, face = "bold", size = 14)
  ) +
  labs(title = "Distribution of Scores each Gender - Comparison of Variances", x = "Gender", y = "Score", fill = "Gender")

4.5 Interpretation

Asumption 2 FULFILLED - Homogeneous Covariance Matrix

Box’s M test yields \(C = `rround(C_stat, 4)`\) with \(df = `rdf_box`\) and \(p\text{-value} = `rround(p_boxm, 4)`\). Since \(p > 0.05\), we fail to reject \(H_0\) there is insufficient evidence that the covariance matrix differs across gender groups.

This indicates that: - The variability of math, reading, and writing scores is equal between the female and male groups. - The assumption \(\boldsymbol{\Sigma}_{\text{female}} = \boldsymbol{\Sigma}_{\text{male}}\) can be maintained. - MANOVA, ANCOVA, and MANCOVA can be continued using the pooled covariance matrix.

Methodological Note: Box’s M is sensitive to non-normality. This non-significant result supports the reliability of the follow-up analysis.


5 Assumption 3 - Multivariate Normality

5.1 Theoretical Basis

Objective: To verify that the DV vector follows the multivariate normal distribution \(\mathbf{Y} \sim N_p(\boldsymbol{\mu}, \boldsymbol{\Sigma})\). MANOVA is based on this assumption for the validity of the Wilks’ F/Lambda test.

Two formal tests were used:

1. Mardia’s Test (Multivariate Skewness & Kurtosis): \[b_{1,p} = \frac{1}{n^2}\sum_{i,j} g_{ij}^3 \quad \text{dan} \quad b_{2,p} = \frac{1}{n}\sum_i g_{ii}^2\] where \(g_{ij} = (\mathbf{x}_i - \bar{\mathbf{x}})'\mathbf{S}^{-1}(\mathbf{x}_j - \bar{\mathbf{x}})\) (Mahalanobis-based).

2. Henze-Zirkler test: \[HZ = \frac{1}{n}\sum_{i,j} K_\beta(X_i - X_j) - 2\sum_i K_{\beta/\sqrt{2}}(X_i - \bar{X}) + n \cdot c\]

Both are available via the MVN package.

Criteria: \(H_0\): data is multivariately normally distributed. Reject \(H_0\) if \(p < 0.05\).

5.2 Univariate Normality Test (Prerequisite)

cat("Shapiro-Wilk Test each DV Variable:\n\n")

sw_results <- lapply(dv_cols, function(col) {
  sw  <- shapiro.test(df[[col]])
  list(Variable = col, W = round(sw$statistic, 5),
       p_value = round(sw$p.value, 6),
       Normal  = ifelse(sw$p.value > 0.05, "Yes", "No"))
})

sw_tbl <- do.call(rbind, lapply(sw_results, as.data.frame))

sw_tbl %>%
  kable(caption = "Shapiro-Wilk Normality Test - DV Univariate") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE) %>%
  column_spec(4, bold = TRUE,
              color = ifelse(sw_tbl$Normal == "Ya", "#1E8449", "#C0392B"))
## Shapiro-Wilk Test each DV Variable:
Shapiro-Wilk Normality Test - DV Univariate
Variable W p_value Normal
W math score 0.99315 0.000145 No
W1 reading score 0.99292 0.000106 No
W2 writing score 0.99196 0.000029 No
qq_plots <- lapply(dv_cols, function(col) {
  ggplot(df, aes(sample = .data[[col]])) +
    stat_qq(color = "#3498DB", alpha = 0.6, size = 1.2) +
    stat_qq_line(color = "#E74C3C", linewidth = 1) +
    theme_minimal(base_size = 11) +
    labs(title = sprintf("Q-Q Plot: %s", col),
         x = "Theoretical Quantiles", y = "Sample Quantiles")
})

grid.arrange(grobs = qq_plots, ncol = 3, top = "Q-Q Plot Univariate Normality - Dependent Variable")

5.3 Multivariate Outlier Detection (Mahalanobis Distance)

mean_vec <- colMeans(Y)
cov_mat <- cov(Y)
inv_cov <- solve(cov_mat)

maha_dist <- mahalanobis(Y, center = mean_vec, cov = cov_mat)
df$mahalanobis <- maha_dist

threshold <- qchisq(0.999, df = ncol(Y))   # α = 0.001, df = p
n_outliers <- sum(maha_dist > threshold)

cat("MAHALANOBIS DISTANCE - Multivariate Outlier\n")
cat()
cat(sprintf(" Threshold χ²(0.001, df=%d) : %.4f\n", ncol(Y), threshold))
cat(sprintf(" Number of outliers : %d (%.1f%%)\n", n_outliers, n_outliers/nrow(df)*100))
cat(sprintf(" Maximum distance : %.4f\n", max(maha_dist)))
cat(sprintf(" Minimum distance : %.4f\n", min(maha_dist)))
cat(sprintf(" Median distance : %.4f\n", median(maha_dist)))
cat()
## MAHALANOBIS DISTANCE - Multivariate Outlier
##  Threshold χ²(0.001, df=3) : 16.2662
##  Number of outliers : 1 (0.1%)
##  Maximum distance : 20.1224
##  Minimum distance : 0.0008
##  Median distance : 2.5039
df_maha <- data.frame(Index = 1:nrow(df), Distance = maha_dist,
                      Outlier = maha_dist > threshold)

p_maha1 <- ggplot(df_maha, aes(x = Index, y = Distance, color = Outlier)) +
  geom_point(size = 1.5, alpha = 0.7) +
  geom_hline(yintercept = threshold, linetype = "dashed",
             color = "#E74C3C", linewidth = 1) +
  scale_color_manual(values = c("#3498DB","#E74C3C")) +
  theme_minimal(base_size = 11) +
  labs(title = "Mahalanobis Distance per Observation", subtitle = sprintf("Threshold χ²(0.001, df=3) = %.2f", threshold), x = "Observation Index", y = "Mahalanobis Distance")

p_maha2 <- ggplot(df_maha, aes(x = Distance)) +
  geom_histogram(aes(y = after_stat(density)), bins = 40,
                 fill = "#3498DB", alpha = 0.7, color = "white") +
  geom_vline(xintercept = threshold, linetype = "dashed",
             color = "#E74C3C", linewidth = 1) +
  stat_function(fun = function(x) dchisq(x, df = ncol(Y)),
                color = "#E74C3C", linewidth = 1) +
  theme_minimal(base_size = 11) +
  labs(title = "Mahalanobis Distribution vs. χ²", x = "Distance", y = "Density")

grid.arrange(p_maha1, p_maha2, ncol = 2)

5.4 Formal Multivariate Normality Test

cat("MARDIA TEST - Multivariate Normality\n")
mvn_mardia <- MVN::mvn(data = Y, mvn_test = "mardia", descriptives = FALSE)
print(mvn_mardia$multivariate_normality)

cat("\nHENZE-ZIRKLER TEST - Multivariate Normality\n")
mvn_hz <- MVN::mvn(data = Y, mvn_test = "hz", descriptives = FALSE)
print(mvn_hz$multivariate_normality)
## MARDIA TEST - Multivariate Normality
##              Test Statistic p.value     Method          MVN
## 1 Mardia Skewness    30.255  <0.001 asymptotic ✗ Not normal
## 2 Mardia Kurtosis    -2.266   0.023 asymptotic ✗ Not normal
## 
## HENZE-ZIRKLER TEST - Multivariate Normality
##            Test Statistic p.value     Method          MVN
## 1 Henze-Zirkler     1.282   0.003 asymptotic ✗ Not normal

5.5 Multivariate Normality Plot

# Chi-square Q-Q plot manual
p_values_sorted <- sort(maha_dist)
n    <- length(p_values_sorted)
expected <- qchisq((1:n - 0.5)/n, df = ncol(Y))

qq_df <- data.frame(Expected = expected, Observed = p_values_sorted,
                    Outlier = p_values_sorted > threshold)

p_chisq <- ggplot(qq_df, aes(x = Expected, y = Observed, color = Outlier)) +
  geom_point(size = 1.5, alpha = 0.7) +
  geom_abline(intercept = 0, slope = 1, color = "#E74C3C",
              linewidth = 1, linetype = "dashed") +
  scale_color_manual(values = c("#3498DB","#E74C3C")) +
  theme_minimal(base_size = 12) +
  labs(title = "Chi-Square Q-Q Plot (Multivariate Normality)",
       subtitle = "Points follow the diagonal line → multivariate normal",
       x = "Theoretical χ² Quantiles", y = "Mahalanobis Distance²",
       color = "Outlier")

# Perspective plot via MVN
mvn_out <- MVN::mvn(data = Y, 
                    mvn_test = "mardia",
                    descriptives = FALSE,
                    multivariate_outlier_method = "quan")
print(p_chisq)

cat("Multivariate Normality Test each Gender Group:\n\n")
for (g in groups) {
  cat(sprintf("Group: %s\n", toupper(g)))
  Y_g <- Y[df[[iv_col]] == g, ]
  mvn_g <- MVN::mvn(data = Y_g, mvn_test = "hz", descriptives = FALSE)
  print(mvn_g$multivariate_normality)
  cat("\n")
}
## Multivariate Normality Test each Gender Group:
## 
## Group: FEMALE
##            Test Statistic p.value     Method      MVN
## 1 Henze-Zirkler     1.018   0.091 asymptotic ✓ Normal
## 
## Group: MALE
##            Test Statistic p.value     Method      MVN
## 1 Henze-Zirkler      0.82     0.5 asymptotic ✓ Normal
mvn_summary <- data.frame(
  Test       = c("Mardia Skewness", "Mardia Kurtosis", "Henze-Zirkler"),
  Statistics = c(
    round(mvn_mardia$multivariate_normality[1, "Statistic"], 4),
    round(mvn_mardia$multivariate_normality[2, "Statistic"], 4),
    round(mvn_hz$multivariate_normality[1, "Statistic"], 4)
  ),
  p_value   = c(
    mvn_mardia$multivariate_normality[1, "p.value"],
    mvn_mardia$multivariate_normality[2, "p.value"],
    mvn_hz$multivariate_normality[1, "p.value"]
  ),
  Hasil     = c(
    mvn_mardia$multivariate_normality[1, "MVN"],
    mvn_mardia$multivariate_normality[2, "MVN"],
    mvn_hz$multivariate_normality[1, "MVN"]
  )
)

mvn_summary %>%
  kable(caption = "Multivariate Normality Test Summary") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE) %>%
  row_spec(0, bold = TRUE, color = "white", background = "#34495E")
Multivariate Normality Test Summary
Test Statistics p_value Hasil
Mardia Skewness 30.255 <0.001 ✗ Not normal
Mardia Kurtosis -2.266 0.023 ✗ Not normal
Henze-Zirkler 1.282 0.003 ✗ Not normal

5.6 Interpretation

Assumption 3 - Multivariate Normality Not Formally Met

The results of the formal tests (Mardia & Henze-Zirkler) indicate that the data do not meet statistically strict multivariate normality. However:

Justification for continuing the analysis:

  1. Robustness of MANOVA: MANOVA is known to be robust to violations of normality when n is large. With n = 1000 $ per group, the sampling distribution of the test statistic approximates chi-square/F based on the Central Limit Theorem.

  2. Multivariate Outliers: Only 1 observations (0.1%) exceed the Mahalanobis threshold there are no extreme outliers that threaten the analysis.

  3. The Chi-square Q-Q plot shows that most of the points follow the diagonal line — deviations occur only in the tails of the distribution.

  4. Rule of thumb (Hair et al.): With n > 200, MANOVA is considered quite robust against violations of multivariate normality.

Recommendation: Proceed with cautious interpretation and consider using Pillai’s Trace (more robust) as an alternative test statistic to Wilks’ Lambda.


6 Assumption 4 - Linearity of Covariates with DV (Linearity Test)

6.1 Theoretical Basis

Objective: To verify that the relationship between the covariate (test preparation course) and each DV variable is linear. ANCOVA and MANCOVA assume a linear regression model between the covariate and DV. If the relationship is not linear, the covariate correction will be biased.

Hypothesis per pair (covariate–DV): \[H_0 : \beta_{\text{linear}} = 0 \quad \text{(no linear relationship)}\] \[H_1 : \beta_{\text{linear}} \neq 0 \quad \text{(there is a linear relationship)}\]

Methods used:

  1. Point-Biserial Correlation (binary covariate vs. continuous DV): \(r_{pb}\)
  2. Independent Samples t-test / ANOVA F-test: comparing mean DV across covariate levels
  3. Scatter plot + regression line per group IV (gender)
  4. Homogeneity of regression slopes: covariate–DV slopes must be equal across all groups (critical assumption of ANCOVA)

6.2 Transformation & Covariate Statistics

df$test_prep_num <- ifelse(df[[cov_col]] == "completed", 1, 0)

cat("Covariate Distribution:\n")
cov_dist <- table(df[[cov_col]])
print(cov_dist)
cat(sprintf("\nProposition 'completed': %.1f%%\n", cov_dist["completed"]/sum(cov_dist)*100))

cat("\nAverage DV per Covariate Level:\n")
df %>% 
  group_by(`test preparation course`) %>% 
  summarize(across(all_of(dv_cols), ~ round(mean(.), 2)), 
            n = n()) %>% 
  kable(caption = "Mean DV per Covariate Level") %>% 
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)
## Covariate Distribution:
## 
## completed      none 
##       358       642 
## 
## Proposition 'completed': 35.8%
## 
## Average DV per Covariate Level:
Mean DV per Covariate Level
test preparation course math score reading score writing score n
completed 69.70 73.89 74.42 358
none 64.08 66.53 64.50 642

6.3 Covariate–DV Correlation

cat("Point-Biserial Correlation (covariate vs DV):\n\n")

corr_results <- lapply(dv_cols, function(col) { 
  ct <- cor.test(df[[col]], df$test_prep_num, method = "pearson") 
  list(DV = col, 
       r = round(ct$estimate, 4), 
       t_stat = round(ct$statistic, 4), 
       p_value = round(ct$p.value, 6), 
       CI_lower = round(ct$conf.int[1], 4), 
       CI_upper = round(ct$conf.int[2], 4), 
       Signif = ifelse(ct$p.value < 0.05, "Significant ✓", "Not Significant"))
})

corr_tbl <- do.call(rbind, lapply(corr_results, as.data.frame))

corr_tbl %>% 
  kable(caption = "Pearson Correlation: Covariate (Test Prep) vs DV") %>% 
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>% 
  column_spec(1, bold = TRUE) %>% 
  column_spec(7, bold = TRUE, color = ifelse(corr_tbl$Signif == "Significant ✓", "#1E8449", "#C0392B"))
## Point-Biserial Correlation (covariate vs DV):
Pearson Correlation: Covariate (Test Prep) vs DV
DV r t_stat p_value CI_lower CI_upper Signif
cor math score 0.1777 5.7046 0 0.1170 0.2371 Significant ✓
cor1 reading score 0.2418 7.8717 0 0.1825 0.2993 Significant ✓
cor2 writing score 0.3129 10.4092 0 0.2559 0.3678 Significant ✓

6.4 Boxplot & Scatter DV per Covariate Level

df$test_prep_label <- factor(df[[cov_col]], levels = c("none","completed"), 
labels = c("None", "Completed"))

bp_plots <- lapply(dv_cols, function(col) { 
  df %>% 
    ggplot(aes(x = test_prep_label, y = .data[[col]], fill = test_prep_label)) + 
    geom_boxplot(alpha = 0.75, outlier.shape = 21) + 
    geom_jitter(width = 0.15, alpha = 0.2, size = 0.7) + 
    scale_fill_manual(values = c("#BDC3C7","#27AE60")) + 
    stat_summary(fun = mean, geom = "point", shape = 18, 
                 size = 4, color = "#E74C3C") + 
    theme_minimal(base_size = 11) + 
    theme(legend.position = "none") + 
    labs(title = col, x = "Test Preparation", y = "Score")
})

grid.arrange(grobs = bp_plots, ncol = 3, 
top = "DV Score Distribution based on Test Preparation Status")

scatter_plots <- lapply(dv_cols, function(col) {
  ggplot(df, aes(x = test_prep_num, y = .data[[col]], color = gender)) +
    geom_jitter(width = 0.05, alpha = 0.4, size = 1.5) +
    geom_smooth(method = "lm", se = TRUE, linewidth = 1.2) +
    geom_smooth(aes(group = 1), method = "lm", color = "#2C3E50",
                se = FALSE, linewidth = 1.5, linetype = "dashed") +
    scale_color_manual(values = c("#E74C3C","#3498DB")) +
    scale_x_continuous(breaks = c(0,1),
                       labels = c("None (0)","Completed (1)")) +
    theme_minimal(base_size = 11) +
    theme(legend.position = "bottom") +
    labs(title = sprintf("Linearity: Test Prep → %s", col), 
         subtitle = sprintf("r = %.4f (black line = whole)", 
                            cor(df[[col]], df$test_prep_num)), 
         x = "Test Preparation (0=None, 1=Completed)", 
         y = "Score", color = "Gender")
})

grid.arrange(grobs = scatter_plots, ncol = 3, top = "Scatter Plot of Covariate Linearity–DV each Gender")

6.5 Independent t-Test: Difference in Mean DV across Covariate Levels

cat("INDEPENDENT SAMPLES t-TEST: Kovariat vs DV\n")
cat()

ttest_results <- lapply(dv_cols, function(col) {
  none_grp      <- df[[col]][df[[cov_col]] == "none"]
  completed_grp <- df[[col]][df[[cov_col]] == "completed"]
  tt <- t.test(completed_grp, none_grp, var.equal = FALSE)
  
  cat(sprintf(" Variable: %s\n", col))
  cat(sprintf(" Mean (None) = %.2f\n", mean(none_grp)))
  cat(sprintf(" Mean (Completed) = %.2f\n", mean(completed_grp)))
  cat(sprintf(" Difference in means = %.2f\n", mean(completed_grp) - mean(none_grp)))
  cat(sprintf(" t = %.4f, df = %.1f, p = %.2e\n",
              tt$statistic, tt$parameter, tt$p.value))
  cat(sprintf(" Conclusion: %s\n\n",
              ifelse(tt$p.value < 0.05,
                     "There is a significant difference → Covariate has an effect ✓",
                     "There is no significant difference")))
  
  data.frame(DV = col,
             Mean_None = round(mean(none_grp), 2),
             Mean_Completed = round(mean(completed_grp), 2),
             Selisih = round(mean(completed_grp) - mean(none_grp), 2),
             t_stat  = round(tt$statistic, 4),
             p_value = round(tt$p.value, 6),
             Signif  = ifelse(tt$p.value < 0.05, "Significant ✓", "No"))
})

ttest_tbl <- do.call(rbind, ttest_results)

ttest_tbl %>%
  kable(caption = "t-Test: Difference in Mean DV based on Test Preparation") %>%
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE) %>%
  column_spec(1, bold = TRUE) %>%
  column_spec(7, bold = TRUE,
              color = ifelse(ttest_tbl$Signif == "Significant ✓", "#1E8449", "#C0392B"))
## INDEPENDENT SAMPLES t-TEST: Kovariat vs DV
##  Variable: math score
##  Mean (None) = 64.08
##  Mean (Completed) = 69.70
##  Difference in means = 5.62
##  t = 5.7870, df = 770.1, p = 1.04e-08
##  Conclusion: There is a significant difference → Covariate has an effect ✓
## 
##  Variable: reading score
##  Mean (None) = 66.53
##  Mean (Completed) = 73.89
##  Difference in means = 7.36
##  t = 8.0041, df = 775.4, p = 4.39e-15
##  Conclusion: There is a significant difference → Covariate has an effect ✓
## 
##  Variable: writing score
##  Mean (None) = 64.50
##  Mean (Completed) = 74.42
##  Difference in means = 9.91
##  t = 10.7525, df = 811.1, p = 2.66e-25
##  Conclusion: There is a significant difference → Covariate has an effect ✓
t-Test: Difference in Mean DV based on Test Preparation
DV Mean_None Mean_Completed Selisih t_stat p_value Signif
t math score 64.08 69.70 5.62 5.7870 0 Significant ✓
t1 reading score 66.53 73.89 7.36 8.0041 0 Significant ✓
t2 writing score 64.50 74.42 9.91 10.7525 0 Significant ✓

6.6 Homogeneity Test of Regression Slope (Interaction Test)

cat("HOMOGENITY TEST OF REGRESSION SLOPE\n")
cat("(Covariate × IV Interaction - Critical Assumptions of ANCOVA)\n")
cat()

for (col in dv_cols) {
  fml <- as.formula(paste0('`', col, '` ~ gender * test_prep_num'))
  m_int <- lm(fml, data = df)
  
  cat(sprintf("DV: %s\n", col))
  
  # Extract interaction term
  s <- summary(m_int)$coefficients
  int_row <- grep("gender.*test_prep|test_prep.*gender", rownames(s), value = TRUE)
  
  if (length(int_row) > 0) {
    cat(sprintf(" Interaction coefficient (gender:test_prep): %.4f\n", s[int_row, "Estimate"])) 
    cat(sprintf(" t = %.4f, p = %.4f\n", s[int_row,"t value"], s[int_row,"Pr(>|t|)"])) 
    cat(sprintf(" Homogeneous slope: %s\n\n", ifelse(s[int_row,"Pr(>|t|)"] > 0.05, "YES ✓", "NO ✗")))
  }
}
## HOMOGENITY TEST OF REGRESSION SLOPE
## (Covariate × IV Interaction - Critical Assumptions of ANCOVA)
## DV: math score
##  Interaction coefficient (gender:test_prep): 0.1258
##  t = 0.0647, p = 0.9484
##  Homogeneous slope: YES ✓
## 
## DV: reading score
##  Interaction coefficient (gender:test_prep): 0.0242
##  t = 0.0134, p = 0.9893
##  Homogeneous slope: YES ✓
## 
## DV: writing score
##  Interaction coefficient (gender:test_prep): 0.3323
##  t = 0.1838, p = 0.8542
##  Homogeneous slope: YES ✓
slope_plots <- lapply(dv_cols, function(col) {
  coefs <- df %>%
    group_by(gender) %>%
    summarise(slope = coef(lm(reformulate("test_prep_num", col), data = cur_data()))[2],
              .groups = "drop")
  
  ggplot(df, aes(x = test_prep_num, y = .data[[col]], color = gender)) +
    geom_jitter(width = 0.04, alpha = 0.35, size = 1.2) +
    geom_smooth(method = "lm", se = TRUE, linewidth = 1.3) +
    scale_color_manual(values = c("#E74C3C","#3498DB")) +
    scale_x_continuous(breaks = c(0,1), labels = c("None","Completed")) +
    annotate("text", x = 0.5, y = min(df[[col]]) + 2,
             label = sprintf("Slope F: %.2f | M: %.2f",
                             coefs$slope[coefs$gender=="female"],
                             coefs$slope[coefs$gender=="male"]),
             size = 3.5, color = "#2C3E50", fontface = "italic") +
    theme_minimal(base_size = 11) +
    theme(legend.position = "bottom") +
    labs(title = col, x = "Test Prep", y = "Skor")
})

grid.arrange(grobs = slope_plots, ncol = 3, top = "Covariate Regression Slope→DV per Gender Group\n(Must be Parallel for ANCOVA)")

6.7 Interpretation

Assumption 4 MEET - Linearity of Covariates with DVs

Results show:

  1. Significant correlation: The covariate test preparation course is significantly correlated with all three DVs (\(p < 0.05\) for all). Students who participated in test preparation scored consistently higher:
  • Math: \(+5.6\) points
  • Reading: \(+7.4\) points
  • Writing: \(+9.9\) points
  1. Linear relationship: The scatter plot shows a clear linear pattern — scores increase as test preparation status increases.

  2. Homogeneity of slope: There is no significant interaction between gender and the covariate (\(p > 0.05\)), meaning the covariate–DV regression slope is equal across both gender groups. This is a critical assumption of ANCOVA that is met.

The covariate is proven relevant and valid for inclusion in the ANCOVA/MANCOVA model.


7 Assumption 5 — Independence of Observations

7.1 Theoretical Basis

Goal: To ensure that each observation (respondent/student) is independent of each other there is no relationship or influence between individuals in the dataset.

Formal Definition: \[\text{Cov}(\varepsilon_{ij}, \varepsilon_{i'j'}) = 0 \quad \text{for } (i,j) \neq (i',j')\]

This assumption cannot be tested formally statistically from cross-sectional data, but must be justified from the research design.

7.2 Research Design Evaluation

Review of Independence of Observations — Student Performance Dataset

Data Source: Kaggle — “Students Performance in Exams” (Royce Kimmons, BYU; simulation of a standard exam scenario)

7.2.1 Arguments in Support of Independence

1. Cross-Sectional Design: Each row represents a unique student taking the exam at the same time. There are no repeated measures on the same individual.

2. No Known Clustering: The data does not include school, grade, or teacher information there is no identifiable hierarchical structure that could introduce intra-class autocorrelation.

3. Independent Sampling: The dataset is designed as an independent survey/observation of a population of students with diverse demographic backgrounds (race/ethnicity, parental level of education, lunch, gender).

4. No Repeated Measures: Each student appears only once in the dataset — there is no longitudinal or before-after data to create dependencies.

7.2.2 Considerations & Limitations

  • Students in the same grade could potentially share the same teacher/instructor, creating a clustering effect. However, the grade variable is not available in this dataset.

  • Socio-economic effects: The variables lunch (a proxy for poverty) and parental education may reflect geographic/community clustering, but are considered independent covariates.

  • Synthetic data: This dataset is a simulation, so independence between observations was explicitly guaranteed in the data generation process.

cat("Verify No Duplicate Observations:\n")
cat(sprintf(" Total rows: %d\n", nrow(df)))
cat(sprintf(" Unique rows: %d\n",
            nrow(unique(df[, c("gender","race/ethnicity","parental level of education",
                               "lunch","test preparation course",
                               "math score","reading score","writing score")]))))
cat(sprintf(" Duplicates present: %s\n",
            ifelse(nrow(df) == nrow(unique(df[, dv_cols])), "No ✓", "Yes - investigation required")))

cat(" Distribution per categorical variable:\n")
for (cat_var in c("gender", "race/ethnicity", "parental level of education", "lunch", "test preparation course")) { 
  cat(sprintf("\n %s:\n", cat_var)) 
  tb <- table(df[[cat_var]]) 
  for (nm in names(tb)) cat(sprintf(" %-30s: %d (%.1f%%)\n", nm, tb[nm], tb[nm]/sum(tb)*100))
}
## Verify No Duplicate Observations:
##  Total rows: 1000
##  Unique rows: 1000
##  Duplicates present: Yes - investigation required
##  Distribution per categorical variable:
## 
##  gender:
##  female                        : 518 (51.8%)
##  male                          : 482 (48.2%)
## 
##  race/ethnicity:
##  group A                       : 89 (8.9%)
##  group B                       : 190 (19.0%)
##  group C                       : 319 (31.9%)
##  group D                       : 262 (26.2%)
##  group E                       : 140 (14.0%)
## 
##  parental level of education:
##  associate's degree            : 222 (22.2%)
##  bachelor's degree             : 118 (11.8%)
##  high school                   : 196 (19.6%)
##  master's degree               : 59 (5.9%)
##  some college                  : 226 (22.6%)
##  some high school              : 179 (17.9%)
## 
##  lunch:
##  free/reduced                  : 355 (35.5%)
##  standard                      : 645 (64.5%)
## 
##  test preparation course:
##  completed                     : 358 (35.8%)
##  none                          : 642 (64.2%)

7.3 Interpretation

Assumption 5 MEET - Independent Observations

Based on the research design review:

  • The dataset is cross-sectional with 1000 unique students
  • Each student is measured once without repetition
  • No clustering structure is identified in the data
  • The distribution of categorical variables shows variability that reflects general population sampling

The independence assumption is acceptable based on the available data structure. For studies with actual data from specific schools, it is recommended to consider a mixed-effects model to accommodate potential clustering between classes/schools.


8 Ringkasan Akhir Semua Asumsi

summary_df <- data.frame(
  No        = 1:5,
  Asumsi    = c(
    "Dependensi DV (Bartlett)",
    "Homogenitas Kovarians (Box's M)",
    "Normalitas Multivariat (Mardia/HZ)",
    "Linearitas Kovariat–DV",
    "Independensi Observasi"
  ),
  Metode    = c(
    "Bartlett Sphericity Test",
    "Box's M Test (biotools)",
    "Mardia + Henze-Zirkler (MVN)",
    "Korelasi Pearson + t-test + Slope Test",
    "Tinjauan Desain Penelitian"
  ),
  Statistik_Kunci = c(
    sprintf("χ²=%.2f, df=%d", chi2_stat, df_bart),
    sprintf("C=%.2f, df=%.0f", C_stat, df_box),
    "Mardia Skew p<0.001; HZ p<0.001",
    sprintf("r=%.3f~%.3f, semua p<0.05",
            min(as.numeric(corr_tbl$r)), max(as.numeric(corr_tbl$r))),
    "Cross-sectional, n=1000 siswa unik"
  ),
  p_value   = c(
    formatC(p_value, format="e", digits=3),
    round(p_boxm, 4),
    "< 0.05 (formal)",
    "< 0.001",
    "N/A"
  ),
  Status    = c(
    "✅ TERPENUHI",
    "✅ TERPENUHI",
    "⚠️ ROBUST (n besar)",
    "✅ TERPENUHI",
    "✅ TERPENUHI"
  ),
  Implikasi = c(
    "MANOVA layak digunakan",
    "Pooled covariance valid",
    "Lanjutkan; gunakan Pillai's Trace",
    "Kovariat valid untuk ANCOVA/MANCOVA",
    "Inferensi statistik valid"
  )
)

summary_df %>%
  kable(caption = "Tabel Ringkasan Pengujian Asumsi — MANOVA / ANCOVA / MANCOVA") %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"),
                full_width = TRUE, font_size = 12) %>%
  column_spec(1, bold = TRUE, width = "2em") %>%
  column_spec(2, bold = TRUE, width = "14em") %>%
  column_spec(6, bold = TRUE,
              color = c("#1E8449","#1E8449","#E67E22","#1E8449","#1E8449")) %>%
  row_spec(0, bold = TRUE, color = "white", background = "#2C3E50")
Tabel Ringkasan Pengujian Asumsi — MANOVA / ANCOVA / MANCOVA
No Asumsi Metode Statistik_Kunci p_value Status Implikasi
1 Dependensi DV (Bartlett) Bartlett Sphericity Test χ²=3532.78, df=3 0.000e+00 ✅ TERPENUHI | ANOVA layak digunakan |
2 Homogenitas Kovarians (Box’s M) Box’s M Test (biotools) C=9.25, df=6 0.1597 ✅ TERPENUHI | ooled covariance valid |
3 Normalitas Multivariat (Mardia/HZ) Mardia + Henze-Zirkler (MVN) Mardia Skew p<0.001; HZ p<0.001 < 0.05 (formal) ⚠️ ROBUST (n besar) |Lanjutkan; gunakan Pillai’s Trace
4 Linearitas Kovariat–DV Korelasi Pearson + t-test + Slope Test r=0.178~0.313, semua p<0.05 < 0.001 ✅ TERPENUHI | ovariat valid untuk ANCOVA/MANCOVA |
5 Independensi Observasi Tinjauan Desain Penelitian Cross-sectional, n=1000 siswa unik N/A ✅ TERPENUHI | nferensi statistik valid |

8.1 Final Conclusions

Conclusions of Assumption Testing

Of the five assumptions tested on the Students Performance in Exams dataset, four assumptions were fully met and one assumption (multivariate normality) was practically met given the large sample size ($n = 1000).

Feasibility of Further Analysis:

Analysis Feasibility Recommendations
One-Way MANOVA Feasible Gender as IV, Math/Reading/Writing as DV
ANCOVA Feasible Gender as IV, Test Prep as covariate, one DV
MANCOVA Feasible Gender as IV, Test Prep as covariate, all three DVs simultaneously

Recommended test statistic: Pillai’s Trace (more robust to violations of multivariate normality than Wilks’ Lambda).

Substantive Interpretation: - Academic score variables (math, reading, writing) are highly correlated → strong justification for using MANOVA - Test preparation covariates have been shown to have a linear and significant effect on all three DVs → MANCOVA will provide a more accurate estimate of the gender effect - Covariate regression slopes are homogeneous across groups → critical assumptions of ANCOVA are met


9 References

9.1 Data Source

9.2 Pustaka R yang Digunakan

  • MVN: Korkmaz, S., Goksuluk, D. & Zararsiz, G. (2014). MVN: An R package for Multivariate Normality Tests
  • biotools: Da Silva, A.R. (2017). A discriminant function using biotools::boxM
  • car: Fox, J. & Weisberg, S. (2019). An R Companion to Applied Regression
  • ggplot2, dplyr, tidyr: Tidyverse packages

9.3 Referensi Statistik

  • Box, G.E.P. (1949). A general distribution theory for a class of likelihood criteria. Biometrika, 36, 317–346.
  • Bartlett, M.S. (1950). Tests of significance in factor analysis. British Journal of Psychology, 3, 77–85.
  • Mardia, K.V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57, 519–530.
  • Henze, N. & Zirkler, B. (1990). A class of invariant consistent tests for multivariate normality. Communications in Statistics, 19, 3595–3618.
  • Hair, J.F. et al. (2019). Multivariate Data Analysis (8th ed.). Cengage Learning.

Laporan Pengujian Asumsi — Analisis Multivariat

Tanggal: 15 April 2026

Prodi S1 Sains Data | FMIPA UNESA | 2026

Dokumen ini dibuat menggunakan R Markdown dengan paket MVN, biotools, car, dan tidyverse