1 Introduction

This report presents an Analysis of Covariance (ANCOVA) applied to student academic performance data sourced from the Students Performance in Exams dataset which is publicly available on Kaggle. The dataset contains records of 1,000 students and captures their scores across three subjects namely mathematics, reading and writing along with several demographic and contextual background variables.

The dataset includes eight variables in total. Gender is recorded as female or male while race/ethnicity is grouped into five categories from Group A to Group E. Parental level of education ranges from some high school up to master’s degree and lunch type serves as a proxy for socioeconomic status distinguishing between standard and free or reduced lunch. Test preparation course completion is also recorded as either completed or none. The three outcome variables are math score, reading score and writing score each measured on a continuous scale from 0 to 100.

In this analysis gender serves as the independent variable and parental level of education serves as the covariate which is encoded ordinally from 1 for some high school up to 6 for master’s degree. The remaining variables such as race/ethnicity, lunch type and test preparation are not included in the current models but are acknowledged as potential confounders that may influence academic outcomes.

ANCOVA extends ANOVA by incorporating a covariate allowing for a more accurate estimation of group differences after controlling for external influences. The main objective of this analysis is to examine whether gender differences in academic performance exist after adjusting for parental education level. Three separate one-way ANCOVA models are fitted with one model per dependent variable:

model_info <- data.frame(
  Model = c("Model 1", "Model 2", "Model 3"),
  DV = c("Math Score", "Reading Score", "Writing Score"),
  IV_Factor = c("Gender", "Gender", "Gender"),
  Covariate = c("Parental Education (Ordinal)",
                "Parental Education (Ordinal)",
                "Parental Education (Ordinal)")
)

knitr::kable(model_info,
             caption = "ANCOVA Model Specification",
             align = "lccc") %>%
  kableExtra::kable_styling(full_width = FALSE,
                            bootstrap_options = c("striped", "hover"))

ANCOVA Model Specification
Model	DV	IV_Factor	Covariate
Model 1	Math Score	Gender	Parental Education (Ordinal)
Model 2	Reading Score	Gender	Parental Education (Ordinal)
Model 3	Writing Score	Gender	Parental Education (Ordinal)

The general model applied for each dependent variable can be expressed as & DV ~ parental_edu_ord + gender. This specification indicates that academic performance is modeled as a function of gender while adjusting for parental education level.

2 Data Loading & Preparation

2.1 Import Dataset

df_raw <- read.csv("StudentsPerformance.csv", stringsAsFactors = FALSE)

colnames(df_raw) <- c(
  "gender", "race_ethnicity", "parental_education",
  "lunch", "test_prep", "math_score", "reading_score", "writing_score"
)

cat(sprintf("Rows: %d | Columns: %d\n", nrow(df_raw), ncol(df_raw)))

## Rows: 1000 | Columns: 8

2.2 Dataset Preview

kable(head(df_raw, 10), caption = "Preview: First 10 Rows of Dataset") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 11) %>%
  scroll_box(width = "100%")

Preview: First 10 Rows of Dataset
gender	race_ethnicity	parental_education	lunch	test_prep	math_score	reading_score	writing_score
female	group B	bachelor’s degree	standard	none	72	72	74
female	group C	some college	standard	completed	69	90	88
female	group B	master’s degree	standard	none	90	95	93
male	group A	associate’s degree	free/reduced	none	47	57	44
male	group C	some college	standard	none	76	78	75
female	group B	associate’s degree	standard	none	71	83	78
female	group B	some college	standard	completed	88	95	92
male	group B	some college	free/reduced	none	40	43	39
male	group D	high school	free/reduced	completed	64	64	67
female	group B	high school	free/reduced	none	38	60	50

2.3 Preprocessing

2.3.1 Factor Conversion & Ordinal Encoding

df_raw$gender <- factor(df_raw$gender)

edu_order <- c(
  "some high school"   = 1,
  "high school"        = 2,
  "some college"       = 3,
  "associate's degree" = 4,
  "bachelor's degree"  = 5,
  "master's degree"    = 6
)

df_raw$parental_edu_ord <- as.integer(edu_order[df_raw$parental_education])

edu_mapping <- data.frame(
  Category = names(edu_order),
  Ordinal_Code = unname(edu_order)
)

knitr::kable(
  edu_mapping,
  caption = "Ordinal Encoding of Parental Education Level",
  align = c("l", "c")
) %>%
  kableExtra::kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover")
  )

Ordinal Encoding of Parental Education Level
Category	Ordinal_Code
some high school	1
high school	2
some college	3
associate’s degree	4
bachelor’s degree	5
master’s degree	6

2.3.2 Group Distribution

group_df <- data.frame(
  Factor = c(rep("Gender", 2), rep("Parental Education", 6)),
  Level  = c(levels(df_raw$gender), names(edu_order)),
  n      = c(as.integer(table(df_raw$gender)),
             as.integer(table(factor(df_raw$parental_education,
                                     levels = names(edu_order)))))
)

rownames(group_df) <- NULL

knitr::kable(
  group_df,
  caption = "Sample Distribution Across Factors",
  col.names = c("Factor", "Level", "n"),
  align = c("l", "l", "c")
) %>%
  kableExtra::kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover")
  )

Sample Distribution Across Factors
Factor	Level	n
Gender	female	518
Gender	male	482
Parental Education	some high school	179
Parental Education	high school	196
Parental Education	some college	226
Parental Education	associate’s degree	222
Parental Education	bachelor’s degree	118
Parental Education	master’s degree	59

2.4 Descriptive Statistics

dvs <- c("math_score", "reading_score", "writing_score")

desc_all <- data.frame(
  Variable = dvs,
  N        = sapply(df_raw[dvs], length),
  Mean     = round(sapply(df_raw[dvs], mean), 3),
  SD       = round(sapply(df_raw[dvs], sd), 3),
  Min      = sapply(df_raw[dvs], min),
  Q1       = sapply(df_raw[dvs], quantile, 0.25),
  Median   = sapply(df_raw[dvs], median),
  Q3       = sapply(df_raw[dvs], quantile, 0.75),
  Max      = sapply(df_raw[dvs], max),
  Skewness = round(sapply(df_raw[dvs], skewness), 3),
  Kurtosis = round(sapply(df_raw[dvs], kurtosis), 3)
)

rownames(desc_all) <- NULL

knitr::kable(
  desc_all,
  caption = "Descriptive Statistics of Dependent Variables",
  align = c("l", rep("c", 10))
) %>%
  kableExtra::kable_styling(
    bootstrap_options = c("striped", "hover"),
    full_width = FALSE
  ) %>%
  kableExtra::scroll_box(width = "100%")

Descriptive Statistics of Dependent Variables
Variable	N	Mean	SD	Min	Q1	Median	Q3	Max	Skewness	Kurtosis
math_score	1000	66.089	15.163	0	57.00	66	77	100	-0.279	3.268
reading_score	1000	69.169	14.600	17	59.00	70	79	100	-0.259	2.926
writing_score	1000	68.054	15.196	10	57.75	69	79	100	-0.289	2.961

desc_gender <- lapply(dvs, function(dv) {
  df_raw %>%
    group_by(gender) %>%
    summarise(
      DV     = dv,
      N      = n(),
      Mean   = round(mean(.data[[dv]]), 3),
      Median = round(median(.data[[dv]]), 3),
      SD     = round(sd(.data[[dv]]), 3),
      Min    = min(.data[[dv]]),
      Max    = max(.data[[dv]]),
      .groups = "drop"
    )
}) %>% bind_rows()

knitr::kable(
  desc_gender,
  caption = "Descriptive Statistics per Gender x DV",
  align = "lccccccc"
) %>%
  kableExtra::kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed"),
    font_size = 11
  )

Descriptive Statistics per Gender x DV
gender	DV	N	Mean	Median	SD	Min	Max
female	math_score	518	63.633	65	15.491	0	100
male	math_score	482	68.728	69	14.356	27	100
female	reading_score	518	72.608	73	14.378	17	100
male	reading_score	482	65.473	66	13.932	23	100
female	writing_score	518	72.467	74	14.845	10	100
male	writing_score	482	63.311	64	14.114	15	100

3 Exploratory Data Analysis

3.1 Score Distributions

df_long <- df_raw %>%
  select(math_score, reading_score, writing_score) %>%
  pivot_longer(everything(), names_to = "Subject", values_to = "Score") %>%
  mutate(Subject = case_when(
    Subject == "math_score"    ~ "Math",
    Subject == "reading_score" ~ "Reading",
    TRUE                       ~ "Writing"
  ))

ggplot(df_long, aes(x = Score, fill = Subject)) +
  geom_histogram(aes(y = after_stat(density)), bins = 25,
                 color = "white", alpha = 0.75) +
  geom_density(aes(color = Subject), linewidth = 0.9, fill = NA) +
  facet_wrap(~Subject, ncol = 3) +
  scale_fill_brewer(palette = "Set1", guide = "none") +
  scale_color_brewer(palette = "Set1", guide = "none") +
  labs(title = "Score Distributions by Subject",
       subtitle = "Histogram with kernel density estimate",
       x = "Score", y = "Density") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        strip.text = element_text(face = "bold"))

3.2 Scores by Gender

df_gender_long <- df_raw %>%
  select(gender, math_score, reading_score, writing_score) %>%
  pivot_longer(-gender, names_to = "Subject", values_to = "Score") %>%
  mutate(Subject = case_when(
    Subject == "math_score"    ~ "Math",
    Subject == "reading_score" ~ "Reading",
    TRUE                       ~ "Writing"
  ))

ggplot(df_gender_long, aes(x = gender, y = Score, fill = gender)) +
  geom_boxplot(outlier.shape = 21, outlier.size = 1.5, alpha = 0.75) +
  geom_jitter(aes(color = gender), width = 0.15, alpha = 0.15, size = 0.8) +
  facet_wrap(~Subject, ncol = 3) +
  scale_fill_brewer(palette  = "Set1") +
  scale_color_brewer(palette = "Set1", guide = "none") +
  labs(title = "Score Distribution by Gender",
       x = "Gender", y = "Score", fill = "Gender") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        strip.text = element_text(face = "bold"))

3.3 Covariate vs. Dependent Variables

df_cov_long <- df_raw %>%
  select(parental_edu_ord, math_score, reading_score, writing_score) %>%
  pivot_longer(-parental_edu_ord, names_to = "Subject", values_to = "Score") %>%
  mutate(Subject = case_when(
    Subject == "math_score"    ~ "Math",
    Subject == "reading_score" ~ "Reading",
    TRUE                       ~ "Writing"
  ))

ggplot(df_cov_long, aes(x = parental_edu_ord, y = Score, color = Subject)) +
  geom_jitter(alpha = 0.25, size = 0.9, width = 0.15) +
  geom_smooth(method = "lm", se = TRUE, linewidth = 1.2) +
  facet_wrap(~Subject, ncol = 3) +
  scale_color_brewer(palette = "Set1", guide = "none") +
  scale_x_continuous(breaks = 1:6,
    labels = c("Some HS", "HS", "Some Col", "Assoc", "Bach", "Master")) +
  labs(title = "Parental Education (Covariate) vs. Academic Scores",
       subtitle = "Linear trend with 95% confidence band",
       x = "Parental Education Level (Ordinal)", y = "Score") +
  theme_minimal(base_size = 11) +
  theme(plot.title  = element_text(face = "bold"),
        strip.text  = element_text(face = "bold"),
        axis.text.x = element_text(angle = 20, hjust = 1))

4 Model Fitting

The three ANCOVA models are fitted here early so all objects are available throughout the document, including in assumption tests and interpretation sections.

model_math    <- lm(math_score    ~ parental_edu_ord + gender, data = df_raw)
model_reading <- lm(reading_score ~ parental_edu_ord + gender, data = df_raw)
model_writing <- lm(writing_score ~ parental_edu_ord + gender, data = df_raw)

models_list <- list(
  "math_score"    = model_math,
  "reading_score" = model_reading,
  "writing_score" = model_writing
)

cat("Models fitted successfully.\n")

## Models fitted successfully.

cat(sprintf("  Model 1: math_score    ~ parental_edu_ord + gender | df = %d\n",
            df.residual(model_math)))

##   Model 1: math_score    ~ parental_edu_ord + gender | df = 997

cat(sprintf("  Model 2: reading_score ~ parental_edu_ord + gender | df = %d\n",
            df.residual(model_reading)))

##   Model 2: reading_score ~ parental_edu_ord + gender | df = 997

cat(sprintf("  Model 3: writing_score ~ parental_edu_ord + gender | df = %d\n",
            df.residual(model_writing)))

##   Model 3: writing_score ~ parental_edu_ord + gender | df = 997

5 ANCOVA Assumption Testing

5.1 Assumption 1 - Linearity (Covariate vs. DV)

Theoretical Basis:

ANCOVA assumes a linear relationship between the covariate (Parental Education) and the dependent variable (Student Scores). This assumption is critical because ANCOVA uses linear regression to “adjust” the group means based on the covariate. If the relationship is non-linear, the resulting adjusted means (EMMs) will be biased, and the statistical power of the test will be compromised.

Statistical Hypotheses:

\(H_0: \rho = 0\): There is no linear relationship between the covariate and the dependent variable.
\(H_1: \rho \neq 0\): There is a significant linear relationship between the covariate and the dependent variable.

lin_results <- lapply(dvs, function(dv) {

  label <- case_when(
    dv == "math_score"    ~ "Math Score",
    dv == "reading_score" ~ "Reading Score",
    dv == "writing_score" ~ "Writing Score"
  )

  ct <- cor.test(df_raw$parental_edu_ord, df_raw[[dv]])

  data.frame(
    DV      = label,
    r       = round(unname(ct$estimate), 4),
    t_stat  = round(unname(ct$statistic), 4),
    p_value = round(ct$p.value, 6),
    Sig     = ifelse(ct$p.value < 0.001, "***",
              ifelse(ct$p.value < 0.01,  "**",
              ifelse(ct$p.value < 0.05,  "*", "ns")))
  )
}) %>% bind_rows()

rownames(lin_results) <- NULL

r_math    <- lin_results$r[lin_results$DV == "Math Score"]
r_reading <- lin_results$r[lin_results$DV == "Reading Score"]
r_writing <- lin_results$r[lin_results$DV == "Writing Score"]

kable(lin_results,
      caption = "Pearson Correlation: Parental Education (Covariate) vs. DV",
      col.names = c("Dependent Variable", "r", "t statistic", "p-value", "Sig.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),, full_width = FALSE)

Pearson Correlation: Parental Education (Covariate) vs. DV
Dependent Variable	r	t statistic	Sig.
Math Score	0.1594	5.1019	***
Reading Score	0.1909	6.1440	***
Writing Score	0.2367	7.6969	***

p_lin <- lapply(dvs, function(dv) {
  label <- case_when(dv == "math_score" ~ "Math",
                     dv == "reading_score" ~ "Reading", TRUE ~ "Writing")
  ggplot(df_raw, aes(x = parental_edu_ord, y = .data[[dv]])) +
    geom_jitter(alpha = 0.3, color = "#2E86AB", width = 0.15, size = 1.2) +
    geom_smooth(method = "lm", se = TRUE, color = "#E84855", linewidth = 1.1) +
    scale_x_continuous(breaks = 1:6) +
    labs(title = paste("Linearity:", label),
         x = "Parental Education (Ordinal)", y = paste(label, "Score")) +
    theme_minimal(base_size = 10) +
    theme(plot.title = element_text(face = "bold"))
})
grid.arrange(grobs = p_lin, ncol = 3)

Interpretation: Pearson correlations are all significant (p < 0.001): math r = 0.1594, reading r = 0.1909, writing r = 0.2367. The linearity assumption is met for all models.

5.2 Assumption 2 - Homogeneity of Regression Slopes (Critical)

Theoretical Basis:

The Homogeneity of Regression Slopes assumption requires that the relationship between the covariate (Parental Education) and the dependent variable (Student Scores) is consistent across all levels of the independent variable (Gender). Visually, this means the regression lines for each group should be approximately parallel. If an interaction exists, it indicates that the effect of parental education on academic performance varies depending on the student’s gender, which would render the ANCOVA model’s main effects uninterpretable.

Statistical Hypotheses:

H0: The regression slopes are equal across groups (No interaction between the IV and Covariate).
H1: The regression slopes are significantly different (A significant interaction exists).

slope_results <- lapply(dvs, function(dv) {
  model_int <- lm(as.formula(paste(dv, "~ parental_edu_ord * gender")),
                  data = df_raw)
  aov_int   <- anova(model_int)
  int_row   <- aov_int["parental_edu_ord:gender", ]
  data.frame(
    DV       = dv,
    F_value  = round(int_row$`F value`, 4),
    p_value  = round(int_row$`Pr(>F)`, 4),
    Decision = ifelse(int_row$`Pr(>F)` > 0.05,
                      "Homogeneous", "Heterogeneous (ANCOVA invalid)")
  )
}) %>% bind_rows()
rownames(slope_results) <- NULL

kable(slope_results,
      caption = "Homogeneity of Regression Slopes: Interaction Test",
      col.names = c("DV", "F value", "p-value", "Decision")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
  row_spec(which(grepl("OK", slope_results$Decision)), color = "darkgreen")

Homogeneity of Regression Slopes: Interaction Test
DV	F value	p-value	Decision
math_score	0.8128	0.3675	Homogeneous
reading_score	0.4402	0.5072	Homogeneous
writing_score	0.1621	0.6873	Homogeneous

p_slope <- lapply(dvs, function(dv) {
  label <- case_when(dv == "math_score" ~ "Math",
                     dv == "reading_score" ~ "Reading", TRUE ~ "Writing")
  ggplot(df_raw, aes(x = parental_edu_ord, y = .data[[dv]], color = gender)) +
    geom_jitter(alpha = 0.25, width = 0.15, size = 1.1) +
    geom_smooth(method = "lm", se = TRUE, linewidth = 1.2) +
    scale_color_brewer(palette = "Set1") +
    scale_x_continuous(breaks = 1:6) +
    labs(title = paste("Regression Slopes by Gender:", label),
         subtitle = "Parallel lines = homogeneous slopes",
         x = "Parental Education (Ordinal)",
         y = paste(label, "Score"), color = "Gender") +
    theme_minimal(base_size = 10) +
    theme(plot.title = element_text(face = "bold"))
})
grid.arrange(grobs = p_slope, ncol = 3)

Interpretation: The interaction term parental_edu_ord x gender is not significant (p > 0.05) for all DVs. Slopes are parallel across gender groups. ANCOVA is valid to proceed.

5.3 Assumption 3 - Normality of Residuals

Theoretical Basis:

ANCOVA assumes that the residuals (the differences between the observed values and the values predicted by the model) are normally distributed. This is essential for the validity of the \(F\)-test and the accuracy of the \(p\)-values. However, under the Central Limit Theorem, ANCOVA is generally robust to violations of normality when the sample size is sufficiently large (e.g., \(N > 30\) per group), as is the case in this study (\(N = 1000\)).

Statistical Hypotheses:

\(H_0\): The residuals follow a normal distribution.
\(H_1\): The residuals do not follow a normal distribution.

We aim to fail to reject \(H_0\) (\(p > 0.05\)). If \(H_0\) is rejected, we rely on the large sample size and visual inspection of Q-Q plots to justify the robustness of the model.

norm_results <- lapply(names(models_list), function(nm) {
  resids <- residuals(models_list[[nm]])
  sw     <- shapiro.test(resids)
  label  <- case_when(nm == "math_score" ~ "Math",
                      nm == "reading_score" ~ "Reading", TRUE ~ "Writing")
  data.frame(
    DV       = label,
    W        = round(sw$statistic, 5),
    p_value  = round(sw$p.value, 6),
    Decision = ifelse(sw$p.value > 0.05, "Normal",
                      "Non-normal (robust for n=1000)")
  )
}) %>% bind_rows()
rownames(norm_results) <- NULL

kable(norm_results,
      caption = "Shapiro-Wilk Test: Normality of Residuals",
      col.names = c("DV", "W statistic", "p-value", "Decision")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Shapiro-Wilk Test: Normality of Residuals
DV	W statistic	p-value	Decision
Math	0.99551	0.004961	Non-normal (robust for n=1000)
Reading	0.99307	0.000129	Non-normal (robust for n=1000)
Writing	0.99040	0.000004	Non-normal (robust for n=1000)

plot_list_norm <- lapply(names(models_list), function(nm) {
  label  <- case_when(nm == "math_score" ~ "Math",
                      nm == "reading_score" ~ "Reading", TRUE ~ "Writing")
  resids <- data.frame(resid = residuals(models_list[[nm]]))

  p_qq <- ggplot(resids, aes(sample = resid)) +
    stat_qq(alpha = 0.5, size = 1.1, color = "#2E86AB") +
    stat_qq_line(color = "#E84855", linewidth = 0.9) +
    labs(title = paste("Q-Q Plot:", label),
         x = "Theoretical Quantiles", y = "Sample Quantiles") +
    theme_minimal(base_size = 10) +
    theme(plot.title = element_text(face = "bold"))

  p_hist <- ggplot(resids, aes(x = resid)) +
    geom_histogram(aes(y = after_stat(density)), bins = 30,
                   fill = "#2E86AB", color = "white", alpha = 0.75) +
    geom_density(color = "#E84855", linewidth = 1) +
    labs(title = paste("Residual Distribution:", label),
         x = "Residuals", y = "Density") +
    theme_minimal(base_size = 10) +
    theme(plot.title = element_text(face = "bold"))

  list(p_qq, p_hist)
})

grid.arrange(
  plot_list_norm[[1]][[1]], plot_list_norm[[1]][[2]],
  plot_list_norm[[2]][[1]], plot_list_norm[[2]][[2]],
  plot_list_norm[[3]][[1]], plot_list_norm[[3]][[2]],
  ncol = 2
)

Interpretation: Shapiro-Wilk yields p < 0.05, but with n = 1,000 the Central Limit Theorem ensures approximate normality. ANCOVA is robust at this sample size.

5.4 Assumption 4 - Homogeneity of Variance (Levene’s Test)

Theoretical Basis:

This assumption, also known as homoscedasticity, requires that the variance of the dependent variable is equal across all groups (Gender). If the variances are significantly different, the \(F\)-test becomes unreliable because the error term will be biased, potentially leading to Type I or Type II errors.

Statistical Hypotheses:

\(H_0: \sigma^2_{female} = \sigma^2_{male}\): The group variances are equal.
\(H_1: \sigma^2_{female} \neq \sigma^2_{male}\): The group variances are significantly different.

We aim to fail to reject \(H_0\) (\(p > 0.05\)). A non-significant Levene’s test indicates that the assumption of equal variances is met, ensuring the stability of the ANCOVA results.

lev_results <- lapply(dvs, function(dv) {
  lt    <- leveneTest(df_raw[[dv]] ~ df_raw$gender)
  label <- case_when(dv == "math_score" ~ "Math",
                     dv == "reading_score" ~ "Reading", TRUE ~ "Writing")
  data.frame(
    DV       = label,
    F_value  = round(lt$`F value`[1], 4),
    p_value  = round(lt$`Pr(>F)`[1],  4),
    Decision = ifelse(lt$`Pr(>F)`[1] > 0.05, "Met", "Violated")
  )
}) %>% bind_rows()
rownames(lev_results) <- NULL

kable(lev_results,
      caption = "Levene's Test: Homogeneity of Variance",
      col.names = c("DV", "F value", "p-value", "Decision")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
  row_spec(which(lev_results$Decision == "Met"), color = "darkgreen")

Levene’s Test: Homogeneity of Variance
DV	F value	p-value	Decision
Math	0.3464	0.5563	Met
Reading	0.0188	0.8911	Met
Writing	0.0069	0.9336	Met

ggplot(df_gender_long, aes(x = gender, y = Score, fill = gender)) +
  geom_boxplot(outlier.shape = 21, outlier.size = 1.5, alpha = 0.75) +
  stat_summary(fun = mean, geom = "point", shape = 23,
               size = 3, fill = "white", color = "black") +
  facet_wrap(~Subject, ncol = 3) +
  scale_fill_brewer(palette = "Set1") +
  labs(title = "Variance Comparison by Gender",
       subtitle = "Diamond = group mean",
       x = "Gender", y = "Score", fill = "Gender") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        strip.text = element_text(face = "bold"))

Interpretation: All three DVs pass Levene’s test (p > 0.05). Homogeneity of variance is met.

5.5 Assumption Summary

assump_summary <- data.frame(
  No         = 1:4,
  Assumption = c("Linearity (Covariate-DV)",
                 "Homogeneity of Regression Slopes",
                 "Normality of Residuals",
                 "Homogeneity of Variance (Levene)"),
  Test       = c("Pearson Correlation",
                 "Covariate x IV Interaction (F-test)",
                 "Shapiro-Wilk + Q-Q Plot",
                 "Levene's Test"),
  Math    = c(paste0("r = ", r_math, ", p < .001"),
              paste0("F = ", slope_results$F_value[1], ", p = ", slope_results$p_value[1]),
              paste0("W = ", norm_results$W[norm_results$DV == "Math"],
                     ", robust n=1000"),
              paste0("F = ", lev_results$F_value[lev_results$DV == "Math"],
                     ", p = ", lev_results$p_value[lev_results$DV == "Math"])),
  Reading = c(paste0("r = ", r_reading, ", p < .001"),
              paste0("F = ", slope_results$F_value[2], ", p = ", slope_results$p_value[2]),
              paste0("W = ", norm_results$W[norm_results$DV == "Reading"],
                     ", robust n=1000"),
              paste0("F = ", lev_results$F_value[lev_results$DV == "Reading"],
                     ", p = ", lev_results$p_value[lev_results$DV == "Reading"])),
  Writing = c(paste0("r = ", r_writing, ", p < .001"),
              paste0("F = ", slope_results$F_value[3], ", p = ", slope_results$p_value[3]),
              paste0("W = ", norm_results$W[norm_results$DV == "Writing"],
                     ", robust n=1000"),
              paste0("F = ", lev_results$F_value[lev_results$DV == "Writing"],
                     ", p = ", lev_results$p_value[lev_results$DV == "Writing"])),
  Status  = c("Met", "Met", "Robust (n=1000)", "Met")
)

kable(assump_summary,
      caption = "ANCOVA Assumption Testing Summary",
      col.names = c("No.", "Assumption", "Test", "Math", "Reading",
                    "Writing", "Status")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = TRUE) %>%
  scroll_box(width = "100%")

ANCOVA Assumption Testing Summary
No.	Assumption	Test	Math	Reading	Writing	Status
1	Linearity (Covariate-DV)	Pearson Correlation	r = 0.1594, p < .001	r = 0.1909, p < .001	r = 0.2367, p < .001	Met
2	Homogeneity of Regression Slopes	Covariate x IV Interaction (F-test)	F = 0.8128, p = 0.3675	F = 0.4402, p = 0.5072	F = 0.1621, p = 0.6873	Met
3	Normality of Residuals	Shapiro-Wilk + Q-Q Plot	W = 0.99551, robust n=1000	W = 0.99307, robust n=1000	W = 0.9904, robust n=1000	Robust (n=1000)
4	Homogeneity of Variance (Levene)	Levene’s Test	F = 0.3464, p = 0.5563	F = 0.0188, p = 0.8911	F = 0.0069, p = 0.9336	Met

6 ANCOVA Results

6.1 Model 1 - Math Score

6.1.1 ANOVA Table

anova_math <- anova(model_math)

kable(round(anova_math, 4),
      caption = "ANCOVA Table: Math Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

ANCOVA Table: Math Score
	Df	Sum Sq	Mean Sq	F value	Pr(>F)
parental_edu_ord	1	5838.353	5838.3529	26.8484	0
gender	1	7046.756	7046.7562	32.4054	0
Residuals	997	216803.970	217.4563	NA	NA

6.1.2 Model Fit Statistics

sm_math <- summary(model_math)
fit_math <- data.frame(
  Metric = c("R-squared", "Adjusted R-squared", "F statistic", "p-value (overall)"),
  Value  = c(round(sm_math$r.squared, 4),
             round(sm_math$adj.r.squared, 4),
             round(sm_math$fstatistic[1], 3),
             round(pf(sm_math$fstatistic[1], sm_math$fstatistic[2],
                      sm_math$fstatistic[3], lower.tail = FALSE), 6))
)
kable(fit_math, caption = "Model Fit: Math Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Model Fit: Math Score
Metric	Value
R-squared	0.0561
Adjusted R-squared	0.0542
F statistic	29.6270
p-value (overall)	0.0000

6.1.3 Regression Coefficients

coef_math <- as.data.frame(summary(model_math)$coefficients)
coef_math$Term <- rownames(coef_math)
coef_math <- coef_math %>%
  select(Term, Estimate, `Std. Error`, `t value`, `Pr(>|t|)`) %>%
  mutate(across(where(is.numeric), ~round(.x, 4)),
         Sig = ifelse(`Pr(>|t|)` < 0.001, "***",
               ifelse(`Pr(>|t|)` < 0.01,  "**",
               ifelse(`Pr(>|t|)` < 0.05,  "*", "ns"))))
rownames(coef_math) <- NULL

kable(coef_math,
      caption = "Regression Coefficients: Model 1 (Math Score)",
      col.names = c("Term", "Estimate", "SE", "t value", "p-value", "Sig.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Regression Coefficients: Model 1 (Math Score)
Term	Estimate	SE	t value	Sig.
(Intercept)	58.1791	1.1958	48.6524	***
parental_edu_ord	1.7354	0.3198	5.4266	***
gendermale	5.3177	0.9342	5.6926	***

6.1.4 Effect Size (Partial eta-squared)

eta_math <- eta_squared(model_math, partial = TRUE)

kable(as.data.frame(eta_math) %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Partial Eta-Squared: Model 1 (Math Score)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Partial Eta-Squared: Model 1 (Math Score)
Parameter	Eta2_partial	CI	CI_low	CI_high
parental_edu_ord	0.0262	0.95	0.0123	1
gender	0.0315	0.95	0.0161	1

6.1.5 Adjusted Means (EMM)

emm_math_obj <- emmeans(model_math, ~ gender)
emm_math_df  <- as.data.frame(emm_math_obj)

kable(emm_math_df %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Estimated Marginal Means: Math Score (adjusted for parental education)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Estimated Marginal Means: Math Score (adjusted for parental education)
gender	emmean	SE	df	lower.CL	upper.CL
female	63.5259	0.6482	997	62.2538	64.7979
male	68.8436	0.6720	997	67.5249	70.1623

6.1.6 Pairwise Contrast (Female vs. Male)

contrast_math <- as.data.frame(pairs(emm_math_obj))

kable(contrast_math %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Pairwise Contrast: Female vs. Male, Math Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Pairwise Contrast: Female vs. Male, Math Score
contrast	estimate	SE	df	t.ratio	p.value
female - male	-5.3177	0.9342	997	-5.6926	0

6.2 Model 2 - Reading Score

6.2.1 ANOVA Table

anova_reading <- anova(model_reading)

kable(round(anova_reading, 4),
      caption = "ANCOVA Table: Reading Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

ANCOVA Table: Reading Score
	Df	Sum Sq	Mean Sq	F value	Pr(>F)
parental_edu_ord	1	7761.257	7761.2573	40.0278	0
gender	1	11876.017	11876.0168	61.2491	0
Residuals	997	193315.165	193.8969	NA	NA

6.2.2 Model Fit Statistics

sm_reading <- summary(model_reading)
fit_reading <- data.frame(
  Metric = c("R-squared", "Adjusted R-squared", "F statistic", "p-value (overall)"),
  Value  = c(round(sm_reading$r.squared, 4),
             round(sm_reading$adj.r.squared, 4),
             round(sm_reading$fstatistic[1], 3),
             round(pf(sm_reading$fstatistic[1], sm_reading$fstatistic[2],
                      sm_reading$fstatistic[3], lower.tail = FALSE), 6))
)
kable(fit_reading, caption = "Model Fit: Reading Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Model Fit: Reading Score
Metric	Value
R-squared	0.0922
Adjusted R-squared	0.0904
F statistic	50.6380
p-value (overall)	0.0000

6.2.3 Regression Coefficients

coef_reading <- as.data.frame(summary(model_reading)$coefficients)
coef_reading$Term <- rownames(coef_reading)
coef_reading <- coef_reading %>%
  select(Term, Estimate, `Std. Error`, `t value`, `Pr(>|t|)`) %>%
  mutate(across(where(is.numeric), ~round(.x, 4)),
         Sig = ifelse(`Pr(>|t|)` < 0.001, "***",
               ifelse(`Pr(>|t|)` < 0.01,  "**",
               ifelse(`Pr(>|t|)` < 0.05,  "*", "ns"))))
rownames(coef_reading) <- NULL

kable(coef_reading,
      caption = "Regression Coefficients: Model 2 (Reading Score)",
      col.names = c("Term", "Estimate", "SE", "t value", "p-value", "Sig.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Regression Coefficients: Model 2 (Reading Score)
Term	Estimate	SE	t value	Sig.
(Intercept)	66.9357	1.1292	59.2784	***
parental_edu_ord	1.8048	0.3020	5.9768	***
gendermale	-6.9035	0.8821	-7.8262	***

6.2.4 Effect Size (Partial eta-squared)

eta_reading <- eta_squared(model_reading, partial = TRUE)

kable(as.data.frame(eta_reading) %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Partial Eta-Squared: Model 2 (Reading Score)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Partial Eta-Squared: Model 2 (Reading Score)
Parameter	Eta2_partial	CI	CI_low	CI_high
parental_edu_ord	0.0386	0.95	0.0214	1
gender	0.0579	0.95	0.0366	1

6.2.5 Adjusted Means (EMM)

emm_reading_obj <- emmeans(model_reading, ~ gender)
emm_reading_df  <- as.data.frame(emm_reading_obj)

kable(emm_reading_df %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Estimated Marginal Means: Reading Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Estimated Marginal Means: Reading Score
gender	emmean	SE	df	lower.CL	upper.CL
female	72.4965	0.6121	997	71.2953	73.6976
male	65.5930	0.6346	997	64.3478	66.8383

6.2.6 Pairwise Contrast (Female vs. Male)

contrast_reading <- as.data.frame(pairs(emm_reading_obj))

kable(contrast_reading %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Pairwise Contrast: Female vs. Male, Reading Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Pairwise Contrast: Female vs. Male, Reading Score
contrast	estimate	SE	df	t.ratio	p.value
female - male	6.9035	0.8821	997	7.8262	0

6.3 Model 3 - Writing Score

6.3.1 ANOVA Table

anova_writing <- anova(model_writing)

kable(round(anova_writing, 4),
      caption = "ANCOVA Table: Writing Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

ANCOVA Table: Writing Score
	Df	Sum Sq	Mean Sq	F value	Pr(>F)
parental_edu_ord	1	12925.78	12925.7766	65.0192	0
gender	1	19548.24	19548.2353	98.3314	0
Residuals	997	198203.07	198.7995	NA	NA

6.3.2 Model Fit Statistics

sm_writing <- summary(model_writing)
fit_writing <- data.frame(
  Metric = c("R-squared", "Adjusted R-squared", "F statistic", "p-value (overall)"),
  Value  = c(round(sm_writing$r.squared, 4),
             round(sm_writing$adj.r.squared, 4),
             round(sm_writing$fstatistic[1], 3),
             round(pf(sm_writing$fstatistic[1], sm_writing$fstatistic[2],
                      sm_writing$fstatistic[3], lower.tail = FALSE), 6))
)
kable(fit_writing, caption = "Model Fit: Writing Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Model Fit: Writing Score
Metric	Value
R-squared	0.1408
Adjusted R-squared	0.1391
F statistic	81.6750
p-value (overall)	0.0000

6.3.3 Regression Coefficients

coef_writing <- as.data.frame(summary(model_writing)$coefficients)
coef_writing$Term <- rownames(coef_writing)
coef_writing <- coef_writing %>%
  select(Term, Estimate, `Std. Error`, `t value`, `Pr(>|t|)`) %>%
  mutate(across(where(is.numeric), ~round(.x, 4)),
         Sig = ifelse(`Pr(>|t|)` < 0.001, "***",
               ifelse(`Pr(>|t|)` < 0.01,  "**",
               ifelse(`Pr(>|t|)` < 0.05,  "*", "ns"))))
rownames(coef_writing) <- NULL

kable(coef_writing,
      caption = "Regression Coefficients: Model 3 (Writing Score)",
      col.names = c("Term", "Estimate", "SE", "t value", "p-value", "Sig.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Regression Coefficients: Model 3 (Writing Score)
Term	Estimate	SE	t value	Sig.
(Intercept)	65.1445	1.1434	56.9762	***
parental_edu_ord	2.3300	0.3058	7.6200	***
gendermale	-8.8570	0.8932	-9.9162	***

6.3.4 Effect Size (Partial eta-squared)

eta_writing <- eta_squared(model_writing, partial = TRUE)

kable(as.data.frame(eta_writing) %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Partial Eta-Squared: Model 3 (Writing Score)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Partial Eta-Squared: Model 3 (Writing Score)
Parameter	Eta2_partial	CI	CI_low	CI_high
parental_edu_ord	0.0612	0.95	0.0393	1
gender	0.0898	0.95	0.0636	1

6.3.5 Adjusted Means (EMM)

emm_writing_obj <- emmeans(model_writing, ~ gender)
emm_writing_df  <- as.data.frame(emm_writing_obj)

kable(emm_writing_df %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Estimated Marginal Means: Writing Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Estimated Marginal Means: Writing Score
gender	emmean	SE	df	lower.CL	upper.CL
female	72.3231	0.6198	997	71.1068	73.5393
male	63.4661	0.6425	997	62.2052	64.7270

6.3.6 Pairwise Contrast (Female vs. Male)

contrast_writing <- as.data.frame(pairs(emm_writing_obj))

kable(contrast_writing %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Pairwise Contrast: Female vs. Male, Writing Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Pairwise Contrast: Female vs. Male, Writing Score
contrast	estimate	SE	df	t.ratio	p.value
female - male	8.857	0.8932	997	9.9162	0

7 Cross-Model Comparison

7.1 ANCOVA Results Across All Three DVs

extract_anova_row <- function(aov_table, term, dv_name) {
  data.frame(
    DV      = dv_name,
    Term    = term,
    Df      = aov_table[term, "Df"],
    SS      = round(aov_table[term, "Sum Sq"], 3),
    MS      = round(aov_table[term, "Mean Sq"], 3),
    F_value = round(aov_table[term, "F value"], 3),
    p_value = round(aov_table[term, "Pr(>F)"], 4),
    Sig     = ifelse(aov_table[term, "Pr(>F)"] < 0.001, "***",
              ifelse(aov_table[term, "Pr(>F)"] < 0.01,  "**",
              ifelse(aov_table[term, "Pr(>F)"] < 0.05,  "*", "ns")))
  )
}

full_comparison <- bind_rows(
  extract_anova_row(anova_math,    "parental_edu_ord", "Math Score"),
  extract_anova_row(anova_math,    "gender",           "Math Score"),
  extract_anova_row(anova_reading, "parental_edu_ord", "Reading Score"),
  extract_anova_row(anova_reading, "gender",           "Reading Score"),
  extract_anova_row(anova_writing, "parental_edu_ord", "Writing Score"),
  extract_anova_row(anova_writing, "gender",           "Writing Score")
)

kable(full_comparison,
      caption = "ANCOVA Results: All Three Models",
      col.names = c("DV", "Term", "df", "SS", "MS", "F", "p-value", "Sig.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 11) %>%
  row_spec(which(full_comparison$Term == "gender"), background = "#f0f8ff")

ANCOVA Results: All Three Models
DV	Term	df	SS	MS	F	Sig.
Math Score	parental_edu_ord	1	5838.353	5838.353	26.848	***
Math Score	gender	1	7046.756	7046.756	32.405	***
Reading Score	parental_edu_ord	1	7761.257	7761.257	40.028	***
Reading Score	gender	1	11876.017	11876.017	61.249	***
Writing Score	parental_edu_ord	1	12925.777	12925.777	65.019	***
Writing Score	gender	1	19548.235	19548.235	98.331	***

7.2 Model Fit Comparison

model_fit_all <- data.frame(
  DV        = c("Math Score", "Reading Score", "Writing Score"),
  R2        = round(c(sm_math$r.squared,    sm_reading$r.squared,    sm_writing$r.squared),    4),
  Adj_R2    = round(c(sm_math$adj.r.squared, sm_reading$adj.r.squared, sm_writing$adj.r.squared), 4),
  F_overall = round(c(sm_math$fstatistic[1], sm_reading$fstatistic[1], sm_writing$fstatistic[1]), 3),
  p_overall = round(c(
    pf(sm_math$fstatistic[1],    sm_math$fstatistic[2],    sm_math$fstatistic[3],    lower.tail = FALSE),
    pf(sm_reading$fstatistic[1], sm_reading$fstatistic[2], sm_reading$fstatistic[3], lower.tail = FALSE),
    pf(sm_writing$fstatistic[1], sm_writing$fstatistic[2], sm_writing$fstatistic[3], lower.tail = FALSE)
  ), 6)
)

kable(model_fit_all,
      caption = "Model Fit Statistics: All Three ANCOVA Models",
      col.names = c("DV", "R-squared", "Adj. R-squared",
                    "F (overall)", "p (overall)")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Model Fit Statistics: All Three ANCOVA Models
DV	R-squared	Adj. R-squared	F (overall)
Math Score	0.0561	0.0542	29.627
Reading Score	0.0922	0.0904	50.638
Writing Score	0.1408	0.1391	81.675

7.3 Effect Size Comparison

# eta_all built here, before it is referenced in the Interpretation section
eta_all <- bind_rows(
  as.data.frame(eta_math)    %>% mutate(DV = "Math Score"),
  as.data.frame(eta_reading) %>% mutate(DV = "Reading Score"),
  as.data.frame(eta_writing) %>% mutate(DV = "Writing Score")
) %>%
  mutate(
    Magnitude = case_when(
      Eta2_partial >= 0.14 ~ "Large",
      Eta2_partial >= 0.06 ~ "Medium",
      Eta2_partial >= 0.01 ~ "Small",
      TRUE                 ~ "Negligible"
    )
  )

kable(eta_all %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Partial Eta-Squared: All Three ANCOVA Models") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Partial Eta-Squared: All Three ANCOVA Models
Parameter	Eta2_partial	CI	CI_low	CI_high	DV	Magnitude
parental_edu_ord	0.0262	0.95	0.0123	1	Math Score	Small
gender	0.0315	0.95	0.0161	1	Math Score	Small
parental_edu_ord	0.0386	0.95	0.0214	1	Reading Score	Small
gender	0.0579	0.95	0.0366	1	Reading Score	Small
parental_edu_ord	0.0612	0.95	0.0393	1	Writing Score	Medium
gender	0.0898	0.95	0.0636	1	Writing Score	Medium

eta_gender <- eta_all %>% filter(Parameter == "gender")

ggplot(eta_gender, aes(x = DV, y = Eta2_partial, fill = DV)) +
  geom_col(alpha = 0.85, color = "white", width = 0.55) +
  geom_text(aes(label = paste0("eta2 = ", round(Eta2_partial, 4))),
            vjust = -0.5, size = 3.8, fontface = "bold") +
  geom_hline(yintercept = 0.01, linetype = "dashed", color = "gray60") +
  geom_hline(yintercept = 0.06, linetype = "dashed", color = "#E84855") +
  geom_hline(yintercept = 0.14, linetype = "dashed", color = "#2E86AB") +
  annotate("text", x = 0.6, y = 0.014, label = "Small (0.01)",  size = 2.8, color = "gray50") +
  annotate("text", x = 0.6, y = 0.074, label = "Medium (0.06)", size = 2.8, color = "#E84855") +
  annotate("text", x = 0.6, y = 0.154, label = "Large (0.14)",  size = 2.8, color = "#2E86AB") +
  scale_fill_brewer(palette = "Set1", guide = "none") +
  ylim(0, 0.20) +
  labs(title = "Gender Effect Size (Partial eta-squared) per ANCOVA Model",
       subtitle = "After controlling for parental education level",
       x = "Dependent Variable", y = "Partial eta-squared") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"))

7.4 Cohen’s d - Unadjusted Gender Differences

cohens_d_df <- lapply(dvs, function(dv) {
  female    <- df_raw[[dv]][df_raw$gender == "female"]
  male      <- df_raw[[dv]][df_raw$gender == "male"]
  pooled_sd <- sqrt(((length(female) - 1) * var(female) +
                     (length(male)   - 1) * var(male)) /
                    (length(female) + length(male) - 2))
  d <- (mean(female) - mean(male)) / pooled_sd
  data.frame(
    DV          = case_when(dv == "math_score" ~ "Math",
                            dv == "reading_score" ~ "Reading", TRUE ~ "Writing"),
    Mean_Female = round(mean(female), 2),
    Mean_Male   = round(mean(male),   2),
    Difference  = round(mean(female) - mean(male), 2),
    Cohen_d     = round(d, 4),
    Magnitude   = ifelse(abs(d) >= 0.8, "Large",
                  ifelse(abs(d) >= 0.5, "Medium",
                  ifelse(abs(d) >= 0.2, "Small", "Negligible")))
  )
}) %>% bind_rows()
rownames(cohens_d_df) <- NULL

kable(cohens_d_df,
      caption = "Cohen's d: Unadjusted Gender Effect per DV",
      col.names = c("DV", "Mean Female", "Mean Male",
                    "Difference", "Cohen's d", "Magnitude")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)

Cohen’s d: Unadjusted Gender Effect per DV
DV	Mean Female	Mean Male	Difference	Cohen’s d	Magnitude
Math	63.63	68.73	-5.10	-0.3407	Small
Reading	72.61	65.47	7.14	0.5037	Medium
Writing	72.47	63.31	9.16	0.6316	Medium

7.5 Adjusted vs. Unadjusted Means

emm_all_plot <- bind_rows(
  as.data.frame(emm_math_obj)    %>% mutate(DV = "Math",    Type = "Adjusted"),
  as.data.frame(emm_reading_obj) %>% mutate(DV = "Reading", Type = "Adjusted"),
  as.data.frame(emm_writing_obj) %>% mutate(DV = "Writing", Type = "Adjusted")
)

raw_means <- df_raw %>%
  group_by(gender) %>%
  summarise(
    Math    = mean(math_score),
    Reading = mean(reading_score),
    Writing = mean(writing_score),
    .groups = "drop"
  ) %>%
  pivot_longer(-gender, names_to = "DV", values_to = "emmean") %>%
  mutate(
    Type     = "Unadjusted",
    lower.CL = emmean - 1.96 * 2,
    upper.CL = emmean + 1.96 * 2
  )

combined <- bind_rows(
  emm_all_plot %>% select(gender, DV, emmean, lower.CL, upper.CL, Type),
  raw_means    %>% select(gender, DV, emmean, lower.CL, upper.CL, Type)
)

ggplot(combined, aes(x = gender, y = emmean, fill = Type,
                     ymin = lower.CL, ymax = upper.CL)) +
  geom_col(position = position_dodge(0.7), width = 0.6,
           alpha = 0.85, color = "white") +
  geom_errorbar(position = position_dodge(0.7), width = 0.2, linewidth = 0.7) +
  facet_wrap(~DV, ncol = 3) +
  scale_fill_manual(values = c("Adjusted" = "#2E86AB", "Unadjusted" = "#E84855")) +
  labs(title = "Adjusted vs. Unadjusted Means by Gender",
       subtitle = "Adjusted = parental education held at mean",
       x = "Gender", y = "Mean Score", fill = "Type") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        strip.text = element_text(face = "bold"))

7.6 Residual Diagnostics

diag_plots <- lapply(names(models_list), function(nm) {
  label    <- case_when(nm == "math_score" ~ "Math",
                        nm == "reading_score" ~ "Reading", TRUE ~ "Writing")
  resid_df <- data.frame(
    Fitted    = fitted(models_list[[nm]]),
    Residuals = residuals(models_list[[nm]])
  )

  p_rf <- ggplot(resid_df, aes(x = Fitted, y = Residuals)) +
    geom_point(alpha = 0.3, size = 1, color = "#2E86AB") +
    geom_hline(yintercept = 0, color = "#E84855",
               linetype = "dashed", linewidth = 0.8) +
    geom_smooth(method = "loess", se = FALSE,
                color = "gray40", linewidth = 0.7) +
    labs(title = paste(label, "Residuals vs Fitted"),
         x = "Fitted", y = "Residuals") +
    theme_minimal(base_size = 9) +
    theme(plot.title = element_text(face = "bold", size = 9))

  p_qq <- ggplot(resid_df, aes(sample = Residuals)) +
    stat_qq(alpha = 0.45, size = 1, color = "#2E86AB") +
    stat_qq_line(color = "#E84855", linewidth = 0.8) +
    labs(title = paste(label, "Normal Q-Q"),
         x = "Theoretical", y = "Sample") +
    theme_minimal(base_size = 9) +
    theme(plot.title = element_text(face = "bold", size = 9))

  list(p_rf, p_qq)
})

grid.arrange(
  diag_plots[[1]][[1]], diag_plots[[1]][[2]],
  diag_plots[[2]][[1]], diag_plots[[2]][[2]],
  diag_plots[[3]][[1]], diag_plots[[3]][[2]],
  ncol = 2
)

8 Interpretation and Conclusion

8.1 Final Summary Table

final_summary <- data.frame(
  Model       = c("Model 1: Math Score",
                  "Model 2: Reading Score",
                  "Model 3: Writing Score"),
  F_Covariate = c(f_cov_math, f_cov_reading, f_cov_writing),
  p_Covariate = c("< 0.001", "< 0.001", "< 0.001"),
  F_Gender    = c(f_gender_math, f_gender_reading, f_gender_writing),
  p_Gender    = c("< 0.001", "< 0.001", "< 0.001"),
  Eta2        = c(eta_g_math, eta_g_reading, eta_g_writing),
  Direction   = c("Male > Female", "Female > Male", "Female > Male"),
  Effect      = c("Small", "Small-Medium", "Medium")
)

kable(final_summary,
      caption = "Final ANCOVA Summary: All Three Models",
      col.names = c("Model", "F (Covariate)", "p (Covariate)",
                    "F (Gender)", "p (Gender)",
                    "Partial eta2", "Direction", "Effect Size")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = TRUE) %>%
  scroll_box(width = "100%")

Final ANCOVA Summary: All Three Models
Model	F (Covariate)	p (Covariate)	F (Gender)	p (Gender)	Partial eta2	Direction	Effect Size
Model 1: Math Score	26.848	< 0.001	32.405	< 0.001	0.0315	Male > Female	Small
Model 2: Reading Score	40.028	< 0.001	61.249	< 0.001	0.0579	Female > Male	Small-Medium
Model 3: Writing Score	65.019	< 0.001	98.331	< 0.001	0.0898	Female > Male	Medium

The ANCOVA summary indicates that both parental education and gender have a statistically significant effect on all academic outcomes (Math, Reading and Writing) with p-values < 0.001. In Model 1 (Math Score), parental education shows a significant effect (F = 26.848), and gender is also significant (F = 32.405), where male students score higher in mathematics after controlling for the covariate, although the effect size is small (η² = 0.0315). In Model 2 (Reading Score), parental education has a stronger influence (F = 40.028) and gender remains significant (F = 61.249), with female students outperforming male students in reading. The effect size falls in the small to small-to-medium range (η² = 0.0579).

In Model 3 (Writing Score), parental education has the strongest effect among all models (F = 65.019), while gender also shows its largest influence here (F = 98.331), with female students again outperforming male students. The effect size for gender is in the medium range (η² = 0.0898) that indicate a more practically meaningful difference in writing compared to the other domains. Overall, although gender differences remain statistically significant across all academic subjects after controlling for parental education, the magnitude of these effects is relatively small to moderate, suggesting that other unmeasured factors may also contribute to students’ academic performance.

8.2 Methodological Notes

note_table <- data.frame(
  No = 1:5,
  Note = c(
    "Ordinal encoding of parental education assumes equal intervals",
    "Normality of residuals technically violated",
    "All interaction terms non-significant",
    "Dataset is cross-sectional and simulated",
    "Lunch type and test prep not controlled"
  ),
  Implication = c(
    "May not capture non-linear educational returns",
    "Robust at n = 1,000 due to Central Limit Theorem",
    "ANCOVA valid; slopes parallel across gender groups",
    "Causal claims not warranted",
    "See Full Factorial MANCOVA for more comprehensive model"
  )
)

knitr::kable(note_table,
             caption = "Methodological Notes",
             align = "c") %>%
  kableExtra::kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  ) %>%
  kableExtra::column_spec(2, width = "40%") %>%
  kableExtra::column_spec(3, width = "40%")

Methodological Notes
No	Note	Implication
1	Ordinal encoding of parental education assumes equal intervals	May not capture non-linear educational returns
2	Normality of residuals technically violated	Robust at n = 1,000 due to Central Limit Theorem
3	All interaction terms non-significant	ANCOVA valid; slopes parallel across gender groups
4	Dataset is cross-sectional and simulated	Causal claims not warranted
5	Lunch type and test prep not controlled	See Full Factorial MANCOVA for more comprehensive model

9 References

Huitema, B.E. (2011). The Analysis of Covariance and Alternatives (2nd ed.). Wiley.
Tabachnick, B.G., & Fidell, L.S. (2019). Using Multivariate Statistics (7th ed.). Pearson.

10 Data Source

Dataset: Students Performance in Exams
Link of Dataset: https://www.kaggle.com/datasets/spscientist/students-performance-in-exams
Source: Kaggle.

11 R Packages Used

package_table <- data.frame(
  Package = c("tidyverse",
              "knitr + kableExtra",
              "car",
              "emmeans",
              "effectsize",
              "gridExtra",
              "moments"),
  Purpose = c("Data wrangling and visualization",
              "Table formatting",
              "Levene's test",
              "Estimated marginal means and pairwise contrasts",
              "Partial eta-squared",
              "Multi-panel plot layout",
              "Skewness and kurtosis")
)

kable(package_table,
      caption = "R Packages Used in the Analysis",
      align = c("l", "l")) %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  )

R Packages Used in the Analysis
Package	Purpose
tidyverse	Data wrangling and visualization
knitr + kableExtra	Table formatting
car	Levene’s test
emmeans	Estimated marginal means and pairwise contrasts
effectsize	Partial eta-squared
gridExtra	Multi-panel plot layout
moments	Skewness and kurtosis

ANCOVA Analysis Report — Multivariate Analysis

Date: 20 April 2026

S1 Data Science | FMIPA UNESA | 2026

ANCOVA Analysis of Student Academic Performance

Multivariate Analysis - S1 Data Science UNESA

2026-04-20

1 Introduction

2 Data Loading & Preparation

2.1 Import Dataset

2.2 Dataset Preview

2.3 Preprocessing

2.3.1 Factor Conversion & Ordinal Encoding

2.3.2 Group Distribution

2.4 Descriptive Statistics

3 Exploratory Data Analysis

3.1 Score Distributions

3.2 Scores by Gender

3.3 Covariate vs. Dependent Variables

4 Model Fitting

5 ANCOVA Assumption Testing

5.1 Assumption 1 - Linearity (Covariate vs. DV)

5.2 Assumption 2 - Homogeneity of Regression Slopes (Critical)

5.3 Assumption 3 - Normality of Residuals

5.4 Assumption 4 - Homogeneity of Variance (Levene’s Test)

5.5 Assumption Summary

6 ANCOVA Results

6.1 Model 1 - Math Score

6.1.1 ANOVA Table

6.1.2 Model Fit Statistics

6.1.3 Regression Coefficients

6.1.4 Effect Size (Partial eta-squared)

6.1.5 Adjusted Means (EMM)

6.1.6 Pairwise Contrast (Female vs. Male)

6.2 Model 2 - Reading Score

6.2.1 ANOVA Table

6.2.2 Model Fit Statistics

6.2.3 Regression Coefficients

6.2.4 Effect Size (Partial eta-squared)

6.2.5 Adjusted Means (EMM)

6.2.6 Pairwise Contrast (Female vs. Male)

6.3 Model 3 - Writing Score

6.3.1 ANOVA Table

6.3.2 Model Fit Statistics

6.3.3 Regression Coefficients

6.3.4 Effect Size (Partial eta-squared)

6.3.5 Adjusted Means (EMM)

6.3.6 Pairwise Contrast (Female vs. Male)

7 Cross-Model Comparison

7.1 ANCOVA Results Across All Three DVs

7.2 Model Fit Comparison

7.3 Effect Size Comparison

7.4 Cohen’s d - Unadjusted Gender Differences

7.5 Adjusted vs. Unadjusted Means

7.6 Residual Diagnostics

8 Interpretation and Conclusion

8.1 Final Summary Table

8.2 Methodological Notes

9 References

10 Data Source

11 R Packages Used

ANCOVA Analysis Report — Multivariate Analysis