1 Introduction


This report presents an Analysis of Covariance (ANCOVA) applied to student academic performance data sourced from the Students Performance in Exams dataset which is publicly available on Kaggle. The dataset contains records of 1,000 students and captures their scores across three subjects namely mathematics, reading and writing along with several demographic and contextual background variables.

The dataset includes eight variables in total. Gender is recorded as female or male while race/ethnicity is grouped into five categories from Group A to Group E. Parental level of education ranges from some high school up to master’s degree and lunch type serves as a proxy for socioeconomic status distinguishing between standard and free or reduced lunch. Test preparation course completion is also recorded as either completed or none. The three outcome variables are math score, reading score and writing score each measured on a continuous scale from 0 to 100.

In this analysis gender serves as the independent variable and parental level of education serves as the covariate which is encoded ordinally from 1 for some high school up to 6 for master’s degree. The remaining variables such as race/ethnicity, lunch type and test preparation are not included in the current models but are acknowledged as potential confounders that may influence academic outcomes.

ANCOVA extends ANOVA by incorporating a covariate allowing for a more accurate estimation of group differences after controlling for external influences. The main objective of this analysis is to examine whether gender differences in academic performance exist after adjusting for parental education level. Three separate one-way ANCOVA models are fitted with one model per dependent variable:

model_info <- data.frame(
  Model = c("Model 1", "Model 2", "Model 3"),
  DV = c("Math Score", "Reading Score", "Writing Score"),
  IV_Factor = c("Gender", "Gender", "Gender"),
  Covariate = c("Parental Education (Ordinal)",
                "Parental Education (Ordinal)",
                "Parental Education (Ordinal)")
)

knitr::kable(model_info,
             caption = "ANCOVA Model Specification",
             align = "lccc") %>%
  kableExtra::kable_styling(full_width = FALSE,
                            bootstrap_options = c("striped", "hover"))
ANCOVA Model Specification
Model DV IV_Factor Covariate
Model 1 Math Score Gender Parental Education (Ordinal)
Model 2 Reading Score Gender Parental Education (Ordinal)
Model 3 Writing Score Gender Parental Education (Ordinal)

The general model applied for each dependent variable can be expressed as & DV ~ parental_edu_ord + gender. This specification indicates that academic performance is modeled as a function of gender while adjusting for parental education level.

2 Data Loading & Preparation


2.1 Import Dataset


df_raw <- read.csv("StudentsPerformance.csv", stringsAsFactors = FALSE)

colnames(df_raw) <- c(
  "gender", "race_ethnicity", "parental_education",
  "lunch", "test_prep", "math_score", "reading_score", "writing_score"
)

cat(sprintf("Rows: %d | Columns: %d\n", nrow(df_raw), ncol(df_raw)))
## Rows: 1000 | Columns: 8

2.2 Dataset Preview


kable(head(df_raw, 10), caption = "Preview: First 10 Rows of Dataset") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 11) %>%
  scroll_box(width = "100%")
Preview: First 10 Rows of Dataset
gender race_ethnicity parental_education lunch test_prep math_score reading_score writing_score
female group B bachelor’s degree standard none 72 72 74
female group C some college standard completed 69 90 88
female group B master’s degree standard none 90 95 93
male group A associate’s degree free/reduced none 47 57 44
male group C some college standard none 76 78 75
female group B associate’s degree standard none 71 83 78
female group B some college standard completed 88 95 92
male group B some college free/reduced none 40 43 39
male group D high school free/reduced completed 64 64 67
female group B high school free/reduced none 38 60 50

2.3 Preprocessing


2.3.1 Factor Conversion & Ordinal Encoding

df_raw$gender <- factor(df_raw$gender)

edu_order <- c(
  "some high school"   = 1,
  "high school"        = 2,
  "some college"       = 3,
  "associate's degree" = 4,
  "bachelor's degree"  = 5,
  "master's degree"    = 6
)

df_raw$parental_edu_ord <- as.integer(edu_order[df_raw$parental_education])

edu_mapping <- data.frame(
  Category = names(edu_order),
  Ordinal_Code = unname(edu_order)
)

knitr::kable(
  edu_mapping,
  caption = "Ordinal Encoding of Parental Education Level",
  align = c("l", "c")
) %>%
  kableExtra::kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover")
  )
Ordinal Encoding of Parental Education Level
Category Ordinal_Code
some high school 1
high school 2
some college 3
associate’s degree 4
bachelor’s degree 5
master’s degree 6

2.3.2 Group Distribution

group_df <- data.frame(
  Factor = c(rep("Gender", 2), rep("Parental Education", 6)),
  Level  = c(levels(df_raw$gender), names(edu_order)),
  n      = c(as.integer(table(df_raw$gender)),
             as.integer(table(factor(df_raw$parental_education,
                                     levels = names(edu_order)))))
)

rownames(group_df) <- NULL

knitr::kable(
  group_df,
  caption = "Sample Distribution Across Factors",
  col.names = c("Factor", "Level", "n"),
  align = c("l", "l", "c")
) %>%
  kableExtra::kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover")
  )
Sample Distribution Across Factors
Factor Level n
Gender female 518
Gender male 482
Parental Education some high school 179
Parental Education high school 196
Parental Education some college 226
Parental Education associate’s degree 222
Parental Education bachelor’s degree 118
Parental Education master’s degree 59

2.4 Descriptive Statistics


dvs <- c("math_score", "reading_score", "writing_score")

desc_all <- data.frame(
  Variable = dvs,
  N        = sapply(df_raw[dvs], length),
  Mean     = round(sapply(df_raw[dvs], mean), 3),
  SD       = round(sapply(df_raw[dvs], sd), 3),
  Min      = sapply(df_raw[dvs], min),
  Q1       = sapply(df_raw[dvs], quantile, 0.25),
  Median   = sapply(df_raw[dvs], median),
  Q3       = sapply(df_raw[dvs], quantile, 0.75),
  Max      = sapply(df_raw[dvs], max),
  Skewness = round(sapply(df_raw[dvs], skewness), 3),
  Kurtosis = round(sapply(df_raw[dvs], kurtosis), 3)
)

rownames(desc_all) <- NULL

knitr::kable(
  desc_all,
  caption = "Descriptive Statistics of Dependent Variables",
  align = c("l", rep("c", 10))
) %>%
  kableExtra::kable_styling(
    bootstrap_options = c("striped", "hover"),
    full_width = FALSE
  ) %>%
  kableExtra::scroll_box(width = "100%")
Descriptive Statistics of Dependent Variables
Variable N Mean SD Min Q1 Median Q3 Max Skewness Kurtosis
math_score 1000 66.089 15.163 0 57.00 66 77 100 -0.279 3.268
reading_score 1000 69.169 14.600 17 59.00 70 79 100 -0.259 2.926
writing_score 1000 68.054 15.196 10 57.75 69 79 100 -0.289 2.961
desc_gender <- lapply(dvs, function(dv) {
  df_raw %>%
    group_by(gender) %>%
    summarise(
      DV     = dv,
      N      = n(),
      Mean   = round(mean(.data[[dv]]), 3),
      Median = round(median(.data[[dv]]), 3),
      SD     = round(sd(.data[[dv]]), 3),
      Min    = min(.data[[dv]]),
      Max    = max(.data[[dv]]),
      .groups = "drop"
    )
}) %>% bind_rows()

knitr::kable(
  desc_gender,
  caption = "Descriptive Statistics per Gender x DV",
  align = "lccccccc"
) %>%
  kableExtra::kable_styling(
    full_width = FALSE,
    bootstrap_options = c("striped", "hover", "condensed"),
    font_size = 11
  )
Descriptive Statistics per Gender x DV
gender DV N Mean Median SD Min Max
female math_score 518 63.633 65 15.491 0 100
male math_score 482 68.728 69 14.356 27 100
female reading_score 518 72.608 73 14.378 17 100
male reading_score 482 65.473 66 13.932 23 100
female writing_score 518 72.467 74 14.845 10 100
male writing_score 482 63.311 64 14.114 15 100

3 Exploratory Data Analysis


3.1 Score Distributions


df_long <- df_raw %>%
  select(math_score, reading_score, writing_score) %>%
  pivot_longer(everything(), names_to = "Subject", values_to = "Score") %>%
  mutate(Subject = case_when(
    Subject == "math_score"    ~ "Math",
    Subject == "reading_score" ~ "Reading",
    TRUE                       ~ "Writing"
  ))

ggplot(df_long, aes(x = Score, fill = Subject)) +
  geom_histogram(aes(y = after_stat(density)), bins = 25,
                 color = "white", alpha = 0.75) +
  geom_density(aes(color = Subject), linewidth = 0.9, fill = NA) +
  facet_wrap(~Subject, ncol = 3) +
  scale_fill_brewer(palette = "Set1", guide = "none") +
  scale_color_brewer(palette = "Set1", guide = "none") +
  labs(title = "Score Distributions by Subject",
       subtitle = "Histogram with kernel density estimate",
       x = "Score", y = "Density") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        strip.text = element_text(face = "bold"))

3.2 Scores by Gender


df_gender_long <- df_raw %>%
  select(gender, math_score, reading_score, writing_score) %>%
  pivot_longer(-gender, names_to = "Subject", values_to = "Score") %>%
  mutate(Subject = case_when(
    Subject == "math_score"    ~ "Math",
    Subject == "reading_score" ~ "Reading",
    TRUE                       ~ "Writing"
  ))

ggplot(df_gender_long, aes(x = gender, y = Score, fill = gender)) +
  geom_boxplot(outlier.shape = 21, outlier.size = 1.5, alpha = 0.75) +
  geom_jitter(aes(color = gender), width = 0.15, alpha = 0.15, size = 0.8) +
  facet_wrap(~Subject, ncol = 3) +
  scale_fill_brewer(palette  = "Set1") +
  scale_color_brewer(palette = "Set1", guide = "none") +
  labs(title = "Score Distribution by Gender",
       x = "Gender", y = "Score", fill = "Gender") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        strip.text = element_text(face = "bold"))

3.3 Covariate vs. Dependent Variables


df_cov_long <- df_raw %>%
  select(parental_edu_ord, math_score, reading_score, writing_score) %>%
  pivot_longer(-parental_edu_ord, names_to = "Subject", values_to = "Score") %>%
  mutate(Subject = case_when(
    Subject == "math_score"    ~ "Math",
    Subject == "reading_score" ~ "Reading",
    TRUE                       ~ "Writing"
  ))

ggplot(df_cov_long, aes(x = parental_edu_ord, y = Score, color = Subject)) +
  geom_jitter(alpha = 0.25, size = 0.9, width = 0.15) +
  geom_smooth(method = "lm", se = TRUE, linewidth = 1.2) +
  facet_wrap(~Subject, ncol = 3) +
  scale_color_brewer(palette = "Set1", guide = "none") +
  scale_x_continuous(breaks = 1:6,
    labels = c("Some HS", "HS", "Some Col", "Assoc", "Bach", "Master")) +
  labs(title = "Parental Education (Covariate) vs. Academic Scores",
       subtitle = "Linear trend with 95% confidence band",
       x = "Parental Education Level (Ordinal)", y = "Score") +
  theme_minimal(base_size = 11) +
  theme(plot.title  = element_text(face = "bold"),
        strip.text  = element_text(face = "bold"),
        axis.text.x = element_text(angle = 20, hjust = 1))


4 Model Fitting


The three ANCOVA models are fitted here early so all objects are available throughout the document, including in assumption tests and interpretation sections.

model_math    <- lm(math_score    ~ parental_edu_ord + gender, data = df_raw)
model_reading <- lm(reading_score ~ parental_edu_ord + gender, data = df_raw)
model_writing <- lm(writing_score ~ parental_edu_ord + gender, data = df_raw)

models_list <- list(
  "math_score"    = model_math,
  "reading_score" = model_reading,
  "writing_score" = model_writing
)

cat("Models fitted successfully.\n")
## Models fitted successfully.
cat(sprintf("  Model 1: math_score    ~ parental_edu_ord + gender | df = %d\n",
            df.residual(model_math)))
##   Model 1: math_score    ~ parental_edu_ord + gender | df = 997
cat(sprintf("  Model 2: reading_score ~ parental_edu_ord + gender | df = %d\n",
            df.residual(model_reading)))
##   Model 2: reading_score ~ parental_edu_ord + gender | df = 997
cat(sprintf("  Model 3: writing_score ~ parental_edu_ord + gender | df = %d\n",
            df.residual(model_writing)))
##   Model 3: writing_score ~ parental_edu_ord + gender | df = 997

5 ANCOVA Assumption Testing


5.1 Assumption 1 - Linearity (Covariate vs. DV)


Theoretical Basis:

ANCOVA assumes a linear relationship between the covariate (Parental Education) and the dependent variable (Student Scores). This assumption is critical because ANCOVA uses linear regression to “adjust” the group means based on the covariate. If the relationship is non-linear, the resulting adjusted means (EMMs) will be biased, and the statistical power of the test will be compromised.

Statistical Hypotheses:

  • \(H_0: \rho = 0\): There is no linear relationship between the covariate and the dependent variable.

  • \(H_1: \rho \neq 0\): There is a significant linear relationship between the covariate and the dependent variable.

lin_results <- lapply(dvs, function(dv) {

  label <- case_when(
    dv == "math_score"    ~ "Math Score",
    dv == "reading_score" ~ "Reading Score",
    dv == "writing_score" ~ "Writing Score"
  )

  ct <- cor.test(df_raw$parental_edu_ord, df_raw[[dv]])

  data.frame(
    DV      = label,
    r       = round(unname(ct$estimate), 4),
    t_stat  = round(unname(ct$statistic), 4),
    p_value = round(ct$p.value, 6),
    Sig     = ifelse(ct$p.value < 0.001, "***",
              ifelse(ct$p.value < 0.01,  "**",
              ifelse(ct$p.value < 0.05,  "*", "ns")))
  )
}) %>% bind_rows()

rownames(lin_results) <- NULL

r_math    <- lin_results$r[lin_results$DV == "Math Score"]
r_reading <- lin_results$r[lin_results$DV == "Reading Score"]
r_writing <- lin_results$r[lin_results$DV == "Writing Score"]

kable(lin_results,
      caption = "Pearson Correlation: Parental Education (Covariate) vs. DV",
      col.names = c("Dependent Variable", "r", "t statistic", "p-value", "Sig.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),, full_width = FALSE)
Pearson Correlation: Parental Education (Covariate) vs. DV
Dependent Variable r t statistic p-value Sig.
Math Score 0.1594 5.1019 0 ***
Reading Score 0.1909 6.1440 0 ***
Writing Score 0.2367 7.6969 0 ***
p_lin <- lapply(dvs, function(dv) {
  label <- case_when(dv == "math_score" ~ "Math",
                     dv == "reading_score" ~ "Reading", TRUE ~ "Writing")
  ggplot(df_raw, aes(x = parental_edu_ord, y = .data[[dv]])) +
    geom_jitter(alpha = 0.3, color = "#2E86AB", width = 0.15, size = 1.2) +
    geom_smooth(method = "lm", se = TRUE, color = "#E84855", linewidth = 1.1) +
    scale_x_continuous(breaks = 1:6) +
    labs(title = paste("Linearity:", label),
         x = "Parental Education (Ordinal)", y = paste(label, "Score")) +
    theme_minimal(base_size = 10) +
    theme(plot.title = element_text(face = "bold"))
})
grid.arrange(grobs = p_lin, ncol = 3)

Interpretation: Pearson correlations are all significant (p < 0.001): math r = 0.1594, reading r = 0.1909, writing r = 0.2367. The linearity assumption is met for all models.


5.2 Assumption 2 - Homogeneity of Regression Slopes (Critical)


Theoretical Basis:

The Homogeneity of Regression Slopes assumption requires that the relationship between the covariate (Parental Education) and the dependent variable (Student Scores) is consistent across all levels of the independent variable (Gender). Visually, this means the regression lines for each group should be approximately parallel. If an interaction exists, it indicates that the effect of parental education on academic performance varies depending on the student’s gender, which would render the ANCOVA model’s main effects uninterpretable.

Statistical Hypotheses:

  • H0: The regression slopes are equal across groups (No interaction between the IV and Covariate).

  • H1: The regression slopes are significantly different (A significant interaction exists).

slope_results <- lapply(dvs, function(dv) {
  model_int <- lm(as.formula(paste(dv, "~ parental_edu_ord * gender")),
                  data = df_raw)
  aov_int   <- anova(model_int)
  int_row   <- aov_int["parental_edu_ord:gender", ]
  data.frame(
    DV       = dv,
    F_value  = round(int_row$`F value`, 4),
    p_value  = round(int_row$`Pr(>F)`, 4),
    Decision = ifelse(int_row$`Pr(>F)` > 0.05,
                      "Homogeneous", "Heterogeneous (ANCOVA invalid)")
  )
}) %>% bind_rows()
rownames(slope_results) <- NULL

kable(slope_results,
      caption = "Homogeneity of Regression Slopes: Interaction Test",
      col.names = c("DV", "F value", "p-value", "Decision")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
  row_spec(which(grepl("OK", slope_results$Decision)), color = "darkgreen")
Homogeneity of Regression Slopes: Interaction Test
DV F value p-value Decision
math_score 0.8128 0.3675 Homogeneous
reading_score 0.4402 0.5072 Homogeneous
writing_score 0.1621 0.6873 Homogeneous
p_slope <- lapply(dvs, function(dv) {
  label <- case_when(dv == "math_score" ~ "Math",
                     dv == "reading_score" ~ "Reading", TRUE ~ "Writing")
  ggplot(df_raw, aes(x = parental_edu_ord, y = .data[[dv]], color = gender)) +
    geom_jitter(alpha = 0.25, width = 0.15, size = 1.1) +
    geom_smooth(method = "lm", se = TRUE, linewidth = 1.2) +
    scale_color_brewer(palette = "Set1") +
    scale_x_continuous(breaks = 1:6) +
    labs(title = paste("Regression Slopes by Gender:", label),
         subtitle = "Parallel lines = homogeneous slopes",
         x = "Parental Education (Ordinal)",
         y = paste(label, "Score"), color = "Gender") +
    theme_minimal(base_size = 10) +
    theme(plot.title = element_text(face = "bold"))
})
grid.arrange(grobs = p_slope, ncol = 3)

Interpretation: The interaction term parental_edu_ord x gender is not significant (p > 0.05) for all DVs. Slopes are parallel across gender groups. ANCOVA is valid to proceed.


5.3 Assumption 3 - Normality of Residuals


Theoretical Basis:

ANCOVA assumes that the residuals (the differences between the observed values and the values predicted by the model) are normally distributed. This is essential for the validity of the \(F\)-test and the accuracy of the \(p\)-values. However, under the Central Limit Theorem, ANCOVA is generally robust to violations of normality when the sample size is sufficiently large (e.g., \(N > 30\) per group), as is the case in this study (\(N = 1000\)).

Statistical Hypotheses:

  • \(H_0\): The residuals follow a normal distribution.
  • \(H_1\): The residuals do not follow a normal distribution.

We aim to fail to reject \(H_0\) (\(p > 0.05\)). If \(H_0\) is rejected, we rely on the large sample size and visual inspection of Q-Q plots to justify the robustness of the model.

norm_results <- lapply(names(models_list), function(nm) {
  resids <- residuals(models_list[[nm]])
  sw     <- shapiro.test(resids)
  label  <- case_when(nm == "math_score" ~ "Math",
                      nm == "reading_score" ~ "Reading", TRUE ~ "Writing")
  data.frame(
    DV       = label,
    W        = round(sw$statistic, 5),
    p_value  = round(sw$p.value, 6),
    Decision = ifelse(sw$p.value > 0.05, "Normal",
                      "Non-normal (robust for n=1000)")
  )
}) %>% bind_rows()
rownames(norm_results) <- NULL

kable(norm_results,
      caption = "Shapiro-Wilk Test: Normality of Residuals",
      col.names = c("DV", "W statistic", "p-value", "Decision")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Shapiro-Wilk Test: Normality of Residuals
DV W statistic p-value Decision
Math 0.99551 0.004961 Non-normal (robust for n=1000)
Reading 0.99307 0.000129 Non-normal (robust for n=1000)
Writing 0.99040 0.000004 Non-normal (robust for n=1000)
plot_list_norm <- lapply(names(models_list), function(nm) {
  label  <- case_when(nm == "math_score" ~ "Math",
                      nm == "reading_score" ~ "Reading", TRUE ~ "Writing")
  resids <- data.frame(resid = residuals(models_list[[nm]]))

  p_qq <- ggplot(resids, aes(sample = resid)) +
    stat_qq(alpha = 0.5, size = 1.1, color = "#2E86AB") +
    stat_qq_line(color = "#E84855", linewidth = 0.9) +
    labs(title = paste("Q-Q Plot:", label),
         x = "Theoretical Quantiles", y = "Sample Quantiles") +
    theme_minimal(base_size = 10) +
    theme(plot.title = element_text(face = "bold"))

  p_hist <- ggplot(resids, aes(x = resid)) +
    geom_histogram(aes(y = after_stat(density)), bins = 30,
                   fill = "#2E86AB", color = "white", alpha = 0.75) +
    geom_density(color = "#E84855", linewidth = 1) +
    labs(title = paste("Residual Distribution:", label),
         x = "Residuals", y = "Density") +
    theme_minimal(base_size = 10) +
    theme(plot.title = element_text(face = "bold"))

  list(p_qq, p_hist)
})

grid.arrange(
  plot_list_norm[[1]][[1]], plot_list_norm[[1]][[2]],
  plot_list_norm[[2]][[1]], plot_list_norm[[2]][[2]],
  plot_list_norm[[3]][[1]], plot_list_norm[[3]][[2]],
  ncol = 2
)

Interpretation: Shapiro-Wilk yields p < 0.05, but with n = 1,000 the Central Limit Theorem ensures approximate normality. ANCOVA is robust at this sample size.


5.4 Assumption 4 - Homogeneity of Variance (Levene’s Test)


Theoretical Basis:

This assumption, also known as homoscedasticity, requires that the variance of the dependent variable is equal across all groups (Gender). If the variances are significantly different, the \(F\)-test becomes unreliable because the error term will be biased, potentially leading to Type I or Type II errors.

Statistical Hypotheses:

  • \(H_0: \sigma^2_{female} = \sigma^2_{male}\): The group variances are equal.
  • \(H_1: \sigma^2_{female} \neq \sigma^2_{male}\): The group variances are significantly different.

We aim to fail to reject \(H_0\) (\(p > 0.05\)). A non-significant Levene’s test indicates that the assumption of equal variances is met, ensuring the stability of the ANCOVA results.

lev_results <- lapply(dvs, function(dv) {
  lt    <- leveneTest(df_raw[[dv]] ~ df_raw$gender)
  label <- case_when(dv == "math_score" ~ "Math",
                     dv == "reading_score" ~ "Reading", TRUE ~ "Writing")
  data.frame(
    DV       = label,
    F_value  = round(lt$`F value`[1], 4),
    p_value  = round(lt$`Pr(>F)`[1],  4),
    Decision = ifelse(lt$`Pr(>F)`[1] > 0.05, "Met", "Violated")
  )
}) %>% bind_rows()
rownames(lev_results) <- NULL

kable(lev_results,
      caption = "Levene's Test: Homogeneity of Variance",
      col.names = c("DV", "F value", "p-value", "Decision")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
  row_spec(which(lev_results$Decision == "Met"), color = "darkgreen")
Levene’s Test: Homogeneity of Variance
DV F value p-value Decision
Math 0.3464 0.5563 Met
Reading 0.0188 0.8911 Met
Writing 0.0069 0.9336 Met
ggplot(df_gender_long, aes(x = gender, y = Score, fill = gender)) +
  geom_boxplot(outlier.shape = 21, outlier.size = 1.5, alpha = 0.75) +
  stat_summary(fun = mean, geom = "point", shape = 23,
               size = 3, fill = "white", color = "black") +
  facet_wrap(~Subject, ncol = 3) +
  scale_fill_brewer(palette = "Set1") +
  labs(title = "Variance Comparison by Gender",
       subtitle = "Diamond = group mean",
       x = "Gender", y = "Score", fill = "Gender") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        strip.text = element_text(face = "bold"))

Interpretation: All three DVs pass Levene’s test (p > 0.05). Homogeneity of variance is met.


5.5 Assumption Summary


assump_summary <- data.frame(
  No         = 1:4,
  Assumption = c("Linearity (Covariate-DV)",
                 "Homogeneity of Regression Slopes",
                 "Normality of Residuals",
                 "Homogeneity of Variance (Levene)"),
  Test       = c("Pearson Correlation",
                 "Covariate x IV Interaction (F-test)",
                 "Shapiro-Wilk + Q-Q Plot",
                 "Levene's Test"),
  Math    = c(paste0("r = ", r_math, ", p < .001"),
              paste0("F = ", slope_results$F_value[1], ", p = ", slope_results$p_value[1]),
              paste0("W = ", norm_results$W[norm_results$DV == "Math"],
                     ", robust n=1000"),
              paste0("F = ", lev_results$F_value[lev_results$DV == "Math"],
                     ", p = ", lev_results$p_value[lev_results$DV == "Math"])),
  Reading = c(paste0("r = ", r_reading, ", p < .001"),
              paste0("F = ", slope_results$F_value[2], ", p = ", slope_results$p_value[2]),
              paste0("W = ", norm_results$W[norm_results$DV == "Reading"],
                     ", robust n=1000"),
              paste0("F = ", lev_results$F_value[lev_results$DV == "Reading"],
                     ", p = ", lev_results$p_value[lev_results$DV == "Reading"])),
  Writing = c(paste0("r = ", r_writing, ", p < .001"),
              paste0("F = ", slope_results$F_value[3], ", p = ", slope_results$p_value[3]),
              paste0("W = ", norm_results$W[norm_results$DV == "Writing"],
                     ", robust n=1000"),
              paste0("F = ", lev_results$F_value[lev_results$DV == "Writing"],
                     ", p = ", lev_results$p_value[lev_results$DV == "Writing"])),
  Status  = c("Met", "Met", "Robust (n=1000)", "Met")
)

kable(assump_summary,
      caption = "ANCOVA Assumption Testing Summary",
      col.names = c("No.", "Assumption", "Test", "Math", "Reading",
                    "Writing", "Status")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = TRUE) %>%
  scroll_box(width = "100%")
ANCOVA Assumption Testing Summary
No.  Assumption Test Math Reading Writing Status
1 Linearity (Covariate-DV) Pearson Correlation r = 0.1594, p < .001 r = 0.1909, p < .001 r = 0.2367, p < .001 Met
2 Homogeneity of Regression Slopes Covariate x IV Interaction (F-test) F = 0.8128, p = 0.3675 F = 0.4402, p = 0.5072 F = 0.1621, p = 0.6873 Met
3 Normality of Residuals Shapiro-Wilk + Q-Q Plot W = 0.99551, robust n=1000 W = 0.99307, robust n=1000 W = 0.9904, robust n=1000 Robust (n=1000)
4 Homogeneity of Variance (Levene) Levene’s Test F = 0.3464, p = 0.5563 F = 0.0188, p = 0.8911 F = 0.0069, p = 0.9336 Met

6 ANCOVA Results


6.1 Model 1 - Math Score


6.1.1 ANOVA Table

anova_math <- anova(model_math)

kable(round(anova_math, 4),
      caption = "ANCOVA Table: Math Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
ANCOVA Table: Math Score
Df Sum Sq Mean Sq F value Pr(>F)
parental_edu_ord 1 5838.353 5838.3529 26.8484 0
gender 1 7046.756 7046.7562 32.4054 0
Residuals 997 216803.970 217.4563 NA NA

6.1.2 Model Fit Statistics

sm_math <- summary(model_math)
fit_math <- data.frame(
  Metric = c("R-squared", "Adjusted R-squared", "F statistic", "p-value (overall)"),
  Value  = c(round(sm_math$r.squared, 4),
             round(sm_math$adj.r.squared, 4),
             round(sm_math$fstatistic[1], 3),
             round(pf(sm_math$fstatistic[1], sm_math$fstatistic[2],
                      sm_math$fstatistic[3], lower.tail = FALSE), 6))
)
kable(fit_math, caption = "Model Fit: Math Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Model Fit: Math Score
Metric Value
R-squared 0.0561
Adjusted R-squared 0.0542
F statistic 29.6270
p-value (overall) 0.0000

6.1.3 Regression Coefficients

coef_math <- as.data.frame(summary(model_math)$coefficients)
coef_math$Term <- rownames(coef_math)
coef_math <- coef_math %>%
  select(Term, Estimate, `Std. Error`, `t value`, `Pr(>|t|)`) %>%
  mutate(across(where(is.numeric), ~round(.x, 4)),
         Sig = ifelse(`Pr(>|t|)` < 0.001, "***",
               ifelse(`Pr(>|t|)` < 0.01,  "**",
               ifelse(`Pr(>|t|)` < 0.05,  "*", "ns"))))
rownames(coef_math) <- NULL

kable(coef_math,
      caption = "Regression Coefficients: Model 1 (Math Score)",
      col.names = c("Term", "Estimate", "SE", "t value", "p-value", "Sig.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Regression Coefficients: Model 1 (Math Score)
Term Estimate SE t value p-value Sig.
(Intercept) 58.1791 1.1958 48.6524 0 ***
parental_edu_ord 1.7354 0.3198 5.4266 0 ***
gendermale 5.3177 0.9342 5.6926 0 ***

6.1.4 Effect Size (Partial eta-squared)

eta_math <- eta_squared(model_math, partial = TRUE)

kable(as.data.frame(eta_math) %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Partial Eta-Squared: Model 1 (Math Score)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Partial Eta-Squared: Model 1 (Math Score)
Parameter Eta2_partial CI CI_low CI_high
parental_edu_ord 0.0262 0.95 0.0123 1
gender 0.0315 0.95 0.0161 1

6.1.5 Adjusted Means (EMM)

emm_math_obj <- emmeans(model_math, ~ gender)
emm_math_df  <- as.data.frame(emm_math_obj)

kable(emm_math_df %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Estimated Marginal Means: Math Score (adjusted for parental education)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Estimated Marginal Means: Math Score (adjusted for parental education)
gender emmean SE df lower.CL upper.CL
female 63.5259 0.6482 997 62.2538 64.7979
male 68.8436 0.6720 997 67.5249 70.1623

6.1.6 Pairwise Contrast (Female vs. Male)

contrast_math <- as.data.frame(pairs(emm_math_obj))

kable(contrast_math %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Pairwise Contrast: Female vs. Male, Math Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Pairwise Contrast: Female vs. Male, Math Score
contrast estimate SE df t.ratio p.value
female - male -5.3177 0.9342 997 -5.6926 0

6.2 Model 2 - Reading Score


6.2.1 ANOVA Table

anova_reading <- anova(model_reading)

kable(round(anova_reading, 4),
      caption = "ANCOVA Table: Reading Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
ANCOVA Table: Reading Score
Df Sum Sq Mean Sq F value Pr(>F)
parental_edu_ord 1 7761.257 7761.2573 40.0278 0
gender 1 11876.017 11876.0168 61.2491 0
Residuals 997 193315.165 193.8969 NA NA

6.2.2 Model Fit Statistics

sm_reading <- summary(model_reading)
fit_reading <- data.frame(
  Metric = c("R-squared", "Adjusted R-squared", "F statistic", "p-value (overall)"),
  Value  = c(round(sm_reading$r.squared, 4),
             round(sm_reading$adj.r.squared, 4),
             round(sm_reading$fstatistic[1], 3),
             round(pf(sm_reading$fstatistic[1], sm_reading$fstatistic[2],
                      sm_reading$fstatistic[3], lower.tail = FALSE), 6))
)
kable(fit_reading, caption = "Model Fit: Reading Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Model Fit: Reading Score
Metric Value
R-squared 0.0922
Adjusted R-squared 0.0904
F statistic 50.6380
p-value (overall) 0.0000

6.2.3 Regression Coefficients

coef_reading <- as.data.frame(summary(model_reading)$coefficients)
coef_reading$Term <- rownames(coef_reading)
coef_reading <- coef_reading %>%
  select(Term, Estimate, `Std. Error`, `t value`, `Pr(>|t|)`) %>%
  mutate(across(where(is.numeric), ~round(.x, 4)),
         Sig = ifelse(`Pr(>|t|)` < 0.001, "***",
               ifelse(`Pr(>|t|)` < 0.01,  "**",
               ifelse(`Pr(>|t|)` < 0.05,  "*", "ns"))))
rownames(coef_reading) <- NULL

kable(coef_reading,
      caption = "Regression Coefficients: Model 2 (Reading Score)",
      col.names = c("Term", "Estimate", "SE", "t value", "p-value", "Sig.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Regression Coefficients: Model 2 (Reading Score)
Term Estimate SE t value p-value Sig.
(Intercept) 66.9357 1.1292 59.2784 0 ***
parental_edu_ord 1.8048 0.3020 5.9768 0 ***
gendermale -6.9035 0.8821 -7.8262 0 ***

6.2.4 Effect Size (Partial eta-squared)

eta_reading <- eta_squared(model_reading, partial = TRUE)

kable(as.data.frame(eta_reading) %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Partial Eta-Squared: Model 2 (Reading Score)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Partial Eta-Squared: Model 2 (Reading Score)
Parameter Eta2_partial CI CI_low CI_high
parental_edu_ord 0.0386 0.95 0.0214 1
gender 0.0579 0.95 0.0366 1

6.2.5 Adjusted Means (EMM)

emm_reading_obj <- emmeans(model_reading, ~ gender)
emm_reading_df  <- as.data.frame(emm_reading_obj)

kable(emm_reading_df %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Estimated Marginal Means: Reading Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Estimated Marginal Means: Reading Score
gender emmean SE df lower.CL upper.CL
female 72.4965 0.6121 997 71.2953 73.6976
male 65.5930 0.6346 997 64.3478 66.8383

6.2.6 Pairwise Contrast (Female vs. Male)

contrast_reading <- as.data.frame(pairs(emm_reading_obj))

kable(contrast_reading %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Pairwise Contrast: Female vs. Male, Reading Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Pairwise Contrast: Female vs. Male, Reading Score
contrast estimate SE df t.ratio p.value
female - male 6.9035 0.8821 997 7.8262 0

6.3 Model 3 - Writing Score


6.3.1 ANOVA Table

anova_writing <- anova(model_writing)

kable(round(anova_writing, 4),
      caption = "ANCOVA Table: Writing Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
ANCOVA Table: Writing Score
Df Sum Sq Mean Sq F value Pr(>F)
parental_edu_ord 1 12925.78 12925.7766 65.0192 0
gender 1 19548.24 19548.2353 98.3314 0
Residuals 997 198203.07 198.7995 NA NA

6.3.2 Model Fit Statistics

sm_writing <- summary(model_writing)
fit_writing <- data.frame(
  Metric = c("R-squared", "Adjusted R-squared", "F statistic", "p-value (overall)"),
  Value  = c(round(sm_writing$r.squared, 4),
             round(sm_writing$adj.r.squared, 4),
             round(sm_writing$fstatistic[1], 3),
             round(pf(sm_writing$fstatistic[1], sm_writing$fstatistic[2],
                      sm_writing$fstatistic[3], lower.tail = FALSE), 6))
)
kable(fit_writing, caption = "Model Fit: Writing Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Model Fit: Writing Score
Metric Value
R-squared 0.1408
Adjusted R-squared 0.1391
F statistic 81.6750
p-value (overall) 0.0000

6.3.3 Regression Coefficients

coef_writing <- as.data.frame(summary(model_writing)$coefficients)
coef_writing$Term <- rownames(coef_writing)
coef_writing <- coef_writing %>%
  select(Term, Estimate, `Std. Error`, `t value`, `Pr(>|t|)`) %>%
  mutate(across(where(is.numeric), ~round(.x, 4)),
         Sig = ifelse(`Pr(>|t|)` < 0.001, "***",
               ifelse(`Pr(>|t|)` < 0.01,  "**",
               ifelse(`Pr(>|t|)` < 0.05,  "*", "ns"))))
rownames(coef_writing) <- NULL

kable(coef_writing,
      caption = "Regression Coefficients: Model 3 (Writing Score)",
      col.names = c("Term", "Estimate", "SE", "t value", "p-value", "Sig.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Regression Coefficients: Model 3 (Writing Score)
Term Estimate SE t value p-value Sig.
(Intercept) 65.1445 1.1434 56.9762 0 ***
parental_edu_ord 2.3300 0.3058 7.6200 0 ***
gendermale -8.8570 0.8932 -9.9162 0 ***

6.3.4 Effect Size (Partial eta-squared)

eta_writing <- eta_squared(model_writing, partial = TRUE)

kable(as.data.frame(eta_writing) %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Partial Eta-Squared: Model 3 (Writing Score)") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Partial Eta-Squared: Model 3 (Writing Score)
Parameter Eta2_partial CI CI_low CI_high
parental_edu_ord 0.0612 0.95 0.0393 1
gender 0.0898 0.95 0.0636 1

6.3.5 Adjusted Means (EMM)

emm_writing_obj <- emmeans(model_writing, ~ gender)
emm_writing_df  <- as.data.frame(emm_writing_obj)

kable(emm_writing_df %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Estimated Marginal Means: Writing Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Estimated Marginal Means: Writing Score
gender emmean SE df lower.CL upper.CL
female 72.3231 0.6198 997 71.1068 73.5393
male 63.4661 0.6425 997 62.2052 64.7270

6.3.6 Pairwise Contrast (Female vs. Male)

contrast_writing <- as.data.frame(pairs(emm_writing_obj))

kable(contrast_writing %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Pairwise Contrast: Female vs. Male, Writing Score") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Pairwise Contrast: Female vs. Male, Writing Score
contrast estimate SE df t.ratio p.value
female - male 8.857 0.8932 997 9.9162 0

7 Cross-Model Comparison


7.1 ANCOVA Results Across All Three DVs


extract_anova_row <- function(aov_table, term, dv_name) {
  data.frame(
    DV      = dv_name,
    Term    = term,
    Df      = aov_table[term, "Df"],
    SS      = round(aov_table[term, "Sum Sq"], 3),
    MS      = round(aov_table[term, "Mean Sq"], 3),
    F_value = round(aov_table[term, "F value"], 3),
    p_value = round(aov_table[term, "Pr(>F)"], 4),
    Sig     = ifelse(aov_table[term, "Pr(>F)"] < 0.001, "***",
              ifelse(aov_table[term, "Pr(>F)"] < 0.01,  "**",
              ifelse(aov_table[term, "Pr(>F)"] < 0.05,  "*", "ns")))
  )
}

full_comparison <- bind_rows(
  extract_anova_row(anova_math,    "parental_edu_ord", "Math Score"),
  extract_anova_row(anova_math,    "gender",           "Math Score"),
  extract_anova_row(anova_reading, "parental_edu_ord", "Reading Score"),
  extract_anova_row(anova_reading, "gender",           "Reading Score"),
  extract_anova_row(anova_writing, "parental_edu_ord", "Writing Score"),
  extract_anova_row(anova_writing, "gender",           "Writing Score")
)

kable(full_comparison,
      caption = "ANCOVA Results: All Three Models",
      col.names = c("DV", "Term", "df", "SS", "MS", "F", "p-value", "Sig.")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 11) %>%
  row_spec(which(full_comparison$Term == "gender"), background = "#f0f8ff")
ANCOVA Results: All Three Models
DV Term df SS MS F p-value Sig.
Math Score parental_edu_ord 1 5838.353 5838.353 26.848 0 ***
Math Score gender 1 7046.756 7046.756 32.405 0 ***
Reading Score parental_edu_ord 1 7761.257 7761.257 40.028 0 ***
Reading Score gender 1 11876.017 11876.017 61.249 0 ***
Writing Score parental_edu_ord 1 12925.777 12925.777 65.019 0 ***
Writing Score gender 1 19548.235 19548.235 98.331 0 ***

7.2 Model Fit Comparison


model_fit_all <- data.frame(
  DV        = c("Math Score", "Reading Score", "Writing Score"),
  R2        = round(c(sm_math$r.squared,    sm_reading$r.squared,    sm_writing$r.squared),    4),
  Adj_R2    = round(c(sm_math$adj.r.squared, sm_reading$adj.r.squared, sm_writing$adj.r.squared), 4),
  F_overall = round(c(sm_math$fstatistic[1], sm_reading$fstatistic[1], sm_writing$fstatistic[1]), 3),
  p_overall = round(c(
    pf(sm_math$fstatistic[1],    sm_math$fstatistic[2],    sm_math$fstatistic[3],    lower.tail = FALSE),
    pf(sm_reading$fstatistic[1], sm_reading$fstatistic[2], sm_reading$fstatistic[3], lower.tail = FALSE),
    pf(sm_writing$fstatistic[1], sm_writing$fstatistic[2], sm_writing$fstatistic[3], lower.tail = FALSE)
  ), 6)
)

kable(model_fit_all,
      caption = "Model Fit Statistics: All Three ANCOVA Models",
      col.names = c("DV", "R-squared", "Adj. R-squared",
                    "F (overall)", "p (overall)")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Model Fit Statistics: All Three ANCOVA Models
DV R-squared Adj. R-squared F (overall) p (overall)
Math Score 0.0561 0.0542 29.627 0
Reading Score 0.0922 0.0904 50.638 0
Writing Score 0.1408 0.1391 81.675 0

7.3 Effect Size Comparison


# eta_all built here, before it is referenced in the Interpretation section
eta_all <- bind_rows(
  as.data.frame(eta_math)    %>% mutate(DV = "Math Score"),
  as.data.frame(eta_reading) %>% mutate(DV = "Reading Score"),
  as.data.frame(eta_writing) %>% mutate(DV = "Writing Score")
) %>%
  mutate(
    Magnitude = case_when(
      Eta2_partial >= 0.14 ~ "Large",
      Eta2_partial >= 0.06 ~ "Medium",
      Eta2_partial >= 0.01 ~ "Small",
      TRUE                 ~ "Negligible"
    )
  )

kable(eta_all %>% mutate(across(where(is.numeric), ~round(.x, 4))),
      caption = "Partial Eta-Squared: All Three ANCOVA Models") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Partial Eta-Squared: All Three ANCOVA Models
Parameter Eta2_partial CI CI_low CI_high DV Magnitude
parental_edu_ord 0.0262 0.95 0.0123 1 Math Score Small
gender 0.0315 0.95 0.0161 1 Math Score Small
parental_edu_ord 0.0386 0.95 0.0214 1 Reading Score Small
gender 0.0579 0.95 0.0366 1 Reading Score Small
parental_edu_ord 0.0612 0.95 0.0393 1 Writing Score Medium
gender 0.0898 0.95 0.0636 1 Writing Score Medium
eta_gender <- eta_all %>% filter(Parameter == "gender")

ggplot(eta_gender, aes(x = DV, y = Eta2_partial, fill = DV)) +
  geom_col(alpha = 0.85, color = "white", width = 0.55) +
  geom_text(aes(label = paste0("eta2 = ", round(Eta2_partial, 4))),
            vjust = -0.5, size = 3.8, fontface = "bold") +
  geom_hline(yintercept = 0.01, linetype = "dashed", color = "gray60") +
  geom_hline(yintercept = 0.06, linetype = "dashed", color = "#E84855") +
  geom_hline(yintercept = 0.14, linetype = "dashed", color = "#2E86AB") +
  annotate("text", x = 0.6, y = 0.014, label = "Small (0.01)",  size = 2.8, color = "gray50") +
  annotate("text", x = 0.6, y = 0.074, label = "Medium (0.06)", size = 2.8, color = "#E84855") +
  annotate("text", x = 0.6, y = 0.154, label = "Large (0.14)",  size = 2.8, color = "#2E86AB") +
  scale_fill_brewer(palette = "Set1", guide = "none") +
  ylim(0, 0.20) +
  labs(title = "Gender Effect Size (Partial eta-squared) per ANCOVA Model",
       subtitle = "After controlling for parental education level",
       x = "Dependent Variable", y = "Partial eta-squared") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"))

7.4 Cohen’s d - Unadjusted Gender Differences


cohens_d_df <- lapply(dvs, function(dv) {
  female    <- df_raw[[dv]][df_raw$gender == "female"]
  male      <- df_raw[[dv]][df_raw$gender == "male"]
  pooled_sd <- sqrt(((length(female) - 1) * var(female) +
                     (length(male)   - 1) * var(male)) /
                    (length(female) + length(male) - 2))
  d <- (mean(female) - mean(male)) / pooled_sd
  data.frame(
    DV          = case_when(dv == "math_score" ~ "Math",
                            dv == "reading_score" ~ "Reading", TRUE ~ "Writing"),
    Mean_Female = round(mean(female), 2),
    Mean_Male   = round(mean(male),   2),
    Difference  = round(mean(female) - mean(male), 2),
    Cohen_d     = round(d, 4),
    Magnitude   = ifelse(abs(d) >= 0.8, "Large",
                  ifelse(abs(d) >= 0.5, "Medium",
                  ifelse(abs(d) >= 0.2, "Small", "Negligible")))
  )
}) %>% bind_rows()
rownames(cohens_d_df) <- NULL

kable(cohens_d_df,
      caption = "Cohen's d: Unadjusted Gender Effect per DV",
      col.names = c("DV", "Mean Female", "Mean Male",
                    "Difference", "Cohen's d", "Magnitude")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE)
Cohen’s d: Unadjusted Gender Effect per DV
DV Mean Female Mean Male Difference Cohen’s d Magnitude
Math 63.63 68.73 -5.10 -0.3407 Small
Reading 72.61 65.47 7.14 0.5037 Medium
Writing 72.47 63.31 9.16 0.6316 Medium

7.5 Adjusted vs. Unadjusted Means


emm_all_plot <- bind_rows(
  as.data.frame(emm_math_obj)    %>% mutate(DV = "Math",    Type = "Adjusted"),
  as.data.frame(emm_reading_obj) %>% mutate(DV = "Reading", Type = "Adjusted"),
  as.data.frame(emm_writing_obj) %>% mutate(DV = "Writing", Type = "Adjusted")
)

raw_means <- df_raw %>%
  group_by(gender) %>%
  summarise(
    Math    = mean(math_score),
    Reading = mean(reading_score),
    Writing = mean(writing_score),
    .groups = "drop"
  ) %>%
  pivot_longer(-gender, names_to = "DV", values_to = "emmean") %>%
  mutate(
    Type     = "Unadjusted",
    lower.CL = emmean - 1.96 * 2,
    upper.CL = emmean + 1.96 * 2
  )

combined <- bind_rows(
  emm_all_plot %>% select(gender, DV, emmean, lower.CL, upper.CL, Type),
  raw_means    %>% select(gender, DV, emmean, lower.CL, upper.CL, Type)
)

ggplot(combined, aes(x = gender, y = emmean, fill = Type,
                     ymin = lower.CL, ymax = upper.CL)) +
  geom_col(position = position_dodge(0.7), width = 0.6,
           alpha = 0.85, color = "white") +
  geom_errorbar(position = position_dodge(0.7), width = 0.2, linewidth = 0.7) +
  facet_wrap(~DV, ncol = 3) +
  scale_fill_manual(values = c("Adjusted" = "#2E86AB", "Unadjusted" = "#E84855")) +
  labs(title = "Adjusted vs. Unadjusted Means by Gender",
       subtitle = "Adjusted = parental education held at mean",
       x = "Gender", y = "Mean Score", fill = "Type") +
  theme_minimal(base_size = 11) +
  theme(plot.title = element_text(face = "bold"),
        strip.text = element_text(face = "bold"))

7.6 Residual Diagnostics


diag_plots <- lapply(names(models_list), function(nm) {
  label    <- case_when(nm == "math_score" ~ "Math",
                        nm == "reading_score" ~ "Reading", TRUE ~ "Writing")
  resid_df <- data.frame(
    Fitted    = fitted(models_list[[nm]]),
    Residuals = residuals(models_list[[nm]])
  )

  p_rf <- ggplot(resid_df, aes(x = Fitted, y = Residuals)) +
    geom_point(alpha = 0.3, size = 1, color = "#2E86AB") +
    geom_hline(yintercept = 0, color = "#E84855",
               linetype = "dashed", linewidth = 0.8) +
    geom_smooth(method = "loess", se = FALSE,
                color = "gray40", linewidth = 0.7) +
    labs(title = paste(label, "Residuals vs Fitted"),
         x = "Fitted", y = "Residuals") +
    theme_minimal(base_size = 9) +
    theme(plot.title = element_text(face = "bold", size = 9))

  p_qq <- ggplot(resid_df, aes(sample = Residuals)) +
    stat_qq(alpha = 0.45, size = 1, color = "#2E86AB") +
    stat_qq_line(color = "#E84855", linewidth = 0.8) +
    labs(title = paste(label, "Normal Q-Q"),
         x = "Theoretical", y = "Sample") +
    theme_minimal(base_size = 9) +
    theme(plot.title = element_text(face = "bold", size = 9))

  list(p_rf, p_qq)
})

grid.arrange(
  diag_plots[[1]][[1]], diag_plots[[1]][[2]],
  diag_plots[[2]][[1]], diag_plots[[2]][[2]],
  diag_plots[[3]][[1]], diag_plots[[3]][[2]],
  ncol = 2
)

8 Interpretation and Conclusion


8.1 Final Summary Table


final_summary <- data.frame(
  Model       = c("Model 1: Math Score",
                  "Model 2: Reading Score",
                  "Model 3: Writing Score"),
  F_Covariate = c(f_cov_math, f_cov_reading, f_cov_writing),
  p_Covariate = c("< 0.001", "< 0.001", "< 0.001"),
  F_Gender    = c(f_gender_math, f_gender_reading, f_gender_writing),
  p_Gender    = c("< 0.001", "< 0.001", "< 0.001"),
  Eta2        = c(eta_g_math, eta_g_reading, eta_g_writing),
  Direction   = c("Male > Female", "Female > Male", "Female > Male"),
  Effect      = c("Small", "Small-Medium", "Medium")
)

kable(final_summary,
      caption = "Final ANCOVA Summary: All Three Models",
      col.names = c("Model", "F (Covariate)", "p (Covariate)",
                    "F (Gender)", "p (Gender)",
                    "Partial eta2", "Direction", "Effect Size")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = TRUE) %>%
  scroll_box(width = "100%")
Final ANCOVA Summary: All Three Models
Model F (Covariate) p (Covariate) F (Gender) p (Gender) Partial eta2 Direction Effect Size
Model 1: Math Score 26.848 < 0.001 32.405 < 0.001 0.0315 Male > Female Small
Model 2: Reading Score 40.028 < 0.001 61.249 < 0.001 0.0579 Female > Male Small-Medium
Model 3: Writing Score 65.019 < 0.001 98.331 < 0.001 0.0898 Female > Male Medium

The ANCOVA summary indicates that both parental education and gender have a statistically significant effect on all academic outcomes (Math, Reading and Writing) with p-values < 0.001. In Model 1 (Math Score), parental education shows a significant effect (F = 26.848), and gender is also significant (F = 32.405), where male students score higher in mathematics after controlling for the covariate, although the effect size is small (η² = 0.0315). In Model 2 (Reading Score), parental education has a stronger influence (F = 40.028) and gender remains significant (F = 61.249), with female students outperforming male students in reading. The effect size falls in the small to small-to-medium range (η² = 0.0579).

In Model 3 (Writing Score), parental education has the strongest effect among all models (F = 65.019), while gender also shows its largest influence here (F = 98.331), with female students again outperforming male students. The effect size for gender is in the medium range (η² = 0.0898) that indicate a more practically meaningful difference in writing compared to the other domains. Overall, although gender differences remain statistically significant across all academic subjects after controlling for parental education, the magnitude of these effects is relatively small to moderate, suggesting that other unmeasured factors may also contribute to students’ academic performance.

8.2 Methodological Notes


note_table <- data.frame(
  No = 1:5,
  Note = c(
    "Ordinal encoding of parental education assumes equal intervals",
    "Normality of residuals technically violated",
    "All interaction terms non-significant",
    "Dataset is cross-sectional and simulated",
    "Lunch type and test prep not controlled"
  ),
  Implication = c(
    "May not capture non-linear educational returns",
    "Robust at n = 1,000 due to Central Limit Theorem",
    "ANCOVA valid; slopes parallel across gender groups",
    "Causal claims not warranted",
    "See Full Factorial MANCOVA for more comprehensive model"
  )
)

knitr::kable(note_table,
             caption = "Methodological Notes",
             align = "c") %>%
  kableExtra::kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  ) %>%
  kableExtra::column_spec(2, width = "40%") %>%
  kableExtra::column_spec(3, width = "40%")
Methodological Notes
No Note Implication
1 Ordinal encoding of parental education assumes equal intervals May not capture non-linear educational returns
2 Normality of residuals technically violated Robust at n = 1,000 due to Central Limit Theorem
3 All interaction terms non-significant ANCOVA valid; slopes parallel across gender groups
4 Dataset is cross-sectional and simulated Causal claims not warranted
5 Lunch type and test prep not controlled See Full Factorial MANCOVA for more comprehensive model

9 References


  • Huitema, B.E. (2011). The Analysis of Covariance and Alternatives (2nd ed.). Wiley.
  • Tabachnick, B.G., & Fidell, L.S. (2019). Using Multivariate Statistics (7th ed.). Pearson.

10 Data Source


11 R Packages Used


package_table <- data.frame(
  Package = c("tidyverse",
              "knitr + kableExtra",
              "car",
              "emmeans",
              "effectsize",
              "gridExtra",
              "moments"),
  Purpose = c("Data wrangling and visualization",
              "Table formatting",
              "Levene's test",
              "Estimated marginal means and pairwise contrasts",
              "Partial eta-squared",
              "Multi-panel plot layout",
              "Skewness and kurtosis")
)

kable(package_table,
      caption = "R Packages Used in the Analysis",
      align = c("l", "l")) %>%
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = FALSE
  )
R Packages Used in the Analysis
Package Purpose
tidyverse Data wrangling and visualization
knitr + kableExtra Table formatting
car Levene’s test
emmeans Estimated marginal means and pairwise contrasts
effectsize Partial eta-squared
gridExtra Multi-panel plot layout
moments Skewness and kurtosis

ANCOVA Analysis Report — Multivariate Analysis

Date: 20 April 2026

S1 Data Science | FMIPA UNESA | 2026