Introduction

This study aimed to investigate whether ambiguous priming and target syllable complexity influence the semantic processing speed of children and adolescents aged 6 to 16 years, and whether intelligence quotient (IQ) scores predict response times above and beyond age. To determine participants’ overall IQ scores, the Wechsler Intelligence Scale for Children (WISC-V) was administered prior to the study. Using a semantic relatedness judgement task, participants were tasked to make decisions on whether presented word pairs ‘go together’ or have a relationship. Age-appropriate language was used to ensure comprehension for the task instructions and stimuli. Through a sequential priming paradigm, each trial began with a fixation point for 500ms, followed by a prime word for 150ms, and then a blank screen for 100ms. A target word was then presented and remained on screen until the participant responded with either a green button for related word pairs (e.g., wave and ocean) or a red button for unrelated word pairs (e.g., wave and hammer). Reaction times were measured from the onset of the target stimulus to the participant’s button press, and therefore used as a measure of semantic processing speeds.

Method

# Seed Used for Simulating Data
set.seed(7311)

# Number of Participants
n <- 180

# Continuous Predictor - Age
age <- rnorm(n, mean = 11, sd = 3)
age <- pmax(age, 6)          
age <- pmin(age, 16)           
age <- round(age)           

# Binary Predictor - Lexical Ambiguity 
la_primes <- sample(c("Yes", "No"), n, replace = TRUE)
la_primes <- factor(la_primes)


# Three-Level Predictor - Syllable Number Complexity 
syl_num <- sample(c("Monosyllable", "Disyllable", "Trisyllable"), n, replace = TRUE)
syl_num <- factor(syl_num, levels = c("Monosyllable", "Disyllable", "Trisyllable"))


# Additional Predictor - IQ Score
iq_score <- rnorm(n, mean = 100, sd = 15)
iq_score <- pmax(iq_score, 40)
iq_score <- pmin(iq_score, 160)
iq_score_centered <- iq_score - 100

# Continuous Outcome - Mean Reaction Time
mean_reaction_time <- 1000 +
  -35 * age +
  70 * (la_primes == "Yes") +
  40 * (syl_num == "Disyllable") +
  80 * (syl_num == "Trisyllable") +
  -0.5 * iq_score_centered +
  rnorm(n, mean = 0, sd = 100)
mean_reaction_time <- pmax(mean_reaction_time, 550)   

# Combining Variables into a Data-set 
la_rt_data <- data.frame(
  age,
  la_primes,
  syl_num,
  iq_score,
  iq_score_centered,
  mean_reaction_time
)

A sample of 180 participants was simulated in R (version 4.5.3) using set.seed(7311), corresponding to the last four digits of the student ID to ensure full reproducibility. Age was used as a continuous predictor and simulated with a normal distribution (M = 11, SD = 3), rounded to the nearest year. This value was constrained to a plausible range of 6 to 16 years, due to the age administration requirements for the WISC-V. Participants’ WISC-V IQ scores were used as an additional continuous predictor, simulated with a normal distribution and constrained to 40 and 160, consistent with the standardisation of all Wechsler Intelligence Scales (M = 100, SD = 15). IQ scores were mean-centred (by subtracting 100) prior to regression analyses, so that zero represented the average IQ and model intercepts reflect predicted reaction time at average intelligence. Participants’ individual mean reaction times (RT) were used as the continuous outcome measured in milliseconds. The RTs were constrained to a minimum of 550ms to reduce physiologically implausible results.

Participants were randomly assigned to the binary categorical predictor of lexical ambiguity prime condition, presented with either an ambiguous prime condition (all prime words carried multiple meanings) or an unambiguous control condition (all prime words had one clear meaning). Additionally, the number of syllables in a target word was used as a three-level categorical variable and randomly assigned to participants. The syllabic complexity conditions included monosyllabic (one syllable target words), disyllabic (two syllable target words), or trisyllabic (three syllable target words).

Results

Descriptive Statistics and Visualisations

Participants ranged in ages from 6 to 16 years (M = 10.82, SD = 2.90), with mean WISC-V IQ scores of 98.68 (SD = 14.62). Mean RT across all participants was 711.78ms (SD = 125.54). Priming conditions were unevenly distributed, with the ambiguous prime condition having 105 participants and 75 in the unambiguous condition. Additionally, target word random assignment included 58 participants given monosyllabic target words, 62 had disyllabic targets, and 60 had trisyllabic targets.

Variable Descriptions

Variable Role Type Description Values
‘mean_reaction_time’ Outcome Continuous Individual mean reaction time on semantic relatedness tasks in milliseconds ≥550 ms
‘age’ Predictor Continuous Age of participants in years 6 - 16
‘la_primes’ Predictor Binary Categorical Lexical prime ambiguity presented Yes, No
‘syl_num’ Predictor Three-level Categorical Target words syllable complexity Monosyllable, Disyllable, Trisyllable
‘iq_score’ Predictor Additional Continuous WISC intelligence quotient score 40 - 160

Descriptive Statistics Tables

# Continuous Variables Descriptive Statistics Table
descript_stat <- data.frame( 
  Variable = c("Age", "IQ Score", "Individual Mean Reaction Time"), 
  M = round(c(mean(la_rt_data$age),
              mean(la_rt_data$iq_score), 
              mean(la_rt_data$mean_reaction_time)), 2),
  SD = round(c(sd(la_rt_data$age), 
               sd(la_rt_data$iq_score),
               sd(la_rt_data$mean_reaction_time)), 2), 
  Min = round(c(min(la_rt_data$age),
                min(la_rt_data$iq_score), 
                min(la_rt_data$mean_reaction_time)), 2),
  Max = round(c(max(la_rt_data$age), 
                max(la_rt_data$iq_score),
                max(la_rt_data$mean_reaction_time)), 2) 
  )
kable(descript_stat, 
      caption = "Table 1 Descriptive Statistics for Continuous Variables", 
      col.names = c("Variable", "M", "SD", "Min", "Max"))
Table 1 Descriptive Statistics for Continuous Variables
Variable M SD Min Max
Age 10.82 2.90 6.00 16.00
IQ Score 98.68 14.62 58.12 137.01
Individual Mean Reaction Time 711.78 125.54 550.00 1045.10
# Ambiguity Prime Condition Reaction Times and Frequency
la_rt_data %>% 
  group_by(la_primes) %>%
  summarise(
    n = n(),
    mean = mean(mean_reaction_time),
    sd = sd(mean_reaction_time), 
    min = min(mean_reaction_time),
    max = max(mean_reaction_time)
  ) %>% 
    kable(caption = "Table 2 Mean Reaction Times by Lexical Ambiguity Prime Condition",
        col.names = c("Ambiguous Prime", "n", "M", "SD", "Min", "Max"))
Table 2 Mean Reaction Times by Lexical Ambiguity Prime Condition
Ambiguous Prime n M SD Min Max
No 75 687.0027 126.8442 550 1045.0951
Yes 105 729.4790 122.1476 550 998.8324
# Syllable Amount Condition Reaction Times and Frequency
la_rt_data %>%
  group_by(syl_num) %>%
  summarise(
    n = n(),
    mean = mean(mean_reaction_time),
    sd = sd(mean_reaction_time),
    min = min(mean_reaction_time),
    max = max(mean_reaction_time)
  ) %>%
    kable(caption = "Table 3 Mean Reaction Times by Syllable Amount Condition",
        col.names = c("Syllable Count", "n", "M", "SD", "Min", "Max"))
Table 3 Mean Reaction Times by Syllable Amount Condition
Syllable Count n M SD Min Max
Monosyllable 58 683.4132 125.1170 550 998.8324
Disyllable 62 710.5322 123.4163 550 1045.0951
Trisyllable 60 740.4922 123.7539 550 983.4734

Visualisations

# Figure 1. Mean Reaction Times by Age
ggplot(la_rt_data, aes(x = age, y = mean_reaction_time)) + 
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", se = FALSE) + 
  labs(
    title = "Figure 1. Age and Mean Reaction Time",
    x = "Age", 
    y = "Mean Reaction Time (ms)"
  ) +
  theme_minimal()

# Figure 2. Mean Reaction Times by IQ Scores 
ggplot(la_rt_data, aes(x = iq_score, y = mean_reaction_time)) + 
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", se = FALSE) + 
  labs(
    title = "Figure 2. Intelligence Quotient Scores and Mean Reaction Time",
    x = "Intelligence Quotient Scores", 
    y = "Mean Reaction Time (ms)"
  ) +
  theme_minimal()

# Figure 3: Mean reaction time differences by lexical prime ambiguity
ggplot(la_rt_data, aes(x = la_primes, y = mean_reaction_time))+
  geom_boxplot() +
  labs(
    title = "Figure 3. Lexical Prime Ambiguity and Mean Reaction Time",
    x = "Ambiguous Lexical Prime",
    y = "Mean Reaction Time (ms)"
  ) + 
  theme_minimal()

# Figure 4: Mean reaction time differences by word pair relationship type}

ggplot(la_rt_data, aes(x = syl_num, y = mean_reaction_time))+
  geom_boxplot() +
  labs(
    title = "Figure 4. Type of Word Pair Relationship and Mean Reaction Time",
    x = "Word Pair Relationship",
    y = "Mean Reaction Time (ms)"
  ) + 
  theme_minimal()

Hypothesis Testing

The focal analysis of this study examined whether lexical ambiguity priming significantly predicted mean RT in children and adolescents.

H0: There is no difference in mean RT between participants presented with ambiguous primes and those presented with unambiguous primes.

H1: Participants presented with ambiguous primes show significantly slower RT than those with unambiguous primes.

Independent Samples T-Test


# T-Test
t_test <- t.test(mean_reaction_time ~ la_primes, data = la_rt_data, var.equal = TRUE)
t_test
## 
##  Two Sample t-test
## 
## data:  mean_reaction_time by la_primes
## t = -2.2635, df = 178, p-value = 0.02481
## alternative hypothesis: true difference in means between group No and group Yes is not equal to 0
## 95 percent confidence interval:
##  -79.507636  -5.444927
## sample estimates:
##  mean in group No mean in group Yes 
##          687.0027          729.4790

# Cohen's D
cohen.d(mean_reaction_time ~ la_primes, data = la_rt_data)
## 
## Cohen's d
## 
## d estimate: -0.3422148 (small)
## 95 percent confidence interval:
##       lower       upper 
## -0.64267752 -0.04175208

The t-test revealed that participants presented with ambiguously primed words (M = 729.48, SD = 122.15) exhibited significantly slower mean RTs compared to participants given unambiguous prime words (M = 687.00, SD = 126.84), t(178) = -2.26, p = .025, 95% CI [-79.51, -5.45], d = 0.34. While the effect size is small, these results suggest that lexical ambiguity in prime words may increase processing demands for young people.

# T-test results table
tidy(t_test) %>%
  select(estimate, estimate1, estimate2, statistic, p.value, conf.low, conf.high) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Independent Samples t-Test Results",
        col.names = c("Mean Difference", "M (Unambiguous)", "M (Ambiguous)", 
                      "t", "p", "95% CI Lower", "95% CI Upper"))
Independent Samples t-Test Results
Mean Difference M (Unambiguous) M (Ambiguous) t p 95% CI Lower 95% CI Upper
-42.476 687.003 729.479 -2.264 0.025 -79.508 -5.445
# Cohen's d table
d_result <- cohen.d(mean_reaction_time ~ la_primes, data = la_rt_data)

data.frame(
  d = round(d_result$estimate, 3),
  magnitude = d_result$magnitude,
  CI_lower = round(d_result$conf.int[1], 3),
  CI_upper = round(d_result$conf.int[2], 3)
) %>%
  kable(caption = "Cohen's d Effect Size",
        col.names = c("d", "Magnitude", "95% CI Lower", "95% CI Upper"))
Cohen’s d Effect Size
d Magnitude 95% CI Lower 95% CI Upper
lower -0.342 small -0.643 -0.042

Regression Equivalent of the T-Test


# Regression Equivalent of T-Test
reg_t_eq <- lm(mean_reaction_time ~ la_primes, data = la_rt_data)

# Summary of Regression Results
summary(reg_t_eq)
## 
## Call:
## lm(formula = mean_reaction_time ~ la_primes, data = la_rt_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -179.48  -98.94  -18.40   83.08  358.09 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    687.00      14.33  47.934   <2e-16 ***
## la_primesYes    42.48      18.77   2.264   0.0248 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 124.1 on 178 degrees of freedom
## Multiple R-squared:  0.02798,    Adjusted R-squared:  0.02252 
## F-statistic: 5.124 on 1 and 178 DF,  p-value: 0.02481

# Regression Confidence Intervals
confint(reg_t_eq)
##                   2.5 %    97.5 %
## (Intercept)  658.719519 715.28585
## la_primesYes   5.444927  79.50764

A simple linear regression model using lexical ambiguity prime conditions as a dummy coded binary predictor (0 = unambiguous, 1 = ambiguous) confirmed the t-test results. The intercept (β₀ = 687.00) represents the mean RTs for the unambiguous reference group, consistent with the group mean reported above. The positive coefficient (β₁ = 42.48, SE = 18.77, t = 2.26, p = .025, 95% CI [5.44, 79.51]) represents the estimated difference in mean RTs between conditions, with the model accounting for approximately 2.3% of RT variance (R² = .027). Both analyses were conceptually linked because they represent the same underlying model, but expressed differently; the regression coefficient is mathematically equivalent to the difference between group means reported in the t-test.

# Coefficients Table
tidy(reg_t_eq, conf.int = TRUE) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Equivalent Simple Linear Regression: Ambiguity Prime Predicting Reaction Time")
Equivalent Simple Linear Regression: Ambiguity Prime Predicting Reaction Time
term estimate std.error statistic p.value conf.low conf.high
(Intercept) 687.003 14.332 47.934 < .001 658.720 715.286
la_primesYes 42.476 18.765 2.264 0.025 5.445 79.508
# Model Fit Table
glance(reg_t_eq) %>%
  select(r.squared, adj.r.squared, sigma, statistic, p.value) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Model Fit Statistics")
Model Fit Statistics
r.squared adj.r.squared sigma statistic p.value
0.028 0.023 124.122 5.124 0.025

Simple Regression


# Simple Regression
reg_simp <- lm(mean_reaction_time ~ age, data = la_rt_data)

# Summary of Regression Results
summary(reg_simp)
## 
## Call:
## lm(formula = mean_reaction_time ~ age, data = la_rt_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -230.359  -64.553   -9.474   60.700  257.381 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1029.616     26.750   38.49   <2e-16 ***
## age          -29.384      2.389  -12.30   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 92.57 on 178 degrees of freedom
## Multiple R-squared:  0.4594, Adjusted R-squared:  0.4563 
## F-statistic: 151.2 on 1 and 178 DF,  p-value: < 2.2e-16

# Regression Confidence Intervals
confint(reg_simp)
##                2.5 %     97.5 %
## (Intercept) 976.8274 1082.40421
## age         -34.0990  -24.66869

A simple linear regression model was used to understand how age predicted RT, indicating that adolescents had faster RTs than younger children (β = -29.38, SE = 2.39, t = -12.30, p < .001, 95% CI [-34.10, -24.67]). The large negative slope reflects the large negative relationship between the age coefficient and RT, with RT decreasing by 29ms for every year age increased. This result suggests that older children processed semantic relationships significantly faster than younger children. The narrow 95% CI indicates the true population slope is closely estimated. Furthermore, while age accounted for approximately 46% of the variance (adjusted R² = .456), there was a residual error of approximately 93ms. This error suggests a large individual variability in mean RTs unexplained by age alone, highlighting the need for a multiple regression analysis.

# Coefficients Table
tidy(reg_simp, conf.int = TRUE) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Simple Linear Regression: Age Predicting Reaction Time")
Simple Linear Regression: Age Predicting Reaction Time
term estimate std.error statistic p.value conf.low conf.high
(Intercept) 1029.616 26.750 38.490 < .001 976.827 1082.404
age -29.384 2.389 -12.298 < .001 -34.099 -24.669
# Model Fit Table
glance(reg_simp) %>%
  select(r.squared, adj.r.squared, sigma, statistic, p.value) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Model Fit Statistics") 
Model Fit Statistics
r.squared adj.r.squared sigma statistic p.value
0.459 0.456 92.57 151.233 < .001

Multiple Regression


# Multiple Regression
reg_multi <- lm(mean_reaction_time ~ age + la_primes + iq_score_centered, data = la_rt_data)

# Summary of Regression Results
summary(reg_multi)
## 
## Call:
## lm(formula = mean_reaction_time ~ age + la_primes + iq_score_centered, 
##     data = la_rt_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -201.818  -59.512   -7.536   61.313  233.439 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       1004.6714    26.3632  38.109  < 2e-16 ***
## age                -29.9493     2.2951 -13.049  < 2e-16 ***
## la_primesYes        51.9571    13.5718   3.828 0.000179 ***
## iq_score_centered   -0.5708     0.4578  -1.247 0.214118    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 88.7 on 176 degrees of freedom
## Multiple R-squared:  0.5092, Adjusted R-squared:  0.5008 
## F-statistic: 60.87 on 3 and 176 DF,  p-value: < 2.2e-16

# Regression Confidence Intervals
confint(reg_multi)
##                        2.5 %       97.5 %
## (Intercept)       952.642616 1056.7000971
## age               -34.478790  -25.4198275
## la_primesYes       25.172636   78.7416040
## iq_score_centered  -1.474368    0.3327059

A multiple regression analysis found age, ambiguous primes and centred IQ scores, significantly predicted RTs (F(3,176) = 60.87, p < .001). This analysis allows for interpretation of coefficients’ unique contribution to RT by controlling for other predictors. Holding the priming condition and IQ scores constant, age (β = -29.95, SE = 2.29, t = -13.05, p <.001, 95% CI [-34.48, -25.42]) was a significant predictor of RT. Similarly, holding age and IQ scores constant, ambiguous priming (β = 51.96, SE = 13.57, t = 3.83, p < .001, 95% CI [25.17, 78.74]) significantly predicted RT. In contrast, centred IQ scores did not significantly predict RTs when holding age and the priming condition constant (β = -0.57, SE = 0.46, t = -1.25, p = .214, 95% CI [-1.47, 0.33]). These results suggest participants’ age decreased RT by 30ms per year, but increased by 52ms when faced with lexical ambiguity.

# Coefficients Table
tidy(reg_multi, conf.int = TRUE) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Multiple Linear Regression: Age, Lexical Ambiguity and IQ Predicting Reaction Time")
Multiple Linear Regression: Age, Lexical Ambiguity and IQ Predicting Reaction Time
term estimate std.error statistic p.value conf.low conf.high
(Intercept) 1004.671 26.363 38.109 < .001 952.643 1056.700
age -29.949 2.295 -13.049 < .001 -34.479 -25.420
la_primesYes 51.957 13.572 3.828 < .001 25.173 78.742
iq_score_centered -0.571 0.458 -1.247 0.214 -1.474 0.333
# Model Fit Table
glance(reg_multi) %>%
  select(r.squared, adj.r.squared, sigma, statistic, p.value) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Model Fit Statistics")
Model Fit Statistics
r.squared adj.r.squared sigma statistic p.value
0.509 0.501 88.697 60.87 < .001

One-Way ANOVA


# One-Way ANOVA
anova_oneway <- aov(mean_reaction_time ~ syl_num, data = la_rt_data)

# Summary of ANOVA Results
summary(anova_oneway)
##              Df  Sum Sq Mean Sq F value Pr(>F)  
## syl_num       2   96231   48116   3.125 0.0464 *
## Residuals   177 2725007   15396                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Tukey HSD
TukeyHSD(anova_oneway)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = mean_reaction_time ~ syl_num, data = la_rt_data)
## 
## $syl_num
##                              diff        lwr       upr     p adj
## Disyllable-Monosyllable  27.11897 -26.454715  80.69266 0.4568507
## Trisyllable-Monosyllable 57.07905   3.075522 111.08259 0.0355628
## Trisyllable-Disyllable   29.96008 -23.150279  83.07044 0.3786340

A one-way ANOVA revealed a statistically significant effect of syllable complexity on RT (F(2,177) = 3.13, p < .05), indicating that at least one syllable group differed significantly from the others. The F-statistic indicates between-group RT variance was approximately 3 times larger than the within-group variance. This result indicates that changes in syllable complexity account for variations in semantic processing speeds. A post hoc comparison suggested that the trisyllabic RTs (M = 740.49, SD = 123.75) were significantly slower than monosyllables by 57ms (M = 683.41, SD = 125.12, p < .05), but nonsignificantly slower by 30ms than disyllables (M = 710.53, SD = 123.42, p = .378). Additionally, disyllabic word RTs were not significantly slower (27ms) than monosyllabic (p = .457). This post hoc comparison suggests that moving from a one-syllable word to a three-syllable word placed increased processing demands on participants.

# ANOVA omnibus result
tidy(anova_oneway) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "One-Way ANOVA: Syllabic Complexity Predicting Reaction Time")
One-Way ANOVA: Syllabic Complexity Predicting Reaction Time
term df sumsq meansq statistic p.value
syl_num 2 96231.32 48115.66 3.125 0.046
Residuals 177 2725006.82 15395.52 NA NA
# Tukey post hoc comparisons
tidy(TukeyHSD(anova_oneway)) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(adj.p.value = ifelse(adj.p.value < .001, "< .001", as.character(adj.p.value))) %>%
  kable(caption = "Tukey Post Hoc Comparisons")
Tukey Post Hoc Comparisons
term contrast null.value estimate conf.low conf.high adj.p.value
syl_num Disyllable-Monosyllable 0 27.119 -26.455 80.693 0.457
syl_num Trisyllable-Monosyllable 0 57.079 3.076 111.083 0.036
syl_num Trisyllable-Disyllable 0 29.960 -23.150 83.070 0.379

Regression Equivalent of One-Way ANOVA


# Regression Equivalent of the One-Way ANOVA
reg_oneway_eq <- lm(mean_reaction_time ~ syl_num, data = la_rt_data)

# Summary of Regression Results
summary(reg_oneway_eq)
## 
## Call:
## lm(formula = mean_reaction_time ~ syl_num, data = la_rt_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -190.49  -99.19  -17.24   77.32  334.56 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          683.41      16.29  41.947   <2e-16 ***
## syl_numDisyllable     27.12      22.67   1.196   0.2331    
## syl_numTrisyllable    57.08      22.85   2.498   0.0134 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 124.1 on 177 degrees of freedom
## Multiple R-squared:  0.03411,    Adjusted R-squared:  0.0232 
## F-statistic: 3.125 on 2 and 177 DF,  p-value: 0.04636

A simple linear regression model using syllable complexity as a dummy-coded predictor produced identical results to the one-way ANOVA. The model’s intercept (β₀ = 683.41) represents the mean RT for the monosyllabic condition group, which served as the reference condition. The disyllabic coefficient (β₁ = 27.12, SE = 22.67, t = 1.19, p = .401) indicated a small non-significant increase in mean RT relative to the monosyllabic group. In contrast, the trisyllabic coefficient (β₂ = 57.08, SE = 22.85, t = 2.49, p = .013) showed a statistically significant increase in mean RTs compared to monosyllabic conditions. This model is mathematically identical to the ANOVA and post hoc results, accounting for 2.3% of RT variance (adjusted R² = .023).

# Coefficients Table
tidy(reg_oneway_eq, conf.int = TRUE) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = " Equivalent Simple Linear Regression: Syllabic Complexity Predicting Reaction Time")
Equivalent Simple Linear Regression: Syllabic Complexity Predicting Reaction Time
term estimate std.error statistic p.value conf.low conf.high
(Intercept) 683.413 16.292 41.947 < .001 651.261 715.565
syl_numDisyllable 27.119 22.666 1.196 0.233 -17.612 71.850
syl_numTrisyllable 57.079 22.848 2.498 0.013 11.989 102.169
# Model Fit Table
glance(reg_oneway_eq) %>%
  select(r.squared, adj.r.squared, sigma, statistic, p.value) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Model Fit Statistics") 
Model Fit Statistics
r.squared adj.r.squared sigma statistic p.value
0.034 0.023 124.079 3.125 0.046

Recoded Reference Group


# Recoded Copy of Syl_Num
syl_num_recoded <- relevel(la_rt_data$syl_num, ref = "Trisyllable")

# Refit Model with New Reference Group
refit_model <- lm(mean_reaction_time ~ syl_num_recoded, data = la_rt_data)

# Summary of Refit Model Results
summary(refit_model)
## 
## Call:
## lm(formula = mean_reaction_time ~ syl_num_recoded, data = la_rt_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -190.49  -99.19  -17.24   77.32  334.56 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                   740.49      16.02  46.227   <2e-16 ***
## syl_num_recodedMonosyllable   -57.08      22.85  -2.498   0.0134 *  
## syl_num_recodedDisyllable     -29.96      22.47  -1.333   0.1841    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 124.1 on 177 degrees of freedom
## Multiple R-squared:  0.03411,    Adjusted R-squared:  0.0232 
## F-statistic: 3.125 on 2 and 177 DF,  p-value: 0.04636
# Coefficient Table
tidy(refit_model) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Regression with Trisyllable as Reference Group",
        col.names = c("Term", "Estimate", "SE", "t", "p"))
Regression with Trisyllable as Reference Group
Term Estimate SE t p
(Intercept) 740.492 16.018 46.227 < .001
syl_num_recodedMonosyllable -57.079 22.848 -2.498 0.013
syl_num_recodedDisyllable -29.960 22.470 -1.333 0.184

Discussion

The ambiguity effects observed were statistically significant, allowing for rejection of the null hypothesis of no difference in RT between priming conditions. During this data simulation, a conceptual link between analyses became evident; t-tests, simple regressions and one-way ANOVAs are all expressions of the same underlying linear model that estimates mean group differences. This equivalence is enabled by dummy coding, converting categorical predictors into binary variables, allowing for intercepts to represent the reference group mean, and coefficients to represent the deviation of another group from the reference. This results in mathematically identical F-statistics and p-values between the regression equivalents of the ANOVA and t-test, indicating the same underlying model is being represented differently. This relationship is seen when re-coding the ANOVA as the model results remained the same but the intercept and coefficient results changed reflecting the change in reference group or comparison point from monosyllabic to trisyllabic.

Despite these methodological insights, there were several limitations to this simulation study. The absence of a significant IQ effect may reflect the use of an overall intelligence measure, rather than specifically isolating for verbal intelligence or literacy scores. Additionally, no data were gathered regarding specific demographic factors, learning disabilities, or accuracy rates, alongside no distinction between word pair relationship types. These factors may have provided valuable information regarding the developmental differences in processing semantic relationships. Future research should examine the impacts of inhibition and phonological-related learning disabilities on ambiguity resolution in children and adolescents.

Reproducibility Appendix

set.seed(7311)
head(la_rt_data, 10)
##    age la_primes      syl_num  iq_score iq_score_centered mean_reaction_time
## 1   10        No   Disyllable 109.76741          9.767411           664.3658
## 2    8       Yes Monosyllable  90.82436         -9.175636           732.2178
## 3   11       Yes Monosyllable 130.80958         30.809579           706.1575
## 4    8        No  Trisyllable  94.18601         -5.813987           823.4557
## 5   16       Yes Monosyllable 119.62693         19.626932           550.0000
## 6    6       Yes Monosyllable 118.20543         18.205431           790.6346
## 7    8       Yes Monosyllable 103.75687          3.756874           727.6174
## 8    6       Yes Monosyllable  75.85787        -24.142130           882.9211
## 9    6        No   Disyllable 107.99513          7.995127           712.8208
## 10  16       Yes  Trisyllable 107.46913          7.469125           662.7193
str(la_rt_data)
## 'data.frame':    180 obs. of  6 variables:
##  $ age               : num  10 8 11 8 16 6 8 6 6 16 ...
##  $ la_primes         : Factor w/ 2 levels "No","Yes": 1 2 2 1 2 2 2 2 1 2 ...
##  $ syl_num           : Factor w/ 3 levels "Monosyllable",..: 2 1 1 3 1 1 1 1 2 3 ...
##  $ iq_score          : num  109.8 90.8 130.8 94.2 119.6 ...
##  $ iq_score_centered : num  9.77 -9.18 30.81 -5.81 19.63 ...
##  $ mean_reaction_time: num  664 732 706 823 550 ...
sessionInfo()
## R version 4.5.3 (2026-03-11)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.6.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Pacific/Auckland
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] broom_1.0.12  effsize_0.8.1 dplyr_1.2.1   knitr_1.51    ggplot2_4.0.2
## 
## loaded via a namespace (and not attached):
##  [1] Matrix_1.7-4       gtable_0.3.6       jsonlite_2.0.0     compiler_4.5.3    
##  [5] tidyselect_1.2.1   stringr_1.6.0      tidyr_1.3.2        jquerylib_0.1.4   
##  [9] splines_4.5.3      scales_1.4.0       yaml_2.3.12        fastmap_1.2.0     
## [13] lattice_0.22-9     R6_2.6.1           labeling_0.4.3     generics_0.1.4    
## [17] backports_1.5.1    tibble_3.3.1       bslib_0.10.0       pillar_1.11.1     
## [21] RColorBrewer_1.1-3 rlang_1.1.7        stringi_1.8.7      cachem_1.1.0      
## [25] xfun_0.57          sass_0.4.10        S7_0.2.1           cli_3.6.5         
## [29] withr_3.0.2        magrittr_2.0.5     mgcv_1.9-4         digest_0.6.39     
## [33] grid_4.5.3         rstudioapi_0.18.0  nlme_3.1-168       lifecycle_1.0.5   
## [37] vctrs_0.7.1        evaluate_1.0.5     glue_1.8.0         farver_2.1.2      
## [41] rmarkdown_2.31     purrr_1.2.2        tools_4.5.3        pkgconfig_2.0.3   
## [45] htmltools_0.5.9

One surprising finding in my simulated study was the uneven distribution of priming conditions. While this might be expected with randomised sampling, the difference between groups was considerably large (n = 30) and may have skewed my overall findings. Additionally, the strength of the age effect on RT variability (adjusted R² = .456) was considerably large but likely reflected the low SD used in the simulation. In the population, individual variability in RT would be much higher, and may have affected the power of my age results.

Large Language Models were used to support the code production of this assignment