Introduction

This study aimed to investigate whether ambiguous priming and target syllable complexity influence the semantic processing speed of children and adolescents aged 6 to 16 years, and whether intelligence quotient (IQ) scores predict response times above and beyond age. To determine participants’ overall IQ scores, the Wechsler Intelligence Scale for Children (WISC-V) was administered prior to the study. Using a semantic relatedness judgement task, participants were tasked to make decisions on whether presented word pairs ‘go together’ or have a relationship. Age-appropriate language was used to ensure comprehension for the task instructions and stimuli. Through a sequential priming paradigm, each trial began with a fixation point for 500ms, followed by a prime word for 150ms, and then a blank screen for 100ms. A target word was then presented and remained on screen until the participant responded with either a green button for related word pairs (e.g., wave and ocean) or a red button for unrelated word pairs (e.g., wave and hammer). Reaction times were measured from the onset of the target stimulus to the participant’s button press, and therefore used as a measure of semantic processing speeds.

Method

# Seed Used for Simulating Data
set.seed(7311)

# Number of Participants
n <- 180

# Continuous Predictor - Age
age <- rnorm(n, mean = 11, sd = 3)
age <- pmax(age, 6)          
age <- pmin(age, 16)           
age <- round(age)           

# Binary Predictor - Lexical Ambiguity 
la_primes <- sample(c("Yes", "No"), n, replace = TRUE)
la_primes <- factor(la_primes)


# Three-Level Predictor - Syllable Number Complexity 
syl_num <- sample(c("Monosyllable", "Disyllable", "Trisyllable"), n, replace = TRUE)
syl_num <- factor(syl_num, levels = c("Monosyllable", "Disyllable", "Trisyllable"))


# Additional Predictor - IQ Score
iq_score <- rnorm(n, mean = 100, sd = 15)
iq_score <- pmax(iq_score, 40)
iq_score <- pmin(iq_score, 160)
iq_score_centered <- iq_score - 100

# Continuous Outcome - Mean Reaction Time
mean_reaction_time <- 1000 +
  -35 * age +
  70 * (la_primes == "Yes") +
  40 * (syl_num == "Disyllable") +
  80 * (syl_num == "Trisyllable") +
  -0.5 * iq_score_centered +
  rnorm(n, mean = 0, sd = 100)
mean_reaction_time <- pmax(mean_reaction_time, 550)   

# Combining Variables into a Data-set 
la_rt_data <- data.frame(
  age,
  la_primes,
  syl_num,
  iq_score,
  iq_score_centered,
  mean_reaction_time
)

A sample of 180 participants was simulated in R (version 4.5.3) using set.seed(7311), corresponding to the last four digits of the student ID to ensure full reproducibility. Age was used as a continuous predictor and simulated with a normal distribution (M = 11, SD = 3), rounded to the nearest year. This value was constrained to a plausible range of 6 to 16 years, due to the age administration requirements for the WISC-V. Participants’ WISC-V IQ scores were used as an additional continuous predictor, simulated with a normal distribution and constrained to 40 and 160, consistent with the standardisation of all Wechsler Intelligence Scales (M = 100, SD = 15). IQ scores were mean-centred (by subtracting 100) prior to regression analyses, so that zero represented the average IQ and model intercepts reflect predicted reaction time at average intelligence. Participants’ individual mean reaction times (RT) were used as the continuous outcome measured in milliseconds. The RTs were constrained to a minimum of 550ms to reduce physiologically implausible results.

Participants were randomly assigned to the binary categorical predictor of lexical ambiguity prime condition, presented with either an ambiguous prime condition (all prime words carried multiple meanings) or an unambiguous control condition (all prime words had one clear meaning). Additionally, the number of syllables in a target word was used as a three-level categorical variable and randomly assigned to participants. The syllabic complexity conditions included monosyllabic (one syllable target words), disyllabic (two syllable target words), or trisyllabic (three syllable target words).

Results

Descriptive Statistics and Visualisations

Participants ranged in ages from 6 to 16 years (M = 10.82, SD = 2.90), with mean WISC-V IQ scores of 98.68 (SD = 14.62). Mean RT across all participants was 711.78ms (SD = 125.54). Priming conditions were unevenly distributed, with the ambiguous prime condition having 105 participants and 75 in the unambiguous condition. Additionally, target word random assignment included 58 participants given monosyllabic target words, 62 had disyllabic targets, and 60 had trisyllabic targets.

Variable Descriptions

Variable	Role	Type	Description	Values
‘mean_reaction_time’	Outcome	Continuous	Individual mean reaction time on semantic relatedness tasks in milliseconds	≥550 ms
‘age’	Predictor	Continuous	Age of participants in years	6 - 16
‘la_primes’	Predictor	Binary Categorical	Lexical prime ambiguity presented	Yes, No
‘syl_num’	Predictor	Three-level Categorical	Target words syllable complexity	Monosyllable, Disyllable, Trisyllable
‘iq_score’	Predictor	Additional Continuous	WISC intelligence quotient score	40 - 160

Descriptive Statistics Tables

# Continuous Variables Descriptive Statistics Table
descript_stat <- data.frame( 
  Variable = c("Age", "IQ Score", "Individual Mean Reaction Time"), 
  M = round(c(mean(la_rt_data$age),
              mean(la_rt_data$iq_score), 
              mean(la_rt_data$mean_reaction_time)), 2),
  SD = round(c(sd(la_rt_data$age), 
               sd(la_rt_data$iq_score),
               sd(la_rt_data$mean_reaction_time)), 2), 
  Min = round(c(min(la_rt_data$age),
                min(la_rt_data$iq_score), 
                min(la_rt_data$mean_reaction_time)), 2),
  Max = round(c(max(la_rt_data$age), 
                max(la_rt_data$iq_score),
                max(la_rt_data$mean_reaction_time)), 2) 
  )
kable(descript_stat, 
      caption = "Table 1 Descriptive Statistics for Continuous Variables", 
      col.names = c("Variable", "M", "SD", "Min", "Max"))

Table 1 Descriptive Statistics for Continuous Variables
Variable	M	SD	Min	Max
Age	10.82	2.90	6.00	16.00
IQ Score	98.68	14.62	58.12	137.01
Individual Mean Reaction Time	711.78	125.54	550.00	1045.10

# Ambiguity Prime Condition Reaction Times and Frequency
la_rt_data %>% 
  group_by(la_primes) %>%
  summarise(
    n = n(),
    mean = mean(mean_reaction_time),
    sd = sd(mean_reaction_time), 
    min = min(mean_reaction_time),
    max = max(mean_reaction_time)
  ) %>% 
    kable(caption = "Table 2 Mean Reaction Times by Lexical Ambiguity Prime Condition",
        col.names = c("Ambiguous Prime", "n", "M", "SD", "Min", "Max"))

Table 2 Mean Reaction Times by Lexical Ambiguity Prime Condition
Ambiguous Prime	n	M	SD	Min	Max
No	75	687.0027	126.8442	550	1045.0951
Yes	105	729.4790	122.1476	550	998.8324

# Syllable Amount Condition Reaction Times and Frequency
la_rt_data %>%
  group_by(syl_num) %>%
  summarise(
    n = n(),
    mean = mean(mean_reaction_time),
    sd = sd(mean_reaction_time),
    min = min(mean_reaction_time),
    max = max(mean_reaction_time)
  ) %>%
    kable(caption = "Table 3 Mean Reaction Times by Syllable Amount Condition",
        col.names = c("Syllable Count", "n", "M", "SD", "Min", "Max"))

Table 3 Mean Reaction Times by Syllable Amount Condition
Syllable Count	n	M	SD	Min	Max
Monosyllable	58	683.4132	125.1170	550	998.8324
Disyllable	62	710.5322	123.4163	550	1045.0951
Trisyllable	60	740.4922	123.7539	550	983.4734

Visualisations

# Figure 1. Mean Reaction Times by Age
ggplot(la_rt_data, aes(x = age, y = mean_reaction_time)) + 
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", se = FALSE) + 
  labs(
    title = "Figure 1. Age and Mean Reaction Time",
    x = "Age", 
    y = "Mean Reaction Time (ms)"
  ) +
  theme_minimal()

# Figure 2. Mean Reaction Times by IQ Scores 
ggplot(la_rt_data, aes(x = iq_score, y = mean_reaction_time)) + 
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", se = FALSE) + 
  labs(
    title = "Figure 2. Intelligence Quotient Scores and Mean Reaction Time",
    x = "Intelligence Quotient Scores", 
    y = "Mean Reaction Time (ms)"
  ) +
  theme_minimal()

# Figure 3: Mean reaction time differences by lexical prime ambiguity
ggplot(la_rt_data, aes(x = la_primes, y = mean_reaction_time))+
  geom_boxplot() +
  labs(
    title = "Figure 3. Lexical Prime Ambiguity and Mean Reaction Time",
    x = "Ambiguous Lexical Prime",
    y = "Mean Reaction Time (ms)"
  ) + 
  theme_minimal()

# Figure 4: Mean reaction time differences by word pair relationship type}

ggplot(la_rt_data, aes(x = syl_num, y = mean_reaction_time))+
  geom_boxplot() +
  labs(
    title = "Figure 4. Type of Word Pair Relationship and Mean Reaction Time",
    x = "Word Pair Relationship",
    y = "Mean Reaction Time (ms)"
  ) + 
  theme_minimal()

Hypothesis Testing

The focal analysis of this study examined whether lexical ambiguity priming significantly predicted mean RT in children and adolescents.

H₀: There is no difference in mean RT between participants presented with ambiguous primes and those presented with unambiguous primes.

H₁: Participants presented with ambiguous primes show significantly slower RT than those with unambiguous primes.

Independent Samples T-Test


# T-Test
t_test <- t.test(mean_reaction_time ~ la_primes, data = la_rt_data, var.equal = TRUE)
t_test
## 
##  Two Sample t-test
## 
## data:  mean_reaction_time by la_primes
## t = -2.2635, df = 178, p-value = 0.02481
## alternative hypothesis: true difference in means between group No and group Yes is not equal to 0
## 95 percent confidence interval:
##  -79.507636  -5.444927
## sample estimates:
##  mean in group No mean in group Yes 
##          687.0027          729.4790

# Cohen's D
cohen.d(mean_reaction_time ~ la_primes, data = la_rt_data)
## 
## Cohen's d
## 
## d estimate: -0.3422148 (small)
## 95 percent confidence interval:
##       lower       upper 
## -0.64267752 -0.04175208

The t-test revealed that participants presented with ambiguously primed words (M = 729.48, SD = 122.15) exhibited significantly slower mean RTs compared to participants given unambiguous prime words (M = 687.00, SD = 126.84), t(178) = -2.26, p = .025, 95% CI [-79.51, -5.45], d = 0.34. While the effect size is small, these results suggest that lexical ambiguity in prime words may increase processing demands for young people.

# T-test results table
tidy(t_test) %>%
  select(estimate, estimate1, estimate2, statistic, p.value, conf.low, conf.high) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Independent Samples t-Test Results",
        col.names = c("Mean Difference", "M (Unambiguous)", "M (Ambiguous)", 
                      "t", "p", "95% CI Lower", "95% CI Upper"))

Independent Samples t-Test Results
Mean Difference	M (Unambiguous)	M (Ambiguous)	t	p	95% CI Lower	95% CI Upper
-42.476	687.003	729.479	-2.264	0.025	-79.508	-5.445

# Cohen's d table
d_result <- cohen.d(mean_reaction_time ~ la_primes, data = la_rt_data)

data.frame(
  d = round(d_result$estimate, 3),
  magnitude = d_result$magnitude,
  CI_lower = round(d_result$conf.int[1], 3),
  CI_upper = round(d_result$conf.int[2], 3)
) %>%
  kable(caption = "Cohen's d Effect Size",
        col.names = c("d", "Magnitude", "95% CI Lower", "95% CI Upper"))

Cohen’s d Effect Size
	d	Magnitude	95% CI Lower	95% CI Upper
lower	-0.342	small	-0.643	-0.042

Regression Equivalent of the T-Test


# Regression Equivalent of T-Test
reg_t_eq <- lm(mean_reaction_time ~ la_primes, data = la_rt_data)

# Summary of Regression Results
summary(reg_t_eq)
## 
## Call:
## lm(formula = mean_reaction_time ~ la_primes, data = la_rt_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -179.48  -98.94  -18.40   83.08  358.09 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    687.00      14.33  47.934   <2e-16 ***
## la_primesYes    42.48      18.77   2.264   0.0248 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 124.1 on 178 degrees of freedom
## Multiple R-squared:  0.02798,    Adjusted R-squared:  0.02252 
## F-statistic: 5.124 on 1 and 178 DF,  p-value: 0.02481

# Regression Confidence Intervals
confint(reg_t_eq)
##                   2.5 %    97.5 %
## (Intercept)  658.719519 715.28585
## la_primesYes   5.444927  79.50764

A simple linear regression model using lexical ambiguity prime conditions as a dummy coded binary predictor (0 = unambiguous, 1 = ambiguous) confirmed the t-test results. The intercept (β₀ = 687.00) represents the mean RTs for the unambiguous reference group, consistent with the group mean reported above. The positive coefficient (β₁ = 42.48, SE = 18.77, t = 2.26, p = .025, 95% CI [5.44, 79.51]) represents the estimated difference in mean RTs between conditions, with the model accounting for approximately 2.3% of RT variance (R² = .027). Both analyses were conceptually linked because they represent the same underlying model, but expressed differently; the regression coefficient is mathematically equivalent to the difference between group means reported in the t-test.

# Coefficients Table
tidy(reg_t_eq, conf.int = TRUE) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Equivalent Simple Linear Regression: Ambiguity Prime Predicting Reaction Time")

Equivalent Simple Linear Regression: Ambiguity Prime Predicting Reaction Time
term	estimate	std.error	statistic	p.value	conf.low	conf.high
(Intercept)	687.003	14.332	47.934	< .001	658.720	715.286
la_primesYes	42.476	18.765	2.264	0.025	5.445	79.508

# Model Fit Table
glance(reg_t_eq) %>%
  select(r.squared, adj.r.squared, sigma, statistic, p.value) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Model Fit Statistics")

Model Fit Statistics
r.squared	adj.r.squared	sigma	statistic	p.value
0.028	0.023	124.122	5.124	0.025

Simple Regression


# Simple Regression
reg_simp <- lm(mean_reaction_time ~ age, data = la_rt_data)

# Summary of Regression Results
summary(reg_simp)
## 
## Call:
## lm(formula = mean_reaction_time ~ age, data = la_rt_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -230.359  -64.553   -9.474   60.700  257.381 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1029.616     26.750   38.49   <2e-16 ***
## age          -29.384      2.389  -12.30   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 92.57 on 178 degrees of freedom
## Multiple R-squared:  0.4594, Adjusted R-squared:  0.4563 
## F-statistic: 151.2 on 1 and 178 DF,  p-value: < 2.2e-16

# Regression Confidence Intervals
confint(reg_simp)
##                2.5 %     97.5 %
## (Intercept) 976.8274 1082.40421
## age         -34.0990  -24.66869

A simple linear regression model was used to understand how age predicted RT, indicating that adolescents had faster RTs than younger children (β = -29.38, SE = 2.39, t = -12.30, p < .001, 95% CI [-34.10, -24.67]). The large negative slope reflects the large negative relationship between the age coefficient and RT, with RT decreasing by 29ms for every year age increased. This result suggests that older children processed semantic relationships significantly faster than younger children. The narrow 95% CI indicates the true population slope is closely estimated. Furthermore, while age accounted for approximately 46% of the variance (adjusted R² = .456), there was a residual error of approximately 93ms. This error suggests a large individual variability in mean RTs unexplained by age alone, highlighting the need for a multiple regression analysis.

# Coefficients Table
tidy(reg_simp, conf.int = TRUE) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Simple Linear Regression: Age Predicting Reaction Time")

Simple Linear Regression: Age Predicting Reaction Time
term	estimate	std.error	statistic	p.value	conf.low	conf.high
(Intercept)	1029.616	26.750	38.490	< .001	976.827	1082.404
age	-29.384	2.389	-12.298	< .001	-34.099	-24.669

# Model Fit Table
glance(reg_simp) %>%
  select(r.squared, adj.r.squared, sigma, statistic, p.value) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Model Fit Statistics")

Model Fit Statistics
r.squared	adj.r.squared	sigma	statistic	p.value
0.459	0.456	92.57	151.233	< .001

Multiple Regression


# Multiple Regression
reg_multi <- lm(mean_reaction_time ~ age + la_primes + iq_score_centered, data = la_rt_data)

# Summary of Regression Results
summary(reg_multi)
## 
## Call:
## lm(formula = mean_reaction_time ~ age + la_primes + iq_score_centered, 
##     data = la_rt_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -201.818  -59.512   -7.536   61.313  233.439 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       1004.6714    26.3632  38.109  < 2e-16 ***
## age                -29.9493     2.2951 -13.049  < 2e-16 ***
## la_primesYes        51.9571    13.5718   3.828 0.000179 ***
## iq_score_centered   -0.5708     0.4578  -1.247 0.214118    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 88.7 on 176 degrees of freedom
## Multiple R-squared:  0.5092, Adjusted R-squared:  0.5008 
## F-statistic: 60.87 on 3 and 176 DF,  p-value: < 2.2e-16

# Regression Confidence Intervals
confint(reg_multi)
##                        2.5 %       97.5 %
## (Intercept)       952.642616 1056.7000971
## age               -34.478790  -25.4198275
## la_primesYes       25.172636   78.7416040
## iq_score_centered  -1.474368    0.3327059

A multiple regression analysis found age, ambiguous primes and centred IQ scores, significantly predicted RTs (F(3,176) = 60.87, p < .001). This analysis allows for interpretation of coefficients’ unique contribution to RT by controlling for other predictors. Holding the priming condition and IQ scores constant, age (β = -29.95, SE = 2.29, t = -13.05, p <.001, 95% CI [-34.48, -25.42]) was a significant predictor of RT. Similarly, holding age and IQ scores constant, ambiguous priming (β = 51.96, SE = 13.57, t = 3.83, p < .001, 95% CI [25.17, 78.74]) significantly predicted RT. In contrast, centred IQ scores did not significantly predict RTs when holding age and the priming condition constant (β = -0.57, SE = 0.46, t = -1.25, p = .214, 95% CI [-1.47, 0.33]). These results suggest participants’ age decreased RT by 30ms per year, but increased by 52ms when faced with lexical ambiguity.

# Coefficients Table
tidy(reg_multi, conf.int = TRUE) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Multiple Linear Regression: Age, Lexical Ambiguity and IQ Predicting Reaction Time")

Multiple Linear Regression: Age, Lexical Ambiguity and IQ Predicting Reaction Time
term	estimate	std.error	statistic	p.value	conf.low	conf.high
(Intercept)	1004.671	26.363	38.109	< .001	952.643	1056.700
age	-29.949	2.295	-13.049	< .001	-34.479	-25.420
la_primesYes	51.957	13.572	3.828	< .001	25.173	78.742
iq_score_centered	-0.571	0.458	-1.247	0.214	-1.474	0.333

# Model Fit Table
glance(reg_multi) %>%
  select(r.squared, adj.r.squared, sigma, statistic, p.value) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Model Fit Statistics")

Model Fit Statistics
r.squared	adj.r.squared	sigma	statistic	p.value
0.509	0.501	88.697	60.87	< .001

One-Way ANOVA


# One-Way ANOVA
anova_oneway <- aov(mean_reaction_time ~ syl_num, data = la_rt_data)

# Summary of ANOVA Results
summary(anova_oneway)
##              Df  Sum Sq Mean Sq F value Pr(>F)  
## syl_num       2   96231   48116   3.125 0.0464 *
## Residuals   177 2725007   15396                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Tukey HSD
TukeyHSD(anova_oneway)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = mean_reaction_time ~ syl_num, data = la_rt_data)
## 
## $syl_num
##                              diff        lwr       upr     p adj
## Disyllable-Monosyllable  27.11897 -26.454715  80.69266 0.4568507
## Trisyllable-Monosyllable 57.07905   3.075522 111.08259 0.0355628
## Trisyllable-Disyllable   29.96008 -23.150279  83.07044 0.3786340

A one-way ANOVA revealed a statistically significant effect of syllable complexity on RT (F(2,177) = 3.13, p < .05), indicating that at least one syllable group differed significantly from the others. The F-statistic indicates between-group RT variance was approximately 3 times larger than the within-group variance. This result indicates that changes in syllable complexity account for variations in semantic processing speeds. A post hoc comparison suggested that the trisyllabic RTs (M = 740.49, SD = 123.75) were significantly slower than monosyllables by 57ms (M = 683.41, SD = 125.12, p < .05), but nonsignificantly slower by 30ms than disyllables (M = 710.53, SD = 123.42, p = .378). Additionally, disyllabic word RTs were not significantly slower (27ms) than monosyllabic (p = .457). This post hoc comparison suggests that moving from a one-syllable word to a three-syllable word placed increased processing demands on participants.

# ANOVA omnibus result
tidy(anova_oneway) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "One-Way ANOVA: Syllabic Complexity Predicting Reaction Time")

One-Way ANOVA: Syllabic Complexity Predicting Reaction Time
term	df	sumsq	meansq	statistic	p.value
syl_num	2	96231.32	48115.66	3.125	0.046
Residuals	177	2725006.82	15395.52	NA	NA

# Tukey post hoc comparisons
tidy(TukeyHSD(anova_oneway)) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(adj.p.value = ifelse(adj.p.value < .001, "< .001", as.character(adj.p.value))) %>%
  kable(caption = "Tukey Post Hoc Comparisons")

Tukey Post Hoc Comparisons
term	contrast	estimate	conf.low	conf.high	adj.p.value
syl_num	Disyllable-Monosyllable	27.119	-26.455	80.693	0.457
syl_num	Trisyllable-Monosyllable	57.079	3.076	111.083	0.036
syl_num	Trisyllable-Disyllable	29.960	-23.150	83.070	0.379

Regression Equivalent of One-Way ANOVA


# Regression Equivalent of the One-Way ANOVA
reg_oneway_eq <- lm(mean_reaction_time ~ syl_num, data = la_rt_data)

# Summary of Regression Results
summary(reg_oneway_eq)
## 
## Call:
## lm(formula = mean_reaction_time ~ syl_num, data = la_rt_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -190.49  -99.19  -17.24   77.32  334.56 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          683.41      16.29  41.947   <2e-16 ***
## syl_numDisyllable     27.12      22.67   1.196   0.2331    
## syl_numTrisyllable    57.08      22.85   2.498   0.0134 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 124.1 on 177 degrees of freedom
## Multiple R-squared:  0.03411,    Adjusted R-squared:  0.0232 
## F-statistic: 3.125 on 2 and 177 DF,  p-value: 0.04636

A simple linear regression model using syllable complexity as a dummy-coded predictor produced identical results to the one-way ANOVA. The model’s intercept (β₀ = 683.41) represents the mean RT for the monosyllabic condition group, which served as the reference condition. The disyllabic coefficient (β₁ = 27.12, SE = 22.67, t = 1.19, p = .401) indicated a small non-significant increase in mean RT relative to the monosyllabic group. In contrast, the trisyllabic coefficient (β₂ = 57.08, SE = 22.85, t = 2.49, p = .013) showed a statistically significant increase in mean RTs compared to monosyllabic conditions. This model is mathematically identical to the ANOVA and post hoc results, accounting for 2.3% of RT variance (adjusted R² = .023).

# Coefficients Table
tidy(reg_oneway_eq, conf.int = TRUE) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = " Equivalent Simple Linear Regression: Syllabic Complexity Predicting Reaction Time")

Equivalent Simple Linear Regression: Syllabic Complexity Predicting Reaction Time
term	estimate	std.error	statistic	p.value	conf.low	conf.high
(Intercept)	683.413	16.292	41.947	< .001	651.261	715.565
syl_numDisyllable	27.119	22.666	1.196	0.233	-17.612	71.850
syl_numTrisyllable	57.079	22.848	2.498	0.013	11.989	102.169

# Model Fit Table
glance(reg_oneway_eq) %>%
  select(r.squared, adj.r.squared, sigma, statistic, p.value) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Model Fit Statistics")

Model Fit Statistics
r.squared	adj.r.squared	sigma	statistic	p.value
0.034	0.023	124.079	3.125	0.046

Recoded Reference Group


# Recoded Copy of Syl_Num
syl_num_recoded <- relevel(la_rt_data$syl_num, ref = "Trisyllable")

# Refit Model with New Reference Group
refit_model <- lm(mean_reaction_time ~ syl_num_recoded, data = la_rt_data)

# Summary of Refit Model Results
summary(refit_model)
## 
## Call:
## lm(formula = mean_reaction_time ~ syl_num_recoded, data = la_rt_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -190.49  -99.19  -17.24   77.32  334.56 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                   740.49      16.02  46.227   <2e-16 ***
## syl_num_recodedMonosyllable   -57.08      22.85  -2.498   0.0134 *  
## syl_num_recodedDisyllable     -29.96      22.47  -1.333   0.1841    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 124.1 on 177 degrees of freedom
## Multiple R-squared:  0.03411,    Adjusted R-squared:  0.0232 
## F-statistic: 3.125 on 2 and 177 DF,  p-value: 0.04636

# Coefficient Table
tidy(refit_model) %>%
  mutate(across(where(is.numeric), ~ round(., 3))) %>%
  mutate(p.value = ifelse(p.value < .001, "< .001", as.character(p.value))) %>%
  kable(caption = "Regression with Trisyllable as Reference Group",
        col.names = c("Term", "Estimate", "SE", "t", "p"))

Regression with Trisyllable as Reference Group
Term	Estimate	SE	t	p
(Intercept)	740.492	16.018	46.227	< .001
syl_num_recodedMonosyllable	-57.079	22.848	-2.498	0.013
syl_num_recodedDisyllable	-29.960	22.470	-1.333	0.184

Discussion

The ambiguity effects observed were statistically significant, allowing for rejection of the null hypothesis of no difference in RT between priming conditions. During this data simulation, a conceptual link between analyses became evident; t-tests, simple regressions and one-way ANOVAs are all expressions of the same underlying linear model that estimates mean group differences. This equivalence is enabled by dummy coding, converting categorical predictors into binary variables, allowing for intercepts to represent the reference group mean, and coefficients to represent the deviation of another group from the reference. This results in mathematically identical F-statistics and p-values between the regression equivalents of the ANOVA and t-test, indicating the same underlying model is being represented differently. This relationship is seen when re-coding the ANOVA as the model results remained the same but the intercept and coefficient results changed reflecting the change in reference group or comparison point from monosyllabic to trisyllabic.

Despite these methodological insights, there were several limitations to this simulation study. The absence of a significant IQ effect may reflect the use of an overall intelligence measure, rather than specifically isolating for verbal intelligence or literacy scores. Additionally, no data were gathered regarding specific demographic factors, learning disabilities, or accuracy rates, alongside no distinction between word pair relationship types. These factors may have provided valuable information regarding the developmental differences in processing semantic relationships. Future research should examine the impacts of inhibition and phonological-related learning disabilities on ambiguity resolution in children and adolescents.

Reproducibility Appendix

set.seed(7311)
head(la_rt_data, 10)
##    age la_primes      syl_num  iq_score iq_score_centered mean_reaction_time
## 1   10        No   Disyllable 109.76741          9.767411           664.3658
## 2    8       Yes Monosyllable  90.82436         -9.175636           732.2178
## 3   11       Yes Monosyllable 130.80958         30.809579           706.1575
## 4    8        No  Trisyllable  94.18601         -5.813987           823.4557
## 5   16       Yes Monosyllable 119.62693         19.626932           550.0000
## 6    6       Yes Monosyllable 118.20543         18.205431           790.6346
## 7    8       Yes Monosyllable 103.75687          3.756874           727.6174
## 8    6       Yes Monosyllable  75.85787        -24.142130           882.9211
## 9    6        No   Disyllable 107.99513          7.995127           712.8208
## 10  16       Yes  Trisyllable 107.46913          7.469125           662.7193
str(la_rt_data)
## 'data.frame':    180 obs. of  6 variables:
##  $ age               : num  10 8 11 8 16 6 8 6 6 16 ...
##  $ la_primes         : Factor w/ 2 levels "No","Yes": 1 2 2 1 2 2 2 2 1 2 ...
##  $ syl_num           : Factor w/ 3 levels "Monosyllable",..: 2 1 1 3 1 1 1 1 2 3 ...
##  $ iq_score          : num  109.8 90.8 130.8 94.2 119.6 ...
##  $ iq_score_centered : num  9.77 -9.18 30.81 -5.81 19.63 ...
##  $ mean_reaction_time: num  664 732 706 823 550 ...
sessionInfo()
## R version 4.5.3 (2026-03-11)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.6.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Pacific/Auckland
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] broom_1.0.12  effsize_0.8.1 dplyr_1.2.1   knitr_1.51    ggplot2_4.0.2
## 
## loaded via a namespace (and not attached):
##  [1] Matrix_1.7-4       gtable_0.3.6       jsonlite_2.0.0     compiler_4.5.3    
##  [5] tidyselect_1.2.1   stringr_1.6.0      tidyr_1.3.2        jquerylib_0.1.4   
##  [9] splines_4.5.3      scales_1.4.0       yaml_2.3.12        fastmap_1.2.0     
## [13] lattice_0.22-9     R6_2.6.1           labeling_0.4.3     generics_0.1.4    
## [17] backports_1.5.1    tibble_3.3.1       bslib_0.10.0       pillar_1.11.1     
## [21] RColorBrewer_1.1-3 rlang_1.1.7        stringi_1.8.7      cachem_1.1.0      
## [25] xfun_0.57          sass_0.4.10        S7_0.2.1           cli_3.6.5         
## [29] withr_3.0.2        magrittr_2.0.5     mgcv_1.9-4         digest_0.6.39     
## [33] grid_4.5.3         rstudioapi_0.18.0  nlme_3.1-168       lifecycle_1.0.5   
## [37] vctrs_0.7.1        evaluate_1.0.5     glue_1.8.0         farver_2.1.2      
## [41] rmarkdown_2.31     purrr_1.2.2        tools_4.5.3        pkgconfig_2.0.3   
## [45] htmltools_0.5.9

One surprising finding in my simulated study was the uneven distribution of priming conditions. While this might be expected with randomised sampling, the difference between groups was considerably large (n = 30) and may have skewed my overall findings. Additionally, the strength of the age effect on RT variability (adjusted R² = .456) was considerably large but likely reflected the low SD used in the simulation. In the population, individual variability in RT would be much higher, and may have affected the power of my age results.

Age and Intelligence as Predictors of Semantic Relatedness Judgement Speeds in Children and Adolescents: The Role of Semantic Priming, Syllable Complexity and Lexical Ambiguity

Daneka Moroney - Student ID: 589297311

May 01, 2026