accuracy <- mean(cleaned_data$correct, na.rm = TRUE)
print(paste("Overall Accuracy: ", round(accuracy * 100, 2), "%"))[1] "Overall Accuracy: 82.78 %"
The study Rapid Word Learning Under Uncertainty via Cross-Situational Statistics by Yu and Smith explored how adults can learn word-referent pairs under highly ambiguous settings. Past studies on word learning have been focusing on constraints such as social, attentional, or linguistic cues to solve the word-referent mapping problem. While these strategies performed well in controlled, minimally ambiguous contexts, real-world learning environments presented learners with greater complexity.
This raises an important question: can learners successfully acquire word-referent pairs in highly ambiguous settings through alternative means, even when they cannot determine correct pairings within a single trial? To address the question, Yu and Smith propose an alternative mechanism—— cross-situational learning —— in this study. They demonstrated that learners could track word-referent pairings across multiple trials by calculating statistical associations over time rather than relying on immediate clarity within each learning instance.
Original effect size, power analysis for samples to achieve 80%, 90%, 95% power to detect that effect size. Considerations of feasibility for selecting planned sample size.
Thirty-eight participants were recruited for the original study, and they receive either course credit or $7. Our replication will aim to include a similar or slightly larger sample size with recruitment from Prolific to maintain consistency with the original design.
“The stimuli were slides containing pictures of uncommon objects (e.g., canister, facial sauna, and rasp) paired with auditorily presented pseudowords. These artificial words were generated by a computer program to sample English forms that were broadly phonotactically probable; they were produced by a synthetic female voice in monotone. There were 54 unique objects and 54 unique pseudowords partitioned into three sets of 18 words and referents for use in the three conditions. The training trials were generated by randomly pairing each word with one picture; these were the word-referent pairs to be discovered by the learner. The three learning conditions differed in the number of words and referents presented on each training trial: 2-2 Condition: 2 words and 2 pictures; 3-3 Condition: 3 words and 3 pictures; 4-4 Condition: 4 words and 4 pictures” (Yu and Smith 2007)
“The pictures were presented on a 17-in. computer screen, and the sound was played by the speakers connected to the same computer. Subjects were instructed that their task was to learn the words and referents, but they were not told that there was one referent per word. They were told that multiple words and pictures would co-occur on each trial and that their task was to figure out across trials which word went with which picture. After training in each condition, subjects received a fouralternative forced-choice test of learning. On the test, they were presented with 1 word and 4 pictures and asked to indicate the picture named by that word. The target picture and the 3 foils were all drawn from the set of 18 training pictures.” (Yu and Smith 2007)
The primary analysis will involves a one-way ANOVA to compare learning accuracy across the three conditions (2×2, 3×3, and 4×4). In this setup, the independent variable is the condition (level of ambiguity), and the dependent variable is the accuracy of word-object pair identification.
We will also examine response times across conditions to investigate whether higher ambiguity affects the speed of learning, which may contribute to understanding cognitive processing under different conditions
Data cleaning will include the exclusion of trials where response times are excessively high or low to account for inattentiveness or random guessing. Also, participants performing below chance level overall will be excluded from the analysis, as this suggests they may not have engaged meaningfully with the task.
Sample: The original study included 38 undergraduate participants from Indiana University. Our sample may differ slightly due to recruitment constraints; participants will probably being drawn from a broader demographic pool, which could introduce variability in learning abilities or prior exposure to similar experimental tasks. However, as cross-situational learning mechanisms are believed to be consistent across adult populations, the sample difference is not supposed to significantly impact the findings.
Setting: In the original study, participants completed the trials in a controlled lab environment. Our replication may only involve online settings. Conducting the experiment outside of a laboratory could introduce additional distractions or variations. As the original research suggests that cross-situational learning effects are resilient to minor environmental changes, we do not expect this variation to significantly influence the outcome.
You can comment this section out prior to final report with data collection.
Sample size, demographics, data exclusions based on rules spelled out in analysis plan
Any differences from what was described as the original plan, or “none”.
As we did not finish coding for the 3 * 3 and 4 * 4 condition, we only asked 10 participants to complete 2 * 2 condition in the pilot test. The following confirmatory analysis is based solely on their performance in the 2 * 2 condition. Due to lack of within-condition data, it appears to be impossible to conduct ANOVA analysis at the time.
accuracy <- mean(cleaned_data$correct, na.rm = TRUE)
print(paste("Overall Accuracy: ", round(accuracy * 100, 2), "%"))[1] "Overall Accuracy: 82.78 %"
In the pilot test, the overall accuracy in 2*2 condition among 10 participants is 0.828, which means, on average, each participant matched about 15 out of 18 pseudowords with their corresponding images (picture of uncommon objects, as noted above) correctly.
image_accuracy <- cleaned_data %>%
group_by(correct_image) %>%
summarize(accuracy = mean(correct, na.rm = TRUE))
ggplot(image_accuracy, aes(x = correct_image, y = accuracy)) +
geom_line(color = "forestgreen", size = 1) +
geom_point(color = "skyblue3", size = 2) +
scale_x_continuous(breaks = seq(1, 18, 1))+
labs(x = "Image",
y = "Accuracy")Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
The line plot presents the accuracy for each image, ranging from 0.6 to 1.0. Differences in accuracy could be due to certain images being more ambiguous, unfamiliar, or harder to associate with the correct pseudowords.
response_time <- cleaned_data %>%
group_by(correct) %>%
summarize(
mean_response_time = mean(response_time, na.rm = TRUE),
median_response_time = median(response_time, na.rm = TRUE),
sd_response_time = sd(response_time, na.rm = TRUE)
)
print(response_time)# A tibble: 2 × 4
correct mean_response_time median_response_time sd_response_time
<lgl> <dbl> <dbl> <dbl>
1 FALSE 4356. 4071 1820.
2 TRUE 3453. 3112 1683.
As shown in the table, participants in the pilot test took significantly longer to respond when their answers were incorrect (mean response time = 4356.23 ms, SD = 1819.59 ms) compared to when their answers were correct (mean response time = 3453.13 ms, SD = 1682.96 ms).
t_test <- t.test(
response_time ~ correct,
data = cleaned_data
)
print(t_test)
Welch Two Sample t-test
data: response_time by correct
t = 3.6007, df = 84.099, p-value = 0.0005357
alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
95 percent confidence interval:
404.3413 1401.8553
sample estimates:
mean in group FALSE mean in group TRUE
4356.226 3453.128
Since p < 0.05 , we reject the null hypothesis and conclude that there is a statistically significant difference in response times between the two groups. Response times are significantly faster for correct responses (correct = TRUE) compared to incorrect responses (correct = FALSE).
cohens_d <- cohens_d(response_time ~ correct, data = cleaned_data)
cohens_dCohen's d | 95% CI
------------------------
0.53 | [0.25, 0.81]
- Estimated using pooled SD.
pwr_result <- pwr.t.test(
d = 0.53,
sig.level = 0.05,
power = 0.80,
type = "two.sample"
)
print(pwr_result)
Two-sample t test power calculation
n = 56.86016
d = 0.53
sig.level = 0.05
power = 0.8
alternative = two.sided
NOTE: n is number in *each* group
The pilot test suggests that there is a medium effect siz of 0.53 in the 2 * 2 condition. Using G * Power, we conducted a power analysis for a two-sample t-test with a significance level of alpha = 0.05 and desired power of 0.80. The results indicated that we would need at least 57 participants per group to achieve sufficient statistical power for detecting this effect size. However, the results can be biased for only 2 * 2 condition is taken into consideration.
Any follow-up analyses desired (not required).
Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.
Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.