Replication of Repeat after us: Syntactic alignment is not partner-specific by Rachel Ostrand and V.S. Ferreira (2019, Psychological Science)

Author

A.J. Schwartz (avschwartz@ucsd.edu)

Published

December 9, 2024

Introduction

This experiment was chosen based on my current research interest in conversational alignment in psycholinguistics. In short, the experiment tests whether or not speakers align syntactically to their specific conversation partner as compared with a more general alignment towards what they have recently been exposed to. My past research, while being psycholinguistic in nature, was not focused on conversation, but moreseo on learning. This project will hopefully encourage my reading and exploration on syntactic alignment. This specific paper was produced partially by Dr. Ferreira’s lab, which I am working in, so I will have access to people who are familiar with the data and the background of the study. Additionally, I have very little familiarity with analyzing psycholinguistic data, in my undergrad the experiment I conducted required more analysis of cognitive test results. I hope to do the computational reproducibility project over the replication project, because I believe that hands on experience with data analysis is more important to my studies than learning something like jspsych, especially because much of my experimentation happens in person- and because I have access to data that is very much like what I would want to be collecting for this experiment, without having to take the many hours it takes to transcribe recording data.

While there is access to data, I have no access to the code used to run the analyses or create the figures. For this project, I hope to reproduce the findings of experiment 4 and 5, which I consider to be the most important experiments from the paper. Experiment 4 eliminates some of the issues in the previous 3 and demonstrates a lack of partner-specific alignment. Experiment 5 takes that even further and shows a lack of partner alignment even when the partner may not understand other syntactical structures. I hope to produce a 4 (Syntax Exposure Condition: all-Prepositional Dative, speaker-specific, mixed, all-Double Object) × 2 (Listener: Experimenter A, Experimenter B) GLMM model for experiment for with a sub-model to test each of the single structure alignments against each other and a sub-model to test the speaker-specific and mixed conditions against each other. I will be doing the same analysis for experiment five as well. Each model will contain a visualization, including the sub-models, which do not have visualizations in the paper that this will be based off. I hope to gain some experience with ggplot- I have a strong background in design, and an interest in data visualization generally, but no hands on experience with it. The main challenge will be working with linguistic instead of numerical data, and figuring out how to translate transcribed sentences into data. I also would like to take some time to make the visualizations not only legible, but visually appealing. In the past, I’ve relied on programs like JASP and Jamovi to analyze my data for me, but I think it’ll be important to learn how to do that in base R considering the limitations of software like those.

Methods

Power Analysis

Power analysis could not be done. Analysis was done through a GLMM with no reported effect size.

Planned Sample

Experiment Four

96 participants were taken from the subject pool from the University of California, San Diego. Participants were excluded if they were part of previous experiments in the same study. Participants were required to be native and monolinguial speakers of English.

Experiment Five

Materials

“The stimuli were 96 colored pictures … consisting of 72 unique dative pictures and 24 unique intransitive pictures. As before, the dative events could be described using either a prepositional dative (PD) or double object (DO) structure. For item counterbalancing purposes, the dative pictures were divided into three item sets of 24. For a particular participant, Experimenter A described one set of dative pictures, Experimenter B described a different set, and the participant described the third set.1 Across participants, the items sets were counterbalanced among the three speakers in the experiment, so that a particular picture was described by Experimenter A for 1/3 of participants, described by Experimenter B for 1/3 of participants, and described by the participant for 1/3 of participants.”

“In certain conditions, the intransitive pictures were described by the experimenters as filler items, to hold constant the number of exposure sentences of a given dative structure across the conditions … Additionally, all participants described the intransitive pictures interleaved with the critical dative pictures, to reduce the influence of self-priming from one sentence to the next. The intransitive pictures had a simple event structure (e.g., “The woman is sleeping”) which made it unlikely that participants would produce a sentence containing even a single object, so as not to prime one of the dative structures more than the other.”

Procedure

Overall Procedure

“Participants were told they were playing a conversational picture-matching game with the experimenters. One participant and one experimenter sat across a table from each other, separated by an opaque barrier that was high enough to block the other’s table space but low enough to easily see each other’s face and upper body. Each partner had a series of pictures; the task throughout the experiment was for one partner to describe the pictures to the other partner, who put his/her own pictures in the same order.

Each participant interacted with two experimenters, one at a time. Both experimenters were female. A round began when the first experimenter entered the participant’s room and described six distinct pictures to the participant (two each of transitive, locative, and dative events), while the participant arranged his own cards in the order described (Exposure Phase A). The first experimenter gave the participant two 2-digit multiplication problems to complete and left the room. The purpose of the math problems was to provide a cover task to allow the experimenter to leave; performance on the math problems was not measured. After 30 seconds, the second experimenter entered the participant’s room, collected the math problems, and described a new set of six distinct pictures to the participant (Exposure Phase B). The second experimenter then gave the participant a new pair of math problems and left the room. Finally, one experimenter returned, this time as the listener, and laid out the participant’s pictures in a predetermined and pseudo-random order such that two pictures of the same event type would not be described consecutively. The participant described all 12 pictures that he had just heard (both experimenters’ full set of six) to the listening experimenter (Test Phase). Thus, for each picture the participant described, the listening experimenter was either the same or different person as the experimenter who had originally described that picture to the participant. This process comprised a complete round (Experimenter A described six unique pictures, Experimenter B described six unique pictures, then the participant described the same 12 pictures), and occurred for four rounds, each containing different pictures. Each experimenter described a total of 24 distinct pictures, and the participant described all 48 pictures over the course of the experiment.

All factors (including nuisance factors) were fully counterbalanced either (or both) within or between participants. The order of the two experimenters during each exposure phase, and identity of the experimenter during each test phase, was counterbalanced across rounds for a given participant and also across participants. That is, each participant listened to Experimenter A and then Experimenter B during two exposure rounds, and Experimenter B and then Experimenter A during the other two exposure rounds. Similarly, each participant described his pictures to Experimenter A for two test rounds and to Experimenter B for the other two test rounds. Additionally, picture-structure mapping was counterbalanced across participants, such that half of participants heard (e.g.) Fig. 1 described using a DO, and the other half heard it described using a PD. The order in which each experimenter described her pictures, and the order in which participants described their pictures, was also counterbalanced across participants.

One experimenter described each transitive picture using an active sentence, each locative picture using a with-locative sentence, and each dative picture using a double object sentence. The other experimenter produced only passive, on-locative, and prepositional dative sentences. The identity of the experimenter who had each syntactic preference was also counterbalanced between participants, such that half of the participants heard actives, with-locatives, and DOs from “Hannah”, and passives, on-locatives, and PDs from “Victoria”, and the other half of participants heard the reversed mapping. At the start of the experiment, participants were told there were multiple experimenters running the experiment at the same time, and thus they might encounter a new experimenter later in the experiment, but were never explicitly informed about each experimenter’s syntactic preferences, or that the pictures could be described using multiple syntactic structures.”

Experiment 4 Specific Procedure

“The general procedure was similar to that of the previous experiments, with a few important differences. As before, each subject interacted with two experimenters across the experiment, and only one experimenter was ever in the room with the subject at a time. Across participants, there were seven people who acted as experimenters, six female and one male; different participants were tested by different pairs of experimenters, assigned non-systematically based on availability.

One goal for Experiment 4 was to ensure that the experimental manipulation was sufficiently powerful to detect partner-specific syntactic alignment should such alignment exist. This was addressed with four methodological changes from the previous experiments. First, to increase the amount of exposure to each experimenter’s syntactic preferences, transitive and locative pictures were removed so that all critical trials involved dative pictures, thus increasing the number of sentences of one alternation (PD vs. DO) that participants heard from the experimenters. Second, for the same reason, all of the rounds in which the experimenter described her pictures preceded all of the rounds in which the participant described his pictures. This maximized the amount of syntactic exposure that participants received from each experimenter before describing their own pictures. As a result of these two design changes, participants heard 24 sentences of a given structure (PD or DO) before they described any pictures themselves.

Third, to verify that participants were aware that they were interacting with two distinct experimenters and could remember specific statements that were said by each person, each round began with the experimenter telling the participant a fictional but plausible fact about herself – for example, that the experimenter grew up in New York City (an uncommon occurrence among students attending a public university in California). Participants were told at the beginning of the experiment that the facts would “come up later”, and were tested on which experimenter had said each fact after the main experiment.

The fourth departure from previous experiments was that each participant was randomly assigned to one of four between-participant syntax exposure groups. As before, each experimenter described a total of 24 pictures across four rounds to the participant. In the all-PD exposure condition, both experimenters described all of their dative events using PDs. Thus participants in this condition heard a total of 24 PDs, 0 DOs, and 24 intransitives (12 PDs and 12 intransitives from each experimenter). The all-DO exposure condition was the reverse: both experimenters described all of their dative events using DOs. Thus, in this condition, participants heard a total of 0 PDs, 24 DOs, and 24 intransitives (12 DOs and 12 intransitives from each experimenter). Comparing the relative rate of participants’ PD and DO production in these two conditions will permit testing for global syntactic alignment. If hearing 24 sentences of a given alternation is sufficient to affect participants’ syntactic production, then participants in the all-PD condition should produce more PDs than do participants in the all-DO condition (following Kaschak, 2007). In the mixed condition, both experimenters produced both structures at the same rate, each describing half of her dative events using PDs and half using DOs. Thus, participants heard a total of 24 PDs, 24 DOs, and 0 intransitives (12 PDs and 12 DOs from each experimenter). In the critical, speaker-specific exposure condition, participants heard the same overall number of sentences of each structure as in the mixed condition (24 PDs, 24 DOs, and 0 intransitives). However, each experimenter described all of her pictures using only her preferred structure. Thus, participants in this condition heard all 24 DOs from Experimenter A and all 24 PDs from Experimenter B. Across the four syntax exposure conditions, all participants received the same amount of exposure to a particular structure – 24 sentences. If hearing 24 sentences of one structure is sufficient to affect syntactic production in one condition, it should be for the others as well. See Table 3 for a summary of the exposure conditions.”

Experiment 5 Specific Procedure

“The experimental design and procedure was identical to that of Experiment 4 except that one experimenter was a non-native English speaker with a heavy Mandarin accent. She began learning English at age 9 and did not live in an English-speaking country (USA) until age 18. The second experimenter was a native (unaccented) English speaker, as in all previous experiments. There were three people who acted as the native experimenter across participants, two female and one male; different participants were tested by a different native experimenter (assigned non-systematically based on availability) paired with the same non-native experimenter. In the speaker-specific condition, syntactic preference of the native and non-native experimenter was counterbalanced across participants.”

Analysis Plan

“The included sentence productions [will be] submitted to a 4 (Syntax Exposure Condition: all-PD, speaker-specific, mixed, all-DO)× 2 (Listener: Experimenter A, Experimenter B) GLMM.”

There will also be single structure comparisons for all-PD vs. all-DO conditions, speaker-specific vs. mixed conditions for each experiment four and experiment five. Each comparison will have a visualization.

This will attempt to confirm the general findings of the study- that syntactic alignment does happen in conversation, but it is not specific to the person that the participant is speaking to. In a conversation with two speakers, one who uses more DO structures and one that does not, participants should use more DO structures unrelated to which experimenter they are speaking to.

Differences from Original Study

As opposed to the original study, this will contain visualizations for each analysis done on experiments 4 and 5. Everything else should be largely the same as the original, considering it is a reproduction.

Reproducibility Pipeline

Data is sourced from the OSF collection of the project. It is already transcribed and cleaned. Participant responses are marked as either fitting the expected sentence structure or not fitting the expected sentence structure. Data needs to be entered into a generalized linear mixed model (GLMM) which will be done using lme4 package’s glmer() function. It will be a binomial model. The original author of the paper noted that the preparation that she did before the experiment ended up being largely unnecessary, so that will not be done. I will then be using ggplot2 to create visualizations of the data. I hope to, at the end of this reproduction, be able to have visualizations similar to the original paper’s and be able to confirm their results.

Actual Sample

The sample size for each experiment is 96 UCSD undergraduates.

Differences from pre-data collection methods plan

None

Data preparation

Data preparation following the analysis plan.

# Load packages
library(lme4)      # For mixed-effects models

Loading required package: Matrix

library(car)       # For sum contrasts, if needed

Loading required package: carData

library(dplyr)     # For data manipulation


Attaching package: 'dplyr'

The following object is masked from 'package:car':

    recode

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

library(ggplot2)   # For Plotting
library(ggthemes) #  GGplot themes
library(emmeans) # Single structure comparison

Welcome to emmeans.
Caution: You lose important information if you filter this package's results.
See '? untidy'

library(papaja) #tables

Loading required package: tinylabels

library(tinylabels) #tables

# Load in data
expFour <- read.csv("/Users/averyschwartz/Desktop/Stats/Final Project/Exp4_ClearTrain.csv")
expFive <- read.csv("/Users/averyschwartz/Desktop/Stats/Final Project/Exp5_TrainInSpain.csv")

# Factorialize exp four data
expFour$Listener <- factor(expFour$Listener, levels = c("E1", "E2"))
expFour$SyntaxExposure <- factor(expFour$SyntaxExposure, levels = c("All-PD", "All-DO", "Mixed", "Speaker-specific"))

# Ensure DV is binary
expFour$Target_PD <- as.numeric(expFour$Target_PD)

# For random effects
expFour$Subject <- factor(expFour$Subject)
expFour$Picture <- factor(expFour$Picture)

# Factorialize exp five data
expFive$Listener_nativeness <- factor(expFive$Listener_nativeness, levels = c("native", "non-native"))
expFive$SyntaxExposure <- factor(expFive$SyntaxExposure, levels = c("All-PD", "All-DO", "Mixed", "Speaker-specific"))

# Ensure DV is binary
expFive$Target_PD <- as.numeric(expFive$Target_PD)

# For random effects
expFive$Subject <- factor(expFive$Subject)
expFive$Picture <- factor(expFive$Picture)

Confirmatory analysis

The analyses as specified in the analysis plan.

#Run the actual model for four
modelFour <- glmer(Target_PD ~ Listener * SyntaxExposure + (1 + Listener * SyntaxExposure | Subject) + (1 | Picture), 
              data = expFour,
              family = binomial(link = "logit"),
              control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 50000)))

boundary (singular) fit: see help('isSingular')

summary(modelFour)

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: 
Target_PD ~ Listener * SyntaxExposure + (1 + Listener * SyntaxExposure |  
    Subject) + (1 | Picture)
   Data: expFour
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 50000))

     AIC      BIC   logLik deviance df.resid 
  1684.5   1925.0   -797.2   1594.5     1503 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.7286 -0.5164  0.1700  0.4971  4.7310 

Random effects:
 Groups  Name                                      Variance Std.Dev. Corr 
 Subject (Intercept)                               2.9109   1.7061        
         ListenerE2                                0.9182   0.9582    0.94
         SyntaxExposureAll-DO                      0.4858   0.6970   -0.33
         SyntaxExposureMixed                       0.7110   0.8432   -0.80
         SyntaxExposureSpeaker-specific            2.1088   1.4522    0.01
         ListenerE2:SyntaxExposureAll-DO           0.6363   0.7977    0.60
         ListenerE2:SyntaxExposureMixed            0.6250   0.7906   -0.88
         ListenerE2:SyntaxExposureSpeaker-specific 3.0900   1.7578   -0.75
 Picture (Intercept)                               1.4277   1.1948        
                                    
                                    
                                    
 -0.25                              
 -0.80  0.39                        
 -0.07 -0.04 -0.06                  
  0.47  0.45 -0.31  0.07            
 -0.99  0.24  0.85  0.07 -0.38      
 -0.70  0.20  0.57 -0.56 -0.49  0.65
                                    
Number of obs: 1548, groups:  Subject, 96; Picture, 72

Fixed effects:
                                          Estimate Std. Error z value Pr(>|z|)
(Intercept)                                 1.3942     0.4474   3.116  0.00183
ListenerE2                                 -0.1828     0.4127  -0.443  0.65773
SyntaxExposureAll-DO                       -1.6059     0.5702  -2.817  0.00485
SyntaxExposureMixed                        -1.0847     0.5123  -2.117  0.03422
SyntaxExposureSpeaker-specific             -0.2178     0.6675  -0.326  0.74422
ListenerE2:SyntaxExposureAll-DO            -0.2683     0.6238  -0.430  0.66712
ListenerE2:SyntaxExposureMixed              0.1202     0.4805   0.250  0.80251
ListenerE2:SyntaxExposureSpeaker-specific  -0.2448     0.5723  -0.428  0.66883
                                            
(Intercept)                               **
ListenerE2                                  
SyntaxExposureAll-DO                      **
SyntaxExposureMixed                       * 
SyntaxExposureSpeaker-specific              
ListenerE2:SyntaxExposureAll-DO             
ListenerE2:SyntaxExposureMixed              
ListenerE2:SyntaxExposureSpeaker-specific   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) LstnE2 SEA-DO SyntEM SynES- LE2:SEA LE2:SEM
ListenerE2   0.070                                            
SyntxExA-DO -0.702 -0.054                                     
SyntxExpsrM -0.780 -0.062  0.612                              
SyntxExpsS- -0.596 -0.046  0.468  0.520                       
LsE2:SEA-DO -0.047 -0.663  0.226  0.040  0.031                
LstnrE2:SEM -0.061 -0.857  0.046 -0.014  0.039  0.569         
LstnE2:SES- -0.052 -0.721  0.040  0.046 -0.348  0.477   0.619 
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

#Single structure comparison
emmip(modelFour, SyntaxExposure~Listener, cov.reduce = range, plotit = TRUE)

# Filter for specific comparisons
contrast_results <- contrast(
  emmeans(modelFour, ~ SyntaxExposure | Listener),  # SyntaxExposure within Listener
  method = "pairwise",                              # Pairwise contrasts
  adjust = "none"                                   
)

# View results
summary(contrast_results)

Listener = E1:
 contrast                      estimate    SE  df z.ratio p.value
 (All-PD) - (All-DO)              1.606 0.570 Inf   2.817  0.0049
 (All-PD) - Mixed                 1.085 0.512 Inf   2.117  0.0342
 (All-PD) - (Speaker-specific)    0.218 0.667 Inf   0.326  0.7442
 (All-DO) - Mixed                -0.521 0.480 Inf  -1.086  0.2773
 (All-DO) - (Speaker-specific)   -1.388 0.644 Inf  -2.157  0.0310
 Mixed - (Speaker-specific)      -0.867 0.593 Inf  -1.461  0.1441

Listener = E2:
 contrast                      estimate    SE  df z.ratio p.value
 (All-PD) - (All-DO)              1.874 0.935 Inf   2.004  0.0451
 (All-PD) - Mixed                 0.965 0.698 Inf   1.383  0.1667
 (All-PD) - (Speaker-specific)    0.463 0.712 Inf   0.649  0.5161
 (All-DO) - Mixed                -0.910 0.782 Inf  -1.163  0.2449
 (All-DO) - (Speaker-specific)   -1.412 0.796 Inf  -1.773  0.0763
 Mixed - (Speaker-specific)      -0.502 0.496 Inf  -1.012  0.3114

Results are given on the log odds ratio (not the response) scale.

#Run the actual model for five
modelFive <- glmer(Target_PD ~ Listener_nativeness * SyntaxExposure + (1 + Listener_nativeness * SyntaxExposure | Subject) + (1 | Picture), 
              data = expFive,
              family = binomial(link = "logit"),
              control = glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 50000)))

boundary (singular) fit: see help('isSingular')

#View results
summary(modelFive)

Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: 
Target_PD ~ Listener_nativeness * SyntaxExposure + (1 + Listener_nativeness *  
    SyntaxExposure | Subject) + (1 | Picture)
   Data: expFive
Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 50000))

     AIC      BIC   logLik deviance df.resid 
  1707.8   1946.4   -808.9   1617.8     1438 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-4.9335 -0.5665  0.1998  0.5433  2.9127 

Random effects:
 Groups  Name                                                         Variance
 Subject (Intercept)                                                  2.2274  
         Listener_nativenessnon-native                                0.2192  
         SyntaxExposureAll-DO                                         0.7023  
         SyntaxExposureMixed                                          0.9194  
         SyntaxExposureSpeaker-specific                               0.7197  
         Listener_nativenessnon-native:SyntaxExposureAll-DO           1.7277  
         Listener_nativenessnon-native:SyntaxExposureMixed            0.7125  
         Listener_nativenessnon-native:SyntaxExposureSpeaker-specific 0.7207  
 Picture (Intercept)                                                  1.3113  
 Std.Dev. Corr                                     
 1.4925                                            
 0.4682   -0.76                                    
 0.8380   -0.59  0.43                              
 0.9588   -0.04 -0.03  0.12                        
 0.8484   -0.54  0.53  0.35 -0.05                  
 1.3144    0.15 -0.08  0.02  0.06 -0.12            
 0.8441    0.20 -0.37 -0.02  0.09 -0.34 -0.35      
 0.8489    0.90 -0.87 -0.51 -0.03 -0.28  0.11  0.21
 1.1451                                            
Number of obs: 1483, groups:  Subject, 96; Picture, 71

Fixed effects:
                                                             Estimate
(Intercept)                                                    1.7513
Listener_nativenessnon-native                                 -0.3288
SyntaxExposureAll-DO                                          -2.0392
SyntaxExposureMixed                                           -1.3039
SyntaxExposureSpeaker-specific                                -1.6042
Listener_nativenessnon-native:SyntaxExposureAll-DO             0.2862
Listener_nativenessnon-native:SyntaxExposureMixed              0.2460
Listener_nativenessnon-native:SyntaxExposureSpeaker-specific   0.4418
                                                             Std. Error z value
(Intercept)                                                      0.4133   4.237
Listener_nativenessnon-native                                    0.3243  -1.014
SyntaxExposureAll-DO                                             0.5001  -4.077
SyntaxExposureMixed                                              0.5641  -2.312
SyntaxExposureSpeaker-specific                                   0.4972  -3.227
Listener_nativenessnon-native:SyntaxExposureAll-DO               0.5168   0.554
Listener_nativenessnon-native:SyntaxExposureMixed                0.4646   0.529
Listener_nativenessnon-native:SyntaxExposureSpeaker-specific     0.4293   1.029
                                                             Pr(>|z|)    
(Intercept)                                                  2.26e-05 ***
Listener_nativenessnon-native                                 0.31070    
SyntaxExposureAll-DO                                         4.56e-05 ***
SyntaxExposureMixed                                           0.02080 *  
SyntaxExposureSpeaker-specific                                0.00125 ** 
Listener_nativenessnon-native:SyntaxExposureAll-DO            0.57971    
Listener_nativenessnon-native:SyntaxExposureMixed             0.59648    
Listener_nativenessnon-native:SyntaxExposureSpeaker-specific  0.30340    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) Lstn_- SEA-DO SyntEM SynES- L_-:SEA L_-:SEM
Lstnr_ntvn- -0.577                                            
SyntxExA-DO -0.733  0.479                                     
SyntxExpsrM -0.646  0.422  0.534                              
SyntxExpsS- -0.735  0.480  0.607  0.536                       
Ls_-:SEA-DO  0.362 -0.629 -0.450 -0.264 -0.301                
Lstnr_-:SEM  0.403 -0.697 -0.335 -0.477 -0.336  0.438         
Lstn_-:SES-  0.435 -0.757 -0.361 -0.319 -0.379  0.477   0.528 
optimizer (bobyqa) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

#Single structure comparison
emmip(modelFive, SyntaxExposure~Listener_nativeness, cov.reduce = range, plotit = TRUE)

# Filter for specific comparisons
contrast_results <- contrast(
  emmeans(modelFive, ~ SyntaxExposure | Listener_nativeness),  # SyntaxExposure within Listener
  method = "pairwise",                              # Pairwise contrasts
  adjust = "none"                                   
)

# View results
summary(contrast_results)

Listener_nativeness = native:
 contrast                      estimate    SE  df z.ratio p.value
 (All-PD) - (All-DO)              2.039 0.500 Inf   4.077  <.0001
 (All-PD) - Mixed                 1.304 0.564 Inf   2.312  0.0208
 (All-PD) - (Speaker-specific)    1.604 0.497 Inf   3.227  0.0013
 (All-DO) - Mixed                -0.735 0.517 Inf  -1.423  0.1546
 (All-DO) - (Speaker-specific)   -0.435 0.442 Inf  -0.984  0.3250
 Mixed - (Speaker-specific)       0.300 0.514 Inf   0.584  0.5593

Listener_nativeness = non-native:
 contrast                      estimate    SE  df z.ratio p.value
 (All-PD) - (All-DO)              1.753 0.533 Inf   3.286  0.0010
 (All-PD) - Mixed                 1.058 0.533 Inf   1.986  0.0470
 (All-PD) - (Speaker-specific)    1.162 0.519 Inf   2.239  0.0252
 (All-DO) - Mixed                -0.695 0.605 Inf  -1.149  0.2506
 (All-DO) - (Speaker-specific)   -0.591 0.592 Inf  -0.998  0.3184
 Mixed - (Speaker-specific)       0.104 0.593 Inf   0.176  0.8601

Results are given on the log odds ratio (not the response) scale.

Results

#create table for modelFour
library(sjPlot)

#refugeeswelcome

tab_model(
  modelFour,
  show.ci = FALSE,
  show.re.var = TRUE,
  show.icc = TRUE,
  dv.labels = "Target PD",
  title = "Effect of Listener and Syntax Exposure on Target PD",
  wrap.labels = 60,  
  linebreak = TRUE,  
  digits = 3         
)

boundary (singular) fit: see help('isSingular')

Effect of Listener and Syntax Exposure on Target PD
	Target PD
Predictors	Odds Ratios	p
(Intercept)	4.032	0.002
Listener [E2]	0.833	0.658
SyntaxExposure [All-DO]	0.201	0.005
SyntaxExposure [Mixed]	0.338	0.034
SyntaxExposure [Speaker-specific]	0.804	0.744
Listener [E2] × SyntaxExposure [All-DO]	0.765	0.667
Listener [E2] × SyntaxExposure [Mixed]	1.128	0.803
Listener [E2] × SyntaxExposure [Speaker-specific]	0.783	0.669
Random Effects
σ²	3.29
τ₀₀ _Subject	2.91
τ₀₀ _Picture	1.43
τ₁₁ _{Subject.ListenerE2}	0.92
τ₁₁ _{Subject.SyntaxExposureAll-DO}	0.49
τ₁₁ _{Subject.SyntaxExposureMixed}	0.71
τ₁₁ _{Subject.SyntaxExposureSpeaker-specific}	2.11
τ₁₁ _{Subject.ListenerE2:SyntaxExposureAll-DO}	0.64
τ₁₁ _{Subject.ListenerE2:SyntaxExposureMixed}	0.62
τ₁₁ _{Subject.ListenerE2:SyntaxExposureSpeaker-specific}	3.09
ρ₀₁ _{Subject.ListenerE2}	0.94
ρ₀₁ _{Subject.SyntaxExposureAll-DO}	-0.33
ρ₀₁ _{Subject.SyntaxExposureMixed}	-0.80
ρ₀₁ _{Subject.SyntaxExposureSpeaker-specific}	0.01
ρ₀₁ _{Subject.ListenerE2:SyntaxExposureAll-DO}	0.60
ρ₀₁ _{Subject.ListenerE2:SyntaxExposureMixed}	-0.88
ρ₀₁ _{Subject.ListenerE2:SyntaxExposureSpeaker-specific}	-0.75
ICC	0.62
N _Subject	96
N _Picture	72
Observations	1548
Marginal R² / Conditional R²	0.050 / 0.639

#Graph for modelFour
ggplot(expFour, aes(x = reorder(SyntaxExposure,desc(Target_PD)), 
                    y = Target_PD, 
                    fill = Listener)) +
  stat_summary(fun = mean, 
               geom = "bar", 
               position = position_dodge(), 
               color = "black") + #bars w mean 
  scale_y_continuous(limits = c(0, 1), expand = c(0, 0)) +
  stat_summary(fun.data = mean_se, 
               geom = "errorbar",
               position = position_dodge(width = 0.9), 
               width = 0.2) + #error bars 
  scale_fill_manual(
    values = c("gray20", "lightgrey"),
    labels = c("Listener 1", "Listener 2") #labels
  ) +
  labs(title = "Proportion of PD Produced by Syntax Exposure and Listener",
       x = "Syntax Exposure",
       y = "Proportion of PD Produced",
       fill = "Listener") +
  theme_linedraw()

# Graph for single structure exp4
# PD and DO 
ggplot(expFour, aes(x = SyntaxExposure,
                    y = Target_PD, 
                    fill = Listener)) +
  stat_summary(fun = "mean", 
               geom = "bar", 
               position = position_dodge(width = .9), 
               color = "black", 
               size = 0.7,
               na.rm = TRUE) +  # Bar with mean values
  stat_summary(fun.data = "mean_se", 
               geom = "errorbar", 
               position = position_dodge(width = .9), 
               width = 0.2,
               na.rm = TRUE) +  # Error bars with standard error
 scale_fill_manual(
    values = c("gray20", "lightgrey"),
    labels = c("Listener 1", "Listener 2")) + # Custom labels
  labs(
    x = "Syntax Exposure",
    y = "Proportion of PDs Produced",
    title = "Comparison of PDs Produced between All-PD and All-DO Conditions"
  ) +
  scale_y_continuous(limits = c(0, 1), expand = c(0, 0)) +
  scale_x_discrete(limits = c("All-PD", "All-DO")) +  # Focus on All-PD vs All-DO
  theme_linedraw()

Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

#Mixed and Speaker specfic 
ggplot(expFour, aes(x = SyntaxExposure,
                    y = Target_PD, 
                    fill = Listener)) +
  stat_summary(fun = "mean", 
               geom = "bar", 
               position = position_dodge(width = .9), 
               color = "black", 
               size = 0.7,
               na.rm = TRUE) +  # Bar with mean values
  stat_summary(fun.data = "mean_se", 
               geom = "errorbar", 
               position = position_dodge(width = .9), 
               width = 0.2,
               na.rm = TRUE) +  # Error bars with standard error
 scale_fill_manual(
    values = c("gray20", "lightgrey"),
    labels = c("Listener 1", "Listener 2")) + # Custom labels
  labs(
    x = "Syntax Exposure",
    y = "Proportion of PDs Produced",
    title = "Comparison of PDs Produced between All-PD and All-DO Conditions"
  ) +
  scale_y_continuous(limits = c(0, 1), expand = c(0, 0)) +
  scale_x_discrete(limits = c("Mixed", "Speaker-specific")) +  # Focus on All-PD vs All-DO
  theme_linedraw()

#table for modelFive
tab_model(
  modelFive,
  show.ci = FALSE,
  show.re.var = TRUE,
  show.icc = TRUE,
  dv.labels = "Target PD",
  title = "Effect of Listener and Syntax Exposure on Target PD",
  wrap.labels = 60,  
  linebreak = TRUE,  
  digits = 3         
)

boundary (singular) fit: see help('isSingular')

Effect of Listener and Syntax Exposure on Target PD
	Target PD
Predictors	Odds Ratios	p
(Intercept)	5.762	<0.001
Listener nativeness [non-native]	0.720	0.311
SyntaxExposure [All-DO]	0.130	<0.001
SyntaxExposure [Mixed]	0.271	0.021
SyntaxExposure [Speaker-specific]	0.201	0.001
Listener nativeness [non-native] × SyntaxExposure [All-DO]	1.331	0.580
Listener nativeness [non-native] × SyntaxExposure [Mixed]	1.279	0.596
Listener nativeness [non-native] × SyntaxExposure [Speaker-specific]	1.556	0.303
Random Effects
σ²	3.29
τ₀₀ _Subject	2.23
τ₀₀ _Picture	1.31
τ₁₁ _{Subject.Listener_nativenessnon-native}	0.22
τ₁₁ _{Subject.SyntaxExposureAll-DO}	0.70
τ₁₁ _{Subject.SyntaxExposureMixed}	0.92
τ₁₁ _{Subject.SyntaxExposureSpeaker-specific}	0.72
τ₁₁ _{Subject.Listener_nativenessnon-native:SyntaxExposureAll-DO}	1.73
τ₁₁ _{Subject.Listener_nativenessnon-native:SyntaxExposureMixed}	0.71
τ₁₁ _{Subject.Listener_nativenessnon-native:SyntaxExposureSpeaker-specific}	0.72
ρ₀₁ _{Subject.Listener_nativenessnon-native}	-0.76
ρ₀₁ _{Subject.SyntaxExposureAll-DO}	-0.59
ρ₀₁ _{Subject.SyntaxExposureMixed}	-0.04
ρ₀₁ _{Subject.SyntaxExposureSpeaker-specific}	-0.54
ρ₀₁ _{Subject.Listener_nativenessnon-native:SyntaxExposureAll-DO}	0.15
ρ₀₁ _{Subject.Listener_nativenessnon-native:SyntaxExposureMixed}	0.20
ρ₀₁ _{Subject.Listener_nativenessnon-native:SyntaxExposureSpeaker-specific}	0.90
ICC	0.53
N _Subject	96
N _Picture	71
Observations	1483
Marginal R² / Conditional R²	0.065 / 0.560

ggplot(expFive, aes(x = reorder(SyntaxExposure,desc(Target_PD)), 
                    y = Target_PD, 
                    fill = Listener_nativeness)) +
  stat_summary(fun = mean, 
               geom = "bar", 
               position = position_dodge(), 
               color = "black") + #bar w mean
  scale_y_continuous(limits = c(0, 1), expand = c(0, 0)) +
  stat_summary(fun.data = mean_se, 
               geom = "errorbar",
               position = position_dodge(width = 0.9), 
               width = 0.2) + # error bars
  scale_fill_manual(
    values = c("gray20", "lightgrey"),
    labels = c("Native", "Non-Native") # Custom labels
  ) +
  labs(title = "Proportion of PD Produced by Syntax Exposure and Listener Nativeness",
       x = "Syntax Exposure",
       y = "Proportion of PD Produced",
       fill = "Listener") +
  theme_linedraw()

#graph for single structure exp 5
ggplot(expFive, aes(x = SyntaxExposure,
                    y = Target_PD, 
                    fill = Listener_nativeness)) +
  stat_summary(fun = "mean", 
               geom = "bar", 
               position = position_dodge(width = .9), 
               color = "black", 
               size = 0.7,
               na.rm = TRUE) +  # Bar w mean
  stat_summary(fun.data = "mean_se", 
               geom = "errorbar", 
               position = position_dodge(width = .9), 
               width = 0.2,
               na.rm = TRUE) +  # Error bars 
  scale_fill_manual(
    values = c("gray20", "lightgrey"),
    labels = c("Native", "Non-Native") # labels
  ) +
  labs(
    x = "Syntax Exposure",
    y = "Proportion of PDs Produced",
    title = "Comparison of PDs Produced between All-PD and All-DO Conditions"
  ) +
  scale_y_continuous(limits = c(0, 1), expand = c(0, 0))+
  scale_x_discrete(limits = c("All-PD", "All-DO")) +  # Focus on All-PD vs All-DO
  theme_linedraw()

#Mixed and Speaker specfic 
ggplot(expFive, aes(x = SyntaxExposure,
                    y = Target_PD, 
                    fill = Listener_nativeness)) +
  stat_summary(fun = "mean", 
               geom = "bar", 
               position = position_dodge(width = .9), 
               color = "black", 
               size = 0.7,
               na.rm = TRUE) +  # Bar w mean
  stat_summary(fun.data = "mean_se", 
               geom = "errorbar", 
               position = position_dodge(width = .9), 
               width = 0.2,
               na.rm = TRUE) +  # Error bars 
 scale_fill_manual(
    values = c("gray20", "lightgrey"),
    labels = c("Native", "Non-native")) + # labels
  labs(
    x = "Syntax Exposure",
    y = "Proportion of PDs Produced",
    title = "Comparison of PDs Produced between Mixed and Speaker-specific"
  ) +
  scale_y_continuous(limits = c(0, 1), expand = c(0, 0)) +
  scale_x_discrete(limits = c("Mixed", "Speaker-specific")) +  # Focus on mixed vs speaker-specific
  theme_linedraw()

Discussion

Summary of Replication Attempt

First, I observed that alignment occurred as expected: when participants were exposed to more prepositional datives (PDs), they produced more PDs themselves. The mixed and speaker-specific conditions showed less production of PDs than the PD exposure condition, and in the double-object condition, participants produced even fewer PDs. This pattern aligned with my predictions.

The critical result for the research question lies in the comparison between listener 1 and listener 2. Here, I found no significant difference in participant structure construction based on which listener they were speaking to. Importantly, after running pairwise comparisons to examine the difference between the speaker-specific and mixed conditions, there was also no significant difference. This is all true for both experiment four and five.

These results indicate that participants did not respond differently to individual experimenters. Therefore, we can conclude that syntactic alignment is not partner-specific but partner-general. This directly replicates the findings of the original paper, making this a successful reproduction.

Commentary

The most notable difference between Ostrand (2019) and my reproduction lies in how the models were analyzed. The original paper compared the results of the full GLMM to a reduced model and reported those comparisons. However, the paper provided very limited information about the reduced model, making this aspect difficult to reproduce. Additionally, the original author mentioned that nested effects had to be removed iteratively to achieve model convergence. In my case, I started with fewer random effects from the outset, avoiding the need for iterative adjustments.

Despite these slight differences, the finding of the original paper were successfully reproduced.