Load packages and helper functions
- Packages
- Helper functions
Load data
Experiment 1
- Preregistered data analyses
- Exploratory data analyses
Experiment 2
- Preregistered data analyses
- Exploratory data analyses
Experiment 3
- Preregistered data analyses
- Exploratory data analyses
Experiment 4
Experiment 5
Analyses across experiments

Load packages and helper functions

Packages

library(psych)
library(plyr)
library(doBy)
library(cowplot)
library(reshape2)
library(lme4)
library(brms)
library(tidyr)
library(tidyverse)
library(data.table)
library(janitor)
library(brms)
library(yarrr)
library(knitr)


source("lmedrop.R")
source("myCenter.R")
source("lizCenter.R")
source("summarySEwithin.R")
source("summarySE.R")
source("normDataWithin.R")
source("BF.R")
source("Bf_range.R")
source("Bf_powercalc.R")


theme_set(theme_bw())

Helper functions

SummarySE

This function can be found on the website “Cookbook for R”.

http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/#Helper functions

It summarizes data, giving count, mean, standard deviation, standard error of the mean, and confidence intervals (default 95%).

data: a data frame.

measurevar: the name of a column that contains the variable to be summariezed

groupvars: a vector containing names of columns that contain grouping variables

na.rm: a boolean that indicates whether to ignore NA’s

conf.interval: the percent range of the confidence interval (default is 95%)

summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=FALSE,
                      conf.interval=.95, .drop=TRUE) {
    require(plyr)

    # New version of length which can handle NA's: if na.rm==T, don't count them
    length2 <- function (x, na.rm=FALSE) {
        if (na.rm) sum(!is.na(x))
        else       length(x)
    }

    # This does the summary. For each group's data frame, return a vector with
    # N, mean, and sd
    datac <- ddply(data, groupvars, .drop=.drop,
      .fun = function(xx, col) {
        c(N    = length2(xx[[col]], na.rm=na.rm),
          mean = mean   (xx[[col]], na.rm=na.rm),
          sd   = sd     (xx[[col]], na.rm=na.rm)
        )
      },
      measurevar
    )

    # Rename the "mean" column    
    datac <- rename(datac, c("mean" = measurevar))

    datac$se <- datac$sd / sqrt(datac$N)  # Calculate standard error of the mean

    # Confidence interval multiplier for standard error
    # Calculate t-statistic for confidence interval: 
    # e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
    ciMult <- qt(conf.interval/2 + .5, datac$N-1)
    datac$ci <- datac$se * ciMult

    return(datac)
}

SummarySEwithin

This function can be found on the website “Cookbook for R”.

http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/#Helper functions

It summarizes data, handling within-subjects variables by removing inter-subject variability. It will still work if there are no within-S variables. It gives count, un-normed mean, normed mean (with same between-group mean), standard deviation, standard error of the mean, and confidence intervals. If there are within-subject variables, calculate adjusted values using method from Morey (2008).

data: a data frame.

measurevar: the name of a column that contains the variable to be summarized

betweenvars: a vector containing names of columns that are between-subjects variables

withinvars: a vector containing names of columns that are within-subjects variables

idvar: the name of a column that identifies each subject (or matched subjects)

na.rm: a boolean that indicates whether to ignore NA’s

conf.interval: the percent range of the confidence interval (default is 95%)

summarySEwithin <- function(data=NULL, measurevar, betweenvars=NULL, withinvars=NULL,
                            idvar=NULL, na.rm=FALSE, conf.interval=.95, .drop=TRUE) {

  # Ensure that the betweenvars and withinvars are factors
  factorvars <- vapply(data[, c(betweenvars, withinvars), drop=FALSE],
    FUN=is.factor, FUN.VALUE=logical(1))

  if (!all(factorvars)) {
    nonfactorvars <- names(factorvars)[!factorvars]
    message("Automatically converting the following non-factors to factors: ",
            paste(nonfactorvars, collapse = ", "))
    data[nonfactorvars] <- lapply(data[nonfactorvars], factor)
  }

  # Get the means from the un-normed data
  datac <- summarySE(data, measurevar, groupvars=c(betweenvars, withinvars),
                     na.rm=na.rm, conf.interval=conf.interval, .drop=.drop)

  # Drop all the unused columns (these will be calculated with normed data)
  datac$sd <- NULL
  datac$se <- NULL
  datac$ci <- NULL

  # Norm each subject's data
  ndata <- normDataWithin(data, idvar, measurevar, betweenvars, na.rm, .drop=.drop)

  # This is the name of the new column
  measurevar_n <- paste(measurevar, "_norm", sep="")

  # Collapse the normed data - now we can treat between and within vars the same
  ndatac <- summarySE(ndata, measurevar_n, groupvars=c(betweenvars, withinvars),
                      na.rm=na.rm, conf.interval=conf.interval, .drop=.drop)

  # Apply correction from Morey (2008) to the standard error and confidence interval
  #  Get the product of the number of conditions of within-S variables
  nWithinGroups    <- prod(vapply(ndatac[,withinvars, drop=FALSE], FUN=nlevels,
                           FUN.VALUE=numeric(1)))
  correctionFactor <- sqrt( nWithinGroups / (nWithinGroups-1) )

  # Apply the correction factor
  ndatac$sd <- ndatac$sd * correctionFactor
  ndatac$se <- ndatac$se * correctionFactor
  ndatac$ci <- ndatac$ci * correctionFactor

  # Combine the un-normed means with the normed results
  merge(datac, ndatac)
}

normDataWithin

This function is used by the SummarySEWithin fucntion above. It can be found on the website “Cookbook for R”.

http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/#Helper functions

From that website: Norms the data within specified groups in a data frame; it normalizes each subject (identified by idvar) so that they have the same mean, within each group specified by betweenvars.

data: a data frame

idvar: the name of a column that identifies each subject (or matched subjects)

measurevar: the name of a column that contains the variable to be summarized

betweenvars: a vector containing names of columns that are between-subjects variables

na.rm: a boolean that indicates whether to ignore NA’s

normDataWithin <- function(data=NULL, idvar, measurevar, betweenvars=NULL,
                           na.rm=FALSE, .drop=TRUE) {
    require(plyr)

    # Measure var on left, idvar + between vars on right of formula.
    data.subjMean <- ddply(data, c(idvar, betweenvars), .drop=.drop,
     .fun = function(xx, col, na.rm) {
        c(subjMean = mean(xx[,col], na.rm=na.rm))
      },
      measurevar,
      na.rm
    )

    # Put the subject means with original data
    data <- merge(data, data.subjMean)

    # Get the normalized data in a new column
    measureNormedVar <- paste(measurevar, "_norm", sep="")
    data[,measureNormedVar] <- data[,measurevar] - data[,"subjMean"] +
                               mean(data[,measurevar], na.rm=na.rm)

    # Remove this subject mean column
    data$subjMean <- NULL

    return(data)
}

myCenter

This function outputs the centered values of an variable, which can be a numeric variable, a factor, or a data frame. It was taken from Florian Jaegers blog https://hlplab.wordpress.com/2009/04/27/centering-several-variables/.

From his blog:

-If the input is a numeric variable, the output is the centered variable.

-If the input is a factor, the output is a numeric variable with centered factor level values. That is, the factor’s levels are converted into numerical values in their inherent order (if not specified otherwise, R defaults to alphanumerical order). More specifically, this centers any binary factor so that the value below 0 will be the 1st level of the original factor, and the value above 0 will be the 2nd level.

-If the input is a data frame or matrix, the output is a new matrix of the same dimension and with the centered values and column names that correspond to the colnames() of the input preceded by “c” (e.g. “Variable1” will be “cVariable1”).

myCenter= function(x) {
  if (is.numeric(x)) { return(x - mean(x, na.rm=T)) }
    if (is.factor(x)) {
        x= as.numeric(x)
        return(x - mean(x, na.rm=T))
    }
    if (is.data.frame(x) || is.matrix(x)) {
        m= matrix(nrow=nrow(x), ncol=ncol(x))
        colnames(m)= paste("c", colnames(x), sep="")
    
        for (i in 1:ncol(x)) {
        
            m[,i]= myCenter(x[,i])
        }
        return(as.data.frame(m))
    }
}

lizCenter

This function provides a wrapper around myCenter allowing you to center a specific list of variables from a dataframe. The input is a dataframe (x) and a list of the names of the variables which you wish to center (listfname). The output is a copy of the dataframe with a column (numeric) added for each of the centered variables with each one labelled with it’s previous name with “.ct” appended. For example, if x is a dataframe with columns “a” and “b” lizCenter(x, list(“a”, “b”)) will return a dataframe with two additional columns, a.ct and b.ct, which are numeric, centered codings of the corresponding variables.

lizCenter= function(x, listfname) 
{
    for (i in 1:length(listfname)) 
    {
        fname = as.character(listfname[i])
        x[paste(fname,".ct", sep="")] = myCenter(x[fname])
    }
        
    return(x)
}

###get_coeffs This function allows us to inspect particular coefficients from the output of an lme model by putting them in table.

x: the output returned when running lmer or glmer (i.e. an object of type lmerMod or glmerMod)

list: a list of the names of the coefficients to be extracted (e.g. c(“variable1”, “variable1:variable2”))

get_coeffs <- function(x,list){(as.data.frame(summary(x)$coefficients)[list,])}

Bf

This function is equivalent to the Dienes (2008) calculator which can be found here: http://www.lifesci.sussex.ac.uk/home/Zoltan_Dienes/inference/Bayes.htm.

The code was provided by Baguely and Kayne (2010) and can be found here: http://www.academia.edu/427288/Review_of_Understanding_psychology_as_a_science_An_introduction_to_scientific_and_statistical_inference

Bf<-function(sd, obtained, uniform, lower=0, upper=1, meanoftheory=0,sdtheory=1, tail=1){
 area <- 0
 if(identical(uniform, 1)){
  theta <- lower
  range <- upper - lower
  incr <- range / 2000
  for (A in -1000:1000){
     theta <- theta + incr
     dist_theta <- 1 / range
     height <- dist_theta * dnorm(obtained, theta, sd)
     area <- area + height * incr
  }
 }else
   {theta <- meanoftheory - 5 * sdtheory
    incr <- sdtheory / 200
    for (A in -1000:1000){
      theta <- theta + incr
      dist_theta <- dnorm(theta, meanoftheory, sdtheory)
      if(identical(tail, 1)){
        if (theta <= 0){
          dist_theta <- 0
        } else {
          dist_theta <- dist_theta * 2
        }
      }
      height <- dist_theta * dnorm(obtained, theta, sd)
      area <- area + height * incr
    }
 }
 LikelihoodTheory <- area
 Likelihoodnull <- dnorm(obtained, 0, sd)
 BayesFactor <- LikelihoodTheory / Likelihoodnull
 ret <- list("LikelihoodTheory" = LikelihoodTheory,"Likelihoodnull" = Likelihoodnull, "BayesFactor" = BayesFactor)
 ret
}

Bf power calculation

This works with the Bf function above. It requires the same values as that function (i.e. the obtained mean and SE for the current sample, a value for the predicted mean, which is set to be sdtheory (with meanoftheory=0), and the current number of participants N). However, rather than returning a BF for the current sample, it works out what the BF would be for a range of different subject numbers (assuming that the SE scales with sqrt(N)).

Bf_powercalc<-function(sd, obtained, uniform, lower=0, upper=1, meanoftheory=0, sdtheory=1, tail=2, N, min, max)
{
  
  x = c(0)
  y = c(0)

# note: working out what the difference between N and df is (for the contrast between two groups, this is 2; for constraints where there is 4 groups this will be 3, etc.)

  for(newN in min : max)
  {
    B = as.numeric(Bf(sd = sd*sqrt(N/newN), obtained, uniform, lower, upper, meanoftheory, sdtheory, tail)[3])
    x= append(x,newN) 
    y= append(y,B)
    output = cbind(x,y)
    
  } 
  output = output[-1,] 
  return(output) 
}

Bf range

This works with the Bf function above. It requires the obtained mean and SE for the current sample and works out what the BF would be for a range of predicted means (which are set to be sdtheoryrange (with meanoftheory=0)).

Bf_range<-function(sd, obtained, meanoftheory=0, sdtheoryrange, tail=1)
{
  
  x = c(0)
  y = c(0)
  
  for(sdi in sdtheoryrange)
  {
    B = as.numeric(Bf(sd, obtained, meanoftheory=0, uniform = 0, sdtheory=sdi, tail)[3])
    
    x= append(x,sdi)  
    y= append(y,B)
    output = cbind(x,y)
    
  } 
  output = output[-1,] 
  colnames(output) = c("sdtheory", "BF")
  return(output) 
}

Load data

########################################################
# Across all experiments
########################################################

#Create the dataframes that we will be working on
combined_production_data.df <- read.csv("all_production_data.csv")

combined_judgment_data.df <- read.csv("all_judgment_data.csv")
combined_judgment_data.df$restricted_verb_noun <- factor(combined_judgment_data.df$restricted_verb_noun)
combined_judgment_data.df$condition <- factor(combined_judgment_data.df$condition)
combined_judgment_data.df$experiment  <- factor(combined_judgment_data.df$experiment)
combined_judgment_data.df$scene_test2  <- factor(combined_judgment_data.df$scene_test2)

#separately for entrenchment and preemption

#entrenchment

entrenchment_production.df <- subset(combined_production_data.df, condition == "entrenchment")

# Create columns that we will need to run production analyses in entrenchment
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

entrenchment_production.df$det1 <- ifelse(entrenchment_production.df$det_lenient_adapted == "det_construction1", 1, 0)
entrenchment_production.df$det2 <- ifelse(entrenchment_production.df$det_lenient_adapted == "det_construction2", 1, 0)
entrenchment_production.df$other <- ifelse(entrenchment_production.df$det_lenient_adapted == "other", 1, 0)
entrenchment_production.df$none <- ifelse(entrenchment_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
entrenchment_production.df$semantically_correct <- as.numeric(entrenchment_production.df$semantically_correct)
entrenchment_production.df$scene_test2 <- factor(entrenchment_production.df$scene_test2)
entrenchment_production.df$experiment  <- factor(entrenchment_production.df$experiment)
entrenchment_production.df$verb_noun_type_training2 <- factor(entrenchment_production.df$verb_noun_type_training2)
entrenchment_production.df$restricted_verb_noun <- factor(entrenchment_production.df$restricted_verb_noun)

entrenchment_judgment.df <- subset(combined_judgment_data.df, condition == "entrenchment")
entrenchment_judgment.df$semantically_correct <- factor(entrenchment_judgment.df$semantically_correct)
entrenchment_judgment.df$restricted_verb_noun <- factor(entrenchment_judgment.df$restricted_verb_noun)
entrenchment_judgment.df$scene_test2 <- factor(entrenchment_judgment.df$scene_test2)


#preemption
preemption_production.df <- subset(combined_production_data.df, condition == "preemption")


# Create columns that we will need to run production analyses in pre-emption
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

preemption_production.df$det1 <- ifelse(preemption_production.df$det_lenient_adapted == "det_construction1", 1, 0)
preemption_production.df$det2 <- ifelse(preemption_production.df$det_lenient_adapted == "det_construction2", 1, 0)
preemption_production.df$other <- ifelse(preemption_production.df$det_lenient_adapted == "other", 1, 0)
preemption_production.df$none <- ifelse(preemption_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
preemption_production.df$semantically_correct <- as.numeric(preemption_production.df$semantically_correct)
preemption_production.df$scene_test2 <- factor(preemption_production.df$scene_test2)
preemption_production.df$experiment  <- factor(preemption_production.df$experiment)
preemption_production.df$verb_noun_type_training2 <- factor(preemption_production.df$verb_noun_type_training2)
preemption_production.df$restricted_verb_noun <- factor(preemption_production.df$restricted_verb_noun)

preemption_judgment.df <- subset(combined_judgment_data.df, condition == "preemption")
preemption_judgment.df$semantically_correct <- factor(preemption_judgment.df$semantically_correct)
preemption_judgment.df$restricted_verb_noun <- factor(preemption_judgment.df$restricted_verb_noun)
preemption_judgment.df$scene_test2 <- factor(preemption_judgment.df$scene_test2)


########################################################
# EXPERIMENT 1 - VERB STUDY WITH ADULTS
########################################################

exp1_production_data.df <- subset(combined_production_data.df, experiment == "exp1")

exp1_judgment_data.df <- subset(combined_judgment_data.df, experiment == "exp1")
exp1_judgment_data.df$restricted_verb_noun <- factor(exp1_judgment_data.df$restricted_verb_noun)
exp1_judgment_data.df$condition <- factor(exp1_judgment_data.df$condition)
exp1_judgment_data.df$scene_test2 <- factor(exp1_judgment_data.df$scene_test2)


#separately for entrenchment and preemption

#entrenchment

exp1_entrenchment_production.df <- subset(exp1_production_data.df, condition == "entrenchment")

# Create columns that we will need to run production analyses in entrenchment
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

exp1_entrenchment_production.df$det1 <- ifelse(exp1_entrenchment_production.df$det_lenient_adapted == "det_construction1", 1, 0)
exp1_entrenchment_production.df$det2 <- ifelse(exp1_entrenchment_production.df$det_lenient_adapted == "det_construction2", 1, 0)
exp1_entrenchment_production.df$other <- ifelse(exp1_entrenchment_production.df$det_lenient_adapted == "other", 1, 0)
exp1_entrenchment_production.df$none <- ifelse(exp1_entrenchment_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
exp1_entrenchment_production.df$semantically_correct <- as.numeric(exp1_entrenchment_production.df$semantically_correct)
exp1_entrenchment_production.df$scene_test2 <- factor(exp1_entrenchment_production.df$scene_test2)
exp1_entrenchment_production.df$verb_noun_type_training2 <- factor(exp1_entrenchment_production.df$verb_noun_type_training2)
exp1_entrenchment_production.df$restricted_verb_noun <- factor(exp1_entrenchment_production.df$restricted_verb_noun)

exp1_entrenchment_judgment.df <- subset(exp1_judgment_data.df, condition == "entrenchment")

# Tidy up numeric variables/factors
exp1_entrenchment_judgment.df$semantically_correct <- factor(exp1_entrenchment_judgment.df$semantically_correct)
exp1_entrenchment_judgment.df$restricted_verb_noun <- factor(exp1_entrenchment_judgment.df$restricted_verb_noun)
exp1_entrenchment_judgment.df$scene_test2 <- factor(exp1_entrenchment_judgment.df$scene_test2)


#preemption
exp1_preemption_production.df <- subset(exp1_production_data.df, condition == "preemption")

# Create columns that we will need to run production analyses in pre-emption
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

exp1_preemption_production.df$det1 <- ifelse(exp1_preemption_production.df$det_lenient_adapted == "det_construction1", 1, 0)
exp1_preemption_production.df$det2 <- ifelse(exp1_preemption_production.df$det_lenient_adapted == "det_construction2", 1, 0)
exp1_preemption_production.df$other <- ifelse(exp1_preemption_production.df$det_lenient_adapted == "other", 1, 0)
exp1_preemption_production.df$none <- ifelse(exp1_preemption_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
exp1_preemption_production.df$semantically_correct <- as.numeric(exp1_preemption_production.df$semantically_correct)
exp1_preemption_production.df$scene_test2 <- factor(exp1_preemption_production.df$scene_test2)
exp1_preemption_production.df$verb_noun_type_training2 <- factor(exp1_preemption_production.df$verb_noun_type_training2)
exp1_preemption_production.df$restricted_verb_noun <- factor(exp1_preemption_production.df$restricted_verb_noun)

exp1_preemption_judgment.df <- subset(exp1_judgment_data.df, condition == "preemption")

# Tidy up numeric variables/factors
exp1_preemption_judgment.df$semantically_correct <- factor(exp1_preemption_judgment.df$semantically_correct)
exp1_preemption_judgment.df$restricted_verb_noun <- factor(exp1_preemption_judgment.df$restricted_verb_noun)
exp1_preemption_judgment.df$scene_test2 <- factor(exp1_preemption_judgment.df$scene_test2)


########################################################
# EXPERIMENT 2 - VERB STUDY WITH ADULTS REPLICATION
########################################################

exp2_production_data.df <- subset(combined_production_data.df, experiment == "exp2")

exp2_judgment_data.df <- subset(combined_judgment_data.df, experiment == "exp2")
exp2_judgment_data.df$restricted_verb_noun <- factor(exp2_judgment_data.df$restricted_verb_noun)
exp2_judgment_data.df$condition <- factor(exp2_judgment_data.df$condition)
exp2_judgment_data.df$scene_test2 <- factor(exp2_judgment_data.df$scene_test2)

#separately for entrenchment and preemption

#entrenchment

exp2_entrenchment_production.df <- subset(exp2_production_data.df, condition == "entrenchment")

# Create columns that we will need to run production analyses in entrenchment
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

exp2_entrenchment_production.df$det1 <- ifelse(exp2_entrenchment_production.df$det_lenient_adapted == "det_construction1", 1, 0)
exp2_entrenchment_production.df$det2 <- ifelse(exp2_entrenchment_production.df$det_lenient_adapted == "det_construction2", 1, 0)
exp2_entrenchment_production.df$other <- ifelse(exp2_entrenchment_production.df$det_lenient_adapted == "other", 1, 0)
exp2_entrenchment_production.df$none <- ifelse(exp2_entrenchment_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
exp2_entrenchment_production.df$semantically_correct <- as.numeric(exp2_entrenchment_production.df$semantically_correct)
exp2_entrenchment_production.df$scene_test2 <- factor(exp2_entrenchment_production.df$scene_test2)
exp2_entrenchment_production.df$verb_noun_type_training2 <- factor(exp2_entrenchment_production.df$verb_noun_type_training2)
exp2_entrenchment_production.df$restricted_verb_noun <- factor(exp2_entrenchment_production.df$restricted_verb_noun)

exp2_entrenchment_judgment.df <- subset(exp2_judgment_data.df, condition == "entrenchment")

# Tidy up numeric variables/factors
exp2_entrenchment_judgment.df$semantically_correct <- factor(exp2_entrenchment_judgment.df$semantically_correct)
exp2_entrenchment_judgment.df$restricted_verb_noun <- factor(exp2_entrenchment_judgment.df$restricted_verb_noun)
exp2_entrenchment_judgment.df$scene_test2 <- factor(exp2_entrenchment_judgment.df$scene_test2)


#preemption
exp2_preemption_production.df <- subset(exp2_production_data.df, condition == "preemption")

# Create columns that we will need to run production analyses in pre-emption
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

exp2_preemption_production.df$det1 <- ifelse(exp2_preemption_production.df$det_lenient_adapted == "det_construction1", 1, 0)
exp2_preemption_production.df$det2 <- ifelse(exp2_preemption_production.df$det_lenient_adapted == "det_construction2", 1, 0)
exp2_preemption_production.df$other <- ifelse(exp2_preemption_production.df$det_lenient_adapted == "other", 1, 0)
exp2_preemption_production.df$none <- ifelse(exp2_preemption_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
exp2_preemption_production.df$semantically_correct <- as.numeric(exp2_preemption_production.df$semantically_correct)
exp2_preemption_production.df$scene_test2 <- factor(exp2_preemption_production.df$scene_test2)
exp2_preemption_production.df$verb_noun_type_training2 <- factor(exp2_preemption_production.df$verb_noun_type_training2)
exp2_preemption_production.df$restricted_verb_noun <- factor(exp2_preemption_production.df$restricted_verb_noun)

exp2_preemption_judgment.df <- subset(exp2_judgment_data.df, condition == "preemption")

# Tidy up numeric variables/factors
exp2_preemption_judgment.df$semantically_correct <- factor(exp2_preemption_judgment.df$semantically_correct)
exp2_preemption_judgment.df$restricted_verb_noun <- factor(exp2_preemption_judgment.df$restricted_verb_noun)
exp2_preemption_judgment.df$scene_test2 <- factor(exp2_preemption_judgment.df$scene_test2)


########################################################
# EXPERIMENT 3 - VERB STUDY WITH DOUBLE TRAINING
########################################################

exp3_production_data.df <- subset(combined_production_data.df, experiment == "exp3")

exp3_judgment_data.df <- subset(combined_judgment_data.df, experiment == "exp3")
exp3_judgment_data.df$restricted_verb_noun <- factor(exp3_judgment_data.df$restricted_verb_noun)
exp3_judgment_data.df$condition <- factor(exp3_judgment_data.df$condition)
exp3_judgment_data.df$scene_test2 <- factor(exp3_judgment_data.df$scene_test2)

#separately for entrenchment and preemption

#entrenchment

exp3_entrenchment_production.df <- subset(exp3_production_data.df, condition == "entrenchment")

# Create columns that we will need to run production analyses in entrenchment
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

exp3_entrenchment_production.df$det1 <- ifelse(exp3_entrenchment_production.df$det_lenient_adapted == "det_construction1", 1, 0)
exp3_entrenchment_production.df$det2 <- ifelse(exp3_entrenchment_production.df$det_lenient_adapted == "det_construction2", 1, 0)
exp3_entrenchment_production.df$other <- ifelse(exp3_entrenchment_production.df$det_lenient_adapted == "other", 1, 0)
exp3_entrenchment_production.df$none <- ifelse(exp3_entrenchment_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
exp3_entrenchment_production.df$semantically_correct <- as.numeric(exp3_entrenchment_production.df$semantically_correct)
exp3_entrenchment_production.df$scene_test2 <- factor(exp3_entrenchment_production.df$scene_test2)
exp3_entrenchment_production.df$verb_noun_type_training2 <- factor(exp3_entrenchment_production.df$verb_noun_type_training2)
exp3_entrenchment_production.df$restricted_verb_noun <- factor(exp3_entrenchment_production.df$restricted_verb_noun)

exp3_entrenchment_judgment.df <- subset(exp3_judgment_data.df, condition == "entrenchment")

# Tidy up numeric variables/factors
exp3_entrenchment_judgment.df$semantically_correct <- factor(exp3_entrenchment_judgment.df$semantically_correct)
exp3_entrenchment_judgment.df$restricted_verb_noun <- factor(exp3_entrenchment_judgment.df$restricted_verb_noun)
exp3_entrenchment_judgment.df$scene_test2 <- factor(exp3_entrenchment_judgment.df$scene_test2)


#preemption
exp3_preemption_production.df <- subset(exp3_production_data.df, condition == "preemption")

# Create columns that we will need to run production analyses in pre-emption
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

exp3_preemption_production.df$det1 <- ifelse(exp3_preemption_production.df$det_lenient_adapted == "det_construction1", 1, 0)
exp3_preemption_production.df$det2 <- ifelse(exp3_preemption_production.df$det_lenient_adapted == "det_construction2", 1, 0)
exp3_preemption_production.df$other <- ifelse(exp3_preemption_production.df$det_lenient_adapted == "other", 1, 0)
exp3_preemption_production.df$none <- ifelse(exp3_preemption_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
exp3_preemption_production.df$semantically_correct <- as.numeric(exp3_preemption_production.df$semantically_correct)
exp3_preemption_production.df$scene_test2 <- factor(exp3_preemption_production.df$scene_test2)
exp3_preemption_production.df$verb_noun_type_training2 <- factor(exp3_preemption_production.df$verb_noun_type_training2)
exp3_preemption_production.df$restricted_verb_noun <- factor(exp3_preemption_production.df$restricted_verb_noun)

exp3_preemption_judgment.df <- subset(exp3_judgment_data.df, condition == "preemption")

# Tidy up numeric variables/factors
exp3_preemption_judgment.df$semantically_correct <- factor(exp3_preemption_judgment.df$semantically_correct)
exp3_preemption_judgment.df$restricted_verb_noun <- factor(exp3_preemption_judgment.df$restricted_verb_noun)
exp3_preemption_judgment.df$scene_test2 <- factor(exp3_preemption_judgment.df$scene_test2)


########################################################
# EXPERIMENT 4 - NOUN STUDY WITH ADULTS
########################################################

exp4_production_data.df <- subset(combined_production_data.df, experiment == "exp4")

exp4_judgment_data.df <- subset(combined_judgment_data.df, experiment == "exp4")
exp4_judgment_data.df$restricted_verb_noun <- factor(exp4_judgment_data.df$restricted_verb_noun)
exp4_judgment_data.df$condition <- factor(exp4_judgment_data.df$condition)
exp4_judgment_data.df$scene_test2 <- factor(exp4_judgment_data.df$scene_test2)

#separately for entrenchment and preemption

#entrenchment

exp4_entrenchment_production.df <- subset(exp4_production_data.df, condition == "entrenchment")

# Create columns that we will need to run production analyses in entrenchment
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

exp4_entrenchment_production.df$det1 <- ifelse(exp4_entrenchment_production.df$det_lenient_adapted == "det_construction1", 1, 0)
exp4_entrenchment_production.df$det2 <- ifelse(exp4_entrenchment_production.df$det_lenient_adapted == "det_construction2", 1, 0)
exp4_entrenchment_production.df$other <- ifelse(exp4_entrenchment_production.df$det_lenient_adapted == "other", 1, 0)
exp4_entrenchment_production.df$none <- ifelse(exp4_entrenchment_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
exp4_entrenchment_production.df$semantically_correct <- as.numeric(exp4_entrenchment_production.df$semantically_correct)
exp4_entrenchment_production.df$scene_test2 <- factor(exp4_entrenchment_production.df$scene_test2)
exp4_entrenchment_production.df$verb_noun_type_training2 <- factor(exp4_entrenchment_production.df$verb_noun_type_training2)
exp4_entrenchment_production.df$restricted_verb_noun <- factor(exp4_entrenchment_production.df$restricted_verb_noun)

exp4_entrenchment_judgment.df <- subset(exp4_judgment_data.df, condition == "entrenchment")

# Tidy up numeric variables/factors
exp4_entrenchment_judgment.df$semantically_correct <- factor(exp4_entrenchment_judgment.df$semantically_correct)
exp4_entrenchment_judgment.df$restricted_verb_noun <- factor(exp4_entrenchment_judgment.df$restricted_verb_noun)
exp4_entrenchment_judgment.df$scene_test2 <- factor(exp4_entrenchment_judgment.df$scene_test2)


#preemption
exp4_preemption_production.df <- subset(exp4_production_data.df, condition == "preemption")

# Create columns that we will need to run production analyses in pre-emption
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

exp4_preemption_production.df$det1 <- ifelse(exp4_preemption_production.df$det_lenient_adapted == "det_construction1", 1, 0)
exp4_preemption_production.df$det2 <- ifelse(exp4_preemption_production.df$det_lenient_adapted == "det_construction2", 1, 0)
exp4_preemption_production.df$other <- ifelse(exp4_preemption_production.df$det_lenient_adapted == "other", 1, 0)
exp4_preemption_production.df$none <- ifelse(exp4_preemption_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
exp4_preemption_production.df$semantically_correct <- as.numeric(exp4_preemption_production.df$semantically_correct)
exp4_preemption_production.df$scene_test2 <- factor(exp4_preemption_production.df$scene_test2)
exp4_preemption_production.df$verb_noun_type_training2 <- factor(exp4_preemption_production.df$verb_noun_type_training2)
exp4_preemption_production.df$restricted_verb_noun <- factor(exp4_preemption_production.df$restricted_verb_noun)

exp4_preemption_judgment.df <- subset(exp4_judgment_data.df, condition == "preemption")

# Tidy up numeric variables/factors
exp4_preemption_judgment.df$semantically_correct <- factor(exp4_preemption_judgment.df$semantically_correct)
exp4_preemption_judgment.df$restricted_verb_noun <- factor(exp4_preemption_judgment.df$restricted_verb_noun)
exp4_preemption_judgment.df$scene_test2 <- factor(exp4_preemption_judgment.df$scene_test2)


########################################################
# EXPERIMENT 5 - NOUN STUDY WITH CHILDREN
########################################################

exp5_production_data.df <- subset(combined_production_data.df, experiment == "exp5")

exp5_judgment_data.df <- subset(combined_judgment_data.df, experiment == "exp5")
exp5_judgment_data.df$restricted_verb_noun <- factor(exp5_judgment_data.df$restricted_verb_noun)
exp5_judgment_data.df$condition <- factor(exp5_judgment_data.df$condition)
exp5_judgment_data.df$scene_test2 <- factor(exp5_judgment_data.df$scene_test2)

#separately for entrenchment and preemption

#entrenchment

exp5_entrenchment_production.df <- subset(exp5_production_data.df, condition == "entrenchment")

# Create columns that we will need to run production analyses in entrenchment
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

exp5_entrenchment_production.df$det1 <- ifelse(exp5_entrenchment_production.df$det_lenient_adapted == "det_construction1", 1, 0)
exp5_entrenchment_production.df$det2 <- ifelse(exp5_entrenchment_production.df$det_lenient_adapted == "det_construction2", 1, 0)
exp5_entrenchment_production.df$other <- ifelse(exp5_entrenchment_production.df$det_lenient_adapted == "other", 1, 0)
exp5_entrenchment_production.df$none <- ifelse(exp5_entrenchment_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
exp5_entrenchment_production.df$semantically_correct <- as.numeric(exp5_entrenchment_production.df$semantically_correct)
exp5_entrenchment_production.df$scene_test2 <- factor(exp5_entrenchment_production.df$scene_test2)
exp5_entrenchment_production.df$verb_noun_type_training2 <- factor(exp5_entrenchment_production.df$verb_noun_type_training2)
exp5_entrenchment_production.df$restricted_verb_noun <- factor(exp5_entrenchment_production.df$restricted_verb_noun)

exp5_entrenchment_judgment.df <- subset(exp5_judgment_data.df, condition == "entrenchment")

# Tidy up numeric variables/factors
exp5_entrenchment_judgment.df$semantically_correct <- factor(exp5_entrenchment_judgment.df$semantically_correct)
exp5_entrenchment_judgment.df$restricted_verb_noun <- factor(exp5_entrenchment_judgment.df$restricted_verb_noun)
exp5_entrenchment_judgment.df$scene_test2 <- factor(exp5_entrenchment_judgment.df$scene_test2)


#preemption
exp5_preemption_production.df <- subset(exp5_production_data.df, condition == "preemption")

# Create columns that we will need to run production analyses in pre-emption
# We want some columns coding which of particle 1, particle 2, 'other' and 'none' was produced

exp5_preemption_production.df$det1 <- ifelse(exp5_preemption_production.df$det_lenient_adapted == "det_construction1", 1, 0)
exp5_preemption_production.df$det2 <- ifelse(exp5_preemption_production.df$det_lenient_adapted == "det_construction2", 1, 0)
exp5_preemption_production.df$other <- ifelse(exp5_preemption_production.df$det_lenient_adapted == "other", 1, 0)
exp5_preemption_production.df$none <- ifelse(exp5_preemption_production.df$det_lenient_adapted == "none", 1, 0)

# Tidy up numeric variables/factors
exp5_preemption_production.df$semantically_correct <- as.numeric(exp5_preemption_production.df$semantically_correct)
exp5_preemption_production.df$scene_test2 <- factor(exp5_preemption_production.df$scene_test2)
exp5_preemption_production.df$verb_noun_type_training2 <- factor(exp5_preemption_production.df$verb_noun_type_training2)
exp5_preemption_production.df$restricted_verb_noun <- factor(exp5_preemption_production.df$restricted_verb_noun)

exp5_preemption_judgment.df <- subset(exp5_judgment_data.df, condition == "preemption")

# Tidy up numeric variables/factors
exp5_preemption_judgment.df$semantically_correct <- factor(exp5_preemption_judgment.df$semantically_correct)
exp5_preemption_judgment.df$restricted_verb_noun <- factor(exp5_preemption_judgment.df$restricted_verb_noun)
exp5_preemption_judgment.df$scene_test2 <- factor(exp5_preemption_judgment.df$scene_test2)

Experiment 1

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between the two argument-structure constructions?

Production data

#Figure 3
RQ1_graph_productions.df = subset(exp1_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating" |verb_noun_type_training2 == "novel")
RQ1_graph_productions.df = subset(RQ1_graph_productions.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.graph1 = aggregate(semantically_correct ~ verb_noun_type_training2 + participant_private_id, RQ1_graph_productions.df, FUN=mean)

aggregated.graph1 <- rename(aggregated.graph1, verb = verb_noun_type_training2,
                            correct = semantically_correct)

yarrr::pirateplot(formula = correct  ~ verb,
                  data = aggregated.graph1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "% semantically correct",
                  cex.lab = 1,
                  cex.axis = 1,
                  cex.names = 1,
                  yaxt = "n")

axis(2, at = seq(0, 1, by = 0.25), las=1)
abline(h = 0.50, lty = 2)

#1 alternating verb production

alternating_prod.df = subset(exp1_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating")

#and filter out responses where participants said something other than det1 or det2
alternating_prod.df = subset(alternating_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_alternating_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, alternating_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_alternating_prod.df$semantically_correct),3)

## [1] 0.945

# average accuracy separately for causative and inchoative scenes
round(tapply(aggregated.means_alternating_prod.df$semantically_correct, aggregated.means_alternating_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.971         0.919

# maximally vague priors for the intercept and the predictors
a = lizCenter(alternating_prod.df, list("scene_test2"))  

alternating_model <-brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=a, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(alternating_model, variable = c("b_Intercept", "b_scene_test2.ct" ))

##                    Estimate Est.Error      Q2.5    Q97.5
## b_Intercept       3.1785276 0.3590857  2.551993 3.947845
## b_scene_test2.ct -0.7568256 0.5097310 -1.760023 0.254472

mcmc_plot(alternating_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##    c(C1, C2)
## 1 0.00000000
## 2 0.06483333

# no difference between construction 1 and construction 2

# Final model
# maximally vague priors for the intercept 
alternating_model_final = brm(formula = semantically_correct~1 + (1|participant_private_id), data=a, family = bernoulli(link = logit),set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(alternating_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 3.022475 0.3380366 2.438893 3.755779

mcmc_plot(alternating_model_final, variable = "b_Intercept", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

#2 novel verb production

novel_prod.df = subset(exp1_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "novel")

#filter out responses where participants said something other than det1 or det2
novel_prod.df = subset(novel_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_novel_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, novel_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_novel_prod.df$semantically_correct),3)

## [1] 0.955

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_novel_prod.df$semantically_correct, aggregated.means_novel_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.946         0.964

b = lizCenter(novel_prod.df, list("scene_test2"))  

# maximally vague priors for the intercept and the predictors
novel_model <- brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=b, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(novel_model, variable = c("b_Intercept", "b_scene_test2.ct"))

##                    Estimate Est.Error      Q2.5    Q97.5
## b_Intercept      3.47804800 0.3974526  2.784284 4.346230
## b_scene_test2.ct 0.05689666 0.5998077 -1.149150 1.234306

mcmc_plot(novel_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] < 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.4586667

# no difference between construction 1 and construction 2  
# Final model

# maximally vague priors for the intercept 
novel_model_final <- brm(formula = semantically_correct~1+ (1|participant_private_id), data=b, family = bernoulli(link = logit), set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 3.229086  0.361725 2.601586 4.025156

mcmc_plot(novel_model_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

Judgment data

#Figure 4
RQ1_graph_judgments.df = subset(exp1_entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating" |verb_noun_type_training2 == "novel")

# aggregated dataframe for means
aggregated.graph2 = aggregate(response ~ verb_noun_type_training2 + semantically_correct + participant_private_id, RQ1_graph_judgments.df, FUN=mean)
aggregated.graph2$semantically_correct <- recode(aggregated.graph2$semantically_correct, "1" = "yes","0" = "no")

aggregated.graph2 <- rename(aggregated.graph2, verb = verb_noun_type_training2,
                                           correct = semantically_correct)

yarrr::pirateplot(formula = response ~ correct + verb,
                  data = aggregated.graph2,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

#1 alternating verb judgments

alternating_judgments.df = subset(exp1_entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating")

# aggregated dataframe for means
aggregated.means_alternating_judgments = aggregate(response ~ scene_test2 + semantically_correct + participant_private_id, alternating_judgments.df, FUN=mean)
aggregated.means_alternating_judgments$semantically_correct<- recode(aggregated.means_alternating_judgments$semantically_correct, "1" = "yes","0" = "no")
aggregated.means_alternating_judgments$scene_test2 <- recode(aggregated.means_alternating_judgments$scene_test2, "construction1" = "transitive causative","construction2" = "intransitive inchoative")

# average accuracy for semantically correct vs. incorrect trials across causative and noncausative trial types
round(tapply(aggregated.means_alternating_judgments$response, aggregated.means_alternating_judgments$semantically_correct, mean),3)

##    no   yes 
## 2.506 4.837

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_alternating_judgments$response, list(aggregated.means_alternating_judgments$semantically_correct, aggregated.means_alternating_judgments$scene_test2), mean),3)

##     transitive causative intransitive inchoative
## no                 2.512                   2.500
## yes                4.779                   4.895

c = lizCenter(alternating_judgments.df, list("scene_test2", "semantically_correct"))  

# maximally vague priors for the predictors (we don't interpret the intercept here)
alternating_model_judgments <-brm(formula = response~scene_test2.ct * semantically_correct.ct + (1 + scene_test2.ct*semantically_correct.ct|participant_private_id), data=c, family = gaussian(), set_prior("normal(0,1)", class="b"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(alternating_model_judgments, variable = c("b_Intercept", "b_scene_test2.ct", "b_semantically_correct.ct", "b_scene_test2.ct:semantically_correct.ct"))

##                                            Estimate  Est.Error        Q2.5
## b_Intercept                              3.68271272 0.07857383  3.53034940
## b_scene_test2.ct                         0.05163069 0.06823706 -0.08069133
## b_semantically_correct.ct                2.29298499 0.12902228  2.03851295
## b_scene_test2.ct:semantically_correct.ct 0.12884617 0.16448120 -0.19538464
##                                              Q97.5
## b_Intercept                              3.8380462
## b_scene_test2.ct                         0.1863481
## b_semantically_correct.ct                2.5442169
## b_scene_test2.ct:semantically_correct.ct 0.4530072

mcmc_plot(alternating_model_judgments, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_judgments))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] < 0)
C3=mean(samps[,"b_semantically_correct.ct"] < 0)
C4=mean(samps[,"b_scene_test2.ct:semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1           0.00000
## 2           0.22425
## 3           0.00000
## 4           0.21525

# no difference between construction 1 and construction 2

# Final model

# maximally vague priors for the predictors (we don't interpret the intercept here)
alternating_model_judgments_final <-brm(formula = response~semantically_correct.ct + (1 + semantically_correct.ct|participant_private_id), data=c, family = gaussian(), set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(alternating_model_judgments, variable = c("b_Intercept", "b_semantically_correct.ct"))

##                           Estimate  Est.Error     Q2.5    Q97.5
## b_Intercept               3.682713 0.07857383 3.530349 3.838046
## b_semantically_correct.ct 2.292985 0.12902228 2.038513 2.544217

mcmc_plot(alternating_model_judgments_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_judgments_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

#2 novel verb judgments

novel_judgments.df = subset(exp1_entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "novel")

# aggregated dataframe for means
aggregated.means_novel_judgments = aggregate(response ~ scene_test2 + semantically_correct + participant_private_id, novel_judgments.df, FUN=mean)


# average accuracy for semantically correct vs. incorrect trials across causative and noncausative trial types
round(tapply(aggregated.means_novel_judgments$response, aggregated.means_novel_judgments$semantically_correct, mean),3)

##     0     1 
## 2.250 4.174

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_novel_judgments$response, list(aggregated.means_novel_judgments$semantically_correct, aggregated.means_novel_judgments$scene_test2), mean),3)

##   construction1 construction2
## 0         2.302         2.198
## 1         4.221         4.128

d = lizCenter(novel_judgments.df, list("scene_test2", "semantically_correct"))  

# maximally vague priors for the predictors (we don't interpret the intercept here) 
novel_model_judgments <-brm(formula = response~scene_test2.ct * semantically_correct.ct + (1 + scene_test2.ct*semantically_correct.ct|participant_private_id), data=d, family = gaussian(), set_prior("normal(0,1)", class="b"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(novel_model_judgments, variable = c("b_Intercept", "b_scene_test2.ct", "b_semantically_correct.ct", "b_scene_test2.ct:semantically_correct.ct"))

##                                             Estimate  Est.Error       Q2.5
## b_Intercept                               3.19951819 0.11052598  2.9834941
## b_scene_test2.ct                         -0.09820619 0.09523924 -0.2873124
## b_semantically_correct.ct                 1.86935258 0.16242203  1.5439449
## b_scene_test2.ct:semantically_correct.ct  0.01157657 0.20158698 -0.3858934
##                                               Q97.5
## b_Intercept                              3.41933213
## b_scene_test2.ct                         0.08893263
## b_semantically_correct.ct                2.17951563
## b_scene_test2.ct:semantically_correct.ct 0.41492585

mcmc_plot(novel_model_judgments, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_judgments))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
C3=mean(samps[,"b_semantically_correct.ct"] < 0)
C4=mean(samps[,"b_scene_test2.ct:semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.1529167
## 3         0.0000000
## 4         0.4776667

# no difference between construction 1 and construction 2
# Final model

# maximally vague priors for the predictors (we don't interpret the intercept here) 
novel_model_judgments_final <-brm(formula = response~semantically_correct.ct + (1 + semantically_correct.ct|participant_private_id), data=d, family = gaussian(), set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(novel_model_judgments, variable = c("b_Intercept", "b_semantically_correct.ct"))

##                           Estimate Est.Error     Q2.5    Q97.5
## b_Intercept               3.199518  0.110526 2.983494 3.419332
## b_semantically_correct.ct 1.869353  0.162422 1.543945 2.179516

mcmc_plot(novel_model_judgments_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_judgments_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Question 2: Does statistical pre-emption constrain verb argument construction generalizations in adults (judgment data)?

#Figure 5

#first, filter our semantically incorrect trials

judgments_unattested_novel.df <- subset(exp1_judgment_data.df, semantically_correct == "1")   

#we only want to keep novel
judgments_novel.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items
judgments_unattested_constr1.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
judgments_unattested_constr2.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

judgments_unattested_novel.df <- rbind(judgments_novel.df, judgments_unattested_constr1.df, judgments_unattested_constr2.df)

aggregated.means = aggregate(response ~ condition + restricted_verb_noun + participant_private_id, judgments_unattested_novel.df, FUN=mean)
aggregated.means<- rename(aggregated.means, restricted = restricted_verb_noun)

yarrr::pirateplot(formula = response ~ restricted + condition,
                  data = aggregated.means,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

judgments_unattested_novel_preemption.df <- subset(judgments_unattested_novel.df, condition == "preemption")
judgments_unattested_novel_preemption.df$restricted_verb_noun <- factor(judgments_unattested_novel_preemption.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(judgments_unattested_novel_preemption.df$response, judgments_unattested_novel_preemption.df$restricted_verb_noun, mean),3)

##   yes    no 
## 2.362 3.026

#Center variables of interest using the lizCenter function:
d_unattested_novel = lizCenter(judgments_unattested_novel_preemption.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_preemption_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                            Estimate  Est.Error       Q2.5
## b_Intercept                               2.6975260 0.10033970  2.5030485
## b_restricted_verb_noun.ct                 0.6533282 0.16990371  0.3214005
## b_scene_test2.ct                         -0.1480895 0.09116533 -0.3289705
## b_restricted_verb_noun.ct:scene_test2.ct -0.1488714 0.18591350 -0.5199957
##                                               Q97.5
## b_Intercept                              2.89552686
## b_restricted_verb_noun.ct                0.98917538
## b_scene_test2.ct                         0.02997544
## b_restricted_verb_noun.ct:scene_test2.ct 0.22118958

mcmc_plot(judgments_preemption_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_preemption_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1      0.0000000000
## 2      0.0001666667
## 3      0.0506666667
## 4      0.2084166667

# SIMPLIFIED MODEL (FINAL)

#Center variables of interest using the lizCenter function:
d_unattested_novel = lizCenter(judgments_unattested_novel_preemption.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, data=d_unattested_novel, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(judgments_preemption_model, WAIC=T)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: response ~ (1 + restricted_verb_noun.ct | participant_private_id) + restricted_verb_noun.ct 
##    Data: d_unattested_novel (Number of observations: 624) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 39) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              0.59      0.08     0.45     0.77
## sd(restricted_verb_noun.ct)                0.98      0.14     0.75     1.28
## cor(Intercept,restricted_verb_noun.ct)     0.20      0.18    -0.17     0.52
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     2454     4681
## sd(restricted_verb_noun.ct)            1.00     3031     4865
## cor(Intercept,restricted_verb_noun.ct) 1.00     2237     3806
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                   2.69      0.10     2.50     2.89 1.00     1564
## restricted_verb_noun.ct     0.65      0.17     0.32     0.98 1.00     2164
##                         Tail_ESS
## Intercept                   3072
## restricted_verb_noun.ct     3469
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.75      0.02     0.70     0.79 1.00    12694     9115
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(judgments_preemption_model, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(judgments_preemption_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# BF analyses: we use the difference between attested and unattested in the pilot study reported at rpubs.com/AnnaSamara/333562 as the maximum difference we could expect in comparing rating for unattested vs. novel constructions (SD = 3.15/2)
Bf(0.17, 0.64, uniform = 0, meanoftheory = 0, sdtheory = 3.15/2, tail = 1)

## $LikelihoodTheory
## [1] 0.4641582
## 
## $Likelihoodnull
## [1] 0.001962597
## 
## $BayesFactor
## [1] 236.5021

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.17, 0.64, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# RRs for which BF > 3
ev_for_h1 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h1$sdtheory)
high_threshold <- max(ev_for_h1$sdtheory)
print(low_threshold)

## [1] 0.06

print(high_threshold)

## [1] 4

Question 3: Does statistical entrenchment constrain verb argument construction generalizations in adults (judgment data)?

#first, filter our semantically incorrect trials
entrenchment_judgments_unattested_novel.df <- subset(exp1_entrenchment_judgment.df, semantically_correct == "1")   

#we only want to keep novel

entrenchment_judgments_novel.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items

entrenchment_judgments_unattested_constr1.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
entrenchment_judgments_unattested_constr2.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

entrenchment_judgments_unattested_novel.df <- rbind(entrenchment_judgments_novel.df, entrenchment_judgments_unattested_constr1.df, entrenchment_judgments_unattested_constr2.df)
entrenchment_judgments_unattested_novel.df$restricted_verb_noun <- factor(entrenchment_judgments_unattested_novel.df$restricted_verb_noun, levels = c("yes", "no"))


round(tapply(entrenchment_judgments_unattested_novel.df$response, entrenchment_judgments_unattested_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 4.552 4.174

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                             Estimate  Est.Error       Q2.5
## b_Intercept                               4.36731817 0.11590658  4.1410813
## b_restricted_verb_noun.ct                -0.36745945 0.16675432 -0.6934754
## b_scene_test2.ct                         -0.09888335 0.08510047 -0.2660195
## b_restricted_verb_noun.ct:scene_test2.ct  0.01103506 0.16759768 -0.3166295
##                                                Q97.5
## b_Intercept                               4.59922671
## b_restricted_verb_noun.ct                -0.03799204
## b_scene_test2.ct                          0.06865547
## b_restricted_verb_noun.ct:scene_test2.ct  0.33760041

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.9870000
## 3         0.1222500
## 4         0.4724167

# SIMPLIFIED MODEL (FINAL)

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df , list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct"))

##                             Estimate Est.Error       Q2.5       Q97.5
## b_Intercept                4.3674112 0.1129804  4.1477781  4.59282312
## b_restricted_verb_noun.ct -0.3658204 0.1624594 -0.6865704 -0.04873543

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

# the effect is in the opposite direction

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.9884167

# drawing a max based on the difference between attested vs. unattested in this experiment 

# BF analyses: we use the difference between attested and unattested in this study (attested > unattested provides supporting evidence for entrenchment) as a maximum we expect when comparing ratings for unattested vs. novel constructions (SD = 0.38/2)
Bf(0.16, -0.36, uniform = 0, meanoftheory = 0, sdtheory = 0.38/2, tail = 1)

## $LikelihoodTheory
## [1] 0.0482932
## 
## $Likelihoodnull
## [1] 0.1983728
## 
## $BayesFactor
## [1] 0.2434466

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.16, -0.36, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# RRs for which BF <1/3
ev_for_h0 <- subset(data.frame(range_test), BF < 1/3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0

print(high_threshold)

## [1] 4

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?

#first, filter our semantically incorrect trials
all_judgment_unattested_novel.df <- subset(exp1_judgment_data.df, semantically_correct == "1")   

#we only want to keep novel

all_judgment_novel.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items

all_judgment_unattested_constr1.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
all_judgment_unattested_constr2.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

all_judgment_unattested_novel.df <- rbind(all_judgment_novel.df, all_judgment_unattested_constr1.df, all_judgment_unattested_constr2.df)
all_judgment_unattested_novel.df$restricted_verb_noun <- factor(all_judgment_unattested_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(all_judgment_unattested_novel.df$response, list(all_judgment_unattested_novel.df$restricted_verb_noun, all_judgment_unattested_novel.df$condition), mean),3)

##     entrenchment preemption
## yes        4.552      2.362
## no         4.174      3.026

#Center variables of interest using the lizCenter function:
df = lizCenter(all_judgment_unattested_novel.df, list("restricted_verb_noun","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_pre_vs_ent_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct * condition.ct *scene_test2.ct, data=df, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_pre_vs_ent_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_condition.ct", "b_scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct", "b_restricted_verb_noun.ct:scene_test2.ct","b_condition.ct:scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct:scene_test2.ct" ))

##                                                          Estimate  Est.Error
## b_Intercept                                            3.28857547 0.07621243
## b_restricted_verb_noun.ct                              0.27916562 0.12132220
## b_condition.ct                                        -1.64356100 0.14600455
## b_scene_test2.ct                                      -0.13282596 0.06367988
## b_restricted_verb_noun.ct:condition.ct                 0.98907390 0.22819432
## b_restricted_verb_noun.ct:scene_test2.ct              -0.09373575 0.12814121
## b_condition.ct:scene_test2.ct                         -0.05399121 0.12431465
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.15015667 0.24184938
##                                                              Q2.5        Q97.5
## b_Intercept                                            3.13780933  3.441599891
## b_restricted_verb_noun.ct                              0.04124959  0.516137541
## b_condition.ct                                        -1.92314425 -1.357362441
## b_scene_test2.ct                                      -0.25727149 -0.005789515
## b_restricted_verb_noun.ct:condition.ct                 0.54340197  1.441283807
## b_restricted_verb_noun.ct:scene_test2.ct              -0.34634971  0.151505767
## b_condition.ct:scene_test2.ct                         -0.30033626  0.184968198
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.63060650  0.316587836

mcmc_plot(judgments_pre_vs_ent_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_pre_vs_ent_model))

C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0) 
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:condition.ct"] < 0)
C5=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] > 0) 
C6=mean(samps[,"b_condition.ct:scene_test2.ct"] > 0)
C7=mean(samps[,"b_restricted_verb_noun.ct:condition.ct:scene_test2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7)
## 1                    0.01100000
## 2                    0.00000000
## 3                    0.01933333
## 4                    0.00000000
## 5                    0.23250000
## 6                    0.33450000
## 7                    0.26908333

#roughly predicted effect size from adult pilot study was 2.91. Use it as max for unattested vs. novel (SD = 2.91/2)
Bf(0.22, 0.99, uniform = 0, meanoftheory = 0, sdtheory = 2.91/2, tail = 1)

## $LikelihoodTheory
## [1] 0.4323971
## 
## $Likelihoodnull
## [1] 7.265337e-05
## 
## $BayesFactor
## [1] 5951.508

H1RANGE = seq(0,4,by=0.01) # [5-1]-[0] - max effect of preemption minus no effect of entrenchment
range_test <- Bf_range(0.22, 0.99, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.06

print(high_threshold)

## [1] 4

Exploratory data analyses

Effect of statistical pre-emption: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

# Figure 22
judgments_unattested_attested.df <- subset(exp1_judgment_data.df, semantically_correct == "1")   
judgments_unattested_attested.df <- subset(judgments_unattested_attested.df, restricted_verb_noun == "yes")   


aggregated.means1 = aggregate(response ~ condition + attested_unattested + participant_private_id, judgments_unattested_attested.df , FUN=mean)
aggregated.means1<- rename(aggregated.means1, attested = attested_unattested)

aggregated.means1$attested<- recode(aggregated.means1$attested, "1" = "yes","0" = "no")


yarrr::pirateplot(formula = response ~  attested  + condition,
                  data = aggregated.means1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

# analyses
attested_vs_unattested = subset(exp1_preemption_judgment.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested$response, attested_vs_unattested$attested_unattested, mean),3)

##     0     1 
## 2.362 4.952

# model with tested construction
#Center variables of interest using the lizCenter function:
d0_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d0_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_preemption1, variable = c("b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                            Estimate Est.Error       Q2.5
## b_scene_test2.ct                        -0.05596002 0.0584396 -0.1712673
## b_attested_unattested.ct                 2.55713610 0.1178657  2.3166948
## b_attested_unattested.ct:scene_test2.ct  0.04364495 0.0947964 -0.1424235
##                                              Q97.5
## b_scene_test2.ct                        0.05804673
## b_attested_unattested.ct                2.78392966
## b_attested_unattested.ct:scene_test2.ct 0.23149070

samps = as.matrix(as.mcmc(attested_unattested_preemption1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1       0.16575
## 2       0.00000
## 3       0.32225

#SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))


posterior_summary(attested_unattested_preemption, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                          Estimate  Est.Error     Q2.5    Q97.5
## b_Intercept              3.671873 0.05843434 3.559544 3.790096
## b_attested_unattested.ct 2.554381 0.11976518 2.312809 2.786729

mcmc_plot(attested_unattested_preemption, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_preemption))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# this prior (5.00-1.85 = 3.15) is drawn from previous pilot study with 10 adults in preemption that was preregistered at https://rpubs.com/AnnaSamara/333562
Bf(0.12, 2.55, uniform = 0, meanoftheory = 0, sdtheory = 3.15, tail = 1)

## $LikelihoodTheory
## [1] 0.1824811
## 
## $Likelihoodnull
## [1] 2.92535e-98
## 
## $BayesFactor
## [1] 6.237925e+96

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.12, 2.55, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.01

print(high_threshold)

## [1] 4

Effect of statistical entrenchment: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

attested_vs_unattested_ent = subset(exp1_entrenchment_judgment.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested_ent$response, attested_vs_unattested_ent$attested_unattested, mean),3)

##     0     1 
## 4.552 4.942

#Center variables of interest using the lizCenter function:
d_attested_unattested_ent1 = lizCenter(attested_vs_unattested_ent, list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d_attested_unattested_ent1, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment1, variable = c("b_Intercept","b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                            Estimate  Est.Error        Q2.5
## b_Intercept                              4.74989113 0.06287633  4.62555920
## b_scene_test2.ct                        -0.02856261 0.06077947 -0.14533485
## b_attested_unattested.ct                 0.38195048 0.12721966  0.12775766
## b_attested_unattested.ct:scene_test2.ct  0.14925124 0.10998681 -0.06927129
##                                              Q97.5
## b_Intercept                             4.87503942
## b_scene_test2.ct                        0.09037277
## b_attested_unattested.ct                0.63198775
## b_attested_unattested.ct:scene_test2.ct 0.36454064

mcmc_plot(attested_unattested_entrenchment1, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1   0.318916667
## 2   0.001833333
## 3   0.085250000

# SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested_ent = lizCenter(attested_vs_unattested_ent, list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested_ent, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                           Estimate  Est.Error      Q2.5     Q97.5
## b_Intercept              4.7486139 0.06220395 4.6261423 4.8694740
## b_attested_unattested.ct 0.3844025 0.12597724 0.1344789 0.6327209

mcmc_plot(attested_unattested_entrenchment, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1     0.000
## 2     0.001

# we preregistered that the max effect of entrenchment here would be 1 based on adult data suggesting difference never more than 1 with novel verbs, i.e. SD = 0.5
Bf(0.13, 0.38, uniform = 0, meanoftheory = 0, sdtheory = 0.5, tail = 1)

## $LikelihoodTheory
## [1] 1.175537
## 
## $Likelihoodnull
## [1] 0.04281328
## 
## $BayesFactor
## [1] 27.45731

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.13, 0.38, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.06

print(high_threshold)

## [1] 4

Entrenchment vs. preemption: ratings for witnessed vs. unwitnessed forms

attested_vs_unattested_across = subset(exp1_judgment_data.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested_across$response, list(attested_vs_unattested_across$condition, attested_vs_unattested_across$attested_unattested), mean),3)

##                  0     1
## entrenchment 4.552 4.942
## preemption   2.362 4.952

# model with test construction
#Center variables of interest using the lizCenter function:

d0_attested_unattested_all = lizCenter(attested_vs_unattested_across , list("attested_unattested","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_all <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct*condition.ct, data=d0_attested_unattested_all, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_all, variable = c("b_Intercept","b_attested_unattested.ct", "b_scene_test2.ct", "b_condition.ct", "b_attested_unattested.ct:scene_test2.ct", "b_attested_unattested.ct:condition.ct", "b_scene_test2.ct:condition.ct", "b_attested_unattested.ct:scene_test2.ct:condition.ct"))

##                                                         Estimate  Est.Error
## b_Intercept                                           4.05630275 0.04336671
## b_attested_unattested.ct                              1.78110746 0.08830579
## b_scene_test2.ct                                     -0.04543540 0.04336163
## b_condition.ct                                       -1.05366161 0.08374009
## b_attested_unattested.ct:scene_test2.ct               0.08165893 0.07306988
## b_attested_unattested.ct:condition.ct                 2.11837775 0.16967751
## b_scene_test2.ct:condition.ct                        -0.02832897 0.08556862
## b_attested_unattested.ct:scene_test2.ct:condition.ct -0.09968469 0.14825097
##                                                            Q2.5       Q97.5
## b_Intercept                                           3.9716074  4.14198016
## b_attested_unattested.ct                              1.6053157  1.95290431
## b_scene_test2.ct                                     -0.1317791  0.03951129
## b_condition.ct                                       -1.2179338 -0.89025300
## b_attested_unattested.ct:scene_test2.ct              -0.0634400  0.22599938
## b_attested_unattested.ct:condition.ct                 1.7827334  2.45201078
## b_scene_test2.ct:condition.ct                        -0.1949138  0.13902789
## b_attested_unattested.ct:scene_test2.ct:condition.ct -0.3898744  0.19031211

samps = as.matrix(as.mcmc(attested_unattested_all))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_condition.ct"] > 0)
C5=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)
C6=mean(samps[,"b_attested_unattested.ct:condition.ct"] < 0)
C7=mean(samps[,"b_scene_test2.ct:condition.ct"] > 0)
C8=mean(samps[,"b_attested_unattested.ct:scene_test2.ct:condition.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7,C8))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7, C8)
## 1                         0.0000000
## 2                         0.0000000
## 3                         0.1420833
## 4                         0.0000000
## 5                         0.1266667
## 6                         0.0000000
## 7                         0.3687500
## 8                         0.2478333

#Center variables of interest using the lizCenter function:
df_attested_unattested = lizCenter(attested_vs_unattested_across, list("attested_unattested", "condition"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment_preemption <- brm(formula = response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct * condition.ct, data=df_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment_preemption, variable = c("b_Intercept","b_condition.ct", "b_attested_unattested.ct","b_attested_unattested.ct:condition.ct"))

##                                        Estimate  Est.Error      Q2.5      Q97.5
## b_Intercept                            4.057233 0.04356025  3.972133  4.1422571
## b_condition.ct                        -1.050986 0.08359345 -1.212736 -0.8862566
## b_attested_unattested.ct               1.779617 0.08938987  1.606371  1.9556987
## b_attested_unattested.ct:condition.ct  2.111189 0.16975965  1.774832  2.4400939

mcmc_plot(attested_unattested_entrenchment_preemption, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment_preemption))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_attested_unattested.ct"] < 0) 
C4=mean(samps[,"b_attested_unattested.ct:condition.ct"] < 0) 

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1                 0
## 2                 0
## 3                 0
## 4                 0

#roughly predicted effect size from adult pilot study: 2.91
Bf(0.17, 2.11, uniform = 0, meanoftheory = 0, sdtheory = 2.91, tail = 1)

## $LikelihoodTheory
## [1] 0.210635
## 
## $Likelihoodnull
## [1] 8.289253e-34
## 
## $BayesFactor
## [1] 2.541061e+32

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.17, 2.11, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.02

print(high_threshold)

## [1] 4

Production data: Effect of statistical pre-emption

#Are participants producing more attested than unattested dets? we will now compare proportion of attested dets (that's the intercept) for the restricted verbs against chance 

production_preemption_attested_unattested.df <- subset(exp1_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_attested_unattested.df <- subset(production_preemption_attested_unattested.df, restricted_verb_noun =="yes")

round(tapply(production_preemption_attested_unattested.df $attested_unattested, production_preemption_attested_unattested.df $verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.994         0.994            NA

production_preemption_attested_unattested.df$verb_noun_type_training2 <- factor(production_preemption_attested_unattested.df$verb_noun_type_training2)

df_prod = lizCenter(production_preemption_attested_unattested.df , list("verb_noun_type_training2"))  

# maximally vague priors for the predictors and the intercept
prod_attested_unattested = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                   Estimate Est.Error     Q2.5   Q97.5
## b_Intercept                    4.675838339 0.4284553  3.91878 5.59129
## b_verb_noun_type_training2.ct -0.002950244 0.6229421 -1.21841 1.22448

mcmc_plot(prod_attested_unattested, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.4981667

#same analyses without verb_training_type

# maximally vague priors for the intercept
prod_attested_unattested_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(prod_attested_unattested_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 4.511537 0.3926437 3.829203 5.361209

mcmc_plot(prod_attested_unattested_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_final))
C1=mean(samps[,"b_Intercept"] < 0)


# We will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the restricted verbs than for the novel verb


production_preemption_restricted_novel.df <- subset(exp1_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_restricted_novel.df<- subset(production_preemption_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- recode(production_preemption_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.006         0.006         0.530

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.006 0.530

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun)
production_preemption_restricted_novel1.df = lizCenter(production_preemption_restricted_novel.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors and the intercept
prod_unattested_novel_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_preemption_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error      Q2.5     Q97.5
## b_Intercept               -3.243702 0.3475156 -3.966945 -2.609154
## b_restricted_verb_noun.ct  3.795713 0.6592448  2.417027  5.017706

mcmc_plot(prod_unattested_novel_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_final))
C1=mean(samps[,"b_Intercept"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# We will now compare unattested for restricted vs. alternating

production_preemption_restricted_alt.df <- subset(exp1_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_restricted_alt.df<- subset(production_preemption_restricted_alt.df, verb_noun_type_training2 != "novel")

# all forms are unwitnessed for the alternating verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_preemption_restricted_alt.df$attested_unattested <- ifelse(production_preemption_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_preemption_restricted_alt.df$det_lenient_adapted == "det_construction1", 1, production_preemption_restricted_alt.df$attested_unattested)

production_preemption_restricted_alt.df$attested_unattested <- ifelse(production_preemption_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_preemption_restricted_alt.df$det_lenient_adapted == "det_construction2", 0, production_preemption_restricted_alt.df$attested_unattested)

production_preemption_restricted_alt.df$attested_unattested <- recode(production_preemption_restricted_alt.df$attested_unattested, `1` = 0L, `0` = 1L)


round(tapply(production_preemption_restricted_alt.df$attested_unattested , production_preemption_restricted_alt.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##         0.465         0.006         0.006            NA

production_preemption_restricted_alt.df$restricted_verb_noun <- factor(production_preemption_restricted_alt.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(production_preemption_restricted_alt.df$attested_unattested , production_preemption_restricted_alt.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.006 0.465

production_preemption_restricted_alt1.df = lizCenter(production_preemption_restricted_alt.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors and the intercept
prod_unattested_alt_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_preemption_restricted_alt1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(prod_unattested_alt_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ restricted_verb_noun.ct + (1 + restricted_verb_noun.ct | participant_private_id) 
##    Data: production_preemption_restricted_alt1.df (Number of observations: 936) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 39) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              0.29      0.22     0.01     0.82
## sd(restricted_verb_noun.ct)                1.25      0.44     0.39     2.16
## cor(Intercept,restricted_verb_noun.ct)    -0.13      0.56    -0.95     0.92
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     5627     5739
## sd(restricted_verb_noun.ct)            1.00     1581     2008
## cor(Intercept,restricted_verb_noun.ct) 1.01      880     2515
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                  -2.94      0.23    -3.43    -2.53 1.00     9593
## restricted_verb_noun.ct     3.95      0.39     3.21     4.73 1.00     8270
##                         Tail_ESS
## Intercept                   6987
## restricted_verb_noun.ct     7886
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(prod_unattested_alt_final, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(prod_unattested_novel_final))
C1=mean(samps[,"b_Intercept"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Production data: Effect of statistical entrenchment

#a. Are participants producing more attested than unattested dets?
# here, we want to see how often participants say the unattested e.g. transitive-only det1 for a det2 (intransitive-only) verb in the intransitive condition at test 
# and vice versa 

production_entrenchment_attested_unattested.df  <- subset(exp1_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_attested_unattested.df  <- subset(production_entrenchment_attested_unattested.df, restricted_verb_noun =="yes")

#We want to compare attested vs. unattested trials for transitive verbs in the intransitive inchoative construction at test
production_entrenchment_attested_unattested1.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

#And intransitive inchoative verbs in the transitive construction at test. Filter out irrelevant trials
production_entrenchment_attested_unattested2.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_attested_unattested.df <- rbind(production_entrenchment_attested_unattested1.df, production_entrenchment_attested_unattested2.df)

#How much of the time are participants producing attested items?
round(mean(production_entrenchment_attested_unattested.df$attested_unattested),3)

## [1] 0.148

# and separately for each verb type
round(tapply(production_entrenchment_attested_unattested.df$attested_unattested, production_entrenchment_attested_unattested.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.174         0.122            NA

production_entrenchment_attested_unattested.df$verb_noun_type_training2 <- factor(production_entrenchment_attested_unattested.df$verb_noun_type_training2)
df_prod_ent = lizCenter((production_entrenchment_attested_unattested.df), list("verb_noun_type_training2"))  


# maximally vague priors for the predictors and the intercept
prod_attested_unattested_ent = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_ent, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                 Estimate Est.Error      Q2.5      Q97.5
## b_Intercept                   -2.7046421 0.4930968 -3.703108 -1.7674734
## b_verb_noun_type_training2.ct -0.6303854 0.4646376 -1.549967  0.3004472

mcmc_plot(prod_attested_unattested_ent, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##    c(C1, C2)
## 1 1.00000000
## 2 0.08491667

#same analyses without verb_training_type


# maximally vague priors for the intercept
prod_attested_unattested_ent_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

summary(prod_attested_unattested_ent_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ 1 + (1 | participant_private_id) 
##    Data: df_prod_ent (Number of observations: 344) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 43) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     2.81      0.63     1.84     4.28 1.00     3377     5378
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept    -2.66      0.48    -3.63    -1.76 1.00     3503     4869
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

posterior_summary(prod_attested_unattested_ent_final, variable = c("b_Intercept"))

##              Estimate Est.Error      Q2.5     Q97.5
## b_Intercept -2.656709 0.4755172 -3.632715 -1.755655

mcmc_plot(prod_attested_unattested_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 1

# c. we will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the 2 non-alternating verbs than for the novel verb (presumably the “unwitnessed” form has to be set arbitrarily here)


production_entrenchment_restricted_novel.df <- subset(exp1_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_restricted_novel.df<- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_entrenchment_restricted_novel.df$attested_unattested)
production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_entrenchment_restricted_novel.df$attested_unattested)

# select trials featuring the novel verb in the intransitive inchoative construction
production_entrenchment_restricted_novel1.df <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "novel"  & scene_test2 == "construction2")


# Select trials featuring transitive verbs in the intransitive inchoative construction at test
production_entrenchment_restricted_novel2.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

# Select trials featuring intransitive verbs in the transitive construction at test
production_entrenchment_restricted_novel3.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)


round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.174         0.122         0.036

# reverse coding to focus on unattested rather than attested for novel vs. restricted
production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)
production_entrenchment_restricted_novel.df$attested_unattested<- recode(production_entrenchment_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)
production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun, levels = c("yes", "no"))

round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.852 0.964

#what this means is that participants produce *unattested forms* less for the restricted than they do for the novel

production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun)
production_entrenchment_restricted_novel1.df = lizCenter(production_entrenchment_restricted_novel.df, list("restricted_verb_noun"))


# maximally vague priors for the predictors and the intercept
prod_unattested_novel_ent_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_entrenchment_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_ent_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error       Q2.5    Q97.5
## b_Intercept               3.1487028 0.3903613  2.4262753 3.975659
## b_restricted_verb_noun.ct 0.7198995 0.6144286 -0.4281981 1.972001

mcmc_plot(prod_unattested_novel_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1    0.0000
## 2    0.1145

# d. we will now compare unattested for restricted vs. alternating
# Do participants produce the unwitnessed form less for the 2 non-alternating verbs than for the novel verb (presumably the “unwitnessed” form has to be set arbitrarily here)


production_entrenchment_restricted_alt.df <- subset(exp1_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_restricted_alt.df<- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 != "novel")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_entrenchment_restricted_alt.df$attested_unattested <- ifelse(production_entrenchment_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_entrenchment_restricted_alt.df$det_lenient_adapted == "det_construction1", 1, production_entrenchment_restricted_alt.df$attested_unattested)
production_entrenchment_restricted_alt.df$attested_unattested <- ifelse(production_entrenchment_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_entrenchment_restricted_alt.df$det_lenient_adapted == "det_construction2", 0, production_entrenchment_restricted_alt.df$attested_unattested)

# select trials featuring the novel verb in the intransitive inchoative construction
production_entrenchment_restricted_alt1.df <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "alternating"  & scene_test2 == "construction2")


# Select trials featuring transitive verbs in the intransitive inchoative construction at test
production_entrenchment_restricted_alt2.df  <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

# Select trials featuring intransitive verbs in the transitive construction at test
production_entrenchment_restricted_alt3.df  <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_restricted_alt.df <- rbind(production_entrenchment_restricted_alt1.df, production_entrenchment_restricted_alt2.df, production_entrenchment_restricted_alt3.df)


round(tapply(production_entrenchment_restricted_alt.df$attested_unattested , production_entrenchment_restricted_alt.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##         0.081         0.174         0.122            NA

# reverse coding to focus on unattested rather than attested for novel vs. restricted
production_entrenchment_restricted_alt.df <- rbind(production_entrenchment_restricted_alt1.df, production_entrenchment_restricted_alt2.df, production_entrenchment_restricted_alt3.df)
production_entrenchment_restricted_alt.df$attested_unattested<- recode(production_entrenchment_restricted_alt.df$attested_unattested, `1` = 0L, `0` = 1L)
production_entrenchment_restricted_alt.df$restricted_verb_noun <- factor(production_entrenchment_restricted_alt.df$restricted_verb_noun, levels = c("yes", "no"))

round(tapply(production_entrenchment_restricted_alt.df$attested_unattested , production_entrenchment_restricted_alt.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.852 0.919

#what this means is that participants produce *unattested forms* less for the restricted than they do for the novel

production_entrenchment_restricted_alt.df$restricted_verb_noun <- factor(production_entrenchment_restricted_alt.df$restricted_verb_noun)
production_entrenchment_restricted_alt1.df= lizCenter(production_entrenchment_restricted_alt.df, list("restricted_verb_noun"))


# maximally vague priors for the predictors and the intercept
prod_unattested_alt_ent_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_entrenchment_restricted_alt1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(prod_unattested_alt_ent_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ restricted_verb_noun.ct + (1 + restricted_verb_noun.ct | participant_private_id) 
##    Data: production_entrenchment_restricted_alt1.df (Number of observations: 516) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 43) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              2.28      0.45     1.54     3.31
## sd(restricted_verb_noun.ct)                1.66      0.60     0.52     2.94
## cor(Intercept,restricted_verb_noun.ct)    -0.69      0.27    -0.99    -0.02
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     4342     7245
## sd(restricted_verb_noun.ct)            1.00     4977     3723
## cor(Intercept,restricted_verb_noun.ct) 1.00     5090     5452
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                   2.88      0.39     2.14     3.69 1.00     5249
## restricted_verb_noun.ct     0.24      0.52    -0.79     1.24 1.00     8331
##                         Tail_ESS
## Intercept                   7623
## restricted_verb_noun.ct     9089
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(prod_unattested_alt_ent_final, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(prod_unattested_alt_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.3251667

Experiment 2

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between the two argument-structure constructions?

Production data

#Figure 6
RQ1_graph_productions.df = subset(exp2_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating" |verb_noun_type_training2 == "novel")
RQ1_graph_productions.df = subset(RQ1_graph_productions.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.graph1 = aggregate(semantically_correct ~ verb_noun_type_training2 + participant_private_id, RQ1_graph_productions.df, FUN=mean)

aggregated.graph1 <- rename(aggregated.graph1, verb = verb_noun_type_training2,
                            correct = semantically_correct)

yarrr::pirateplot(formula = correct  ~ verb,
                  data = aggregated.graph1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "% semantically correct",
                  cex.lab = 1,
                  cex.axis = 1,
                  cex.names = 1,
                  yaxt = "n")

axis(2, at = seq(0, 1, by = 0.25), las=1)
abline(h = 0.50, lty = 2)

#1 alternating verb production

alternating_prod.df = subset(exp2_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating")

#and filter out responses where participants said something other than det1 or det2
alternating_prod.df = subset(alternating_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_alternating_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, alternating_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_alternating_prod.df$semantically_correct),3)

## [1] 0.975

# average accuracy separately for causative and inchoative scenes
round(tapply(aggregated.means_alternating_prod.df$semantically_correct, aggregated.means_alternating_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.963         0.988

# maximally vague priors for the intercept and the predictors
a = lizCenter(alternating_prod.df, list("scene_test2"))  

alternating_model <-brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=a, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model, variable = c("b_Intercept", "b_scene_test2.ct" ))

##                   Estimate Est.Error       Q2.5    Q97.5
## b_Intercept      3.6266001 0.3705089  2.9698031 4.408073
## b_scene_test2.ct 0.5183207 0.5865219 -0.6355693 1.684884

mcmc_plot(alternating_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] < 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.1840833

# no difference between construction 1 and construction 2

# Final model
# maximally vague priors for the intercept 
alternating_model_final = brm(formula = semantically_correct~1 + (1|participant_private_id), data=a, family = bernoulli(link = logit),set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5   Q97.5
## b_Intercept 3.441702 0.3323323 2.844819 4.16216

mcmc_plot(alternating_model_final, variable = "b_Intercept", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

#2 novel verb production

novel_prod.df = subset(exp2_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "novel")

#filter out responses where participants said something other than det1 or det2
novel_prod.df = subset(novel_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_novel_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, novel_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_novel_prod.df$semantically_correct),3)

## [1] 0.96

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_novel_prod.df$semantically_correct, aggregated.means_novel_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.975         0.946

b = lizCenter(novel_prod.df, list("scene_test2"))  

# maximally vague priors for the intercept and the predictors
novel_model <- brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=b, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model, variable = c("b_Intercept", "b_scene_test2.ct"))

##                   Estimate Est.Error      Q2.5     Q97.5
## b_Intercept       3.637522 0.4364774  2.867687 4.5810324
## b_scene_test2.ct -0.313324 0.6080358 -1.474418 0.9012694

mcmc_plot(novel_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1    0.0000
## 2    0.2975

# no difference between construction 1 and construction 2  
# Final model

# maximally vague priors for the intercept 
novel_model_final <- brm(formula = semantically_correct~1+ (1|participant_private_id), data=b, family = bernoulli(link = logit), set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 3.479233 0.4181484 2.733981 4.369653

mcmc_plot(novel_model_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

Judgment data

#Figure 7
RQ1_graph_judgments.df = subset(exp2_entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating" |verb_noun_type_training2 == "novel")

# aggregated dataframe for means
aggregated.graph2 = aggregate(response ~ verb_noun_type_training2 + semantically_correct + participant_private_id, RQ1_graph_judgments.df, FUN=mean)
aggregated.graph2$semantically_correct <- recode(aggregated.graph2$semantically_correct, "1" = "yes","0" = "no")

aggregated.graph2 <- rename(aggregated.graph2, verb = verb_noun_type_training2,
                                           correct = semantically_correct)

yarrr::pirateplot(formula = response ~ correct + verb,
                  data = aggregated.graph2,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

#1 alternating verb judgments

alternating_judgments.df = subset(exp2_entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating")

# aggregated dataframe for means
aggregated.means_alternating_judgments = aggregate(response ~ scene_test2 + semantically_correct + participant_private_id, alternating_judgments.df, FUN=mean)
aggregated.means_alternating_judgments$semantically_correct<- recode(aggregated.means_alternating_judgments$semantically_correct, "1" = "yes","0" = "no")
aggregated.means_alternating_judgments$scene_test2 <- recode(aggregated.means_alternating_judgments$scene_test2, "construction1" = "transitive causative","construction2" = "intransitive inchoative")

# average accuracy for semantically correct vs. incorrect trials across causative and noncausative trial types
round(tapply(aggregated.means_alternating_judgments$response, aggregated.means_alternating_judgments$semantically_correct, mean),3)

##    no   yes 
## 2.294 4.775

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_alternating_judgments$response, list(aggregated.means_alternating_judgments$semantically_correct, aggregated.means_alternating_judgments$scene_test2), mean),3)

##     transitive causative intransitive inchoative
## no                 2.350                   2.237
## yes                4.812                   4.737

c = lizCenter(alternating_judgments.df, list("scene_test2", "semantically_correct"))  

# maximally vague priors for the predictors (we don't interpret the intercept here)
alternating_model_judgments <-brm(formula = response~scene_test2.ct * semantically_correct.ct + (1 + scene_test2.ct*semantically_correct.ct|participant_private_id), data=c, family = gaussian(), set_prior("normal(0,1)", class="b"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))


posterior_summary(alternating_model_judgments, variable = c("b_Intercept", "b_scene_test2.ct", "b_semantically_correct.ct", "b_scene_test2.ct:semantically_correct.ct"))

##                                             Estimate  Est.Error       Q2.5
## b_Intercept                               3.54628377 0.07082621  3.4060994
## b_scene_test2.ct                         -0.09247006 0.07814274 -0.2442150
## b_semantically_correct.ct                 2.43166799 0.14131597  2.1463628
## b_scene_test2.ct:semantically_correct.ct  0.04277391 0.15538705 -0.2579325
##                                               Q97.5
## b_Intercept                              3.68571815
## b_scene_test2.ct                         0.06307272
## b_semantically_correct.ct                2.70813867
## b_scene_test2.ct:semantically_correct.ct 0.34549299

mcmc_plot(alternating_model_judgments, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_judgments))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] < 0)
C3=mean(samps[,"b_semantically_correct.ct"] < 0)
C4=mean(samps[,"b_scene_test2.ct:semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.8819167
## 3         0.0000000
## 4         0.3910833

# no difference between construction 1 and construction 2

# Final model

# maximally vague priors for the predictors (we don't interpret the intercept here)
alternating_model_judgments_final <-brm(formula = response~semantically_correct.ct + (1 + semantically_correct.ct|participant_private_id), data=c, family = gaussian(), set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model_judgments, variable = c("b_Intercept", "b_semantically_correct.ct"))

##                           Estimate  Est.Error     Q2.5    Q97.5
## b_Intercept               3.546284 0.07082621 3.406099 3.685718
## b_semantically_correct.ct 2.431668 0.14131597 2.146363 2.708139

mcmc_plot(alternating_model_judgments_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_judgments_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

#2 novel verb judgments

novel_judgments.df = subset(exp2_entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "novel")

# aggregated dataframe for means
aggregated.means_novel_judgments = aggregate(response ~ scene_test2 + semantically_correct + participant_private_id, novel_judgments.df, FUN=mean)


# average accuracy for semantically correct vs. incorrect trials across causative and noncausative trial types
round(tapply(aggregated.means_novel_judgments$response, aggregated.means_novel_judgments$semantically_correct, mean),3)

##     0     1 
## 2.163 3.944

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_novel_judgments$response, list(aggregated.means_novel_judgments$semantically_correct, aggregated.means_novel_judgments$scene_test2), mean),3)

##   construction1 construction2
## 0         2.188         2.138
## 1         3.938         3.950

d = lizCenter(novel_judgments.df, list("scene_test2", "semantically_correct"))  

# maximally vague priors for the predictors (we don't interpret the intercept here) 
novel_model_judgments <-brm(formula = response~scene_test2.ct * semantically_correct.ct + (1 + scene_test2.ct*semantically_correct.ct|participant_private_id), data=d, family = gaussian(), set_prior("normal(0,1)", class="b"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_judgments, variable = c("b_Intercept", "b_scene_test2.ct", "b_semantically_correct.ct", "b_scene_test2.ct:semantically_correct.ct"))

##                                             Estimate  Est.Error       Q2.5
## b_Intercept                               3.04234418 0.11248308  2.8174563
## b_scene_test2.ct                         -0.02156404 0.09240455 -0.2051258
## b_semantically_correct.ct                 1.72141461 0.17951082  1.3738847
## b_scene_test2.ct:semantically_correct.ct  0.05567715 0.18893781 -0.3130332
##                                              Q97.5
## b_Intercept                              3.2618444
## b_scene_test2.ct                         0.1588175
## b_semantically_correct.ct                2.0730219
## b_scene_test2.ct:semantically_correct.ct 0.4238116

mcmc_plot(novel_model_judgments, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_judgments))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
C3=mean(samps[,"b_semantically_correct.ct"] < 0)
C4=mean(samps[,"b_scene_test2.ct:semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1            0.0000
## 2            0.4065
## 3            0.0000
## 4            0.3815

# no difference between construction 1 and construction 2
# Final model

# maximally vague priors for the predictors (we don't interpret the intercept here) 
novel_model_judgments_final <-brm(formula = response~semantically_correct.ct + (1 + semantically_correct.ct|participant_private_id), data=d, family = gaussian(), set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_judgments, variable = c("b_Intercept", "b_semantically_correct.ct"))

##                           Estimate Est.Error     Q2.5    Q97.5
## b_Intercept               3.042344 0.1124831 2.817456 3.261844
## b_semantically_correct.ct 1.721415 0.1795108 1.373885 2.073022

mcmc_plot(novel_model_judgments_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_judgments_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Question 2: Does statistical pre-emption constrain verb argument construction generalizations in adults (judgment data)?

#Figure 8

#first, filter our semantically incorrect trials

judgments_unattested_novel.df <- subset(exp2_judgment_data.df, semantically_correct == "1")   

#we only want to keep novel
judgments_novel.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items
judgments_unattested_constr1.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
judgments_unattested_constr2.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

judgments_unattested_novel.df <- rbind(judgments_novel.df, judgments_unattested_constr1.df, judgments_unattested_constr2.df)

aggregated.means = aggregate(response ~ condition + restricted_verb_noun + participant_private_id, judgments_unattested_novel.df, FUN=mean)
aggregated.means<- rename(aggregated.means, restricted = restricted_verb_noun)

yarrr::pirateplot(formula = response ~ restricted + condition,
                  data = aggregated.means,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

judgments_unattested_novel_preemption.df <- subset(judgments_unattested_novel.df, condition == "preemption")
judgments_unattested_novel_preemption.df$restricted_verb_noun <- factor(judgments_unattested_novel_preemption.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(judgments_unattested_novel_preemption.df$response, judgments_unattested_novel_preemption.df$restricted_verb_noun, mean),3)

##   yes    no 
## 2.259 2.644

#Center variables of interest using the lizCenter function:
d_unattested_novel = lizCenter(judgments_unattested_novel_preemption.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(judgments_preemption_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                             Estimate  Est.Error         Q2.5
## b_Intercept                               2.45178030 0.09144855  2.268004556
## b_restricted_verb_noun.ct                 0.36794367 0.18401469 -0.001360611
## b_scene_test2.ct                         -0.07957874 0.10031268 -0.279484951
## b_restricted_verb_noun.ct:scene_test2.ct  0.22670332 0.12186523 -0.011716821
##                                              Q97.5
## b_Intercept                              2.6353740
## b_restricted_verb_noun.ct                0.7248496
## b_scene_test2.ct                         0.1148226
## b_restricted_verb_noun.ct:scene_test2.ct 0.4683699

mcmc_plot(judgments_preemption_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_preemption_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1        0.00000000
## 2        0.02550000
## 3        0.21150000
## 4        0.03083333

# break down the interaction

preemption_1 = subset(judgments_unattested_novel_preemption.df, scene_test2 == "construction1")   
round(tapply(preemption_1$response, preemption_1$restricted_verb_noun, mean),3)

##   yes    no 
## 2.356 2.625

#Center variables of interest using the lizCenter function:
d_unattested_novel_con1 = lizCenter(preemption_1, list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_con1 <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, d_unattested_novel_con1, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

summary(judgments_preemption_con1, WAIC=T)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: response ~ (1 + restricted_verb_noun.ct | participant_private_id) + restricted_verb_noun.ct 
##    Data: d_unattested_novel_con1 (Number of observations: 320) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 40) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              0.62      0.08     0.47     0.80
## sd(restricted_verb_noun.ct)                1.17      0.16     0.89     1.52
## cor(Intercept,restricted_verb_noun.ct)     0.32      0.17    -0.03     0.61
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     2678     5894
## sd(restricted_verb_noun.ct)            1.00     3277     6162
## cor(Intercept,restricted_verb_noun.ct) 1.00     2043     4113
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                   2.49      0.10     2.29     2.69 1.00     1868
## restricted_verb_noun.ct     0.26      0.19    -0.13     0.64 1.00     2466
##                         Tail_ESS
## Intercept                   3864
## restricted_verb_noun.ct     4679
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.65      0.03     0.59     0.71 1.00     9356     8987
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(judgments_preemption_con1, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(judgments_preemption_con1))
C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C1

## [1] 0.088

# use unattested vs. novel in original study as an estimate of difference in unattested vs. novel
Bf(0.20, 0.26, uniform = 0, meanoftheory = 0, sdtheory = 0.65, tail = 1)

## $LikelihoodTheory
## [1] 0.9721199
## 
## $Likelihoodnull
## [1] 0.856843
## 
## $BayesFactor
## [1] 1.134537

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.20, 0.26, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)


# find values for which BF is inconclusive 
ev_for_h1 <- subset(data.frame(range_test), BF < 3 & BF > 1/3)
low_threshold <- min(ev_for_h1$sdtheory)
high_threshold <- max(ev_for_h1$sdtheory)
print(low_threshold)

## [1] 0.01

print(high_threshold)

## [1] 2.48

preemption_2 = subset(judgments_unattested_novel_preemption.df, scene_test2 == "construction2")   
round(tapply(preemption_2$response, preemption_2$restricted_verb_noun, mean),3)

##   yes    no 
## 2.163 2.663

#Center variables of interest using the lizCenter function:
d_unattested_novel_con2 = lizCenter(preemption_2, list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_con2 <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, d_unattested_novel_con2, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

summary(judgments_preemption_con2, WAIC=T)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: response ~ (1 + restricted_verb_noun.ct | participant_private_id) + restricted_verb_noun.ct 
##    Data: d_unattested_novel_con2 (Number of observations: 320) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 40) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              0.60      0.08     0.47     0.79
## sd(restricted_verb_noun.ct)                1.13      0.15     0.87     1.47
## cor(Intercept,restricted_verb_noun.ct)     0.31      0.16    -0.03     0.61
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     3230     5192
## sd(restricted_verb_noun.ct)            1.00     3762     6484
## cor(Intercept,restricted_verb_noun.ct) 1.00     2775     4966
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                   2.41      0.10     2.20     2.60 1.00     2551
## restricted_verb_noun.ct     0.48      0.19     0.11     0.86 1.00     3035
##                         Tail_ESS
## Intercept                   4644
## restricted_verb_noun.ct     4698
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.64      0.03     0.58     0.70 1.00    10785     9340
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(judgments_preemption_con2, variable = "^b_", regex = TRUE)
dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(judgments_preemption_con2))
C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C1

## [1] 0.007416667

# use unattested vs. novel in original study as an estimate of difference in unattested vs. novel
Bf(0.19, 0.48, uniform = 0, meanoftheory = 0, sdtheory = 0.65, tail = 1)

## $LikelihoodTheory
## [1] 0.9093003
## 
## $Likelihoodnull
## [1] 0.08635029
## 
## $BayesFactor
## [1] 10.53037

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.19, 0.48, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)


# find values for which BF is inconclusive 
ev_for_h1 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h1$sdtheory)
high_threshold <- max(ev_for_h1$sdtheory)
print(low_threshold)

## [1] 0.1

print(high_threshold)

## [1] 3.01

Question 3: Does statistical entrenchment constrain verb argument construction generalizations in adults (judgment data)?

#first, filter our semantically incorrect trials
entrenchment_judgments_unattested_novel.df <- subset(exp2_entrenchment_judgment.df, semantically_correct == "1")   

#we only want to keep novel

entrenchment_judgments_novel.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items

entrenchment_judgments_unattested_constr1.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
entrenchment_judgments_unattested_constr2.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

entrenchment_judgments_unattested_novel.df <- rbind(entrenchment_judgments_novel.df, entrenchment_judgments_unattested_constr1.df, entrenchment_judgments_unattested_constr2.df)
entrenchment_judgments_unattested_novel.df$restricted_verb_noun <- factor(entrenchment_judgments_unattested_novel.df$restricted_verb_noun, levels = c("yes", "no"))


round(tapply(entrenchment_judgments_unattested_novel.df$response, entrenchment_judgments_unattested_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 4.356 3.944

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                             Estimate Est.Error       Q2.5
## b_Intercept                               4.15973393 0.1248652  3.9108998
## b_restricted_verb_noun.ct                -0.39966487 0.1744838 -0.7392334
## b_scene_test2.ct                         -0.03534368 0.1162370 -0.2605339
## b_restricted_verb_noun.ct:scene_test2.ct  0.09518764 0.2134859 -0.3302396
##                                                Q97.5
## b_Intercept                               4.40790179
## b_restricted_verb_noun.ct                -0.05662211
## b_scene_test2.ct                          0.19131322
## b_restricted_verb_noun.ct:scene_test2.ct  0.50168825

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.9879167
## 3         0.3778333
## 4         0.3206667

# SIMPLIFIED MODEL (FINAL)

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df , list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct"))

##                             Estimate Est.Error      Q2.5       Q97.5
## b_Intercept                4.1539162 0.1252855  3.905502  4.39834704
## b_restricted_verb_noun.ct -0.4010873 0.1728762 -0.742063 -0.06216213

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

# the effect is in the opposite direction

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1   0.00000
## 2   0.98925

# drawing a max based on the difference between attested vs. unattested in this experiment (this was sig. evidence for entrenchment)
Bf(0.17, -0.40, uniform = 0, meanoftheory = 0, sdtheory = 0.38/2, tail = 1)

## $LikelihoodTheory
## [1] 0.03663446
## 
## $Likelihoodnull
## [1] 0.1473201
## 
## $BayesFactor
## [1] 0.2486726

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.17, -0.40, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF < 1/3
ev_for_h1 <- subset(data.frame(range_test), BF < 1/3)
low_threshold <- min(ev_for_h1$sdtheory)
high_threshold <- max(ev_for_h1$sdtheory)
print(low_threshold)

## [1] 0

print(high_threshold)

## [1] 4

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?

#first, filter our semantically incorrect trials
all_judgment_unattested_novel.df <- subset(exp2_judgment_data.df, semantically_correct == "1")   

#we only want to keep novel

all_judgment_novel.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items

all_judgment_unattested_constr1.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
all_judgment_unattested_constr2.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

all_judgment_unattested_novel.df <- rbind(all_judgment_novel.df, all_judgment_unattested_constr1.df, all_judgment_unattested_constr2.df)
all_judgment_unattested_novel.df$restricted_verb_noun <- factor(all_judgment_unattested_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(all_judgment_unattested_novel.df$response, list(all_judgment_unattested_novel.df$restricted_verb_noun, all_judgment_unattested_novel.df$condition), mean),3)

##     entrenchment preemption
## yes        4.356      2.259
## no         3.944      2.644

#Center variables of interest using the lizCenter function:
df = lizCenter(all_judgment_unattested_novel.df, list("restricted_verb_noun","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_pre_vs_ent_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct * condition.ct *scene_test2.ct, data=df, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_pre_vs_ent_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_condition.ct", "b_scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct", "b_restricted_verb_noun.ct:scene_test2.ct","b_condition.ct:scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct:scene_test2.ct" ))

##                                                          Estimate  Est.Error
## b_Intercept                                            3.01881836 0.07888499
## b_restricted_verb_noun.ct                              0.10835512 0.12869066
## b_condition.ct                                        -1.66901674 0.14754120
## b_scene_test2.ct                                      -0.06405588 0.07553926
## b_restricted_verb_noun.ct:condition.ct                 0.75782967 0.24962351
## b_restricted_verb_noun.ct:scene_test2.ct               0.18524868 0.10483271
## b_condition.ct:scene_test2.ct                         -0.03795273 0.14700148
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct  0.12479257 0.21325009
##                                                             Q2.5       Q97.5
## b_Intercept                                            2.8663346  3.17684687
## b_restricted_verb_noun.ct                             -0.1487069  0.36421494
## b_condition.ct                                        -1.9527179 -1.37676092
## b_scene_test2.ct                                      -0.2116889  0.08375014
## b_restricted_verb_noun.ct:condition.ct                 0.2649321  1.24583743
## b_restricted_verb_noun.ct:scene_test2.ct              -0.0163897  0.38795779
## b_condition.ct:scene_test2.ct                         -0.3288720  0.24954795
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.2962083  0.54802580

mcmc_plot(judgments_pre_vs_ent_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_pre_vs_ent_model))

C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0) 
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:condition.ct"] < 0)
C5=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0) 
C6=mean(samps[,"b_condition.ct:scene_test2.ct"] > 0)
C7=mean(samps[,"b_restricted_verb_noun.ct:condition.ct:scene_test2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7)
## 1                   0.195250000
## 2                   0.000000000
## 3                   0.191250000
## 4                   0.001166667
## 5                   0.036583333
## 6                   0.402500000
## 7                   0.280666667

# SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
df = lizCenter(all_judgment_unattested_novel.df, list("restricted_verb_noun", "condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here)
judgments_pre_vs_ent_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct * condition.ct, data=df, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_pre_vs_ent_model, variable = c("b_restricted_verb_noun.ct", "b_condition.ct", "b_restricted_verb_noun.ct:condition.ct"))

##                                          Estimate Est.Error       Q2.5
## b_restricted_verb_noun.ct               0.1155508 0.1293006 -0.1383229
## b_condition.ct                         -1.6670286 0.1513004 -1.9618560
## b_restricted_verb_noun.ct:condition.ct  0.7652934 0.2486626  0.2847811
##                                             Q97.5
## b_restricted_verb_noun.ct               0.3737962
## b_condition.ct                         -1.3631202
## b_restricted_verb_noun.ct:condition.ct  1.2550545

mcmc_plot(judgments_pre_vs_ent_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_pre_vs_ent_model))

C1=mean(samps[,"b_restricted_verb_noun.ct"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct:condition.ct"] > 0) 
C3=mean(samps[,"b_condition.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1     0.8195833
## 2     0.9990833
## 3     0.0000000

#roughly predicted effect size from previous study was 1.0. Use it as an estimate of the effect we expect here
Bf(0.24, 0.78, uniform = 0, meanoftheory = 0, sdtheory = 1.00, tail = 1)

## $LikelihoodTheory
## [1] 0.5814428
## 
## $Likelihoodnull
## [1] 0.008454367
## 
## $BayesFactor
## [1] 68.77426

H1RANGE = seq(0,4,by=0.01) # [5-1]-[0] - max effect of preemption minus no effect of entrenchment
range_test <- Bf_range(0.24, 0.78, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.09

print(high_threshold)

## [1] 4

Exploratory data analyses

Effect of statistical pre-emption: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

# Figure 25
judgments_unattested_attested.df <- subset(exp2_judgment_data.df, semantically_correct == "1")   
judgments_unattested_attested.df <- subset(judgments_unattested_attested.df, restricted_verb_noun == "yes")   


aggregated.means1 = aggregate(response ~ condition + attested_unattested + participant_private_id, judgments_unattested_attested.df , FUN=mean)
aggregated.means1<- rename(aggregated.means1, attested = attested_unattested)

aggregated.means1$attested<- recode(aggregated.means1$attested, "1" = "yes","0" = "no")


yarrr::pirateplot(formula = response ~  attested  + condition,
                  data = aggregated.means1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

# analyses
attested_vs_unattested = subset(exp2_preemption_judgment.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested$response, attested_vs_unattested$attested_unattested, mean),3)

##     0     1 
## 2.259 4.881

# model with tested construction
#Center variables of interest using the lizCenter function:
d0_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d0_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_preemption1, variable = c("b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                           Estimate  Est.Error       Q2.5
## b_scene_test2.ct                        -0.1337024 0.06260864 -0.2563912
## b_attested_unattested.ct                 2.5833375 0.11872216  2.3466720
## b_attested_unattested.ct:scene_test2.ct  0.1175276 0.12277555 -0.1273794
##                                                Q97.5
## b_scene_test2.ct                        -0.008456757
## b_attested_unattested.ct                 2.810375854
## b_attested_unattested.ct:scene_test2.ct  0.357050899

samps = as.matrix(as.mcmc(attested_unattested_preemption1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1    0.01766667
## 2    0.00000000
## 3    0.16800000

#SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))


posterior_summary(attested_unattested_preemption, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                          Estimate  Est.Error     Q2.5    Q97.5
## b_Intercept              3.581813 0.05437162 3.475852 3.687969
## b_attested_unattested.ct 2.587041 0.11688509 2.354051 2.814774

mcmc_plot(attested_unattested_preemption, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_preemption))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# this priors drawn from Experiment 1
Bf(0.12, 2.59, uniform = 0, meanoftheory = 0, sdtheory = 2.55, tail = 1)

## $LikelihoodTheory
## [1] 0.1868105
## 
## $Likelihoodnull
## [1] 2.321655e-101
## 
## $BayesFactor
## [1] 8.046437e+99

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.12, 2.59, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.01

print(high_threshold)

## [1] 4

Effect of statistical entrenchment: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

attested_vs_unattested_ent = subset(exp2_entrenchment_judgment.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested_ent$response, attested_vs_unattested_ent$attested_unattested, mean),3)

##     0     1 
## 4.356 4.925

#Center variables of interest using the lizCenter function:
d_attested_unattested_ent1 = lizCenter(attested_vs_unattested_ent, list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d_attested_unattested_ent1, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment1, variable = c("b_Intercept","b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                            Estimate  Est.Error       Q2.5
## b_Intercept                              4.64388926 0.07554307  4.4971730
## b_scene_test2.ct                        -0.02884403 0.08877478 -0.2024878
## b_attested_unattested.ct                 0.55938530 0.13882717  0.2803921
## b_attested_unattested.ct:scene_test2.ct  0.10877906 0.18546947 -0.2567433
##                                             Q97.5
## b_Intercept                             4.7944757
## b_scene_test2.ct                        0.1500898
## b_attested_unattested.ct                0.8259997
## b_attested_unattested.ct:scene_test2.ct 0.4724480

mcmc_plot(attested_unattested_entrenchment1, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1       0.36900
## 2       0.00000
## 3       0.27875

# SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested_ent = lizCenter(attested_vs_unattested_ent, list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested_ent, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                           Estimate  Est.Error      Q2.5    Q97.5
## b_Intercept              4.6448417 0.07498229 4.5003573 4.791180
## b_attested_unattested.ct 0.5592418 0.13822512 0.2867082 0.831733

mcmc_plot(attested_unattested_entrenchment, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# expect a difference of 0.38 from previous work
Bf(0.14, 0.56, uniform = 0, meanoftheory = 0, sdtheory = 0.38, tail = 1)

## $LikelihoodTheory
## [1] 0.7572747
## 
## $Likelihoodnull
## [1] 0.0009559302
## 
## $BayesFactor
## [1] 792.1862

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.14, 0.56, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.04

print(high_threshold)

## [1] 4

Entrenchment vs. preemption: ratings for witnessed vs. unwitnessed forms

attested_vs_unattested_across = subset(exp2_judgment_data.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested_across$response, list(attested_vs_unattested_across$condition, attested_vs_unattested_across$attested_unattested), mean),3)

##                  0     1
## entrenchment 4.356 4.925
## preemption   2.259 4.881

# model with test construction
#Center variables of interest using the lizCenter function:

d0_attested_unattested_all = lizCenter(attested_vs_unattested_across , list("attested_unattested","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_all <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct*condition.ct, data=d0_attested_unattested_all, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_all, variable = c("b_Intercept","b_attested_unattested.ct", "b_scene_test2.ct", "b_condition.ct", "b_attested_unattested.ct:scene_test2.ct", "b_attested_unattested.ct:condition.ct", "b_scene_test2.ct:condition.ct", "b_attested_unattested.ct:scene_test2.ct:condition.ct"))

##                                                          Estimate  Est.Error
## b_Intercept                                           3.939971396 0.04660329
## b_attested_unattested.ct                              1.907495984 0.09209161
## b_scene_test2.ct                                     -0.099359617 0.05239362
## b_condition.ct                                       -1.033460787 0.09103844
## b_attested_unattested.ct:scene_test2.ct               0.116524683 0.10540988
## b_attested_unattested.ct:condition.ct                 1.965400996 0.18060166
## b_scene_test2.ct:condition.ct                        -0.100171305 0.10522381
## b_attested_unattested.ct:scene_test2.ct:condition.ct  0.006789355 0.21078493
##                                                             Q2.5        Q97.5
## b_Intercept                                           3.84854057  4.031774054
## b_attested_unattested.ct                              1.72606349  2.085926845
## b_scene_test2.ct                                     -0.20234800  0.003929018
## b_condition.ct                                       -1.21103983 -0.855307857
## b_attested_unattested.ct:scene_test2.ct              -0.08848119  0.324011486
## b_attested_unattested.ct:condition.ct                 1.61016450  2.324585561
## b_scene_test2.ct:condition.ct                        -0.30516778  0.105181090
## b_attested_unattested.ct:scene_test2.ct:condition.ct -0.40942970  0.416542396

samps = as.matrix(as.mcmc(attested_unattested_all))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_condition.ct"] > 0)
C5=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)
C6=mean(samps[,"b_attested_unattested.ct:condition.ct"] < 0)
C7=mean(samps[,"b_scene_test2.ct:condition.ct"] > 0)
C8=mean(samps[,"b_attested_unattested.ct:scene_test2.ct:condition.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7,C8))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7, C8)
## 1                        0.00000000
## 2                        0.00000000
## 3                        0.02941667
## 4                        0.00000000
## 5                        0.13291667
## 6                        0.00000000
## 7                        0.16675000
## 8                        0.48950000

#roughly predicted effect size from previous study 2.11

Bf(0.18, 1.97, uniform = 0, meanoftheory = 0, sdtheory = 2.11, tail = 1)

## $LikelihoodTheory
## [1] 0.2444349
## 
## $Likelihoodnull
## [1] 2.165476e-26
## 
## $BayesFactor
## [1] 1.128781e+25

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.18, 1.97, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.02

print(high_threshold)

## [1] 4

Production data: Effect of statistical pre-emption

#Are participants producing more attested than unattested dets? we will now compare proportion of attested dets (that's the intercept) for the restricted verbs against chance 

production_preemption_attested_unattested.df <- subset(exp2_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_attested_unattested.df <- subset(production_preemption_attested_unattested.df, restricted_verb_noun =="yes")

round(tapply(production_preemption_attested_unattested.df $attested_unattested, production_preemption_attested_unattested.df $verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.997         0.990            NA

production_preemption_attested_unattested.df$verb_noun_type_training2 <- factor(production_preemption_attested_unattested.df$verb_noun_type_training2)

df_prod = lizCenter(production_preemption_attested_unattested.df , list("verb_noun_type_training2"))  

# maximally vague priors for the predictors and the intercept
prod_attested_unattested = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                 Estimate Est.Error      Q2.5     Q97.5
## b_Intercept                    4.5858113 0.3989542  3.874601 5.4370211
## b_verb_noun_type_training2.ct -0.2925743 0.6054233 -1.475858 0.9275525

mcmc_plot(prod_attested_unattested, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.3088333

#same analyses without verb_training_type

# maximally vague priors for the intercept
prod_attested_unattested_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 4.442832 0.3754269 3.781418 5.260213

mcmc_plot(prod_attested_unattested_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_final))
C1=mean(samps[,"b_Intercept"] < 0)


# We will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the restricted verbs than for the novel verb


production_preemption_restricted_novel.df <- subset(exp2_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_restricted_novel.df<- subset(production_preemption_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- recode(production_preemption_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.003         0.010         0.522

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.007 0.522

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun)
production_preemption_restricted_novel1.df = lizCenter(production_preemption_restricted_novel.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors and the intercept
prod_unattested_novel_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_preemption_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error      Q2.5     Q97.5
## b_Intercept               -3.059238 0.2614348 -3.596574 -2.568375
## b_restricted_verb_noun.ct  3.887547 0.5343414  2.797347  4.912210

mcmc_plot(prod_unattested_novel_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_final))
C1=mean(samps[,"b_Intercept"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# We will now compare unattested for restricted vs. alternating

production_preemption_restricted_alt.df <- subset(exp2_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_restricted_alt.df<- subset(production_preemption_restricted_alt.df, verb_noun_type_training2 != "novel")

# all forms are unwitnessed for the alternating verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_preemption_restricted_alt.df$attested_unattested <- ifelse(production_preemption_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_preemption_restricted_alt.df$det_lenient_adapted == "det_construction1", 1, production_preemption_restricted_alt.df$attested_unattested)

production_preemption_restricted_alt.df$attested_unattested <- ifelse(production_preemption_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_preemption_restricted_alt.df$det_lenient_adapted == "det_construction2", 0, production_preemption_restricted_alt.df$attested_unattested)

production_preemption_restricted_alt.df$attested_unattested <- recode(production_preemption_restricted_alt.df$attested_unattested, `1` = 0L, `0` = 1L)


round(tapply(production_preemption_restricted_alt.df$attested_unattested , production_preemption_restricted_alt.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##         0.484         0.003         0.010            NA

production_preemption_restricted_alt.df$restricted_verb_noun <- factor(production_preemption_restricted_alt.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(production_preemption_restricted_alt.df$attested_unattested , production_preemption_restricted_alt.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.007 0.484

production_preemption_restricted_alt1.df = lizCenter(production_preemption_restricted_alt.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors and the intercept
prod_unattested_alt_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_preemption_restricted_alt1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(prod_unattested_alt_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ restricted_verb_noun.ct + (1 + restricted_verb_noun.ct | participant_private_id) 
##    Data: production_preemption_restricted_alt1.df (Number of observations: 920) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 39) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              0.46      0.25     0.03     0.99
## sd(restricted_verb_noun.ct)                0.54      0.36     0.03     1.33
## cor(Intercept,restricted_verb_noun.ct)     0.06      0.55    -0.91     0.95
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     2627     3417
## sd(restricted_verb_noun.ct)            1.00     2436     5599
## cor(Intercept,restricted_verb_noun.ct) 1.00     6126     6322
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                  -2.86      0.23    -3.32    -2.44 1.00    13084
## restricted_verb_noun.ct     4.12      0.35     3.44     4.83 1.00    10639
##                         Tail_ESS
## Intercept                   9293
## restricted_verb_noun.ct     8914
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(prod_unattested_alt_final, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(prod_unattested_novel_final))
C1=mean(samps[,"b_Intercept"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Production data: Effect of statistical entrenchment

#a. Are participants producing more attested than unattested dets?
# here, we want to see how often participants say the unattested e.g. transitive-only det1 for a det2 (intransitive-only) verb in the intransitive condition at test 
# and vice versa 

production_entrenchment_attested_unattested.df  <- subset(exp2_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_attested_unattested.df  <- subset(production_entrenchment_attested_unattested.df, restricted_verb_noun =="yes")

#We want to compare attested vs. unattested trials for transitive verbs in the intransitive inchoative construction at test
production_entrenchment_attested_unattested1.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

#And intransitive inchoative verbs in the transitive construction at test. Filter out irrelevant trials
production_entrenchment_attested_unattested2.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_attested_unattested.df <- rbind(production_entrenchment_attested_unattested1.df, production_entrenchment_attested_unattested2.df)

#How much of the time are participants producing attested items?
round(mean(production_entrenchment_attested_unattested.df$attested_unattested),3)

## [1] 0.123

# and separately for each verb type
round(tapply(production_entrenchment_attested_unattested.df$attested_unattested, production_entrenchment_attested_unattested.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.138         0.107            NA

production_entrenchment_attested_unattested.df$verb_noun_type_training2 <- factor(production_entrenchment_attested_unattested.df$verb_noun_type_training2)
df_prod_ent = lizCenter((production_entrenchment_attested_unattested.df), list("verb_noun_type_training2"))  


# maximally vague priors for the predictors and the intercept
prod_attested_unattested_ent = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_ent, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                  Estimate Est.Error      Q2.5     Q97.5
## b_Intercept                   -2.74740959 0.4366748 -3.653896 -1.943732
## b_verb_noun_type_training2.ct -0.06057054 0.5242888 -1.045509  1.013271

mcmc_plot(prod_attested_unattested_ent, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 1.0000000
## 2 0.4420833

#same analyses without verb_training_type


# maximally vague priors for the intercept
prod_attested_unattested_ent_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

summary(prod_attested_unattested_ent_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ 1 + (1 | participant_private_id) 
##    Data: df_prod_ent (Number of observations: 318) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 40) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     2.18      0.53     1.33     3.41 1.00     3981     5810
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept    -2.67      0.41    -3.53    -1.90 1.00     7307     7989
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

posterior_summary(prod_attested_unattested_ent_final, variable = c("b_Intercept"))

##             Estimate Est.Error      Q2.5     Q97.5
## b_Intercept  -2.6704 0.4139377 -3.525481 -1.899866

mcmc_plot(prod_attested_unattested_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 1

# c. we will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the 2 non-alternating verbs than for the novel verb (presumably the “unwitnessed” form has to be set arbitrarily here)


production_entrenchment_restricted_novel.df <- subset(exp2_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_restricted_novel.df<- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_entrenchment_restricted_novel.df$attested_unattested)
production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_entrenchment_restricted_novel.df$attested_unattested)

# select trials featuring the novel verb in the intransitive inchoative construction
production_entrenchment_restricted_novel1.df <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "novel"  & scene_test2 == "construction2")


# Select trials featuring transitive verbs in the intransitive inchoative construction at test
production_entrenchment_restricted_novel2.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

# Select trials featuring intransitive verbs in the transitive construction at test
production_entrenchment_restricted_novel3.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)


round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.138         0.107         0.051

# reverse coding to focus on unattested rather than attested for novel vs. restricted
production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)
production_entrenchment_restricted_novel.df$attested_unattested<- recode(production_entrenchment_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)
production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun, levels = c("yes", "no"))

round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.877 0.949

#what this means is that participants produce *unattested forms* less for the restricted than they do for the novel

production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun)
production_entrenchment_restricted_novel1.df = lizCenter(production_entrenchment_restricted_novel.df, list("restricted_verb_noun"))


# maximally vague priors for the predictors and the intercept
prod_unattested_novel_ent_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_entrenchment_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_ent_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                           Estimate Est.Error     Q2.5    Q97.5
## b_Intercept               3.072808 0.4185109 2.279330 3.943038
## b_restricted_verb_noun.ct 1.012210 0.5076480 0.036819 2.039624

mcmc_plot(prod_unattested_novel_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##    c(C1, C2)
## 1 0.00000000
## 2 0.02166667

# d. we will now compare unattested for restricted vs. alternating
# Do participants produce the unwitnessed form less for the 2 non-alternating verbs than for the novel verb (presumably the “unwitnessed” form has to be set arbitrarily here)


production_entrenchment_restricted_alt.df <- subset(exp2_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_restricted_alt.df<- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 != "novel")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_entrenchment_restricted_alt.df$attested_unattested <- ifelse(production_entrenchment_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_entrenchment_restricted_alt.df$det_lenient_adapted == "det_construction1", 1, production_entrenchment_restricted_alt.df$attested_unattested)
production_entrenchment_restricted_alt.df$attested_unattested <- ifelse(production_entrenchment_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_entrenchment_restricted_alt.df$det_lenient_adapted == "det_construction2", 0, production_entrenchment_restricted_alt.df$attested_unattested)

# select trials featuring the novel verb in the intransitive inchoative construction
production_entrenchment_restricted_alt1.df <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "alternating"  & scene_test2 == "construction2")


# Select trials featuring transitive verbs in the intransitive inchoative construction at test
production_entrenchment_restricted_alt2.df  <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

# Select trials featuring intransitive verbs in the transitive construction at test
production_entrenchment_restricted_alt3.df  <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_restricted_alt.df <- rbind(production_entrenchment_restricted_alt1.df, production_entrenchment_restricted_alt2.df, production_entrenchment_restricted_alt3.df)


round(tapply(production_entrenchment_restricted_alt.df$attested_unattested , production_entrenchment_restricted_alt.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##         0.013         0.138         0.107            NA

# reverse coding to focus on unattested rather than attested for novel vs. restricted
production_entrenchment_restricted_alt.df <- rbind(production_entrenchment_restricted_alt1.df, production_entrenchment_restricted_alt2.df, production_entrenchment_restricted_alt3.df)
production_entrenchment_restricted_alt.df$attested_unattested<- recode(production_entrenchment_restricted_alt.df$attested_unattested, `1` = 0L, `0` = 1L)
production_entrenchment_restricted_alt.df$restricted_verb_noun <- factor(production_entrenchment_restricted_alt.df$restricted_verb_noun, levels = c("yes", "no"))

round(tapply(production_entrenchment_restricted_alt.df$attested_unattested , production_entrenchment_restricted_alt.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.877 0.988

#what this means is that participants produce *unattested forms* less for the restricted than they do for the novel

production_entrenchment_restricted_alt.df$restricted_verb_noun <- factor(production_entrenchment_restricted_alt.df$restricted_verb_noun)
production_entrenchment_restricted_alt1.df= lizCenter(production_entrenchment_restricted_alt.df, list("restricted_verb_noun"))


# maximally vague priors for the predictors and the intercept
prod_unattested_alt_ent_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_entrenchment_restricted_alt1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(prod_unattested_alt_ent_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ restricted_verb_noun.ct + (1 + restricted_verb_noun.ct | participant_private_id) 
##    Data: production_entrenchment_restricted_alt1.df (Number of observations: 478) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 40) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              1.40      0.41     0.69     2.33
## sd(restricted_verb_noun.ct)                2.40      0.83     0.88     4.18
## cor(Intercept,restricted_verb_noun.ct)    -0.79      0.22    -0.99    -0.21
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     4913     6368
## sd(restricted_verb_noun.ct)            1.00     3927     2515
## cor(Intercept,restricted_verb_noun.ct) 1.00     4404     6407
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                   3.34      0.38     2.63     4.12 1.00     8180
## restricted_verb_noun.ct     1.18      0.63    -0.05     2.45 1.00     9613
##                         Tail_ESS
## Intercept                   8507
## restricted_verb_noun.ct     8596
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(prod_unattested_alt_ent_final, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(prod_unattested_alt_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1      0.00
## 2      0.03

Experiment 3

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between the two argument-structure constructions?

Production data

#Figure 9
RQ1_graph_productions.df = subset(exp3_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating" |verb_noun_type_training2 == "novel")
RQ1_graph_productions.df = subset(RQ1_graph_productions.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.graph1 = aggregate(semantically_correct ~ verb_noun_type_training2 + participant_private_id, RQ1_graph_productions.df, FUN=mean)

aggregated.graph1 <- rename(aggregated.graph1, verb = verb_noun_type_training2,
                            correct = semantically_correct)

yarrr::pirateplot(formula = correct  ~ verb,
                  data = aggregated.graph1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "% semantically correct",
                  cex.lab = 1,
                  cex.axis = 1,
                  cex.names = 1,
                  yaxt = "n")

axis(2, at = seq(0, 1, by = 0.25), las=1)
abline(h = 0.50, lty = 2)

#1 alternating verb production

alternating_prod.df = subset(exp3_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating")

#and filter out responses where participants said something other than det1 or det2
alternating_prod.df = subset(alternating_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_alternating_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, alternating_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_alternating_prod.df$semantically_correct),3)

## [1] 0.972

# average accuracy separately for causative and inchoative scenes
round(tapply(aggregated.means_alternating_prod.df$semantically_correct, aggregated.means_alternating_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.988         0.956

# maximally vague priors for the intercept and the predictors
a = lizCenter(alternating_prod.df, list("scene_test2"))  

alternating_model <-brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=a, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model, variable = c("b_Intercept", "b_scene_test2.ct" ))

##                    Estimate Est.Error      Q2.5     Q97.5
## b_Intercept       3.4877233 0.3374883  2.885446 4.1929452
## b_scene_test2.ct -0.6630145 0.5607684 -1.773739 0.4247673

mcmc_plot(alternating_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] < 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1   0.00000
## 2   0.88275

# no difference between construction 1 and construction 2

# Final model
# maximally vague priors for the intercept 
alternating_model_final = brm(formula = semantically_correct~1 + (1|participant_private_id), data=a, family = bernoulli(link = logit),set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept  3.35454 0.3168686 2.785174 4.031713

mcmc_plot(alternating_model_final, variable = "b_Intercept", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

#2 novel verb production

novel_prod.df = subset(exp3_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "novel")

#filter out responses where participants said something other than det1 or det2
novel_prod.df = subset(novel_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_novel_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, novel_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_novel_prod.df$semantically_correct),3)

## [1] 0.959

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_novel_prod.df$semantically_correct, aggregated.means_novel_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.969         0.950

b = lizCenter(novel_prod.df, list("scene_test2"))  

# maximally vague priors for the intercept and the predictors
novel_model <- brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=b, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model, variable = c("b_Intercept", "b_scene_test2.ct"))

##                    Estimate Est.Error      Q2.5     Q97.5
## b_Intercept       3.4359630 0.3878008  2.750641 4.2567796
## b_scene_test2.ct -0.5991281 0.5815606 -1.791968 0.4926243

mcmc_plot(novel_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1   0.00000
## 2   0.14525

# no difference between construction 1 and construction 2  
# Final model

# maximally vague priors for the intercept 
novel_model_final <- brm(formula = semantically_correct~1+ (1|participant_private_id), data=b, family = bernoulli(link = logit), set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 3.273022 0.3686258 2.625439 4.064741

mcmc_plot(novel_model_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

Judgment data

#Figure 10
RQ1_graph_judgments.df = subset(exp3_entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating" |verb_noun_type_training2 == "novel")

# aggregated dataframe for means
aggregated.graph2 = aggregate(response ~ verb_noun_type_training2 + semantically_correct + participant_private_id, RQ1_graph_judgments.df, FUN=mean)
aggregated.graph2$semantically_correct <- recode(aggregated.graph2$semantically_correct, "1" = "yes","0" = "no")

aggregated.graph2 <- rename(aggregated.graph2, verb = verb_noun_type_training2,
                                           correct = semantically_correct)

yarrr::pirateplot(formula = response ~ correct + verb,
                  data = aggregated.graph2,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

#1 alternating verb judgments

alternating_judgments.df = subset(exp3_entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating")

# aggregated dataframe for means
aggregated.means_alternating_judgments = aggregate(response ~ scene_test2 + semantically_correct + participant_private_id, alternating_judgments.df, FUN=mean)
aggregated.means_alternating_judgments$semantically_correct<- recode(aggregated.means_alternating_judgments$semantically_correct, "1" = "yes","0" = "no")
aggregated.means_alternating_judgments$scene_test2 <- recode(aggregated.means_alternating_judgments$scene_test2, "construction1" = "transitive causative","construction2" = "intransitive inchoative")

# average accuracy for semantically correct vs. incorrect trials across causative and noncausative trial types
round(tapply(aggregated.means_alternating_judgments$response, aggregated.means_alternating_judgments$semantically_correct, mean),3)

##    no   yes 
## 2.350 4.862

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_alternating_judgments$response, list(aggregated.means_alternating_judgments$semantically_correct, aggregated.means_alternating_judgments$scene_test2), mean),3)

##     transitive causative intransitive inchoative
## no                 2.288                   2.413
## yes                4.888                   4.838

c = lizCenter(alternating_judgments.df, list("scene_test2", "semantically_correct"))  

# maximally vague priors for the predictors (we don't interpret the intercept here)
alternating_model_judgments <-brm(formula = response~scene_test2.ct * semantically_correct.ct + (1 + scene_test2.ct*semantically_correct.ct|participant_private_id), data=c, family = gaussian(), set_prior("normal(0,1)", class="b"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))


posterior_summary(alternating_model_judgments, variable = c("b_Intercept", "b_scene_test2.ct", "b_semantically_correct.ct", "b_scene_test2.ct:semantically_correct.ct"))

##                                             Estimate  Est.Error       Q2.5
## b_Intercept                               3.62266976 0.06592462  3.4919882
## b_scene_test2.ct                          0.03798969 0.07378541 -0.1053799
## b_semantically_correct.ct                 2.46480734 0.13201640  2.1996277
## b_scene_test2.ct:semantically_correct.ct -0.17314351 0.13841478 -0.4451935
##                                              Q97.5
## b_Intercept                              3.7504905
## b_scene_test2.ct                         0.1837982
## b_semantically_correct.ct                2.7231327
## b_scene_test2.ct:semantically_correct.ct 0.1001944

mcmc_plot(alternating_model_judgments, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_judgments))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] < 0)
C3=mean(samps[,"b_semantically_correct.ct"] < 0)
C4=mean(samps[,"b_scene_test2.ct:semantically_correct.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.2984167
## 3         0.0000000
## 4         0.1029167

# no difference between construction 1 and construction 2

# Final model

# maximally vague priors for the predictors (we don't interpret the intercept here)
alternating_model_judgments_final <-brm(formula = response~semantically_correct.ct + (1 + semantically_correct.ct|participant_private_id), data=c, family = gaussian(), set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model_judgments, variable = c("b_Intercept", "b_semantically_correct.ct"))

##                           Estimate  Est.Error     Q2.5    Q97.5
## b_Intercept               3.622670 0.06592462 3.491988 3.750491
## b_semantically_correct.ct 2.464807 0.13201640 2.199628 2.723133

mcmc_plot(alternating_model_judgments_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_judgments_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

#2 novel verb judgments

novel_judgments.df = subset(exp3_entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "novel")

# aggregated dataframe for means
aggregated.means_novel_judgments = aggregate(response ~ scene_test2 + semantically_correct + participant_private_id, novel_judgments.df, FUN=mean)


# average accuracy for semantically correct vs. incorrect trials across causative and noncausative trial types
round(tapply(aggregated.means_novel_judgments$response, aggregated.means_novel_judgments$semantically_correct, mean),3)

##     0     1 
## 2.169 3.825

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_novel_judgments$response, list(aggregated.means_novel_judgments$semantically_correct, aggregated.means_novel_judgments$scene_test2), mean),3)

##   construction1 construction2
## 0         2.112         2.225
## 1         3.775         3.875

d = lizCenter(novel_judgments.df, list("scene_test2", "semantically_correct"))  

# maximally vague priors for the predictors (we don't interpret the intercept here) 
novel_model_judgments <-brm(formula = response~scene_test2.ct * semantically_correct.ct + (1 + scene_test2.ct*semantically_correct.ct|participant_private_id), data=d, family = gaussian(), set_prior("normal(0,1)", class="b"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_judgments, variable = c("b_Intercept", "b_scene_test2.ct", "b_semantically_correct.ct", "b_scene_test2.ct:semantically_correct.ct"))

##                                            Estimate  Est.Error        Q2.5
## b_Intercept                               2.9752525 0.13724038  2.69912239
## b_scene_test2.ct                          0.1058882 0.07563988 -0.04268074
## b_semantically_correct.ct                 1.5876016 0.19872071  1.19876952
## b_scene_test2.ct:semantically_correct.ct -0.0046011 0.16360039 -0.33124959
##                                              Q97.5
## b_Intercept                              3.2390980
## b_scene_test2.ct                         0.2552098
## b_semantically_correct.ct                1.9730426
## b_scene_test2.ct:semantically_correct.ct 0.3235240

mcmc_plot(novel_model_judgments, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_judgments))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
C3=mean(samps[,"b_semantically_correct.ct"] < 0)
C4=mean(samps[,"b_scene_test2.ct:semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.9200833
## 3         0.0000000
## 4         0.5126667

# no difference between construction 1 and construction 2
# Final model

# maximally vague priors for the predictors (we don't interpret the intercept here) 
novel_model_judgments_final <-brm(formula = response~semantically_correct.ct + (1 + semantically_correct.ct|participant_private_id), data=d, family = gaussian(), set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_judgments, variable = c("b_Intercept", "b_semantically_correct.ct"))

##                           Estimate Est.Error     Q2.5    Q97.5
## b_Intercept               2.975253 0.1372404 2.699122 3.239098
## b_semantically_correct.ct 1.587602 0.1987207 1.198770 1.973043

mcmc_plot(novel_model_judgments_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_judgments_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Question 2: Does statistical pre-emption constrain verb argument construction generalizations in adults (judgment data)?

#Figure 11

#first, filter our semantically incorrect trials

judgments_unattested_novel.df <- subset(exp3_judgment_data.df, semantically_correct == "1")   

#we only want to keep novel
judgments_novel.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items
judgments_unattested_constr1.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
judgments_unattested_constr2.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

judgments_unattested_novel.df <- rbind(judgments_novel.df, judgments_unattested_constr1.df, judgments_unattested_constr2.df)

aggregated.means = aggregate(response ~ condition + restricted_verb_noun + participant_private_id, judgments_unattested_novel.df, FUN=mean)
aggregated.means<- rename(aggregated.means, restricted = restricted_verb_noun)

yarrr::pirateplot(formula = response ~ restricted + condition,
                  data = aggregated.means,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

judgments_unattested_novel_preemption.df <- subset(judgments_unattested_novel.df, condition == "preemption")
judgments_unattested_novel_preemption.df$restricted_verb_noun <- factor(judgments_unattested_novel_preemption.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(judgments_unattested_novel_preemption.df$response, judgments_unattested_novel_preemption.df$restricted_verb_noun, mean),3)

##   yes    no 
## 2.397 2.767

#Center variables of interest using the lizCenter function:
d_unattested_novel = lizCenter(judgments_unattested_novel_preemption.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_preemption_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                             Estimate  Est.Error        Q2.5
## b_Intercept                               2.58163971 0.06389569  2.45435149
## b_restricted_verb_noun.ct                 0.36319317 0.14626526  0.07318734
## b_scene_test2.ct                         -0.01005523 0.06443603 -0.13853533
## b_restricted_verb_noun.ct:scene_test2.ct  0.06078736 0.11792993 -0.16979773
##                                              Q97.5
## b_Intercept                              2.7066125
## b_restricted_verb_noun.ct                0.6483448
## b_scene_test2.ct                         0.1163818
## b_restricted_verb_noun.ct:scene_test2.ct 0.2934798

mcmc_plot(judgments_preemption_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_preemption_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1       0.000000000
## 2       0.007083333
## 3       0.440000000
## 4       0.302916667

# SIMPLIFIED MODEL (FINAL)

#Center variables of interest using the lizCenter function:
d_unattested_novel = lizCenter(judgments_unattested_novel_preemption.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, data=d_unattested_novel, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(judgments_preemption_model, WAIC=T)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: response ~ (1 + restricted_verb_noun.ct | participant_private_id) + restricted_verb_noun.ct 
##    Data: d_unattested_novel (Number of observations: 1168) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 73) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              0.50      0.05     0.41     0.60
## sd(restricted_verb_noun.ct)                1.18      0.11     0.98     1.42
## cor(Intercept,restricted_verb_noun.ct)    -0.10      0.13    -0.34     0.16
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     3125     5168
## sd(restricted_verb_noun.ct)            1.00     2541     4850
## cor(Intercept,restricted_verb_noun.ct) 1.00     1862     3179
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                   2.58      0.06     2.46     2.70 1.00     2148
## restricted_verb_noun.ct     0.36      0.15     0.07     0.64 1.00     1801
##                         Tail_ESS
## Intercept                   4112
## restricted_verb_noun.ct     3606
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.78      0.02     0.74     0.81 1.00    14597     9171
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(judgments_preemption_model, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(judgments_preemption_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##     c(C1, C2)
## 1 0.000000000
## 2 0.008333333

# BF analyses: we use the difference between attested and unattested in Experiment1 (SD = 0.65) as an estimate of the difference we expect in comparing rating for unattested vs. novel constructions 
Bf(0.14, 0.36, uniform = 0, meanoftheory = 0, sdtheory = 0.65, tail = 1)

## $LikelihoodTheory
## [1] 1.029992
## 
## $Likelihoodnull
## [1] 0.1044603
## 
## $BayesFactor
## [1] 9.860124

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.14, 0.36, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h1 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h1$sdtheory)
high_threshold <- max(ev_for_h1$sdtheory)
print(low_threshold)

## [1] 0.07

print(high_threshold)

## [1] 2.5

Question 3: Does statistical entrenchment constrain verb argument construction generalizations in adults (judgment data)?

#first, filter our semantically incorrect trials
entrenchment_judgments_unattested_novel.df <- subset(exp3_entrenchment_judgment.df, semantically_correct == "1")   

#we only want to keep novel

entrenchment_judgments_novel.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items

entrenchment_judgments_unattested_constr1.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
entrenchment_judgments_unattested_constr2.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

entrenchment_judgments_unattested_novel.df <- rbind(entrenchment_judgments_novel.df, entrenchment_judgments_unattested_constr1.df, entrenchment_judgments_unattested_constr2.df)
entrenchment_judgments_unattested_novel.df$restricted_verb_noun <- factor(entrenchment_judgments_unattested_novel.df$restricted_verb_noun, levels = c("yes", "no"))


round(tapply(entrenchment_judgments_unattested_novel.df$response, entrenchment_judgments_unattested_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 4.425 3.825

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                             Estimate Est.Error        Q2.5
## b_Intercept                               4.13277600 0.1474779  3.83978739
## b_restricted_verb_noun.ct                -0.58428676 0.1890658 -0.95762558
## b_scene_test2.ct                         -0.07477894 0.1075586 -0.29057374
## b_restricted_verb_noun.ct:scene_test2.ct  0.33681916 0.1997174 -0.05531082
##                                               Q97.5
## b_Intercept                               4.4224546
## b_restricted_verb_noun.ct                -0.2167774
## b_scene_test2.ct                          0.1418832
## b_restricted_verb_noun.ct:scene_test2.ct  0.7301904

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] > 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.0010000
## 3         0.2375833
## 4         0.0467500

# SIMPLIFIED MODEL (FINAL)

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df , list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct"))

##                            Estimate Est.Error       Q2.5      Q97.5
## b_Intercept                4.139934 0.1412607  3.8590629  4.4189730
## b_restricted_verb_noun.ct -0.573672 0.1876719 -0.9392259 -0.1980523

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

# the effect is in the opposite direction

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1     0.000
## 2     0.999

# the effect is in the opposite direction


# drawing a max based on the difference between attested vs. unattested in this experiment (this was sig. evidence for entrenchment)
Bf(0.19, -0.58, uniform = 0, meanoftheory = 0, sdtheory = 0.38/2, tail = 1)

## $LikelihoodTheory
## [1] 0.004503071
## 
## $Likelihoodnull
## [1] 0.01989102
## 
## $BayesFactor
## [1] 0.2263872

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.19, -0.58, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF < 1/3
ev_for_h1 <- subset(data.frame(range_test), BF < 1/3)
low_threshold <- min(ev_for_h1$sdtheory)
high_threshold <- max(ev_for_h1$sdtheory)
print(low_threshold)

## [1] 0

print(high_threshold)

## [1] 4

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?

#first, filter our semantically incorrect trials
all_judgment_unattested_novel.df <- subset(exp3_judgment_data.df, semantically_correct == "1")   

#we only want to keep novel

all_judgment_novel.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items

all_judgment_unattested_constr1.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
all_judgment_unattested_constr2.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

all_judgment_unattested_novel.df <- rbind(all_judgment_novel.df, all_judgment_unattested_constr1.df, all_judgment_unattested_constr2.df)
all_judgment_unattested_novel.df$restricted_verb_noun <- factor(all_judgment_unattested_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(all_judgment_unattested_novel.df$response, list(all_judgment_unattested_novel.df$restricted_verb_noun, all_judgment_unattested_novel.df$condition), mean),3)

##     entrenchment preemption
## yes        4.425      2.397
## no         3.825      2.767

#Center variables of interest using the lizCenter function:
df = lizCenter(all_judgment_unattested_novel.df, list("restricted_verb_noun","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_pre_vs_ent_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct * condition.ct *scene_test2.ct, data=df, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_pre_vs_ent_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_condition.ct", "b_scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct", "b_restricted_verb_noun.ct:scene_test2.ct","b_condition.ct:scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct:scene_test2.ct" ))

##                                                          Estimate  Est.Error
## b_Intercept                                            2.91412426 0.06567671
## b_restricted_verb_noun.ct                              0.15209873 0.11719267
## b_condition.ct                                        -1.52200902 0.13489936
## b_scene_test2.ct                                      -0.02412938 0.05589895
## b_restricted_verb_noun.ct:condition.ct                 0.92024713 0.23577817
## b_restricted_verb_noun.ct:scene_test2.ct               0.12305683 0.10300147
## b_condition.ct:scene_test2.ct                          0.06501503 0.11954890
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.27827102 0.22701246
##                                                              Q2.5       Q97.5
## b_Intercept                                            2.78800640  3.04350389
## b_restricted_verb_noun.ct                             -0.07677031  0.38313972
## b_condition.ct                                        -1.78332526 -1.25311094
## b_scene_test2.ct                                      -0.13399631  0.08689568
## b_restricted_verb_noun.ct:condition.ct                 0.45847571  1.37981005
## b_restricted_verb_noun.ct:scene_test2.ct              -0.07948379  0.32550028
## b_condition.ct:scene_test2.ct                         -0.16895439  0.29893148
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.72429751  0.17016414

mcmc_plot(judgments_pre_vs_ent_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_pre_vs_ent_model))

C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0) 
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:condition.ct"] < 0)
C5=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0) 
C6=mean(samps[,"b_condition.ct:scene_test2.ct"] < 0)
C7=mean(samps[,"b_restricted_verb_noun.ct:condition.ct:scene_test2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7)
## 1                  9.658333e-02
## 2                  0.000000e+00
## 3                  3.325833e-01
## 4                  8.333333e-05
## 5                  1.170833e-01
## 6                  2.920833e-01
## 7                  1.094167e-01

# SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
df = lizCenter(all_judgment_unattested_novel.df, list("restricted_verb_noun", "condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here)
judgments_pre_vs_ent_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct * condition.ct, data=df, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_pre_vs_ent_model, variable = c("b_restricted_verb_noun.ct", "b_condition.ct", "b_restricted_verb_noun.ct:condition.ct"))

##                                          Estimate Est.Error        Q2.5
## b_restricted_verb_noun.ct               0.1535984 0.1189095 -0.07934501
## b_condition.ct                         -1.5180610 0.1310689 -1.77349630
## b_restricted_verb_noun.ct:condition.ct  0.9237767 0.2405861  0.44968404
##                                             Q97.5
## b_restricted_verb_noun.ct               0.3889023
## b_condition.ct                         -1.2607409
## b_restricted_verb_noun.ct:condition.ct  1.3954802

mcmc_plot(judgments_pre_vs_ent_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_pre_vs_ent_model))

C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_restricted_verb_noun.ct:condition.ct"] < 0) 

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1    0.09716667
## 2    0.00000000
## 3    0.00000000

#roughly predicted effect size from previous study was 1.0. Use it as an estimate of the effect we expect here
Bf(0.23, 0.93, uniform = 0, meanoftheory = 0, sdtheory = 1.00, tail = 1)

## $LikelihoodTheory
## [1] 0.5156481
## 
## $Likelihoodnull
## [1] 0.0004885246
## 
## $BayesFactor
## [1] 1055.521

H1RANGE = seq(0,4,by=0.01) # [5-1]-[0] - max effect of preemption minus no effect of entrenchment
range_test <- Bf_range(0.23, 0.93, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.07

print(high_threshold)

## [1] 4

Exploratory data analyses

Effect of statistical pre-emption: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

# Figure 28
judgments_unattested_attested.df <- subset(exp3_judgment_data.df, semantically_correct == "1")   
judgments_unattested_attested.df <- subset(judgments_unattested_attested.df, restricted_verb_noun == "yes")   


aggregated.means1 = aggregate(response ~ condition + attested_unattested + participant_private_id, judgments_unattested_attested.df , FUN=mean)
aggregated.means1<- rename(aggregated.means1, attested = attested_unattested)

aggregated.means1$attested<- recode(aggregated.means1$attested, "1" = "yes","0" = "no")


yarrr::pirateplot(formula = response ~  attested  + condition,
                  data = aggregated.means1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

# analyses
attested_vs_unattested = subset(exp3_preemption_judgment.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested$response, attested_vs_unattested$attested_unattested, mean),3)

##     0     1 
## 2.397 4.899

# model with tested construction
#Center variables of interest using the lizCenter function:
d0_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d0_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_preemption1, variable = c("b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                            Estimate  Est.Error        Q2.5
## b_scene_test2.ct                        -0.01185713 0.04138818 -0.09417495
## b_attested_unattested.ct                 2.47644518 0.10410486  2.27441772
## b_attested_unattested.ct:scene_test2.ct  0.05732708 0.08138795 -0.10360512
##                                             Q97.5
## b_scene_test2.ct                        0.0684634
## b_attested_unattested.ct                2.6837753
## b_attested_unattested.ct:scene_test2.ct 0.2201504

samps = as.matrix(as.mcmc(attested_unattested_preemption1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1     0.3915833
## 2     0.0000000
## 3     0.2399167

#SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))


posterior_summary(attested_unattested_preemption, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                          Estimate  Est.Error     Q2.5    Q97.5
## b_Intercept              3.658434 0.05216385 3.555996 3.760696
## b_attested_unattested.ct 2.477269 0.10113318 2.276884 2.675417

mcmc_plot(attested_unattested_preemption, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_preemption))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# this priors drawn from Experiment 1
Bf(0.10, 2.48, uniform = 0, meanoftheory = 0, sdtheory = 2.55, tail = 1)

## $LikelihoodTheory
## [1] 0.1949811
## 
## $Likelihoodnull
## [1] 1.113451e-133
## 
## $BayesFactor
## [1] 1.751143e+132

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.10, 2.48, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.01

print(high_threshold)

## [1] 4

Effect of statistical entrenchment: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

attested_vs_unattested_ent = subset(exp3_entrenchment_judgment.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested_ent$response, attested_vs_unattested_ent$attested_unattested, mean),3)

##     0     1 
## 4.425 4.950

#Center variables of interest using the lizCenter function:
d_attested_unattested_ent1 = lizCenter(attested_vs_unattested_ent, list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d_attested_unattested_ent1, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment1, variable = c("b_Intercept","b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                           Estimate  Est.Error       Q2.5
## b_Intercept                              4.6941470 0.06553314  4.5634605
## b_scene_test2.ct                        -0.1066784 0.09243031 -0.2865172
## b_attested_unattested.ct                 0.5117860 0.12791476  0.2611199
## b_attested_unattested.ct:scene_test2.ct  0.2560031 0.18890377 -0.1197017
##                                              Q97.5
## b_Intercept                             4.82505877
## b_scene_test2.ct                        0.07570406
## b_attested_unattested.ct                0.76000147
## b_attested_unattested.ct:scene_test2.ct 0.62596998

mcmc_plot(attested_unattested_entrenchment1, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1  1.234167e-01
## 2  8.333333e-05
## 3  8.650000e-02

# SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested_ent = lizCenter(attested_vs_unattested_ent, list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested_ent, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                           Estimate  Est.Error      Q2.5     Q97.5
## b_Intercept              4.6921358 0.06774502 4.5571803 4.8256405
## b_attested_unattested.ct 0.5123466 0.13340971 0.2434933 0.7714918

mcmc_plot(attested_unattested_entrenchment, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##      c(C1, C2)
## 1 0.0000000000
## 2 0.0004166667

# expect a difference of 0.38 from previous work
Bf(0.13, 0.51, uniform = 0, meanoftheory = 0, sdtheory = 0.38, tail = 1)

## $LikelihoodTheory
## [1] 0.887002
## 
## $Likelihoodnull
## [1] 0.001396224
## 
## $BayesFactor
## [1] 635.2864

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.13, 0.51, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.04

print(high_threshold)

## [1] 4

Entrenchment vs. preemption: ratings for witnessed vs. unwitnessed forms

attested_vs_unattested_across = subset(exp3_judgment_data.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested_across$response, list(attested_vs_unattested_across$condition, attested_vs_unattested_across$attested_unattested), mean),3)

##                  0     1
## entrenchment 4.425 4.950
## preemption   2.397 4.899

# model with test construction
#Center variables of interest using the lizCenter function:

d0_attested_unattested_all = lizCenter(attested_vs_unattested_across , list("attested_unattested","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_all <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct*condition.ct, data=d0_attested_unattested_all, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_all, variable = c("b_Intercept","b_attested_unattested.ct", "b_scene_test2.ct", "b_condition.ct", "b_attested_unattested.ct:scene_test2.ct", "b_attested_unattested.ct:condition.ct", "b_scene_test2.ct:condition.ct", "b_attested_unattested.ct:scene_test2.ct:condition.ct"))

##                                                         Estimate  Est.Error
## b_Intercept                                           3.88199805 0.04332877
## b_attested_unattested.ct                              2.05319090 0.08404544
## b_scene_test2.ct                                     -0.03170012 0.04122243
## b_condition.ct                                       -1.00595460 0.08603263
## b_attested_unattested.ct:scene_test2.ct               0.10093666 0.07979321
## b_attested_unattested.ct:condition.ct                 1.90257573 0.16779775
## b_scene_test2.ct:condition.ct                         0.10139040 0.09115100
## b_attested_unattested.ct:scene_test2.ct:condition.ct -0.21577841 0.17907903
##                                                             Q2.5       Q97.5
## b_Intercept                                           3.79662273  3.96754470
## b_attested_unattested.ct                              1.88894974  2.21764378
## b_scene_test2.ct                                     -0.11362697  0.04862692
## b_condition.ct                                       -1.17574020 -0.84006317
## b_attested_unattested.ct:scene_test2.ct              -0.05678313  0.25608670
## b_attested_unattested.ct:condition.ct                 1.57421312  2.23349444
## b_scene_test2.ct:condition.ct                        -0.07658219  0.28050340
## b_attested_unattested.ct:scene_test2.ct:condition.ct -0.56495786  0.13548829

samps = as.matrix(as.mcmc(attested_unattested_all))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_condition.ct"] > 0)
C5=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)
C6=mean(samps[,"b_attested_unattested.ct:condition.ct"] < 0)
C7=mean(samps[,"b_scene_test2.ct:condition.ct"] < 0)
C8=mean(samps[,"b_attested_unattested.ct:scene_test2.ct:condition.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7,C8))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7, C8)
## 1                         0.0000000
## 2                         0.0000000
## 3                         0.2195833
## 4                         0.0000000
## 5                         0.1045000
## 6                         0.0000000
## 7                         0.1323333
## 8                         0.1161667

#Center variables of interest using the lizCenter function:
df_attested_unattested = lizCenter(attested_vs_unattested_across, list("attested_unattested", "condition"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment_preemption <- brm(formula = response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct * condition.ct, data=df_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment_preemption, variable = c("b_Intercept","b_condition.ct", "b_attested_unattested.ct","b_attested_unattested.ct:condition.ct"))

##                                        Estimate  Est.Error      Q2.5      Q97.5
## b_Intercept                            3.882491 0.04275664  3.800262  3.9670991
## b_condition.ct                        -1.004392 0.08525029 -1.172896 -0.8365452
## b_attested_unattested.ct               2.053470 0.08355460  1.890079  2.2144036
## b_attested_unattested.ct:condition.ct  1.900545 0.16603949  1.573823  2.2279448

mcmc_plot(attested_unattested_entrenchment_preemption, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment_preemption))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_attested_unattested.ct"] < 0) 
C4=mean(samps[,"b_attested_unattested.ct:condition.ct"] < 0) 

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1                 0
## 2                 0
## 3                 0
## 4                 0

#roughly predicted effect size from previous study 2.11

Bf(0.17, 1.90, uniform = 0, meanoftheory = 0, sdtheory = 2.11, tail = 1)

## $LikelihoodTheory
## [1] 0.2519496
## 
## $Likelihoodnull
## [1] 1.761329e-27
## 
## $BayesFactor
## [1] 1.430452e+26

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.17, 1.90, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.02

print(high_threshold)

## [1] 4

Production data: Effect of statistical pre-emption

#Are participants producing more attested than unattested dets? we will now compare proportion of attested dets (that's the intercept) for the restricted verbs against chance 

production_preemption_attested_unattested.df <- subset(exp3_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_attested_unattested.df <- subset(production_preemption_attested_unattested.df, restricted_verb_noun =="yes")

round(tapply(production_preemption_attested_unattested.df $attested_unattested, production_preemption_attested_unattested.df $verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.995         0.997            NA

production_preemption_attested_unattested.df$verb_noun_type_training2 <- factor(production_preemption_attested_unattested.df$verb_noun_type_training2)

df_prod = lizCenter(production_preemption_attested_unattested.df , list("verb_noun_type_training2"))  

# maximally vague priors for the predictors and the intercept
prod_attested_unattested = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                Estimate Est.Error      Q2.5    Q97.5
## b_Intercept                   5.0063142 0.3520660  4.377264 5.755723
## b_verb_noun_type_training2.ct 0.1504383 0.5722741 -0.975768 1.268186

mcmc_plot(prod_attested_unattested, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1   0.00000
## 2   0.40025

#same analyses without verb_training_type

# maximally vague priors for the intercept
prod_attested_unattested_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 4.916322 0.3415986 4.300002 5.633669

mcmc_plot(prod_attested_unattested_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_final))
C1=mean(samps[,"b_Intercept"] < 0)


# We will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the restricted verbs than for the novel verb


production_preemption_restricted_novel.df <- subset(exp3_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_restricted_novel.df<- subset(production_preemption_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- recode(production_preemption_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.005         0.003         0.457

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.004 0.457

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun)
production_preemption_restricted_novel1.df = lizCenter(production_preemption_restricted_novel.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors and the intercept
prod_unattested_novel_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_preemption_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error      Q2.5     Q97.5
## b_Intercept               -3.597151 0.2504851 -4.110856 -3.131344
## b_restricted_verb_noun.ct  3.959813 0.4917882  2.959409  4.897177

mcmc_plot(prod_unattested_novel_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_final))
C1=mean(samps[,"b_Intercept"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# We will now compare unattested for restricted vs. alternating

production_preemption_restricted_alt.df <- subset(exp3_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_restricted_alt.df<- subset(production_preemption_restricted_alt.df, verb_noun_type_training2 != "novel")

# all forms are unwitnessed for the alternating verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_preemption_restricted_alt.df$attested_unattested <- ifelse(production_preemption_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_preemption_restricted_alt.df$det_lenient_adapted == "det_construction1", 1, production_preemption_restricted_alt.df$attested_unattested)

production_preemption_restricted_alt.df$attested_unattested <- ifelse(production_preemption_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_preemption_restricted_alt.df$det_lenient_adapted == "det_construction2", 0, production_preemption_restricted_alt.df$attested_unattested)

production_preemption_restricted_alt.df$attested_unattested <- recode(production_preemption_restricted_alt.df$attested_unattested, `1` = 0L, `0` = 1L)


round(tapply(production_preemption_restricted_alt.df$attested_unattested , production_preemption_restricted_alt.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##         0.522         0.005         0.003            NA

production_preemption_restricted_alt.df$restricted_verb_noun <- factor(production_preemption_restricted_alt.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(production_preemption_restricted_alt.df$attested_unattested , production_preemption_restricted_alt.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.004 0.522

production_preemption_restricted_alt1.df = lizCenter(production_preemption_restricted_alt.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors and the intercept
prod_unattested_alt_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_preemption_restricted_alt1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(prod_unattested_alt_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ restricted_verb_noun.ct + (1 + restricted_verb_noun.ct | participant_private_id) 
##    Data: production_preemption_restricted_alt1.df (Number of observations: 1748) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 73) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              0.40      0.24     0.02     0.90
## sd(restricted_verb_noun.ct)                0.98      0.40     0.12     1.69
## cor(Intercept,restricted_verb_noun.ct)     0.24      0.49    -0.77     0.96
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00      962     2545
## sd(restricted_verb_noun.ct)            1.01      639     1843
## cor(Intercept,restricted_verb_noun.ct) 1.01      553      659
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                  -3.14      0.21    -3.57    -2.76 1.00     9840
## restricted_verb_noun.ct     4.76      0.34     4.13     5.45 1.00     8813
##                         Tail_ESS
## Intercept                   7904
## restricted_verb_noun.ct     8177
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(prod_unattested_alt_final, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(prod_unattested_novel_final))
C1=mean(samps[,"b_Intercept"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Production data: Effect of statistical entrenchment

#a. Are participants producing more attested than unattested dets?
# here, we want to see how often participants say the unattested e.g. transitive-only det1 for a det2 (intransitive-only) verb in the intransitive condition at test 
# and vice versa 

production_entrenchment_attested_unattested.df  <- subset(exp3_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_attested_unattested.df  <- subset(production_entrenchment_attested_unattested.df, restricted_verb_noun =="yes")

#We want to compare attested vs. unattested trials for transitive verbs in the intransitive inchoative construction at test
production_entrenchment_attested_unattested1.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

#And intransitive inchoative verbs in the transitive construction at test. Filter out irrelevant trials
production_entrenchment_attested_unattested2.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_attested_unattested.df <- rbind(production_entrenchment_attested_unattested1.df, production_entrenchment_attested_unattested2.df)

#How much of the time are participants producing attested items?
round(mean(production_entrenchment_attested_unattested.df$attested_unattested),3)

## [1] 0.144

# and separately for each verb type
round(tapply(production_entrenchment_attested_unattested.df$attested_unattested, production_entrenchment_attested_unattested.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.175         0.112            NA

production_entrenchment_attested_unattested.df$verb_noun_type_training2 <- factor(production_entrenchment_attested_unattested.df$verb_noun_type_training2)
df_prod_ent = lizCenter((production_entrenchment_attested_unattested.df), list("verb_noun_type_training2"))  


# maximally vague priors for the predictors and the intercept
prod_attested_unattested_ent = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_ent, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                 Estimate Est.Error      Q2.5      Q97.5
## b_Intercept                   -2.7636165 0.5585478 -3.863347 -1.6618543
## b_verb_noun_type_training2.ct -0.8744874 0.5573047 -1.979615  0.2089291

mcmc_plot(prod_attested_unattested_ent, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##    c(C1, C2)
## 1 1.00000000
## 2 0.05566667

#same analyses without verb_training_type


# maximally vague priors for the intercept
prod_attested_unattested_ent_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

summary(prod_attested_unattested_ent_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ 1 + (1 | participant_private_id) 
##    Data: df_prod_ent (Number of observations: 320) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 40) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     3.13      0.79     1.95     5.01 1.00     3101     4981
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept    -2.65      0.52    -3.71    -1.63 1.00     3496     5061
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

posterior_summary(prod_attested_unattested_ent_final, variable = c("b_Intercept"))

##              Estimate Est.Error      Q2.5     Q97.5
## b_Intercept -2.650201 0.5242895 -3.708617 -1.630656

mcmc_plot(prod_attested_unattested_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 1

# c. we will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the 2 non-alternating verbs than for the novel verb (presumably the “unwitnessed” form has to be set arbitrarily here)


production_entrenchment_restricted_novel.df <- subset(exp3_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_restricted_novel.df<- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_entrenchment_restricted_novel.df$attested_unattested)
production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_entrenchment_restricted_novel.df$attested_unattested)

# select trials featuring the novel verb in the intransitive inchoative construction
production_entrenchment_restricted_novel1.df <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "novel"  & scene_test2 == "construction2")


# Select trials featuring transitive verbs in the intransitive inchoative construction at test
production_entrenchment_restricted_novel2.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

# Select trials featuring intransitive verbs in the transitive construction at test
production_entrenchment_restricted_novel3.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)


round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.175         0.112         0.051

# reverse coding to focus on unattested rather than attested for novel vs. restricted
production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)
production_entrenchment_restricted_novel.df$attested_unattested<- recode(production_entrenchment_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)
production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun, levels = c("yes", "no"))

round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.856 0.949

#what this means is that participants produce *unattested forms* less for the restricted than they do for the novel

production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun)
production_entrenchment_restricted_novel1.df = lizCenter(production_entrenchment_restricted_novel.df, list("restricted_verb_noun"))


# maximally vague priors for the predictors and the intercept
prod_unattested_novel_ent_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_entrenchment_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_ent_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error       Q2.5    Q97.5
## b_Intercept               2.9466361 0.3898338  2.2037279 3.744884
## b_restricted_verb_noun.ct 0.3029179 0.5784473 -0.8264741 1.448727

mcmc_plot(prod_unattested_novel_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1   0.00000
## 2   0.30275

# d. we will now compare unattested for restricted vs. alternating
# Do participants produce the unwitnessed form less for the 2 non-alternating verbs than for the novel verb (presumably the “unwitnessed” form has to be set arbitrarily here)


production_entrenchment_restricted_alt.df <- subset(exp3_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_restricted_alt.df<- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 != "novel")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_entrenchment_restricted_alt.df$attested_unattested <- ifelse(production_entrenchment_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_entrenchment_restricted_alt.df$det_lenient_adapted == "det_construction1", 1, production_entrenchment_restricted_alt.df$attested_unattested)
production_entrenchment_restricted_alt.df$attested_unattested <- ifelse(production_entrenchment_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_entrenchment_restricted_alt.df$det_lenient_adapted == "det_construction2", 0, production_entrenchment_restricted_alt.df$attested_unattested)

# select trials featuring the novel verb in the intransitive inchoative construction
production_entrenchment_restricted_alt1.df <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "alternating"  & scene_test2 == "construction2")


# Select trials featuring transitive verbs in the intransitive inchoative construction at test
production_entrenchment_restricted_alt2.df  <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

# Select trials featuring intransitive verbs in the transitive construction at test
production_entrenchment_restricted_alt3.df  <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_restricted_alt.df <- rbind(production_entrenchment_restricted_alt1.df, production_entrenchment_restricted_alt2.df, production_entrenchment_restricted_alt3.df)


round(tapply(production_entrenchment_restricted_alt.df$attested_unattested , production_entrenchment_restricted_alt.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##         0.044         0.175         0.112            NA

# reverse coding to focus on unattested rather than attested for novel vs. restricted
production_entrenchment_restricted_alt.df <- rbind(production_entrenchment_restricted_alt1.df, production_entrenchment_restricted_alt2.df, production_entrenchment_restricted_alt3.df)
production_entrenchment_restricted_alt.df$attested_unattested<- recode(production_entrenchment_restricted_alt.df$attested_unattested, `1` = 0L, `0` = 1L)
production_entrenchment_restricted_alt.df$restricted_verb_noun <- factor(production_entrenchment_restricted_alt.df$restricted_verb_noun, levels = c("yes", "no"))

round(tapply(production_entrenchment_restricted_alt.df$attested_unattested , production_entrenchment_restricted_alt.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.856 0.956

#what this means is that participants produce *unattested forms* less for the restricted than they do for the novel

production_entrenchment_restricted_alt.df$restricted_verb_noun <- factor(production_entrenchment_restricted_alt.df$restricted_verb_noun)
production_entrenchment_restricted_alt1.df= lizCenter(production_entrenchment_restricted_alt.df, list("restricted_verb_noun"))


# maximally vague priors for the predictors and the intercept
prod_unattested_alt_ent_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_entrenchment_restricted_alt1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(prod_unattested_alt_ent_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ restricted_verb_noun.ct + (1 + restricted_verb_noun.ct | participant_private_id) 
##    Data: production_entrenchment_restricted_alt1.df (Number of observations: 480) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 40) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              1.74      0.46     0.96     2.74
## sd(restricted_verb_noun.ct)                3.45      0.82     2.04     5.25
## cor(Intercept,restricted_verb_noun.ct)    -0.88      0.12    -1.00    -0.56
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     3805     5882
## sd(restricted_verb_noun.ct)            1.00     4967     7102
## cor(Intercept,restricted_verb_noun.ct) 1.00     2934     5931
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                   3.13      0.38     2.42     3.91 1.00     4986
## restricted_verb_noun.ct     0.44      0.63    -0.77     1.70 1.00     6085
##                         Tail_ESS
## Intercept                   7777
## restricted_verb_noun.ct     8090
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(prod_unattested_alt_ent_final, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(prod_unattested_alt_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.2424167

Experiment 4

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between singular/plural marking?

Production data

#Figure 13
RQ1_graph_productions.df = subset(exp4_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating" |verb_noun_type_training2 == "novel")
RQ1_graph_productions.df = subset(RQ1_graph_productions.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.graph1 = aggregate(semantically_correct ~ verb_noun_type_training2 + participant_private_id, RQ1_graph_productions.df, FUN=mean)

aggregated.graph1 <- rename(aggregated.graph1, noun = verb_noun_type_training2,
                            correct = semantically_correct)

yarrr::pirateplot(formula = correct  ~ noun,
                  data = aggregated.graph1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "% semantically correct",
                  cex.lab = 1,
                  cex.axis = 1,
                  cex.names = 1,
                  yaxt = "n")

axis(2, at = seq(0, 1, by = 0.25), las=1)
abline(h = 0.50, lty = 2)

#1 alternating verb production

alternating_prod.df = subset(exp4_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating")

#and filter out responses where participants said something other than det1 or det2
alternating_prod.df = subset(alternating_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_alternating_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, alternating_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_alternating_prod.df$semantically_correct),3)

## [1] 0.977

# average accuracy separately for causative and inchoative scenes
round(tapply(aggregated.means_alternating_prod.df$semantically_correct, aggregated.means_alternating_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.994         0.959

# maximally vague priors for the intercept and the predictors
a = lizCenter(alternating_prod.df, list("scene_test2"))  

alternating_model <-brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=a, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model, variable = c("b_Intercept", "b_scene_test2.ct" ))

##                    Estimate Est.Error      Q2.5     Q97.5
## b_Intercept       4.0171098 0.4589591  3.192981 4.9897792
## b_scene_test2.ct -0.4509232 0.6808850 -1.763323 0.9445847

mcmc_plot(alternating_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.2449167

# no difference between construction 1 and construction 2

# Final model
# maximally vague priors for the intercept 
alternating_model_final = brm(formula = semantically_correct~1 + (1|participant_private_id), data=a, family = bernoulli(link = logit),set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 3.702724 0.4289143 2.951144 4.617194

mcmc_plot(alternating_model_final, variable = "b_Intercept", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

#2 novel verb production

novel_prod.df = subset(exp4_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "novel")

#filter out responses where participants said something other than det1 or det2
novel_prod.df = subset(novel_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_novel_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, novel_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_novel_prod.df$semantically_correct),3)

## [1] 0.939

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_novel_prod.df$semantically_correct, aggregated.means_novel_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.954         0.924

b = lizCenter(novel_prod.df, list("scene_test2"))  

# maximally vague priors for the intercept and the predictors
novel_model <- brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=b, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model, variable = c("b_Intercept", "b_scene_test2.ct"))

##                    Estimate Est.Error      Q2.5     Q97.5
## b_Intercept       3.6382542 0.5095048  2.678071 4.6654034
## b_scene_test2.ct -0.3787823 0.6130081 -1.558647 0.8553687

mcmc_plot(novel_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.2625833

# no difference between construction 1 and construction 2  
# Final model

# maximally vague priors for the intercept 
novel_model_final <- brm(formula = semantically_correct~1+ (1|participant_private_id), data=b, family = bernoulli(link = logit), set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept  3.55215 0.4919854 2.636328 4.580522

mcmc_plot(novel_model_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

Judgment data

There are no semantically incorrect trials in Experiment 4, thus, these analyses are not possible.

Question 2: Does statistical preemption constrain morphological generalizations in adults (judgment data)?

#Figure 14

#no semantically incorrect trials here

#we only want to keep novel
judgments_novel.df <- subset(exp4_judgment_data.df, verb_noun_type_training2 == "novel")   

#and restricted items
judgments_unattested_constr1.df <- subset(exp4_judgment_data.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
judgments_unattested_constr2.df <- subset(exp4_judgment_data.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

judgments_unattested_novel.df <- rbind(judgments_novel.df, judgments_unattested_constr1.df, judgments_unattested_constr2.df)

aggregated.means = aggregate(response ~ condition + restricted_verb_noun + participant_private_id, judgments_unattested_novel.df, FUN=mean)
aggregated.means<- rename(aggregated.means, restricted = restricted_verb_noun)

yarrr::pirateplot(formula = response ~ restricted + condition,
                  data = aggregated.means,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

judgments_unattested_novel_preemption.df <- subset(judgments_unattested_novel.df, condition == "preemption")
judgments_unattested_novel_preemption.df$restricted_verb_noun <- factor(judgments_unattested_novel_preemption.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(judgments_unattested_novel_preemption.df$response, judgments_unattested_novel_preemption.df$restricted_verb_noun, mean),3)

##   yes    no 
## 2.717 3.475

#Center variables of interest using the lizCenter function:
d_unattested_novel = lizCenter(judgments_unattested_novel_preemption.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_preemption_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                            Estimate  Est.Error       Q2.5
## b_Intercept                               3.1094017 0.12089195  2.8688803
## b_restricted_verb_noun.ct                 0.7091007 0.21673296  0.2715548
## b_scene_test2.ct                         -0.2410345 0.07798783 -0.3936505
## b_restricted_verb_noun.ct:scene_test2.ct -0.1763631 0.14927456 -0.4694959
##                                                Q97.5
## b_Intercept                               3.34398121
## b_restricted_verb_noun.ct                 1.12616798
## b_scene_test2.ct                         -0.08738664
## b_restricted_verb_noun.ct:scene_test2.ct  0.11688735

mcmc_plot(judgments_preemption_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_preemption_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.0012500
## 3         0.0022500
## 4         0.1169167

# BF analyses: we use the difference between attested and novel in Experiment 1 (SD = 0.65) as an estimate of the difference we expect here

Bf(0.22, 0.72, uniform = 0, meanoftheory = 0, sdtheory = 0.65, tail = 1)

## $LikelihoodTheory
## [1] 0.6698739
## 
## $Likelihoodnull
## [1] 0.008564045
## 
## $BayesFactor
## [1] 78.21934

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.22, 0.72, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# RRs for which BF > 3
ev_for_h1 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h1$sdtheory)
high_threshold <- max(ev_for_h1$sdtheory)
print(low_threshold)

## [1] 0.08

print(high_threshold)

## [1] 4

Question 3: Does statistical entrenchment constrain morphological generalizations in adults (judgment data)?

#no semantically incorrect trials here

#we only want to keep novel

entrenchment_judgments_novel.df <- subset(exp4_entrenchment_judgment.df, verb_noun_type_training2 == "novel")   

#and restricted items

entrenchment_judgments_unattested_constr1.df <- subset(exp4_entrenchment_judgment.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
entrenchment_judgments_unattested_constr2.df <- subset(exp4_entrenchment_judgment.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

entrenchment_judgments_unattested_novel.df <- rbind(entrenchment_judgments_novel.df, entrenchment_judgments_unattested_constr1.df, entrenchment_judgments_unattested_constr2.df)
entrenchment_judgments_unattested_novel.df$restricted_verb_noun <- factor(entrenchment_judgments_unattested_novel.df$restricted_verb_noun, levels = c("yes", "no"))


round(tapply(entrenchment_judgments_unattested_novel.df$response, entrenchment_judgments_unattested_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 4.537 4.646

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                             Estimate  Est.Error       Q2.5
## b_Intercept                               4.59593174 0.09445369  4.4111938
## b_restricted_verb_noun.ct                 0.10755868 0.14090058 -0.1681028
## b_scene_test2.ct                          0.08565752 0.09780645 -0.1123254
## b_restricted_verb_noun.ct:scene_test2.ct -0.11935404 0.20127718 -0.5139401
##                                              Q97.5
## b_Intercept                              4.7826548
## b_restricted_verb_noun.ct                0.3838231
## b_scene_test2.ct                         0.2764040
## b_restricted_verb_noun.ct:scene_test2.ct 0.2775788

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] < 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.2218333
## 3         0.1842500
## 4         0.2730000

# SIMPLIFIED MODEL (FINAL)

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df , list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct"))

##                            Estimate  Est.Error      Q2.5    Q97.5
## b_Intercept               4.5920634 0.09273427  4.409932 4.773527
## b_restricted_verb_noun.ct 0.1056231 0.14115417 -0.172580 0.379502

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)


# use unattested vs. novel in original study as an estimate of difference in unattested vs. novel
Bf(0.14, 0.11, uniform = 0, meanoftheory = 0, sdtheory = 0.38/2, tail = 1)

## $LikelihoodTheory
## [1] 2.237763
## 
## $Likelihoodnull
## [1] 2.092796
## 
## $BayesFactor
## [1] 1.06927

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.14, 0.11, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)


# find values for which BF is inconclusive 
ev_for_h1 <- subset(data.frame(range_test), BF < 3 & BF > 1/3)
low_threshold <- min(ev_for_h1$sdtheory)
high_threshold <- max(ev_for_h1$sdtheory)
print(low_threshold)

## [1] 0.01

print(high_threshold)

## [1] 0.88

# find out how many more participants we would need for conclusive evidence for entrenchment (BF > 3)
invisible(Bf_powercalc(0.14,  0.11, uniform=0, meanoftheory=0, sdtheory=0.38/2, tail=1, N=40, min=30, max=400))

#N = 238

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?

#no semantically incorrect trials to filter out

#we only want to keep novel

all_judgment_novel.df <- subset(exp4_judgment_data.df, verb_noun_type_training2 == "novel")   

#and restricted items

all_judgment_unattested_constr1.df <- subset(exp4_judgment_data.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
all_judgment_unattested_constr2.df <- subset(exp4_judgment_data.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

all_judgment_unattested_novel.df <- rbind(all_judgment_novel.df, all_judgment_unattested_constr1.df, all_judgment_unattested_constr2.df)
all_judgment_unattested_novel.df$restricted_verb_noun <- factor(all_judgment_unattested_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(all_judgment_unattested_novel.df$response, list(all_judgment_unattested_novel.df$restricted_verb_noun, all_judgment_unattested_novel.df$condition), mean),3)

##     entrenchment preemption
## yes        4.537      2.717
## no         4.646      3.475

#Center variables of interest using the lizCenter function:
df = lizCenter(all_judgment_unattested_novel.df, list("restricted_verb_noun","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_pre_vs_ent_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct * condition.ct *scene_test2.ct, data=df, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_pre_vs_ent_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_condition.ct", "b_scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct", "b_restricted_verb_noun.ct:scene_test2.ct","b_condition.ct:scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct:scene_test2.ct" ))

##                                                          Estimate  Est.Error
## b_Intercept                                            3.84696703 0.07364829
## b_restricted_verb_noun.ct                              0.42245850 0.12822068
## b_condition.ct                                        -1.45118040 0.15078511
## b_scene_test2.ct                                      -0.07622597 0.06296661
## b_restricted_verb_noun.ct:condition.ct                 0.58520829 0.25682444
## b_restricted_verb_noun.ct:scene_test2.ct              -0.15637095 0.12441659
## b_condition.ct:scene_test2.ct                         -0.33426741 0.12418818
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.05411566 0.24530265
##                                                              Q2.5       Q97.5
## b_Intercept                                            3.69969605  3.99186858
## b_restricted_verb_noun.ct                              0.17222960  0.67351160
## b_condition.ct                                        -1.74702960 -1.15484353
## b_scene_test2.ct                                      -0.20025323  0.04785169
## b_restricted_verb_noun.ct:condition.ct                 0.07883685  1.07463921
## b_restricted_verb_noun.ct:scene_test2.ct              -0.39977860  0.08746769
## b_condition.ct:scene_test2.ct                         -0.57495042 -0.08944954
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.53421730  0.42732502

mcmc_plot(judgments_pre_vs_ent_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_pre_vs_ent_model))

C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0) 
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:condition.ct"] < 0)
C5=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] > 0) 
C6=mean(samps[,"b_condition.ct:scene_test2.ct"] > 0)
C7=mean(samps[,"b_restricted_verb_noun.ct:condition.ct:scene_test2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7)
## 1                   0.000500000
## 2                   0.000000000
## 3                   0.113416667
## 4                   0.013333333
## 5                   0.104750000
## 6                   0.003166667
## 7                   0.413083333

#roughly predicted effect size from previous study was 1.0. Use it as an estimate of the effect we expect here
Bf(0.25, 0.57, uniform = 0, meanoftheory = 0, sdtheory = 1.00, tail = 1)

## $LikelihoodTheory
## [1] 0.6551184
## 
## $Likelihoodnull
## [1] 0.1186183
## 
## $BayesFactor
## [1] 5.52291

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.25, 0.57, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.15

print(high_threshold)

## [1] 2.12

Exploratory data analyses

Effect of statistical pre-emption: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

# Figure 31
judgments_unattested_attested.df <- subset(exp4_judgment_data.df, restricted_verb_noun == "yes")   


aggregated.means1 = aggregate(response ~ condition + attested_unattested + participant_private_id, judgments_unattested_attested.df , FUN=mean)
aggregated.means1<- rename(aggregated.means1, attested = attested_unattested)

aggregated.means1$attested<- recode(aggregated.means1$attested, "1" = "yes","0" = "no")


yarrr::pirateplot(formula = response ~  attested  + condition,
                  data = aggregated.means1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

# analyses
attested_vs_unattested = subset(exp4_preemption_judgment.df, restricted_verb_noun == "yes")

round(tapply(attested_vs_unattested$response, attested_vs_unattested$attested_unattested, mean),3)

##     0     1 
## 2.717 4.375

# model with tested construction
#Center variables of interest using the lizCenter function:
d0_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d0_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_preemption1, variable = c("b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                            Estimate  Est.Error        Q2.5
## b_scene_test2.ct                        -0.04775044 0.06946793 -0.18480734
## b_attested_unattested.ct                 1.47600914 0.32181326  0.83358990
## b_attested_unattested.ct:scene_test2.ct  0.20135793 0.13226533 -0.06072156
##                                              Q97.5
## b_scene_test2.ct                        0.08941463
## b_attested_unattested.ct                2.10648349
## b_attested_unattested.ct:scene_test2.ct 0.45970493

samps = as.matrix(as.mcmc(attested_unattested_preemption1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1    0.24616667
## 2    0.00000000
## 3    0.06508333

#SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))


posterior_summary(attested_unattested_preemption, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                          Estimate  Est.Error      Q2.5    Q97.5
## b_Intercept              3.559357 0.07709578 3.4098178 3.713241
## b_attested_unattested.ct 1.477298 0.32565601 0.8313471 2.111387

mcmc_plot(attested_unattested_preemption, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_preemption))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# prior from previous study with adults: 2.55
Bf(0.32, 1.47, uniform = 0, meanoftheory = 0, sdtheory = 2.55, tail = 1)

## $LikelihoodTheory
## [1] 0.2636105
## 
## $Likelihoodnull
## [1] 3.261384e-05
## 
## $BayesFactor
## [1] 8082.779

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.32, 1.47, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.08

print(high_threshold)

## [1] 4

Effect of statistical entrenchment: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

attested_vs_unattested_ent = subset(exp4_entrenchment_judgment.df, restricted_verb_noun == "yes")

round(tapply(attested_vs_unattested_ent$response, attested_vs_unattested_ent$attested_unattested, mean),3)

##     0     1 
## 4.537 4.850

#Center variables of interest using the lizCenter function:
d_attested_unattested_ent1 = lizCenter(attested_vs_unattested_ent, list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d_attested_unattested_ent1, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment1, variable = c("b_Intercept","b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                             Estimate  Est.Error        Q2.5
## b_Intercept                              4.700561025 0.08652214  4.53170592
## b_scene_test2.ct                        -0.008081567 0.08113197 -0.16781335
## b_attested_unattested.ct                 0.304674125 0.14436848  0.01961831
## b_attested_unattested.ct:scene_test2.ct -0.304608299 0.25623289 -0.79901532
##                                             Q97.5
## b_Intercept                             4.8743126
## b_scene_test2.ct                        0.1540354
## b_attested_unattested.ct                0.5838210
## b_attested_unattested.ct:scene_test2.ct 0.1987388

mcmc_plot(attested_unattested_entrenchment1, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] > 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1    0.45783333
## 2    0.01966667
## 3    0.11858333

# SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested_ent = lizCenter(attested_vs_unattested_ent, list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested_ent, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                           Estimate  Est.Error       Q2.5     Q97.5
## b_Intercept              4.7012621 0.08757613 4.52621860 4.8722510
## b_attested_unattested.ct 0.3016032 0.14058478 0.02895671 0.5790088

mcmc_plot(attested_unattested_entrenchment, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##    c(C1, C2)
## 1 0.00000000
## 2 0.01508333

# expect a difference of 0.38 from previous work
Bf(0.14, 0.31, uniform = 0, meanoftheory = 0, sdtheory = 0.38, tail = 1)

## $LikelihoodTheory
## [1] 1.442615
## 
## $Likelihoodnull
## [1] 0.2455251
## 
## $BayesFactor
## [1] 5.875631

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.14, 0.31, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.09

print(high_threshold)

## [1] 1.01

Entrenchment vs. preemption: ratings for witnessed vs. unwitnessed forms

attested_vs_unattested_across = subset(exp4_judgment_data.df, restricted_verb_noun == "yes")

round(tapply(attested_vs_unattested_across$response, list(attested_vs_unattested_across$condition, attested_vs_unattested_across$attested_unattested), mean),3)

##                  0     1
## entrenchment 4.537 4.850
## preemption   2.717 4.375

# model with test construction
#Center variables of interest using the lizCenter function:

d0_attested_unattested_all = lizCenter(attested_vs_unattested_across , list("attested_unattested","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_all <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct*condition.ct, data=d0_attested_unattested_all, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_all, variable = c("b_Intercept","b_attested_unattested.ct", "b_scene_test2.ct", "b_condition.ct", "b_attested_unattested.ct:scene_test2.ct", "b_attested_unattested.ct:condition.ct", "b_scene_test2.ct:condition.ct", "b_attested_unattested.ct:scene_test2.ct:condition.ct"))

##                                                         Estimate  Est.Error
## b_Intercept                                           4.12466518 0.05778405
## b_attested_unattested.ct                              0.94887873 0.17920056
## b_scene_test2.ct                                     -0.02805994 0.04975284
## b_condition.ct                                       -1.12022516 0.11412592
## b_attested_unattested.ct:scene_test2.ct              -0.06424622 0.13914102
## b_attested_unattested.ct:condition.ct                 1.16027590 0.34545581
## b_scene_test2.ct:condition.ct                        -0.04046957 0.10015825
## b_attested_unattested.ct:scene_test2.ct:condition.ct  0.48709007 0.27567267
##                                                             Q2.5      Q97.5
## b_Intercept                                           4.01017907  4.2362692
## b_attested_unattested.ct                              0.59296723  1.3011927
## b_scene_test2.ct                                     -0.12685944  0.0683049
## b_condition.ct                                       -1.34543342 -0.8926197
## b_attested_unattested.ct:scene_test2.ct              -0.33730053  0.2090005
## b_attested_unattested.ct:condition.ct                 0.45861275  1.8285488
## b_scene_test2.ct:condition.ct                        -0.23741886  0.1578364
## b_attested_unattested.ct:scene_test2.ct:condition.ct -0.04969375  1.0283325

samps = as.matrix(as.mcmc(attested_unattested_all))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_condition.ct"] > 0)
C5=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] > 0)
C6=mean(samps[,"b_attested_unattested.ct:condition.ct"] < 0)
C7=mean(samps[,"b_scene_test2.ct:condition.ct"] > 0)
C8=mean(samps[,"b_attested_unattested.ct:scene_test2.ct:condition.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7,C8))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7, C8)
## 1                      0.0000000000
## 2                      0.0000000000
## 3                      0.2837500000
## 4                      0.0000000000
## 5                      0.3229166667
## 6                      0.0001666667
## 7                      0.3424166667
## 8                      0.0380833333

#roughly predicted effect size from previous study 2.11
Bf(0.34, 1.14, uniform = 0, meanoftheory = 0, sdtheory = 2.11, tail = 1)

## $LikelihoodTheory
## [1] 0.3236811
## 
## $Likelihoodnull
## [1] 0.004248301
## 
## $BayesFactor
## [1] 76.19072

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.34, 1.14, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.12

print(high_threshold)

## [1] 4

Production data: Effect of statistical pre-emption

#Are participants producing more attested than unattested dets? we will now compare proportion of attested dets (that's the intercept) for the restricted verbs against chance 

production_preemption_attested_unattested.df <- subset(exp4_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_attested_unattested.df <- subset(production_preemption_attested_unattested.df, restricted_verb_noun =="yes")

round(tapply(production_preemption_attested_unattested.df $attested_unattested, production_preemption_attested_unattested.df $verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.915         0.823            NA

production_preemption_attested_unattested.df$verb_noun_type_training2 <- factor(production_preemption_attested_unattested.df$verb_noun_type_training2)

df_prod = lizCenter(production_preemption_attested_unattested.df , list("verb_noun_type_training2"))  

# maximally vague priors for the predictors and the intercept
prod_attested_unattested = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                 Estimate Est.Error      Q2.5    Q97.5
## b_Intercept                    3.0639538 0.6790375  1.660292 4.337627
## b_verb_noun_type_training2.ct -0.7788673 0.6097739 -1.906375 0.495107

mcmc_plot(prod_attested_unattested, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0002500
## 2 0.1020833

#same analyses without verb_training_type

# maximally vague priors for the intercept
prod_attested_unattested_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 2.964787 0.6371152 1.632119 4.202644

mcmc_plot(prod_attested_unattested_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_final))
C1=mean(samps[,"b_Intercept"] < 0)


# We will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the restricted verbs than for the novel verb


production_preemption_restricted_novel.df <- subset(exp4_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_restricted_novel.df<- subset(production_preemption_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- recode(production_preemption_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.085         0.177         0.503

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.131 0.503

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun)
production_preemption_restricted_novel1.df = lizCenter(production_preemption_restricted_novel.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors and the intercept
prod_unattested_novel_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_preemption_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error       Q2.5     Q97.5
## b_Intercept               -2.065887 0.4410375 -2.9192262 -1.165670
## b_restricted_verb_noun.ct  1.978798 0.7441411  0.4757647  3.377839

mcmc_plot(prod_unattested_novel_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_final))
C1=mean(samps[,"b_Intercept"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1    0.0000
## 2    0.0065

Production data: Effect of statistical entrenchment

#a. Are participants producing more attested than unattested dets?
# here, we want to see how often participants say the unattested e.g. transitive-only det1 for a det2 (intransitive-only) verb in the intransitive condition at test 
# and vice versa 

production_entrenchment_attested_unattested.df  <- subset(exp4_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_attested_unattested.df  <- subset(production_entrenchment_attested_unattested.df, restricted_verb_noun =="yes")

#We want to compare attested vs. unattested trials for transitive verbs in the intransitive inchoative construction at test
production_entrenchment_attested_unattested1.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

#And intransitive inchoative verbs in the transitive construction at test. Filter out irrelevant trials
production_entrenchment_attested_unattested2.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_attested_unattested.df <- rbind(production_entrenchment_attested_unattested1.df, production_entrenchment_attested_unattested2.df)

#How much of the time are participants producing attested items?
round(mean(production_entrenchment_attested_unattested.df$attested_unattested),3)

## [1] 0.123

# and separately for each verb type
round(tapply(production_entrenchment_attested_unattested.df$attested_unattested, production_entrenchment_attested_unattested.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.137         0.109            NA

production_entrenchment_attested_unattested.df$verb_noun_type_training2 <- factor(production_entrenchment_attested_unattested.df$verb_noun_type_training2)
df_prod_ent = lizCenter((production_entrenchment_attested_unattested.df), list("verb_noun_type_training2"))  


# maximally vague priors for the predictors and the intercept
prod_attested_unattested_ent = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_ent, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                 Estimate Est.Error      Q2.5     Q97.5
## b_Intercept                   -3.0983951 0.7995118 -4.537806 -1.334261
## b_verb_noun_type_training2.ct -0.3679099 0.7089648 -1.773684  1.046187

mcmc_plot(prod_attested_unattested_ent, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.9990833
## 2 0.2968333

#same analyses without verb_training_type


# maximally vague priors for the intercept
prod_attested_unattested_ent_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

summary(prod_attested_unattested_ent_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ 1 + (1 | participant_private_id) 
##    Data: df_prod_ent (Number of observations: 302) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 39) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     4.09      1.28     2.38     7.31 1.00     2833     4262
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept    -3.15      0.64    -4.40    -1.84 1.00     3945     4594
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

posterior_summary(prod_attested_unattested_ent_final, variable = c("b_Intercept"))

##              Estimate Est.Error      Q2.5    Q97.5
## b_Intercept -3.154072 0.6443672 -4.395901 -1.84278

mcmc_plot(prod_attested_unattested_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0.9998333

# c. we will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the 2 non-alternating verbs than for the novel verb (presumably the “unwitnessed” form has to be set arbitrarily here)


production_entrenchment_restricted_novel.df <- subset(exp4_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_restricted_novel.df<- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_entrenchment_restricted_novel.df$attested_unattested)
production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_entrenchment_restricted_novel.df$attested_unattested)

# select trials featuring the novel verb in the intransitive inchoative construction
production_entrenchment_restricted_novel1.df <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "novel"  & scene_test2 == "construction2")


# Select trials featuring transitive verbs in the intransitive inchoative construction at test
production_entrenchment_restricted_novel2.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

# Select trials featuring intransitive verbs in the transitive construction at test
production_entrenchment_restricted_novel3.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)


round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.137         0.109         0.077

# reverse coding to focus on unattested rather than attested for novel vs. restricted
production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)
production_entrenchment_restricted_novel.df$attested_unattested<- recode(production_entrenchment_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)
production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun, levels = c("yes", "no"))

round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.877 0.923

#what this means is that participants produce *unattested forms* less for the restricted than they do for the novel

production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun)
production_entrenchment_restricted_novel1.df = lizCenter(production_entrenchment_restricted_novel.df, list("restricted_verb_noun"))


# maximally vague priors for the predictors and the intercept
prod_unattested_novel_ent_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_entrenchment_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_ent_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error       Q2.5    Q97.5
## b_Intercept               3.3924598 0.6637945  2.0091988 4.625746
## b_restricted_verb_noun.ct 0.3339008 0.6438076 -0.9065402 1.643224

mcmc_plot(prod_unattested_novel_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0002500
## 2 0.3055833

Experiment 5

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between the two argument-structure constructions?

Production data

#Figure 15
RQ1_graph_productions.df = subset(exp5_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating" |verb_noun_type_training2 == "novel")
RQ1_graph_productions.df = subset(RQ1_graph_productions.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.graph1 = aggregate(semantically_correct ~ verb_noun_type_training2 + participant_private_id, RQ1_graph_productions.df, FUN=mean)

aggregated.graph1 <- rename(aggregated.graph1, noun = verb_noun_type_training2,
                            correct = semantically_correct)

yarrr::pirateplot(formula = correct  ~ noun,
                  data = aggregated.graph1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "% semantically correct",
                  cex.lab = 1,
                  cex.axis = 1,
                  cex.names = 1,
                  yaxt = "n")

axis(2, at = seq(0, 1, by = 0.25), las=1)
abline(h = 0.50, lty = 2)

#1 alternating verb production

alternating_prod.df = subset(exp5_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating")

#and filter out responses where participants said something other than det1 or det2
alternating_prod.df = subset(alternating_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_alternating_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, alternating_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_alternating_prod.df$semantically_correct),3)

## [1] 0.958

# average accuracy separately for causative and inchoative scenes
round(tapply(aggregated.means_alternating_prod.df$semantically_correct, aggregated.means_alternating_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.976         0.940

# maximally vague priors for the intercept and the predictors
a = lizCenter(alternating_prod.df, list("scene_test2"))  

alternating_model <-brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=a, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model, variable = c("b_Intercept", "b_scene_test2.ct" ))

##                    Estimate Est.Error      Q2.5     Q97.5
## b_Intercept       3.0880151 0.4236518  2.323647 3.9701259
## b_scene_test2.ct -0.4467743 0.6148872 -1.650930 0.7506924

mcmc_plot(alternating_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.2329167

# no difference between construction 1 and construction 2

# Final model
# maximally vague priors for the intercept 
alternating_model_final = brm(formula = semantically_correct~1 + (1|participant_private_id), data=a, family = bernoulli(link = logit),set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 2.971751 0.4056766 2.251072 3.815451

mcmc_plot(alternating_model_final, variable = "b_Intercept", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

#2 novel verb production

novel_prod.df = subset(exp5_entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "novel")

#filter out responses where participants said something other than det1 or det2
novel_prod.df = subset(novel_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_novel_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, novel_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_novel_prod.df$semantically_correct),3)

## [1] 0.94

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_novel_prod.df$semantically_correct, aggregated.means_novel_prod.df$scene_test2, mean),3)

## construction1 construction2 
##          0.94          0.94

b = lizCenter(novel_prod.df, list("scene_test2"))  

# maximally vague priors for the intercept and the predictors
novel_model <- brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=b, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model, variable = c("b_Intercept", "b_scene_test2.ct"))

##                     Estimate Est.Error      Q2.5    Q97.5
## b_Intercept       2.88826008 0.4488524  2.066702 3.824652
## b_scene_test2.ct -0.05322225 0.6013003 -1.256747 1.106714

mcmc_plot(novel_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.4671667

# no difference between construction 1 and construction 2  
# Final model

# maximally vague priors for the intercept 
novel_model_final <- brm(formula = semantically_correct~1+ (1|participant_private_id), data=b, family = bernoulli(link = logit), set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 2.800737 0.4336458 2.011482 3.721876

mcmc_plot(novel_model_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

Judgment data

There are no semantically incorrect trials in Experiment 5, thus, these analyses are not possible.

Question 2: Does statistical preemption constrain morphological generalizations in children (judgment data)?

#Figure 16

#no semantically incorrect trials here

#we only want to keep novel
judgments_novel.df <- subset(exp5_judgment_data.df, verb_noun_type_training2 == "novel")   

#and restricted items
judgments_unattested_constr1.df <- subset(exp5_judgment_data.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
judgments_unattested_constr2.df <- subset(exp5_judgment_data.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

judgments_unattested_novel.df <- rbind(judgments_novel.df, judgments_unattested_constr1.df, judgments_unattested_constr2.df)

aggregated.means = aggregate(response ~ condition + restricted_verb_noun + participant_private_id, judgments_unattested_novel.df, FUN=mean)
aggregated.means<- rename(aggregated.means, restricted = restricted_verb_noun)

yarrr::pirateplot(formula = response ~ restricted + condition,
                  data = aggregated.means,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

judgments_unattested_novel_preemption.df <- subset(judgments_unattested_novel.df, condition == "preemption")
judgments_unattested_novel_preemption.df$restricted_verb_noun <- factor(judgments_unattested_novel_preemption.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(judgments_unattested_novel_preemption.df$response, judgments_unattested_novel_preemption.df$restricted_verb_noun, mean),3)

##   yes    no 
## 2.363 3.141

#Center variables of interest using the lizCenter function:
d_unattested_novel = lizCenter(judgments_unattested_novel_preemption.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(judgments_preemption_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                              Estimate Est.Error       Q2.5
## b_Intercept                               2.748372777 0.1455870  2.4634706
## b_restricted_verb_noun.ct                 0.750722893 0.2049244  0.3434968
## b_scene_test2.ct                          0.092829536 0.1397569 -0.1854070
## b_restricted_verb_noun.ct:scene_test2.ct -0.002370871 0.2474515 -0.4955336
##                                              Q97.5
## b_Intercept                              3.0379794
## b_restricted_verb_noun.ct                1.1474977
## b_scene_test2.ct                         0.3671593
## b_restricted_verb_noun.ct:scene_test2.ct 0.4863203

mcmc_plot(judgments_preemption_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_preemption_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] < 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1      0.000000e+00
## 2      8.333333e-05
## 3      2.485833e-01
## 4      4.986667e-01

#SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_unattested_novel = lizCenter(judgments_unattested_novel_preemption.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, data=d_unattested_novel, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.1/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13:
## In file included from /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.1/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

posterior_summary(judgments_preemption_model, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error      Q2.5    Q97.5
## b_Intercept               2.7491049 0.1414916 2.4740545 3.029540
## b_restricted_verb_noun.ct 0.7437583 0.2058178 0.3362302 1.143925

mcmc_plot(judgments_preemption_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_preemption_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##      c(C1, C2)
## 1 0.0000000000
## 2 0.0005833333

# BF analyses: we use the difference between attested and novel in Experiment 1 (SD = 0.65) as an estimate of the difference we expect here

Bf(0.21, 0.75, uniform = 0, meanoftheory = 0, sdtheory = 0.65/2, tail = 1)

## $LikelihoodTheory
## [1] 0.3147016
## 
## $Likelihoodnull
## [1] 0.003228164
## 
## $BayesFactor
## [1] 97.48626

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.21, 0.75, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h1 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h1$sdtheory)
high_threshold <- max(ev_for_h1$sdtheory)
print(low_threshold)

## [1] 0.07

print(high_threshold)

## [1] 4

Question 3: Does statistical entrenchment constrain morphological generalizations in children (judgment data)?

#no semantically incorrect trials here

#we only want to keep novel

entrenchment_judgments_novel.df <- subset(exp5_entrenchment_judgment.df, verb_noun_type_training2 == "novel")   

#and restricted items

entrenchment_judgments_unattested_constr1.df <- subset(exp5_entrenchment_judgment.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
entrenchment_judgments_unattested_constr2.df <- subset(exp5_entrenchment_judgment.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

entrenchment_judgments_unattested_novel.df <- rbind(entrenchment_judgments_novel.df, entrenchment_judgments_unattested_constr1.df, entrenchment_judgments_unattested_constr2.df)
entrenchment_judgments_unattested_novel.df$restricted_verb_noun <- factor(entrenchment_judgments_unattested_novel.df$restricted_verb_noun, levels = c("yes", "no"))


round(tapply(entrenchment_judgments_unattested_novel.df$response, entrenchment_judgments_unattested_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 4.267 4.242

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                             Estimate Est.Error        Q2.5
## b_Intercept                               4.25247104 0.1456781  3.96943153
## b_restricted_verb_noun.ct                -0.02348382 0.2181520 -0.44271977
## b_scene_test2.ct                          0.20692034 0.1110676 -0.01408107
## b_restricted_verb_noun.ct:scene_test2.ct  0.04833525 0.1654576 -0.27460885
##                                              Q97.5
## b_Intercept                              4.5382212
## b_restricted_verb_noun.ct                0.4178955
## b_scene_test2.ct                         0.4244873
## b_restricted_verb_noun.ct:scene_test2.ct 0.3732184

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] < 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1        0.00000000
## 2        0.55175000
## 3        0.03158333
## 4        0.38700000

# this one is based on the final N in the adult study (**attested vs. unattested** used as a max)
Bf(0.21, -0.02, uniform = 0, meanoftheory = 0, sdtheory = 0.38/2, tail = 1)

## $LikelihoodTheory
## [1] 1.337387
## 
## $Likelihoodnull
## [1] 1.891129
## 
## $BayesFactor
## [1] 0.7071894

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.21, -0.02, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)


# find values for which BF is inconclusive 
ev_for_h1 <- subset(data.frame(range_test), BF < 3 & BF > 1/3)
low_threshold <- min(ev_for_h1$sdtheory)
high_threshold <- max(ev_for_h1$sdtheory)
print(low_threshold)

## [1] 0.01

print(high_threshold)

## [1] 0.54

# find out how many more participants we would need for conclusive evidence for entrenchment (BF > 3)
invisible(Bf_powercalc(0.21, -0.02, uniform=0, meanoftheory=0, sdtheory=0.38/2, tail=1, N=40, min=30, max=400))

#N = 327

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?

#no semantically incorrect trials to filter out

#we only want to keep novel

all_judgment_novel.df <- subset(exp5_judgment_data.df, verb_noun_type_training2 == "novel")   

#and restricted items

all_judgment_unattested_constr1.df <- subset(exp5_judgment_data.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
all_judgment_unattested_constr2.df <- subset(exp5_judgment_data.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

all_judgment_unattested_novel.df <- rbind(all_judgment_novel.df, all_judgment_unattested_constr1.df, all_judgment_unattested_constr2.df)
all_judgment_unattested_novel.df$restricted_verb_noun <- factor(all_judgment_unattested_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(all_judgment_unattested_novel.df$response, list(all_judgment_unattested_novel.df$restricted_verb_noun, all_judgment_unattested_novel.df$condition), mean),3)

##     entrenchment preemption
## yes        4.267      2.363
## no         4.242      3.141

#Center variables of interest using the lizCenter function:
df = lizCenter(all_judgment_unattested_novel.df, list("restricted_verb_noun","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_pre_vs_ent_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct * condition.ct *scene_test2.ct, data=df, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_pre_vs_ent_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_condition.ct", "b_scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct", "b_restricted_verb_noun.ct:scene_test2.ct","b_condition.ct:scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct:scene_test2.ct" ))

##                                                          Estimate  Est.Error
## b_Intercept                                            3.45980400 0.09746285
## b_restricted_verb_noun.ct                              0.39880646 0.14892904
## b_condition.ct                                        -1.44027048 0.19485890
## b_scene_test2.ct                                       0.14622907 0.08980312
## b_restricted_verb_noun.ct:condition.ct                 0.73218524 0.28739739
## b_restricted_verb_noun.ct:scene_test2.ct               0.01962701 0.14982226
## b_condition.ct:scene_test2.ct                         -0.11505768 0.17683189
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.04924259 0.28717659
##                                                              Q2.5      Q97.5
## b_Intercept                                            3.27088243  3.6519592
## b_restricted_verb_noun.ct                              0.10380641  0.6858692
## b_condition.ct                                        -1.82138134 -1.0637861
## b_scene_test2.ct                                      -0.03080822  0.3224956
## b_restricted_verb_noun.ct:condition.ct                 0.15971748  1.2956458
## b_restricted_verb_noun.ct:scene_test2.ct              -0.27786889  0.3153994
## b_condition.ct:scene_test2.ct                         -0.45855107  0.2372264
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.59976374  0.5100318

mcmc_plot(judgments_pre_vs_ent_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_pre_vs_ent_model))

C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0) 
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_scene_test2.ct"] < 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:condition.ct"] < 0)
C5=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0) 
C6=mean(samps[,"b_condition.ct:scene_test2.ct"] > 0)
C7=mean(samps[,"b_restricted_verb_noun.ct:condition.ct:scene_test2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7)
## 1                   0.004666667
## 2                   0.000000000
## 3                   0.051416667
## 4                   0.006500000
## 5                   0.447166667
## 6                   0.252833333
## 7                   0.437833333

# SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
df = lizCenter(all_judgment_unattested_novel.df, list("restricted_verb_noun", "condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here)
judgments_pre_vs_ent_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct * condition.ct, data=df, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_pre_vs_ent_model, variable = c("b_restricted_verb_noun.ct", "b_condition.ct", "b_restricted_verb_noun.ct:condition.ct"))

##                                          Estimate Est.Error        Q2.5
## b_restricted_verb_noun.ct               0.3898703 0.1482730  0.09943112
## b_condition.ct                         -1.4450668 0.1930680 -1.82525827
## b_restricted_verb_noun.ct:condition.ct  0.7282702 0.2904465  0.16669012
##                                             Q97.5
## b_restricted_verb_noun.ct               0.6825214
## b_condition.ct                         -1.0633474
## b_restricted_verb_noun.ct:condition.ct  1.3026055

mcmc_plot(judgments_pre_vs_ent_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_pre_vs_ent_model))

C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_restricted_verb_noun.ct:condition.ct"] < 0) 

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1       0.00525
## 2       0.00000
## 3       0.00525

#roughly predicted effect size from previous study was 1.0. Use it as an estimate of the max effect we expect here
Bf(0.29, 0.73, uniform = 0, meanoftheory = 0, sdtheory = 1.00/2, tail = 1)

## $LikelihoodTheory
## [1] 0.6125222
## 
## $Likelihoodnull
## [1] 0.05788388
## 
## $BayesFactor
## [1] 10.58191

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.29, 0.73, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.15

print(high_threshold)

## [1] 4

Exploratory data analyses

Effect of statistical pre-emption: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

# Figure 34
judgments_unattested_attested.df <- subset(exp5_judgment_data.df, restricted_verb_noun == "yes")   


aggregated.means1 = aggregate(response ~ condition + attested_unattested + participant_private_id, judgments_unattested_attested.df , FUN=mean)
aggregated.means1<- rename(aggregated.means1, attested = attested_unattested)

aggregated.means1$attested<- recode(aggregated.means1$attested, "1" = "yes","0" = "no")


yarrr::pirateplot(formula = response ~  attested  + condition,
                  data = aggregated.means1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

# analyses
attested_vs_unattested = subset(exp5_preemption_judgment.df, restricted_verb_noun == "yes")

round(tapply(attested_vs_unattested$response, attested_vs_unattested$attested_unattested, mean),3)

##     0     1 
## 2.363 4.526

# model with tested construction
#Center variables of interest using the lizCenter function:
d0_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d0_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_preemption1, variable = c("b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                             Estimate Est.Error       Q2.5
## b_scene_test2.ct                         0.008965815 0.1041085 -0.1999387
## b_attested_unattested.ct                 2.048207543 0.2421165  1.5700349
## b_attested_unattested.ct:scene_test2.ct -0.160892582 0.2010730 -0.5529902
##                                             Q97.5
## b_scene_test2.ct                        0.2129446
## b_attested_unattested.ct                2.5191328
## b_attested_unattested.ct:scene_test2.ct 0.2332449

samps = as.matrix(as.mcmc(attested_unattested_preemption1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] > 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1     0.5330000
## 2     0.0000000
## 3     0.2078333

#SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))


posterior_summary(attested_unattested_preemption, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                          Estimate  Est.Error     Q2.5    Q97.5
## b_Intercept              3.462851 0.08497127 3.300354 3.631902
## b_attested_unattested.ct 2.044481 0.23809666 1.567227 2.506679

mcmc_plot(attested_unattested_preemption, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_preemption))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# prior from previous study with adults: 2.55 as a max
Bf(0.24, 2.04, uniform = 0, meanoftheory = 0, sdtheory = 2.55/2 , tail = 1)

## $LikelihoodTheory
## [1] 0.1786466
## 
## $Likelihoodnull
## [1] 3.402598e-16
## 
## $BayesFactor
## [1] 5.250301e+14

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.24, 2.04, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.04

print(high_threshold)

## [1] 4

Effect of statistical entrenchment: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

attested_vs_unattested_ent = subset(exp5_entrenchment_judgment.df, restricted_verb_noun == "yes")

round(tapply(attested_vs_unattested_ent$response, attested_vs_unattested_ent$attested_unattested, mean),3)

##     0     1 
## 4.267 4.717

#Center variables of interest using the lizCenter function:
d_attested_unattested_ent1 = lizCenter(attested_vs_unattested_ent, list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d_attested_unattested_ent1, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment1, variable = c("b_Intercept","b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                           Estimate Est.Error       Q2.5
## b_Intercept                              4.4927048 0.1253505  4.2443263
## b_scene_test2.ct                         0.1148531 0.1295208 -0.1483785
## b_attested_unattested.ct                 0.4448428 0.1642301  0.1215114
## b_attested_unattested.ct:scene_test2.ct -0.1320314 0.1336832 -0.3973605
##                                             Q97.5
## b_Intercept                             4.7354636
## b_scene_test2.ct                        0.3687156
## b_attested_unattested.ct                0.7684108
## b_attested_unattested.ct:scene_test2.ct 0.1318545

mcmc_plot(attested_unattested_entrenchment1, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment1))

C1=mean(samps[,"b_scene_test2.ct"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] > 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1   0.182500000
## 2   0.003916667
## 3   0.162250000

# SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested_ent = lizCenter(attested_vs_unattested_ent, list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested_ent, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                           Estimate Est.Error      Q2.5     Q97.5
## b_Intercept              4.4996846 0.1221072 4.2577494 4.7352712
## b_attested_unattested.ct 0.4336526 0.1676747 0.1075792 0.7635905

mcmc_plot(attested_unattested_entrenchment, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##     c(C1, C2)
## 1 0.000000000
## 2 0.004416667

# expect a difference of 0.38 from previous work
Bf(0.17, 0.43, uniform = 0, meanoftheory = 0, sdtheory = 0.346/2, tail = 1)

## $LikelihoodTheory
## [1] 0.6592187
## 
## $Likelihoodnull
## [1] 0.0957568
## 
## $BayesFactor
## [1] 6.884301

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.17, 0.43, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.09

print(high_threshold)

## [1] 2.72

Entrenchment vs. preemption: ratings for witnessed vs. unwitnessed forms

attested_vs_unattested_across = subset(exp5_judgment_data.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested_across$response, list(attested_vs_unattested_across$condition, attested_vs_unattested_across$attested_unattested), mean),3)

##                  0     1
## entrenchment 4.267 4.717
## preemption   2.363 4.526

# model with test construction
#Center variables of interest using the lizCenter function:

d0_attested_unattested_all = lizCenter(attested_vs_unattested_across , list("attested_unattested","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_all <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct*condition.ct, data=d0_attested_unattested_all, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_all, variable = c("b_Intercept","b_attested_unattested.ct", "b_scene_test2.ct", "b_condition.ct", "b_attested_unattested.ct:scene_test2.ct", "b_attested_unattested.ct:condition.ct", "b_scene_test2.ct:condition.ct", "b_attested_unattested.ct:scene_test2.ct:condition.ct"))

##                                                         Estimate  Est.Error
## b_Intercept                                           3.94286110 0.07307188
## b_attested_unattested.ct                              1.33801706 0.14997445
## b_scene_test2.ct                                      0.05694750 0.08312553
## b_condition.ct                                       -0.99203426 0.14461966
## b_attested_unattested.ct:scene_test2.ct              -0.14898097 0.11918319
## b_attested_unattested.ct:condition.ct                 1.56512041 0.28883242
## b_scene_test2.ct:condition.ct                        -0.11020615 0.16290498
## b_attested_unattested.ct:scene_test2.ct:condition.ct -0.02992876 0.22878887
##                                                            Q2.5       Q97.5
## b_Intercept                                           3.8006283  4.08717890
## b_attested_unattested.ct                              1.0435207  1.63156440
## b_scene_test2.ct                                     -0.1066644  0.21915103
## b_condition.ct                                       -1.2725984 -0.70306330
## b_attested_unattested.ct:scene_test2.ct              -0.3789541  0.08264113
## b_attested_unattested.ct:condition.ct                 0.9906394  2.11495107
## b_scene_test2.ct:condition.ct                        -0.4260961  0.21453990
## b_attested_unattested.ct:scene_test2.ct:condition.ct -0.4828172  0.41911081

samps = as.matrix(as.mcmc(attested_unattested_all))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] < 0)
C4=mean(samps[,"b_condition.ct"] > 0)
C5=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] > 0)
C6=mean(samps[,"b_attested_unattested.ct:condition.ct"] < 0)
C7=mean(samps[,"b_scene_test2.ct:condition.ct"] > 0)
C8=mean(samps[,"b_attested_unattested.ct:scene_test2.ct:condition.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7,C8))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7, C8)
## 1                         0.0000000
## 2                         0.0000000
## 3                         0.2461667
## 4                         0.0000000
## 5                         0.1109167
## 6                         0.0000000
## 7                         0.2446667
## 8                         0.4459167

#Center variables of interest using the lizCenter function:
df_attested_unattested = lizCenter(attested_vs_unattested_across, list("attested_unattested", "condition"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment_preemption <- brm(formula = response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct * condition.ct, data=df_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment_preemption, variable = c("b_Intercept","b_condition.ct", "b_attested_unattested.ct","b_attested_unattested.ct:condition.ct"))

##                                         Estimate  Est.Error      Q2.5
## b_Intercept                            3.9440809 0.07314369  3.799811
## b_condition.ct                        -0.9897911 0.14532782 -1.267836
## b_attested_unattested.ct               1.3280303 0.14801619  1.043710
## b_attested_unattested.ct:condition.ct  1.5467504 0.29133103  0.971104
##                                            Q97.5
## b_Intercept                            4.0875400
## b_condition.ct                        -0.7009976
## b_attested_unattested.ct               1.6229089
## b_attested_unattested.ct:condition.ct  2.1069986

mcmc_plot(attested_unattested_entrenchment_preemption, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment_preemption))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_attested_unattested.ct"] < 0) 
C4=mean(samps[,"b_attested_unattested.ct:condition.ct"] < 0) 

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1                 0
## 2                 0
## 3                 0
## 4                 0

#max predicted effect size from previous study 2.11
Bf(0.29, 1.56, uniform = 0, meanoftheory = 0, sdtheory = 2.12/2, tail = 1)

## $LikelihoodTheory
## [1] 0.2650902
## 
## $Likelihoodnull
## [1] 7.160226e-07
## 
## $BayesFactor
## [1] 370225.9

H1RANGE = seq(0,4,by=0.01)
range_test <- Bf_range(0.29, 1.56, meanoftheory=0, sdtheoryrange= H1RANGE, tail=1)

# find values for which BF > 3
ev_for_h0 <- subset(data.frame(range_test), BF > 3)
low_threshold <- min(ev_for_h0$sdtheory)
high_threshold <- max(ev_for_h0$sdtheory)
print(low_threshold)

## [1] 0.06

print(high_threshold)

## [1] 4

Production data: Effect of statistical pre-emption

#Are participants producing more attested than unattested dets? we will now compare proportion of attested dets (that's the intercept) for the restricted verbs against chance 
production_preemption_attested_unattested.df <- subset(exp5_preemption_production.df, experimenter == "GS")
production_preemption_attested_unattested.df <- subset(production_preemption_attested_unattested.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_attested_unattested.df <- subset(production_preemption_attested_unattested.df, restricted_verb_noun =="yes")

round(tapply(production_preemption_attested_unattested.df $attested_unattested, production_preemption_attested_unattested.df $verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.940         0.944            NA

production_preemption_attested_unattested.df$verb_noun_type_training2 <- factor(production_preemption_attested_unattested.df$verb_noun_type_training2)

df_prod = lizCenter(production_preemption_attested_unattested.df , list("verb_noun_type_training2"))  

# maximally vague priors for the predictors and the intercept
prod_attested_unattested = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                Estimate Est.Error       Q2.5    Q97.5
## b_Intercept                   3.1341834 0.9960006  0.7411656 4.725943
## b_verb_noun_type_training2.ct 0.1543384 0.5963445 -0.9996258 1.321258

mcmc_plot(prod_attested_unattested, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##     c(C1, C2)
## 1 0.007083333
## 2 0.403583333

#same analyses without verb_training_type

# maximally vague priors for the intercept
prod_attested_unattested_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5   Q97.5
## b_Intercept  3.23289 0.9101538 1.100983 4.69088

mcmc_plot(prod_attested_unattested_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_final))
C1=mean(samps[,"b_Intercept"] < 0)


# We will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the restricted verbs than for the novel verb


production_preemption_restricted_novel.df <- subset(exp5_preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_restricted_novel.df<- subset(production_preemption_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- recode(production_preemption_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.060         0.056         0.537

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.058 0.537

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun)
production_preemption_restricted_novel1.df = lizCenter(production_preemption_restricted_novel.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors and the intercept
prod_unattested_novel_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_preemption_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error         Q2.5     Q97.5
## b_Intercept               -2.436134 0.5072066 -3.362708819 -1.339206
## b_restricted_verb_noun.ct  1.889201 0.9335428 -0.008896963  3.632812

mcmc_plot(prod_unattested_novel_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_final))
C1=mean(samps[,"b_Intercept"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##      c(C1, C2)
## 1 0.0001666667
## 2 0.0254166667

Production data: Effect of statistical entrenchment

#a. Are participants producing more attested than unattested dets?
# here, we want to see how often participants say the unattested e.g. transitive-only det1 for a det2 (intransitive-only) verb in the intransitive condition at test 
# and vice versa 

production_entrenchment_attested_unattested.df  <- subset(exp5_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_attested_unattested.df  <- subset(production_entrenchment_attested_unattested.df, restricted_verb_noun =="yes")

#We want to compare attested vs. unattested trials for transitive verbs in the intransitive inchoative construction at test
production_entrenchment_attested_unattested1.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

#And intransitive inchoative verbs in the transitive construction at test. Filter out irrelevant trials
production_entrenchment_attested_unattested2.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_attested_unattested.df <- rbind(production_entrenchment_attested_unattested1.df, production_entrenchment_attested_unattested2.df)

#How much of the time are participants producing attested items?
round(mean(production_entrenchment_attested_unattested.df$attested_unattested),3)

## [1] 0.167

# and separately for each verb type
round(tapply(production_entrenchment_attested_unattested.df$attested_unattested, production_entrenchment_attested_unattested.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.167         0.167            NA

production_entrenchment_attested_unattested.df$verb_noun_type_training2 <- factor(production_entrenchment_attested_unattested.df$verb_noun_type_training2)
df_prod_ent = lizCenter((production_entrenchment_attested_unattested.df), list("verb_noun_type_training2"))  


# maximally vague priors for the predictors and the intercept
prod_attested_unattested_ent = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_ent, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                  Estimate Est.Error       Q2.5     Q97.5
## b_Intercept                   -1.90833199 0.4273206 -2.7805322 -1.082364
## b_verb_noun_type_training2.ct  0.06956874 0.4816382 -0.8592132  1.020713

mcmc_plot(prod_attested_unattested_ent, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1   1.00000
## 2   0.44325

#same analyses without verb_training_type


# maximally vague priors for the intercept
prod_attested_unattested_ent_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

summary(prod_attested_unattested_ent_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ 1 + (1 | participant_private_id) 
##    Data: df_prod_ent (Number of observations: 168) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 21) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     1.59      0.55     0.75     2.89 1.00     3355     5062
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept    -1.86      0.42    -2.72    -1.06 1.00     5339     6044
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

posterior_summary(prod_attested_unattested_ent_final, variable = c("b_Intercept"))

##              Estimate Est.Error      Q2.5     Q97.5
## b_Intercept -1.864721 0.4167428 -2.723396 -1.064065

mcmc_plot(prod_attested_unattested_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 1

# c. we will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the 2 non-alternating verbs than for the novel verb (presumably the “unwitnessed” form has to be set arbitrarily here)


production_entrenchment_restricted_novel.df <- subset(exp5_entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_restricted_novel.df<- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_entrenchment_restricted_novel.df$attested_unattested)
production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_entrenchment_restricted_novel.df$attested_unattested)

# select trials featuring the novel verb in the intransitive inchoative construction
production_entrenchment_restricted_novel1.df <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "novel"  & scene_test2 == "construction2")


# Select trials featuring transitive verbs in the intransitive inchoative construction at test
production_entrenchment_restricted_novel2.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

# Select trials featuring intransitive verbs in the transitive construction at test
production_entrenchment_restricted_novel3.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)


round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.167         0.167         0.060

# reverse coding to focus on unattested rather than attested for novel vs. restricted
production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)
production_entrenchment_restricted_novel.df$attested_unattested<- recode(production_entrenchment_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)
production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun, levels = c("yes", "no"))

round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.833 0.940

#what this means is that participants produce *unattested forms* less for the restricted than they do for the novel

production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun)
production_entrenchment_restricted_novel1.df = lizCenter(production_entrenchment_restricted_novel.df, list("restricted_verb_noun"))


# maximally vague priors for the predictors and the intercept
prod_unattested_novel_ent_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_entrenchment_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_ent_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error        Q2.5    Q97.5
## b_Intercept               2.2589957 0.4016533 1.493760371 3.065701
## b_restricted_verb_noun.ct 0.9940012 0.5189887 0.000908003 2.038622

mcmc_plot(prod_unattested_novel_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##    c(C1, C2)
## 1 0.00000000
## 2 0.02466667

Analyses across experiments

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between the two argument-structure constructions?

Production data

#Figure 17
RQ1_graph_productions.df = subset(entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating" |verb_noun_type_training2 == "novel")
RQ1_graph_productions.df = subset(RQ1_graph_productions.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.graph1 = aggregate(semantically_correct ~ verb_noun_type_training2 + participant_private_id, RQ1_graph_productions.df, FUN=mean)

aggregated.graph1 <- rename(aggregated.graph1, verb_noun = verb_noun_type_training2,
                            correct = semantically_correct)

yarrr::pirateplot(formula = correct  ~ verb_noun,
                  data = aggregated.graph1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "% semantically correct",
                  cex.lab = 1,
                  cex.axis = 1,
                  cex.names = 1,
                  yaxt = "n")

axis(2, at = seq(0, 1, by = 0.25), las=1)
abline(h = 0.50, lty = 2)

#1 alternating verb production

alternating_prod.df = subset(entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating")

#and filter out responses where participants said something other than det1 or det2
alternating_prod.df = subset(alternating_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_alternating_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, alternating_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_alternating_prod.df$semantically_correct),3)

## [1] 0.966

# average accuracy separately for causative and inchoative scenes
round(tapply(aggregated.means_alternating_prod.df$semantically_correct, aggregated.means_alternating_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.978         0.953

# maximally vague priors for the intercept and the predictors
a = lizCenter(alternating_prod.df, list("scene_test2"))  

alternating_model <-brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=a, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model, variable = c("b_Intercept", "b_scene_test2.ct" ))

##                    Estimate Est.Error      Q2.5     Q97.5
## b_Intercept       3.9295409 0.2865117  3.409887 4.5343269
## b_scene_test2.ct -0.5359587 0.4230030 -1.341275 0.3250266

mcmc_plot(alternating_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.1038333

# no difference between construction 1 and construction 2

# Final model
# maximally vague priors for the intercept 
alternating_model_final = brm(formula = semantically_correct~1 + (1|participant_private_id), data=a, family = bernoulli(link = logit),set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5  Q97.5
## b_Intercept 3.789272 0.2624495 3.315712 4.3421

mcmc_plot(alternating_model_final, variable = "b_Intercept", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

#2 novel verb production

novel_prod.df = subset(entrenchment_production.df, condition == "entrenchment" & verb_noun_type_training2 == "novel")

#filter out responses where participants said something other than det1 or det2
novel_prod.df = subset(novel_prod.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")

# aggregated dataframe for means
aggregated.means_novel_prod.df = aggregate(semantically_correct ~ scene_test2 + participant_private_id, novel_prod.df, FUN=mean)

# average accuracy across trial types
round(mean(aggregated.means_novel_prod.df$semantically_correct),3)

## [1] 0.952

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_novel_prod.df$semantically_correct, aggregated.means_novel_prod.df$scene_test2, mean),3)

## construction1 construction2 
##         0.959         0.946

b = lizCenter(novel_prod.df, list("scene_test2"))  

# maximally vague priors for the intercept and the predictors
novel_model <- brm(formula = semantically_correct~scene_test2.ct + (1 + scene_test2.ct|participant_private_id), data=b, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model, variable = c("b_Intercept", "b_scene_test2.ct"))

##                   Estimate Est.Error      Q2.5     Q97.5
## b_Intercept       4.270171 0.3288543  3.672304 4.9591253
## b_scene_test2.ct -0.650139 0.5062990 -1.690828 0.3017423

mcmc_plot(novel_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1   0.00000
## 2   0.09225

# no difference between construction 1 and construction 2  
# Final model

# maximally vague priors for the intercept 
novel_model_final <- brm(formula = semantically_correct~1+ (1|participant_private_id), data=b, family = bernoulli(link = logit), set_prior("normal(0,1)", class="Intercept"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 4.018292  0.286136 3.501378 4.614413

mcmc_plot(novel_model_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 0

Judgment data

#Figure 18
RQ1_graph_judgments.df = subset(entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating" |verb_noun_type_training2 == "novel")
# filter out experiments 4 and 5 that have no semantically incorrect trials
RQ1_graph_judgments.df = subset(RQ1_graph_judgments.df, experiment != "exp4")
RQ1_graph_judgments.df = subset(RQ1_graph_judgments.df, experiment != "exp5")


# aggregated dataframe for means
aggregated.graph2 = aggregate(response ~ verb_noun_type_training2 + semantically_correct + participant_private_id, RQ1_graph_judgments.df, FUN=mean)
aggregated.graph2$semantically_correct <- recode(aggregated.graph2$semantically_correct, "1" = "yes","0" = "no")

aggregated.graph2 <- rename(aggregated.graph2, verb_noun = verb_noun_type_training2,
                                           correct = semantically_correct)

yarrr::pirateplot(formula = response ~ correct + verb_noun,
                  data = aggregated.graph2,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

#1 alternating verb judgments

alternating_judgments.df = subset(entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "alternating")
alternating_judgments.df= subset(alternating_judgments.df, experiment != "exp4")
alternating_judgments.df = subset(alternating_judgments.df, experiment != "exp5")

# aggregated dataframe for means
aggregated.means_alternating_judgments = aggregate(response ~ scene_test2 + semantically_correct + participant_private_id, alternating_judgments.df, FUN=mean)
aggregated.means_alternating_judgments$semantically_correct<- recode(aggregated.means_alternating_judgments$semantically_correct, "1" = "yes","0" = "no")
aggregated.means_alternating_judgments$scene_test2 <- recode(aggregated.means_alternating_judgments$scene_test2, "construction1" = "transitive causative","construction2" = "intransitive inchoative")

# average accuracy for semantically correct vs. incorrect trials across causative and noncausative trial types
round(tapply(aggregated.means_alternating_judgments$response, aggregated.means_alternating_judgments$semantically_correct, mean),3)

##    no   yes 
## 2.386 4.825

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_alternating_judgments$response, list(aggregated.means_alternating_judgments$semantically_correct, aggregated.means_alternating_judgments$scene_test2), mean),3)

##     transitive causative intransitive inchoative
## no                 2.386                   2.386
## yes                4.825                   4.825

c = lizCenter(alternating_judgments.df, list("scene_test2", "semantically_correct"))  

# maximally vague priors for the predictors (we don't interpret the intercept here)
alternating_model_judgments <-brm(formula = response~scene_test2.ct * semantically_correct.ct + (1 + scene_test2.ct*semantically_correct.ct|participant_private_id), data=c, family = gaussian(), set_prior("normal(0,1)", class="b"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))


posterior_summary(alternating_model_judgments, variable = c("b_Intercept", "b_scene_test2.ct", "b_semantically_correct.ct", "b_scene_test2.ct:semantically_correct.ct"))

##                                               Estimate  Est.Error        Q2.5
## b_Intercept                               3.6099980477 0.04125045  3.52934295
## b_scene_test2.ct                          0.0001197042 0.04099151 -0.08091563
## b_semantically_correct.ct                 2.4245286009 0.07612683  2.27373456
## b_scene_test2.ct:semantically_correct.ct -0.0006052378 0.08315710 -0.16317897
##                                              Q97.5
## b_Intercept                              3.6916550
## b_scene_test2.ct                         0.0815820
## b_semantically_correct.ct                2.5718158
## b_scene_test2.ct:semantically_correct.ct 0.1628261

mcmc_plot(alternating_model_judgments, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_judgments))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
C3=mean(samps[,"b_semantically_correct.ct"] < 0)
C4=mean(samps[,"b_scene_test2.ct:semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.5005000
## 3         0.0000000
## 4         0.5019167

# no difference between construction 1 and construction 2

# Final model

# maximally vague priors for the predictors (we don't interpret the intercept here)
alternating_model_judgments_final <-brm(formula = response~semantically_correct.ct + (1 + semantically_correct.ct|participant_private_id), data=c, family = gaussian(), set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(alternating_model_judgments, variable = c("b_Intercept", "b_semantically_correct.ct"))

##                           Estimate  Est.Error     Q2.5    Q97.5
## b_Intercept               3.609998 0.04125045 3.529343 3.691655
## b_semantically_correct.ct 2.424529 0.07612683 2.273735 2.571816

mcmc_plot(alternating_model_judgments_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(alternating_model_judgments_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

#2 novel verb judgments

novel_judgments.df = subset(entrenchment_judgment.df, condition == "entrenchment" & verb_noun_type_training2 == "novel")
novel_judgments.df = subset(novel_judgments.df, experiment != "exp4")
novel_judgments.df = subset(novel_judgments.df, experiment != "exp5")

# aggregated dataframe for means
aggregated.means_novel_judgments = aggregate(response ~ scene_test2 + semantically_correct + participant_private_id, novel_judgments.df, FUN=mean)


# average accuracy for semantically correct vs. incorrect trials across causative and noncausative trial types
round(tapply(aggregated.means_novel_judgments$response, aggregated.means_novel_judgments$semantically_correct, mean),3)

##     0     1 
## 2.195 3.986

# average accuracy separately for causative and noncausative scenes
round(tapply(aggregated.means_novel_judgments$response, list(aggregated.means_novel_judgments$semantically_correct, aggregated.means_novel_judgments$scene_test2), mean),3)

##   construction1 construction2
## 0         2.203         2.187
## 1         3.984         3.988

d = lizCenter(novel_judgments.df, list("scene_test2", "semantically_correct"))  

# maximally vague priors for the predictors (we don't interpret the intercept here) 
novel_model_judgments <-brm(formula = response~scene_test2.ct * semantically_correct.ct + (1 + scene_test2.ct*semantically_correct.ct|participant_private_id), data=d, family = gaussian(), set_prior("normal(0,1)", class="b"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_judgments, variable = c("b_Intercept", "b_scene_test2.ct", "b_semantically_correct.ct", "b_scene_test2.ct:semantically_correct.ct"))

##                                             Estimate  Est.Error       Q2.5
## b_Intercept                               3.08415061 0.06705235  2.9519737
## b_scene_test2.ct                         -0.00604950 0.05013327 -0.1035992
## b_semantically_correct.ct                 1.77038714 0.10343266  1.5697442
## b_scene_test2.ct:semantically_correct.ct  0.02142732 0.10576443 -0.1889624
##                                               Q97.5
## b_Intercept                              3.21373016
## b_scene_test2.ct                         0.09131447
## b_semantically_correct.ct                1.97759097
## b_scene_test2.ct:semantically_correct.ct 0.23152009

mcmc_plot(novel_model_judgments, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_judgments))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_scene_test2.ct"] > 0)
C3=mean(samps[,"b_semantically_correct.ct"] < 0)
C4=mean(samps[,"b_scene_test2.ct:semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.4543333
## 3         0.0000000
## 4         0.4171667

# no difference between construction 1 and construction 2
# Final model

# maximally vague priors for the predictors (we don't interpret the intercept here) 
novel_model_judgments_final <-brm(formula = response~semantically_correct.ct + (1 + semantically_correct.ct|participant_private_id), data=d, family = gaussian(), set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(novel_model_judgments, variable = c("b_Intercept", "b_semantically_correct.ct"))

##                           Estimate  Est.Error     Q2.5    Q97.5
## b_Intercept               3.084151 0.06705235 2.951974 3.213730
## b_semantically_correct.ct 1.770387 0.10343266 1.569744 1.977591

mcmc_plot(novel_model_judgments_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(novel_model_judgments_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_semantically_correct.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Question 2: Does statistical preemption constrain morphosyntactic generalizations (judgment data)?

#first, filter our semantically incorrect trials

judgments_unattested_novel.df <- subset(combined_judgment_data.df, semantically_correct == "1")   

#we only want to keep novel
judgments_novel.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items
judgments_unattested_constr1.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
judgments_unattested_constr2.df <- subset(judgments_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

judgments_unattested_novel.df <- rbind(judgments_novel.df, judgments_unattested_constr1.df, judgments_unattested_constr2.df)

aggregated.means = aggregate(response ~ condition + restricted_verb_noun + participant_private_id, judgments_unattested_novel.df, FUN=mean)
aggregated.means<- rename(aggregated.means, restricted = restricted_verb_noun)

yarrr::pirateplot(formula = response ~ restricted + condition,
                  data = aggregated.means,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

judgments_unattested_novel_preemption.df <- subset(judgments_unattested_novel.df, condition == "preemption")
judgments_unattested_novel_preemption.df$restricted_verb_noun <- factor(judgments_unattested_novel_preemption.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(judgments_unattested_novel_preemption.df$response, judgments_unattested_novel_preemption.df$restricted_verb_noun, mean),3)

##   yes    no 
## 2.404 2.948

#Center variables of interest using the lizCenter function:
d_unattested_novel = lizCenter(judgments_unattested_novel_preemption.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_preemption_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                              Estimate  Est.Error       Q2.5
## b_Intercept                               2.694640017 0.04478853  2.6074026
## b_restricted_verb_noun.ct                 0.557126813 0.07939820  0.3994067
## b_scene_test2.ct                         -0.068370221 0.04053798 -0.1481546
## b_restricted_verb_noun.ct:scene_test2.ct  0.008627146 0.07298986 -0.1350547
##                                               Q97.5
## b_Intercept                              2.78227201
## b_restricted_verb_noun.ct                0.71380060
## b_scene_test2.ct                         0.01038846
## b_restricted_verb_noun.ct:scene_test2.ct 0.15126194

mcmc_plot(judgments_preemption_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_preemption_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.0000000
## 3         0.0437500
## 4         0.5475833

# SIMPLIFIED MODEL (FINAL)

#Center variables of interest using the lizCenter function:
d_unattested_novel = lizCenter(judgments_unattested_novel_preemption.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_preemption_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, data=d_unattested_novel, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(judgments_preemption_model, WAIC=T)

##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: response ~ (1 + restricted_verb_noun.ct | participant_private_id) + restricted_verb_noun.ct 
##    Data: d_unattested_novel (Number of observations: 3452) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 237) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              0.65      0.03     0.59     0.72
## sd(restricted_verb_noun.ct)                1.16      0.06     1.04     1.29
## cor(Intercept,restricted_verb_noun.ct)    -0.04      0.07    -0.18     0.10
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     2691     4432
## sd(restricted_verb_noun.ct)            1.00     3351     6752
## cor(Intercept,restricted_verb_noun.ct) 1.00     2189     4293
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                   2.70      0.04     2.61     2.78 1.00     2201
## restricted_verb_noun.ct     0.56      0.08     0.40     0.72 1.00     2824
##                         Tail_ESS
## Intercept                   4111
## restricted_verb_noun.ct     4258
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma     0.85      0.01     0.83     0.88 1.00    17176     9493
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(judgments_preemption_model, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(judgments_preemption_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Question 3: Does statistical entrenchment constrain morphosyntactic generalizations (judgment data)?

#first, filter our semantically incorrect trials
entrenchment_judgments_unattested_novel.df <- subset(entrenchment_judgment.df, semantically_correct == "1")   

#we only want to keep novel

entrenchment_judgments_novel.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items

entrenchment_judgments_unattested_constr1.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
entrenchment_judgments_unattested_constr2.df <- subset(entrenchment_judgments_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

entrenchment_judgments_unattested_novel.df <- rbind(entrenchment_judgments_novel.df, entrenchment_judgments_unattested_constr1.df, entrenchment_judgments_unattested_constr2.df)
entrenchment_judgments_unattested_novel.df$restricted_verb_noun <- factor(entrenchment_judgments_unattested_novel.df$restricted_verb_noun, levels = c("yes", "no"))


round(tapply(entrenchment_judgments_unattested_novel.df$response, entrenchment_judgments_unattested_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 4.425 4.212

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df, list("restricted_verb_noun","scene_test2"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct*scene_test2.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_scene_test2.ct","b_restricted_verb_noun.ct:scene_test2.ct"))

##                                             Estimate  Est.Error        Q2.5
## b_Intercept                               4.30016661 0.05476121  4.19103522
## b_restricted_verb_noun.ct                -0.25459768 0.07944004 -0.40919335
## b_scene_test2.ct                          0.02568000 0.04629136 -0.06283496
## b_restricted_verb_noun.ct:scene_test2.ct  0.06459904 0.08061273 -0.09187710
##                                                Q97.5
## b_Intercept                               4.40702513
## b_restricted_verb_noun.ct                -0.09854862
## b_scene_test2.ct                          0.11673602
## b_restricted_verb_noun.ct:scene_test2.ct  0.22334317

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] < 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1         0.0000000
## 2         0.9991667
## 3         0.2918333
## 4         0.2118333

# SIMPLIFIED MODEL (FINAL)

#Center variables of interest using the lizCenter function:
d_unattested_novel_entrenchment = lizCenter(entrenchment_judgments_unattested_novel.df , list("restricted_verb_noun"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_entrenchment_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct, data=d_unattested_novel_entrenchment, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_entrenchment_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct"))

##                             Estimate  Est.Error       Q2.5       Q97.5
## b_Intercept                4.3010819 0.05557313  4.1931905  4.41064111
## b_restricted_verb_noun.ct -0.2519378 0.08053942 -0.4096279 -0.09342229

mcmc_plot(judgments_entrenchment_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_entrenchment_model))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

# the effect is in the opposite direction

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.9991667

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?

#first, filter our semantically incorrect trials
all_judgment_unattested_novel.df <- subset(combined_judgment_data.df, semantically_correct == "1")   

#we only want to keep novel

all_judgment_novel.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "novel")   

#and restricted items

all_judgment_unattested_constr1.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "construction1" & attested_unattested == "0")   
all_judgment_unattested_constr2.df <- subset(all_judgment_unattested_novel.df, verb_noun_type_training2 == "construction2" & attested_unattested == "0")   

all_judgment_unattested_novel.df <- rbind(all_judgment_novel.df, all_judgment_unattested_constr1.df, all_judgment_unattested_constr2.df)
all_judgment_unattested_novel.df$restricted_verb_noun <- factor(all_judgment_unattested_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(all_judgment_unattested_novel.df$response, list(all_judgment_unattested_novel.df$restricted_verb_noun, all_judgment_unattested_novel.df$condition), mean),3)

##     entrenchment preemption
## yes        4.425      2.404
## no         4.212      2.948

#Center variables of interest using the lizCenter function:
df = lizCenter(all_judgment_unattested_novel.df, list("restricted_verb_noun","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here) 
judgments_pre_vs_ent_model <- brm(formula = response~(1 +restricted_verb_noun.ct*scene_test2.ct|participant_private_id)+restricted_verb_noun.ct * condition.ct *scene_test2.ct, data=df, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_pre_vs_ent_model, variable = c("b_Intercept", "b_restricted_verb_noun.ct","b_condition.ct", "b_scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct", "b_restricted_verb_noun.ct:scene_test2.ct","b_condition.ct:scene_test2.ct", "b_restricted_verb_noun.ct:condition.ct:scene_test2.ct" ))

##                                                          Estimate  Est.Error
## b_Intercept                                            3.27272928 0.03504070
## b_restricted_verb_noun.ct                              0.26889815 0.05786497
## b_condition.ct                                        -1.59759283 0.07012373
## b_scene_test2.ct                                      -0.03305468 0.03097753
## b_restricted_verb_noun.ct:condition.ct                 0.80960692 0.11330014
## b_restricted_verb_noun.ct:scene_test2.ct               0.02704369 0.05503256
## b_condition.ct:scene_test2.ct                         -0.09751280 0.06200363
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.05253435 0.11029846
##                                                              Q2.5       Q97.5
## b_Intercept                                            3.20410436  3.34101665
## b_restricted_verb_noun.ct                              0.15452612  0.38235798
## b_condition.ct                                        -1.73669882 -1.45843740
## b_scene_test2.ct                                      -0.09336996  0.02881578
## b_restricted_verb_noun.ct:condition.ct                 0.58637645  1.03173397
## b_restricted_verb_noun.ct:scene_test2.ct              -0.08121869  0.13441623
## b_condition.ct:scene_test2.ct                         -0.22167450  0.02318673
## b_restricted_verb_noun.ct:condition.ct:scene_test2.ct -0.26712046  0.16125639

mcmc_plot(judgments_pre_vs_ent_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_pre_vs_ent_model))

C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0) 
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_restricted_verb_noun.ct:condition.ct"] < 0)
C5=mean(samps[,"b_restricted_verb_noun.ct:scene_test2.ct"] < 0) 
C6=mean(samps[,"b_condition.ct:scene_test2.ct"] > 0)
C7=mean(samps[,"b_restricted_verb_noun.ct:condition.ct:scene_test2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7)
## 1                    0.00000000
## 2                    0.00000000
## 3                    0.14250000
## 4                    0.00000000
## 5                    0.31008333
## 6                    0.05658333
## 7                    0.32016667

# SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
df = lizCenter(all_judgment_unattested_novel.df, list("restricted_verb_noun", "condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here)
judgments_pre_vs_ent_model <- brm(formula = response~(1 +restricted_verb_noun.ct|participant_private_id)+restricted_verb_noun.ct * condition.ct, data=df, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(judgments_pre_vs_ent_model, variable = c("b_restricted_verb_noun.ct", "b_condition.ct", "b_restricted_verb_noun.ct:condition.ct"))

##                                          Estimate  Est.Error       Q2.5
## b_restricted_verb_noun.ct               0.2643104 0.05786706  0.1500529
## b_condition.ct                         -1.5972162 0.06994556 -1.7357241
## b_restricted_verb_noun.ct:condition.ct  0.8042308 0.11591302  0.5751904
##                                             Q97.5
## b_restricted_verb_noun.ct               0.3774092
## b_condition.ct                         -1.4603555
## b_restricted_verb_noun.ct:condition.ct  1.0271897

mcmc_plot(judgments_pre_vs_ent_model, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(judgments_pre_vs_ent_model))

C1=mean(samps[,"b_restricted_verb_noun.ct"] < 0)
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_restricted_verb_noun.ct:condition.ct"] < 0) 

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1             0
## 2             0
## 3             0

Exploratory data analyses

Effect of statistical pre-emption: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

judgments_unattested_attested.df <- subset(preemption_judgment.df, semantically_correct == "1")   
judgments_unattested_attested.df <- subset(judgments_unattested_attested.df, restricted_verb_noun == "yes")   


aggregated.means1 = aggregate(response ~ condition + attested_unattested + participant_private_id, judgments_unattested_attested.df , FUN=mean)
aggregated.means1<- rename(aggregated.means1, attested = attested_unattested)

aggregated.means1$attested<- recode(aggregated.means1$attested, "1" = "yes","0" = "no")


yarrr::pirateplot(formula = response ~  attested  + condition,
                  data = aggregated.means1,
                  main = "",
                  theme=2,
                  point.o = .3,
                  gl.col = 'white',
                  ylab = "Rating",
                  cex.lab = 0.8,
                  cex.axis = 1,
                  cex.names = 0.8,
                  yaxt = "n")

axis(2, at = seq(1, 9, by = 1), las=1)

# analyses
attested_vs_unattested = subset(preemption_judgment.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested$response, attested_vs_unattested$attested_unattested, mean),3)

##     0     1 
## 2.404 4.774

# model with tested construction
#Center variables of interest using the lizCenter function:
d0_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d0_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_preemption1, variable = c("b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                            Estimate  Est.Error       Q2.5
## b_scene_test2.ct                        -0.04782811 0.02828421 -0.1025756
## b_attested_unattested.ct                 2.31851412 0.08437042  2.1526593
## b_attested_unattested.ct:scene_test2.ct  0.04983503 0.05396899 -0.0567715
##                                               Q97.5
## b_scene_test2.ct                        0.008407984
## b_attested_unattested.ct                2.482373424
## b_attested_unattested.ct:scene_test2.ct 0.155764617

samps = as.matrix(as.mcmc(attested_unattested_preemption1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1    0.04558333
## 2    0.00000000
## 3    0.17575000

#SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested = lizCenter(attested_vs_unattested , list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_preemption <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))


posterior_summary(attested_unattested_preemption, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                          Estimate  Est.Error     Q2.5    Q97.5
## b_Intercept              3.586500 0.02950385 3.528444 3.644153
## b_attested_unattested.ct 2.320058 0.08448532 2.152761 2.483228

mcmc_plot(attested_unattested_preemption, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_preemption))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Effect of statistical entrenchment: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

attested_vs_unattested_ent = subset(entrenchment_judgment.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested_ent$response, attested_vs_unattested_ent$attested_unattested, mean),3)

##     0     1 
## 4.425 4.862

#Center variables of interest using the lizCenter function:
d_attested_unattested_ent1 = lizCenter(attested_vs_unattested_ent, list("attested_unattested","scene_test2"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment1 <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct, data=d_attested_unattested_ent1, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment1, variable = c("b_Intercept","b_scene_test2.ct", "b_attested_unattested.ct", "b_attested_unattested.ct:scene_test2.ct"))

##                                             Estimate  Est.Error        Q2.5
## b_Intercept                              4.651523009 0.03802348  4.57765662
## b_scene_test2.ct                        -0.011653557 0.04125062 -0.09292965
## b_attested_unattested.ct                 0.441290259 0.06126382  0.32128623
## b_attested_unattested.ct:scene_test2.ct  0.004601284 0.08113709 -0.15178408
##                                             Q97.5
## b_Intercept                             4.7274189
## b_scene_test2.ct                        0.0699843
## b_attested_unattested.ct                0.5617672
## b_attested_unattested.ct:scene_test2.ct 0.1633242

mcmc_plot(attested_unattested_entrenchment1, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment1))

C1=mean(samps[,"b_scene_test2.ct"] > 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 
C3=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)  

pMCMC=as.data.frame(c(C1,C2,C3))
pMCMC

##   c(C1, C2, C3)
## 1     0.3873333
## 2     0.0000000
## 3     0.4786667

# SIMPLIFIED MODEL
#Center variables of interest using the lizCenter function:
d_attested_unattested_ent = lizCenter(attested_vs_unattested_ent, list("attested_unattested"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment <- brm(formula =response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct, data=d_attested_unattested_ent, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment, variable = c("b_Intercept", "b_attested_unattested.ct"))

##                           Estimate  Est.Error      Q2.5    Q97.5
## b_Intercept              4.6491487 0.03829200 4.5750013 4.724242
## b_attested_unattested.ct 0.4368054 0.06144661 0.3162615 0.556772

mcmc_plot(attested_unattested_entrenchment, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0) 


pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Entrenchment vs. preemption: ratings for witnessed vs. unwitnessed forms

attested_vs_unattested_across = subset(combined_judgment_data.df, restricted_verb_noun == "yes" & semantically_correct == "1")

round(tapply(attested_vs_unattested_across$response, list(attested_vs_unattested_across$condition, attested_vs_unattested_across$attested_unattested), mean),3)

##                  0     1
## entrenchment 4.425 4.862
## preemption   2.404 4.774

# model with test construction
#Center variables of interest using the lizCenter function:

d0_attested_unattested_all = lizCenter(attested_vs_unattested_across , list("attested_unattested","scene_test2","condition"))

# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_all <- brm(formula =response~(1 +attested_unattested.ct*scene_test2.ct|participant_private_id)+attested_unattested.ct*scene_test2.ct*condition.ct, data=d0_attested_unattested_all, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_all, variable = c("b_Intercept","b_attested_unattested.ct", "b_scene_test2.ct", "b_condition.ct", "b_attested_unattested.ct:scene_test2.ct", "b_attested_unattested.ct:condition.ct", "b_scene_test2.ct:condition.ct", "b_attested_unattested.ct:scene_test2.ct:condition.ct"))

##                                                         Estimate  Est.Error
## b_Intercept                                           3.96888162 0.02392786
## b_attested_unattested.ct                              1.65086627 0.05526792
## b_scene_test2.ct                                     -0.03449675 0.02425933
## b_condition.ct                                       -1.06229467 0.04644743
## b_attested_unattested.ct:scene_test2.ct               0.02778778 0.04576076
## b_attested_unattested.ct:condition.ct                 1.87119844 0.10772304
## b_scene_test2.ct:condition.ct                        -0.03856385 0.04846482
## b_attested_unattested.ct:scene_test2.ct:condition.ct  0.04895835 0.09244439
##                                                             Q2.5       Q97.5
## b_Intercept                                           3.92176501  4.01497106
## b_attested_unattested.ct                              1.54384583  1.75770001
## b_scene_test2.ct                                     -0.08207548  0.01255682
## b_condition.ct                                       -1.15305530 -0.97134577
## b_attested_unattested.ct:scene_test2.ct              -0.06221017  0.11782945
## b_attested_unattested.ct:condition.ct                 1.66042803  2.08488751
## b_scene_test2.ct:condition.ct                        -0.13422724  0.05589474
## b_attested_unattested.ct:scene_test2.ct:condition.ct -0.13213757  0.22829693

samps = as.matrix(as.mcmc(attested_unattested_all))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_attested_unattested.ct"] < 0)
C3=mean(samps[,"b_scene_test2.ct"] > 0)
C4=mean(samps[,"b_condition.ct"] > 0)
C5=mean(samps[,"b_attested_unattested.ct:scene_test2.ct"] < 0)
C6=mean(samps[,"b_attested_unattested.ct:condition.ct"] < 0)
C7=mean(samps[,"b_scene_test2.ct:condition.ct"] > 0)
C8=mean(samps[,"b_attested_unattested.ct:scene_test2.ct:condition.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2,C3,C4,C5,C6,C7,C8))
pMCMC

##   c(C1, C2, C3, C4, C5, C6, C7, C8)
## 1                         0.0000000
## 2                         0.0000000
## 3                         0.0760000
## 4                         0.0000000
## 5                         0.2671667
## 6                         0.0000000
## 7                         0.2155833
## 8                         0.2977500

#Center variables of interest using the lizCenter function:
df_attested_unattested = lizCenter(attested_vs_unattested_across, list("attested_unattested", "condition"))


# maximally vague priors for the predictors (we don't interpret the intercept here)
attested_unattested_entrenchment_preemption <- brm(formula = response~(1 +attested_unattested.ct|participant_private_id)+attested_unattested.ct * condition.ct, data=df_attested_unattested, family=gaussian(),set_prior("normal(0,1)", class="b"),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(attested_unattested_entrenchment_preemption, variable = c("b_Intercept","b_condition.ct", "b_attested_unattested.ct","b_attested_unattested.ct:condition.ct"))

##                                        Estimate  Est.Error      Q2.5      Q97.5
## b_Intercept                            3.970473 0.02365269  3.924609  4.0176949
## b_condition.ct                        -1.058520 0.04770283 -1.151736 -0.9634889
## b_attested_unattested.ct               1.647459 0.05507886  1.538985  1.7543213
## b_attested_unattested.ct:condition.ct  1.870281 0.10778340  1.657054  2.0812288

mcmc_plot(attested_unattested_entrenchment_preemption, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(attested_unattested_entrenchment_preemption))

C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_condition.ct"] > 0)
C3=mean(samps[,"b_attested_unattested.ct"] < 0) 
C4=mean(samps[,"b_attested_unattested.ct:condition.ct"] < 0) 

pMCMC=as.data.frame(c(C1,C2,C3,C4))
pMCMC

##   c(C1, C2, C3, C4)
## 1                 0
## 2                 0
## 3                 0
## 4                 0

Production data: Effect of statistical pre-emption

#Are participants producing more attested than unattested dets? we will now compare proportion of attested dets (that's the intercept) for the restricted verbs against chance 

production_preemption_attested_unattested.df <- subset(preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_attested_unattested.df <- subset(production_preemption_attested_unattested.df, restricted_verb_noun =="yes")

round(tapply(production_preemption_attested_unattested.df $attested_unattested, production_preemption_attested_unattested.df $verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.974         0.958            NA

production_preemption_attested_unattested.df$verb_noun_type_training2 <- factor(production_preemption_attested_unattested.df$verb_noun_type_training2)

df_prod = lizCenter(production_preemption_attested_unattested.df , list("verb_noun_type_training2"))  

# maximally vague priors for the predictors and the intercept
prod_attested_unattested = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                Estimate Est.Error      Q2.5    Q97.5
## b_Intercept                   6.3656824 0.4180139  5.580832 7.235998
## b_verb_noun_type_training2.ct 0.1218155 0.5895998 -1.003655 1.315805

mcmc_plot(prod_attested_unattested, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1     0.000
## 2     0.423

#same analyses without verb_training_type

# maximally vague priors for the intercept
prod_attested_unattested_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_final, variable = c("b_Intercept"))

##             Estimate Est.Error     Q2.5    Q97.5
## b_Intercept 6.216118 0.4015867 5.472719 7.052819

mcmc_plot(prod_attested_unattested_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_final))
C1=mean(samps[,"b_Intercept"] < 0)


# We will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the restricted verbs than for the novel verb


production_preemption_restricted_novel.df <- subset(preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_restricted_novel.df<- subset(production_preemption_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- ifelse(production_preemption_restricted_novel.df$verb_noun_type_training2 == "novel" & production_preemption_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_preemption_restricted_novel.df$attested_unattested)

production_preemption_restricted_novel.df$attested_unattested <- recode(production_preemption_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.026         0.042         0.500

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(production_preemption_restricted_novel.df$attested_unattested , production_preemption_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.034 0.500

production_preemption_restricted_novel.df$restricted_verb_noun <- factor(production_preemption_restricted_novel.df$restricted_verb_noun)
production_preemption_restricted_novel1.df = lizCenter(production_preemption_restricted_novel.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors and the intercept
prod_unattested_novel_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_preemption_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error      Q2.5     Q97.5
## b_Intercept               -4.033801 0.2453618 -4.532995 -3.578344
## b_restricted_verb_noun.ct  5.515563 0.4071634  4.727045  6.331727

mcmc_plot(prod_unattested_novel_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_final))
C1=mean(samps[,"b_Intercept"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

# We will now compare unattested for restricted vs. alternating

production_preemption_restricted_alt.df <- subset(preemption_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_preemption_restricted_alt.df<- subset(production_preemption_restricted_alt.df, verb_noun_type_training2 != "novel")

# all forms are unwitnessed for the alternating verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_preemption_restricted_alt.df$attested_unattested <- ifelse(production_preemption_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_preemption_restricted_alt.df$det_lenient_adapted == "det_construction1", 1, production_preemption_restricted_alt.df$attested_unattested)

production_preemption_restricted_alt.df$attested_unattested <- ifelse(production_preemption_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_preemption_restricted_alt.df$det_lenient_adapted == "det_construction2", 0, production_preemption_restricted_alt.df$attested_unattested)

production_preemption_restricted_alt.df$attested_unattested <- recode(production_preemption_restricted_alt.df$attested_unattested, `1` = 0L, `0` = 1L)


round(tapply(production_preemption_restricted_alt.df$attested_unattested , production_preemption_restricted_alt.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##         0.489         0.026         0.042            NA

production_preemption_restricted_alt.df$restricted_verb_noun <- factor(production_preemption_restricted_alt.df$restricted_verb_noun , levels = c("yes", "no"))

round(tapply(production_preemption_restricted_alt.df$attested_unattested , production_preemption_restricted_alt.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.034 0.489

production_preemption_restricted_alt1.df = lizCenter(production_preemption_restricted_alt.df, list("restricted_verb_noun"))

# maximally vague priors for the predictors and the intercept
prod_unattested_alt_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_preemption_restricted_alt1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(prod_unattested_alt_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ restricted_verb_noun.ct + (1 + restricted_verb_noun.ct | participant_private_id) 
##    Data: production_preemption_restricted_alt1.df (Number of observations: 5151) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 217) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              1.85      0.20     1.49     2.29
## sd(restricted_verb_noun.ct)                3.08      0.32     2.51     3.78
## cor(Intercept,restricted_verb_noun.ct)    -0.89      0.03    -0.94    -0.82
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     3650     6401
## sd(restricted_verb_noun.ct)            1.00     2999     5850
## cor(Intercept,restricted_verb_noun.ct) 1.00     3236     5741
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                  -3.86      0.23    -4.34    -3.43 1.00     5527
## restricted_verb_noun.ct     5.65      0.36     4.98     6.38 1.00     5591
##                         Tail_ESS
## Intercept                   6707
## restricted_verb_noun.ct     6727
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(prod_unattested_alt_final, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(prod_unattested_novel_final))
C1=mean(samps[,"b_Intercept"] > 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1         0
## 2         0

Production data: Effect of statistical entrenchment

#a. Are participants producing more attested than unattested dets?
# here, we want to see how often participants say the unattested e.g. transitive-only det1 for a det2 (intransitive-only) verb in the intransitive condition at test 
# and vice versa 

production_entrenchment_attested_unattested.df  <- subset(entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_attested_unattested.df  <- subset(production_entrenchment_attested_unattested.df, restricted_verb_noun =="yes")

#We want to compare attested vs. unattested trials for transitive verbs in the intransitive inchoative construction at test
production_entrenchment_attested_unattested1.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

#And intransitive inchoative verbs in the transitive construction at test. Filter out irrelevant trials
production_entrenchment_attested_unattested2.df  <- subset(production_entrenchment_attested_unattested.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_attested_unattested.df <- rbind(production_entrenchment_attested_unattested1.df, production_entrenchment_attested_unattested2.df)

#How much of the time are participants producing attested items?
round(mean(production_entrenchment_attested_unattested.df$attested_unattested),3)

## [1] 0.138

# and separately for each verb type
round(tapply(production_entrenchment_attested_unattested.df$attested_unattested, production_entrenchment_attested_unattested.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.158         0.119            NA

production_entrenchment_attested_unattested.df$verb_noun_type_training2 <- factor(production_entrenchment_attested_unattested.df$verb_noun_type_training2)
df_prod_ent = lizCenter((production_entrenchment_attested_unattested.df), list("verb_noun_type_training2"))  


# maximally vague priors for the predictors and the intercept
prod_attested_unattested_ent = brm(formula = attested_unattested ~verb_noun_type_training2.ct + (1 + verb_noun_type_training2.ct|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)),cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_attested_unattested_ent, variable = c("b_Intercept","b_verb_noun_type_training2.ct"))

##                                 Estimate Est.Error      Q2.5      Q97.5
## b_Intercept                   -3.5808451 0.3253543 -4.254685 -2.9687142
## b_verb_noun_type_training2.ct -0.4322548 0.3984259 -1.219177  0.3664864

mcmc_plot(prod_attested_unattested_ent, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_verb_noun_type_training2.ct"] > 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1     1.000
## 2     0.137

#same analyses without verb_training_type


# maximally vague priors for the intercept
prod_attested_unattested_ent_final = brm(formula = attested_unattested ~1 + (1|participant_private_id), data=df_prod_ent, family = bernoulli(link = logit), set_prior("normal(0, 1)", class = "Intercept"), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

summary(prod_attested_unattested_ent_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ 1 + (1 | participant_private_id) 
##    Data: df_prod_ent (Number of observations: 1452) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 183) 
##               Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept)     2.91      0.33     2.32     3.61 1.00     3878     6578
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept    -3.41      0.31    -4.05    -2.84 1.00     4744     7134
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

posterior_summary(prod_attested_unattested_ent_final, variable = c("b_Intercept"))

##              Estimate Est.Error      Q2.5     Q97.5
## b_Intercept -3.405354 0.3070352 -4.049147 -2.837391

mcmc_plot(prod_attested_unattested_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_attested_unattested_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C1

## [1] 1

# c. we will now compare unattested for restricted vs. novel
# Do participants produce the unwitnessed form less for the 2 non-alternating verbs than for the novel verb (presumably the “unwitnessed” form has to be set arbitrarily here)


production_entrenchment_restricted_novel.df <- subset(entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_restricted_novel.df<- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 != "alternating")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction1", 1, production_entrenchment_restricted_novel.df$attested_unattested)
production_entrenchment_restricted_novel.df$attested_unattested <- ifelse(production_entrenchment_restricted_novel.df$verb_noun_type_training2 == "novel" & production_entrenchment_restricted_novel.df$det_lenient_adapted == "det_construction2", 0, production_entrenchment_restricted_novel.df$attested_unattested)

# select trials featuring the novel verb in the intransitive inchoative construction
production_entrenchment_restricted_novel1.df <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "novel"  & scene_test2 == "construction2")


# Select trials featuring transitive verbs in the intransitive inchoative construction at test
production_entrenchment_restricted_novel2.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

# Select trials featuring intransitive verbs in the transitive construction at test
production_entrenchment_restricted_novel3.df  <- subset(production_entrenchment_restricted_novel.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)


round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##            NA         0.158         0.119         0.054

# reverse coding to focus on unattested rather than attested for novel vs. restricted
production_entrenchment_restricted_novel.df <- rbind(production_entrenchment_restricted_novel1.df, production_entrenchment_restricted_novel2.df, production_entrenchment_restricted_novel3.df)
production_entrenchment_restricted_novel.df$attested_unattested<- recode(production_entrenchment_restricted_novel.df$attested_unattested, `1` = 0L, `0` = 1L)
production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun, levels = c("yes", "no"))

round(tapply(production_entrenchment_restricted_novel.df$attested_unattested , production_entrenchment_restricted_novel.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.862 0.946

#what this means is that participants produce *unattested forms* less for the restricted than they do for the novel

production_entrenchment_restricted_novel.df$restricted_verb_noun <- factor(production_entrenchment_restricted_novel.df$restricted_verb_noun)
production_entrenchment_restricted_novel1.df = lizCenter(production_entrenchment_restricted_novel.df, list("restricted_verb_noun"))


# maximally vague priors for the predictors and the intercept
prod_unattested_novel_ent_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_entrenchment_restricted_novel1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))

posterior_summary(prod_unattested_novel_ent_final, variable = c("b_Intercept","b_restricted_verb_noun.ct"))

##                            Estimate Est.Error       Q2.5    Q97.5
## b_Intercept               3.6444300 0.2741511  3.1398933 4.207701
## b_restricted_verb_noun.ct 0.5625371 0.4442174 -0.2897181 1.469969

mcmc_plot(prod_unattested_novel_ent_final, variable = "^b_", regex = TRUE)

samps = as.matrix(as.mcmc(prod_unattested_novel_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##    c(C1, C2)
## 1 0.00000000
## 2 0.09591667

# d. we will now compare unattested for restricted vs. alternating
# Do participants produce the unwitnessed form less for the 2 non-alternating verbs than for the novel verb (presumably the “unwitnessed” form has to be set arbitrarily here)


production_entrenchment_restricted_alt.df <- subset(entrenchment_production.df, det_lenient_adapted == "det_construction1" | det_lenient_adapted == "det_construction2")
production_entrenchment_restricted_alt.df<- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 != "novel")

# all forms are unwitnessed for the novel verb so we are going to randomly set all det1s as attested and all dets2 as unattested 

production_entrenchment_restricted_alt.df$attested_unattested <- ifelse(production_entrenchment_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_entrenchment_restricted_alt.df$det_lenient_adapted == "det_construction1", 1, production_entrenchment_restricted_alt.df$attested_unattested)
production_entrenchment_restricted_alt.df$attested_unattested <- ifelse(production_entrenchment_restricted_alt.df$verb_noun_type_training2 == "alternating" & production_entrenchment_restricted_alt.df$det_lenient_adapted == "det_construction2", 0, production_entrenchment_restricted_alt.df$attested_unattested)

# select trials featuring the novel verb in the intransitive inchoative construction
production_entrenchment_restricted_alt1.df <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "alternating"  & scene_test2 == "construction2")


# Select trials featuring transitive verbs in the intransitive inchoative construction at test
production_entrenchment_restricted_alt2.df  <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "construction1" & scene_test2 == "construction2")

# Select trials featuring intransitive verbs in the transitive construction at test
production_entrenchment_restricted_alt3.df  <- subset(production_entrenchment_restricted_alt.df, verb_noun_type_training2 == "construction2" & scene_test2 == "construction1")


production_entrenchment_restricted_alt.df <- rbind(production_entrenchment_restricted_alt1.df, production_entrenchment_restricted_alt2.df, production_entrenchment_restricted_alt3.df)


round(tapply(production_entrenchment_restricted_alt.df$attested_unattested , production_entrenchment_restricted_alt.df$verb_noun_type_training2, mean),3)

##   alternating construction1 construction2         novel 
##         0.047         0.158         0.119            NA

# reverse coding to focus on unattested rather than attested for novel vs. restricted
production_entrenchment_restricted_alt.df <- rbind(production_entrenchment_restricted_alt1.df, production_entrenchment_restricted_alt2.df, production_entrenchment_restricted_alt3.df)
production_entrenchment_restricted_alt.df$attested_unattested<- recode(production_entrenchment_restricted_alt.df$attested_unattested, `1` = 0L, `0` = 1L)
production_entrenchment_restricted_alt.df$restricted_verb_noun <- factor(production_entrenchment_restricted_alt.df$restricted_verb_noun, levels = c("yes", "no"))

round(tapply(production_entrenchment_restricted_alt.df$attested_unattested , production_entrenchment_restricted_alt.df$restricted_verb_noun, mean),3)

##   yes    no 
## 0.862 0.953

#what this means is that participants produce *unattested forms* less for the restricted than they do for the novel

production_entrenchment_restricted_alt.df$restricted_verb_noun <- factor(production_entrenchment_restricted_alt.df$restricted_verb_noun)
production_entrenchment_restricted_alt1.df= lizCenter(production_entrenchment_restricted_alt.df, list("restricted_verb_noun"))


# maximally vague priors for the predictors and the intercept
prod_unattested_alt_ent_final = brm(formula = attested_unattested ~restricted_verb_noun.ct + (1 + restricted_verb_noun.ct|participant_private_id), data=production_entrenchment_restricted_alt1.df, family = bernoulli(link = logit), prior = c(prior(normal(0, 1), class = Intercept), prior(normal(0, 1), class = b)), cores=4, warmup = 2000, iter=5000, chains=4, control=list(adapt_delta = 0.99))
summary(prod_unattested_alt_ent_final, WAIC=T)

##  Family: bernoulli 
##   Links: mu = logit 
## Formula: attested_unattested ~ restricted_verb_noun.ct + (1 + restricted_verb_noun.ct | participant_private_id) 
##    Data: production_entrenchment_restricted_alt1.df (Number of observations: 2176) 
##   Draws: 4 chains, each with iter = 5000; warmup = 2000; thin = 1;
##          total post-warmup draws = 12000
## 
## Group-Level Effects: 
## ~participant_private_id (Number of levels: 183) 
##                                        Estimate Est.Error l-95% CI u-95% CI
## sd(Intercept)                              2.24      0.24     1.81     2.76
## sd(restricted_verb_noun.ct)                2.34      0.39     1.62     3.18
## cor(Intercept,restricted_verb_noun.ct)    -0.78      0.12    -0.97    -0.49
##                                        Rhat Bulk_ESS Tail_ESS
## sd(Intercept)                          1.00     3600     6476
## sd(restricted_verb_noun.ct)            1.00     3993     6545
## cor(Intercept,restricted_verb_noun.ct) 1.00     2742     3599
## 
## Population-Level Effects: 
##                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept                   3.56      0.26     3.09     4.09 1.00     4231
## restricted_verb_noun.ct     0.27      0.42    -0.53     1.12 1.00     4817
##                         Tail_ESS
## Intercept                   6689
## restricted_verb_noun.ct     7131
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

mcmc_plot(prod_unattested_alt_ent_final, variable = "^b_", regex = TRUE)

dev.off()

## null device 
##           1

samps = as.matrix(as.mcmc(prod_unattested_alt_ent_final))
C1=mean(samps[,"b_Intercept"] < 0)
C2=mean(samps[,"b_restricted_verb_noun.ct"] < 0)

pMCMC=as.data.frame(c(C1,C2))
pMCMC

##   c(C1, C2)
## 1 0.0000000
## 2 0.2573333

Data analyses for manuscript: Learners constrain their linguistic generalizations using preemption but not entrenchment: Evidence from artificial language learning and computational modeling with adults and children

Anna Samara

21 September 2022

Load packages and helper functions

Packages

Helper functions

SummarySE

SummarySEwithin

normDataWithin

myCenter

lizCenter

Bf

Bf power calculation

Bf range

Load data

Experiment 1

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between the two argument-structure constructions?

Production data

Judgment data

Question 2: Does statistical pre-emption constrain verb argument construction generalizations in adults (judgment data)?

Question 3: Does statistical entrenchment constrain verb argument construction generalizations in adults (judgment data)?

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?

Exploratory data analyses

Effect of statistical pre-emption: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

Effect of statistical entrenchment: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

Entrenchment vs. preemption: ratings for witnessed vs. unwitnessed forms

Production data: Effect of statistical pre-emption

Production data: Effect of statistical entrenchment

Experiment 2

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between the two argument-structure constructions?

Production data

Judgment data

Question 2: Does statistical pre-emption constrain verb argument construction generalizations in adults (judgment data)?

Question 3: Does statistical entrenchment constrain verb argument construction generalizations in adults (judgment data)?

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?

Exploratory data analyses

Effect of statistical pre-emption: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

Effect of statistical entrenchment: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

Entrenchment vs. preemption: ratings for witnessed vs. unwitnessed forms

Production data: Effect of statistical pre-emption

Production data: Effect of statistical entrenchment

Experiment 3

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between the two argument-structure constructions?

Production data

Judgment data

Question 2: Does statistical pre-emption constrain verb argument construction generalizations in adults (judgment data)?

Question 3: Does statistical entrenchment constrain verb argument construction generalizations in adults (judgment data)?

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?

Exploratory data analyses

Effect of statistical pre-emption: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

Effect of statistical entrenchment: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

Entrenchment vs. preemption: ratings for witnessed vs. unwitnessed forms

Production data: Effect of statistical pre-emption

Production data: Effect of statistical entrenchment

Experiment 4

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between singular/plural marking?

Production data

Judgment data

Question 2: Does statistical preemption constrain morphological generalizations in adults (judgment data)?

Question 3: Does statistical entrenchment constrain morphological generalizations in adults (judgment data)?

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?

Exploratory data analyses

Exploratory data analyses

Effect of statistical pre-emption: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

Effect of statistical entrenchment: Comparison of adults’ judgment ratings (acceptability) for witnessed versus unwitnessed forms

Entrenchment vs. preemption: ratings for witnessed vs. unwitnessed forms

Production data: Effect of statistical pre-emption

Production data: Effect of statistical entrenchment

Experiment 5

Preregistered data analyses

Question 1: Have participants picked up on the difference in meaning between the two argument-structure constructions?

Production data

Judgment data

Question 2: Does statistical preemption constrain morphological generalizations in children (judgment data)?

Question 3: Does statistical entrenchment constrain morphological generalizations in children (judgment data)?

Question 4: Is the effect of statistical pre-emption larger than entrenchment (judgment data)?