Introduction

I chose this paper because it’s an experimental application of Bayes theorem, a ubiquitous framework in the social sciences. Replicating it will require me to internalize the theorem’s basic intuition as well as its more complex manifestations. The experimental paradigm involves a qualitative and computational exploration of how beliefs are updated with evidence. This is a great opportunity to think about how Bayesian principles fit in with people’s real world decision-making, an invaluable tool in a social scientist’s arsenal.

This paper explored participants’ judgments about the likelihood of a hypothetical person being of a particular occupation based on idiosyncratic statistical heuristics. The replication target will be study 5, which will consist of three parts; the first part will query people’s prior, posterior, and likelihood estimates of an air traffic control (ATC) communicator either being male or female; the second part will compute a model posterior for each participant and compare it with their actual posterior; and the third part will have the participants evaluate the moral character of a third party who makes a Bayesian judgment about the same scenario.

If the results of the original study hold, this study expects to find that participants make Bayesian judgments about the ATC scenario, with actual and model posteriors favoring the communicator being male rather than female. A second key finding will be that participants will judge a third party who makes the same Bayesian judgments as themselves as being unfair, unjust, inaccurate and unintelligent.The study is expected to be conducted on Amazon’s task crowd-sourcing marketplace, Mechanical Turk. Some challenges expected include low quality of data due to bots or inattentive participants, and the possibility of participants looking up answers to the filler questions about unrelated statistical phenomena.

Link to the repository
Link to the Qualtrics survey Link to the preregistration:

Methods

Power Analysis

Using the G*Power software, the original effect size was calculated to be d = 0.48, and the planned sample size to achieve 80% power was calculated to be 36

Planned Sample

Planned sample size = 36

Materials and Procedure

The study will be conducted in October 2020. Participants will be recruited from Amazon Mturk and compensated $0.71 each. The study will proceed in four parts, the first three of which will correspond to a component of Bayes’s rule.

“In the first part, each participant was randomly assigned to learn that either a man or a woman had communicated with air traffic control during a flight. Participants provided their priors, posteriors, and likelihoods for this scenario”

"Part 1: priors. Participants will be instructed to imagine a man and a woman who work at the same airline. One person is a pilot and the other person is not a pilot, but who is the pilot and who is not is unknown. Participants will estimate the percentage chance that each person is the pilot. Because there are two hypotheses—either the man or the woman is the pilot (and the other is the not)—both estimates had to sum to 1. Thus, each participant will provide his or her subjective prior about each person’s profession (e.g., the man has a 75% chance of being the pilot; the woman has a 25% chance of being not the pilot).

Part 2: posteriors. After providing priors, each participant will be randomly assigned to learn one of the following two pieces of data: (a) The man communicated with air traffic control(b) the woman communicated with air traffic control. After learning this datum, participants will again estimated the percentage chance that each person is the pilot. Thus, each participant will provide his or her subjective posterior.

Part 3: likelihoods. Each participant will estimate two likelihoods: the likelihood of observing the datum given the hypothesis that the target they learned about is the pilot and the likelihood of observing the datum given the hypothesis that the target they learned about is not the pilot. For example, if a participant learned that the woman had communicated with air traffic control, that participant will estimate the percentage of female pilots who communicate with air traffic control and the percentage of female non-pilots who communicate with air traffic control. If a participant learned that the man had communicated with air traffic control, that participant will estimated the percentage of male pilots who communicate with air traffic control and the percentage of male non-pilots who communicate with air traffic control. Thus, each participant will provide his or her subjective likelihood estimates, which will be combined by forming a ratio. Each participant will be randomly assigned to estimate the corresponding likelihoods either before or after providing subjective priors and posteriors. Each participant’s priors and likelihoods will be entered into Bayes’s rule to compute a model posterior, which represents what the participant’s posterior should be from a statistical perspective. This model posterior will be compared with the posterior that the participant actually reported.

Next, participants will learn about person X, who stated the following after learning the same information as participants: “Even though the man and the woman both communicated with air traffic control (ATC), the man is more likely to be a doctor than the woman.” Participants then completed four Likert-type scales that assessed how (a) fair, (b) just, (c) accurate, and (d) intelligent person X’s statement was. Each scale ranged from 1 (e.g., extremely unfair) to 7 (e.g., extremely fair)."

Analysis Plan

Participants will be excluded who provide priors of either 0% of 100% since these cannot be updated in accordance with Bayes rule.

“Each participant’s priors and likelihoods will be entered into Bayes’s rule to compute a model posterior, which represents what the participant’s posterior should be from a statistical perspective. This model posterior will be compared with the posterior that the participant actually reported.”

Key descriptive statistics such as means and standard errors for judgments among participants in each condition will be computed and some plotted.

Clarify key analysis of interest here

The key statistical test I will replicate will be the two-way ANOVA with linear model fit examining the effects of person X evaluations ([1] negative to [7] positive) and target gender (male or female) on posterior judgments. These findings were represented visually in figure 4.

The analysis

Differences from Original Study

The study is expected to be replicated with a much smaller sample than the original 353 due to budget concerns. A power analysis will be conducted to determine the ideal sample size under which an effect is expected to be observed. The procedure and analysis will be the same one used in study 5. A key difference is that the replication study will not ask participants to complete filler tasks consisting of unrelated statistical judgments on the second part of the study, as the authors did. This was done because it was determined that the results of the filler questions were inconsequential to the conclusions.

Methods Addendum (Post Data Collection)

You can comment this section out prior to final report with data collection.

Actual Sample

Sample size, demographics, data exclusions based on rules spelled out in analysis plan

Differences from pre-data collection methods plan

Any differences from what was described as the original plan, or “none”.

Results

Data Preparation

Data preparation following the analysis plan.

Data Wrangling

Rename and Tidy columns

data <- caodata %>% 
   rename(#man condition
          cond1.prior_manPilot.womanFlightAttendant = X1.prior_1,  #prior prob that man is pilot
          cond1.prior_womanPilot.manFlightAttendant = X1.prior_2, #prior prob that woman is pilot
          cond1.post_manPilot.womanFlightAttendant = X1.post_1, #posterior prob that man is pilot
          cond1.post_womanPilot.manFlightAttendant = X1.post_2, #posterior prob woman is pilot
          cond1.lk_percent.male.pilots.comm.ATC = X1.lk_1_1,  
          cond1.lk_percent.male.flightAttendants.comm.ATC = X1.lk_2_1,
          
          ##woman condition
          cond2.prior_manPilot.womanFlightAttendant = X2.prior_1, #prior prob that man is pilot
          cond2.prior_womanPilot.manFlightAttendant = X2.prior_2, #prior prob that woman is pilot
          cond2.post_manPilot.womanFlightAttendant = X2.post_1, #posterior prob that man is pilot
          cond2.post_womanPilot.manFlightAttendant = X2.post_2, #posterior prob woman is pilot
          cond2.lk_percent.female.pilots.comm.ATC = X2.lk_1_1,
          cond2.lk_percent.female.flightAttendants.comm.ATC = X2.lk_2_1
          ) 

# Filter out columns with answers to Trivial Questions
data <- select(data, !starts_with("Q"))

mean(data$age)

## Warning in mean.default(data$age): argument is not numeric or logical: returning
## NA

## [1] NA

data <- data %>%
  mutate_at(vars(age), as.numeric)

## Warning: Problem with `mutate()` input `age`.
## ℹ NAs introduced by coercion
## ℹ Input `age` is `.Primitive("as.double")(age)`.

## Warning in mask$eval_all_mutate(dots[[i]]): NAs introduced by coercion

data <- data %>%
  mutate_at(vars(gender), as.numeric)

## Warning: Problem with `mutate()` input `gender`.
## ℹ NAs introduced by coercion
## ℹ Input `gender` is `.Primitive("as.double")(gender)`.

## Warning: NAs introduced by coercion

count(data$gender)

##    x freq
## 1  1   23
## 2  2   12
## 3 NA    2

Tidy rows and remove blanks

#Removing irrelevant column titles
rowlength <- length(data$ResponseId) #check length of data frame = 
data <- data[3:rowlength,] #excludes row 1 and 2

#Removing Survey Preview rows aka Rows with Status 1 and unfinished rows
data <- filter(data, Status==0)
data <- filter(data, Finished==1) #Removing unfinished rows

## Remove empty/ blank responses in both conditions
data_noblank <- data %>%
  filter(!is.na(cond1.prior_manPilot.womanFlightAttendant) | !is.na(cond2.prior_manPilot.womanFlightAttendant))

# How many exclusions?
Blank_excl_total <- length(data$ResponseId) - length(data_noblank$ResponseId) 
Blank_excl_total # X exclusions

## [1] 0

data <- data_noblank

Arranging columns

#first sort by condition
data_arr <- arrange(data, cond1.prior_manPilot.womanFlightAttendant)

#create condition column
datac <- data_arr %>%
  mutate(condition = ifelse(is.na(cond1.prior_manPilot.womanFlightAttendant), "2", "1")
         )

Convert judgments from character to numeric

## ========================================== ##
## Convert judgments from character to numeric
datac <- datac %>% 
      mutate_at(vars(contains('cond')), as.numeric) 

## ========================================== ##
## Convert attributions from character to numeric

datac <- datac %>% #change free responses to different naming convention
  rename(X1.free = c1.free,
         X2.free = c2.free)

datac <- datac %>% #convert all columns with c1 prefix
      mutate_at(vars(contains('c1')), as.numeric) 

datac <- datac %>% #convert all columns with c2 prefix
      mutate_at(vars(contains('c2')), as.numeric) 

## ========================================== ##

Sort Prior responses - creating Prior and Target columns

# new Prior.Man.Pilot column 
datac <- datac %>%
  mutate(Prior.Man.Pilot = if_else(!is.na(cond1.prior_manPilot.womanFlightAttendant), cond1.prior_manPilot.womanFlightAttendant, cond2.prior_manPilot.womanFlightAttendant)) 

# new Prior.WoMan.Pilot column 
datac <- datac %>%
  mutate(Prior.Woman.Pilot = if_else(!is.na(cond1.prior_womanPilot.manFlightAttendant), cond1.prior_womanPilot.manFlightAttendant, cond2.prior_womanPilot.manFlightAttendant)) 

datac$Prior.Man.Pilot + datac$Prior.Woman.Pilot  # #confirm priors sum to 100%

##  [1] 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100
## [20] 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

hist(datac$Prior.Man.Pilot) ## Peak at distribution of priors overall

## ================================================================== ## Prior Target 
datac <- datac %>% 
  mutate(target = if_else(condition == 1, "man", "woman")) #Create target column

datac <- datac %>%
  mutate_at(vars(target), factor) # convert target to factor

datac <- datac %>%
  mutate(Prior.Target.Pilot = ifelse(target == "man", Prior.Man.Pilot, Prior.Woman.Pilot)) ## create prior target column

## ================================================================== ## Able to update

summarise(datac, Prior.Target.Pilot=n()) #count number of observations

## # A tibble: 1 x 1
##   Prior.Target.Pilot
##                <int>
## 1                 35

datac <- datac %>%
  mutate(able.to.update = ifelse(Prior.Target.Pilot == 1 | Prior.Target.Pilot == 0, "no", "yes")) 

count(datac$able.to.update)   ## x participants can't update their priors

##     x freq
## 1  no    1
## 2 yes   34

Sort Posterior responses - Creating Posterior and Posterior Target columns

# new posterior man pilot column
datac <- datac %>%
  mutate(Posterior.Man.Pilot = ifelse(!is.na(cond1.post_manPilot.womanFlightAttendant), cond1.post_manPilot.womanFlightAttendant, cond2.post_manPilot.womanFlightAttendant)) 

# new posterior woman pilot column
datac <- datac %>%
  mutate(Posterior.Woman.Pilot = ifelse(!is.na(cond1.post_womanPilot.manFlightAttendant), cond1.post_womanPilot.manFlightAttendant, cond2.post_womanPilot.manFlightAttendant)) 

datac$Posterior.Man.Pilot + datac$Posterior.Woman.Pilot  #confirm posteriors sum 100 > works #sum is 100!!

##  [1] 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100
## [20] 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

hist(datac$Posterior.Man.Pilot) ## Peak at distribution of posteriors overall

## =========================================================== ## Posterior Target 
 datac <- datac %>%
   mutate(Posterior.Target.Pilot= ifelse(target == "man", Posterior.Man.Pilot, Posterior.Woman.Pilot)) #create post target col

  # exclude columns with demographic info -> for analysis
data_clean <- select(datac, ResponseId,
                     starts_with("cond"), 
                     starts_with("c1"),
                     starts_with("c2"),
                     Prior.Man.Pilot,
                     Prior.Woman.Pilot,
                     target,
                     Prior.Target.Pilot,
                     able.to.update,
                     Posterior.Man.Pilot,
                     Posterior.Woman.Pilot,
                     Posterior.Target.Pilot
                     )

Calculating percent prior and posteriors

# -> Compute odds ratio then convert to percent

# create prior odds ratio, divide differently depending on target
data_clean <- data_clean %>% 
  mutate(prior.odds.ratio = ifelse(target == "man", 
                                   Prior.Man.Pilot/ Prior.Woman.Pilot,
                                   Prior.Woman.Pilot/ Prior.Man.Pilot))

# repeat for posterior odds ratio
data_clean <- data_clean %>%
  mutate(posterior.odds.ratio = ifelse(target == "man",
                                       Posterior.Man.Pilot/ Posterior.Woman.Pilot,
                                       Posterior.Woman.Pilot/ Posterior.Man.Pilot))

# --- # 

## prior.odds.ratio --> percent format 
data_clean <- data_clean %>%
  mutate(prior_percent = prior.odds.ratio/ (1 + prior.odds.ratio) ) #covert odds ratio to percent format

# check mean prior percent for each target
data_clean %>%
  group_by(target) %>%
  summarize(mean = mean(prior_percent)) #looks like higher priors for man than woman

## `summarise()` ungrouping output (override with `.groups` argument)

## # A tibble: 2 x 2
##   target  mean
##   <fct>  <dbl>
## 1 man    0.694
## 2 woman  0.248

# --- #

## posterior.odds.ratio --> percent format
data_clean <- data_clean %>%
  mutate(posterior_percent = ifelse(is.infinite(posterior.odds.ratio), 
                                    1,
                                    posterior.odds.ratio/ (1 + posterior.odds.ratio))) #convert to percent, infinities = 1

# check mean posterior for each target
data_clean %>%
  group_by(target) %>%
  summarize(mean = mean(posterior_percent)) #still higher for man than woman. But surprised how woman didn't move much higher

## `summarise()` ungrouping output (override with `.groups` argument)

## # A tibble: 2 x 2
##   target  mean
##   <fct>  <dbl>
## 1 man    0.786
## 2 woman  0.472

## ========================================================================= ##


## Create long form version for plotting purposes
data_clean_long <- data_clean %>%
  pivot_longer(c("prior_percent", "posterior_percent"),
               names_to = "variable",
               values_to = "value")

35*2 ## Each subject gives two responses: prior and reported posterior

## [1] 70

#convert to factor
data_clean_long <- data_clean_long %>%
  mutate_at(vars(target), factor) %>%
  mutate_at(vars(variable), factor)

Barplot showing belief updating - average judgments in each condition

###______________________###
library(scales)

bar <- data_clean %>%
  select(prior_percent, posterior_percent, target)

bar_2 <- bar %>%
  pivot_longer(-c("target"),
               names_to = "type",
               values_to = "value")

bar_2 <- bar_2 %>%
  mutate_at(vars(type), factor) #

stderror <- function(x) sd(x)/sqrt(length(x))

bar_2 <- bar_2 %>%
  mutate(SE = stderror(value)) # SE = 0.0382

dodge <- position_dodge(width = 0.5)

bar_2$type <- factor(bar_2$type, levels = c("prior_percent", "posterior_percent"))

bar_2_plot <- bar_2 %>%
  group_by(target, type) %>%
  summarize(mean = mean(value),
            SE = stderror(value))

## `summarise()` regrouping output by 'target' (override with `.groups` argument)

plot_2 <- ggplot(bar_2_plot,
       aes(x = target, y = mean, fill = type)) +
  geom_bar(width = 0.7, position= position_dodge(width = 0.75), stat = "identity", ) +
  labs (y = expression(paste(italic(P), "(Target = Pilot)"))) +
  scale_y_continuous(breaks = seq(0, 1, 0.10),
                    label = percent_format()) +
   geom_errorbar(aes(ymax = mean + SE, ymin = mean - SE), position = position_dodge(0.75), width = 0.2) +
  scale_fill_discrete(labels = c("prior",
                                  "posterior")) +
  theme(legend.title = element_blank()) + 
  theme(legend.position = c(0.25, 0.2)) + 
  theme(legend.justification = c(0.25, 0.2)) + 
  theme(legend.direction = "vertical")  
plot_2

# lets visualize - focus on the distribution of posterior women
bxp <- ggplot(bar_2,
       aes(x = target, y = value,  fill = type)) +
  geom_boxplot() +
  theme(legend.title = element_blank()) + 
  theme(legend.position = c(0.25, 0.2)) + 
  theme(legend.justification = c(0.25, 0.2)) + 
  theme(legend.direction = "vertical") 
  
bxp

Person X attributions analyses

# Combine/collapse responses into single columns for both conditions
data_clean <- data_clean %>%
  mutate(query = ifelse(condition == "1", c1.self, c2.self),
         intelligent = ifelse(condition == "1", c1.intel, c2.intel),
         accurate = ifelse(condition == "1", c1.acc, c2.acc),
         fair = ifelse(condition == "1", c1.fair, c2.acc),
         just = ifelse(condition == "1", c1.just, c2.just))

count(data_clean$query)

##   x freq
## 1 1   30
## 2 2    4
## 3 3    1

# 30/35 # 85% agree that both the man and woman equally likely to be a doctor
# 4/35 # 11% agree that the man is more likely to be a doctor
# 1/35 # 3% agree that the woman is more likely to be a doctor

## Create dataset with just composite items
data.composite <- data_clean %>%
  select(intelligent, accurate, fair, just) %>%
  pivot_longer(c("intelligent", "accurate", "fair", "just"),
               names_to = "variable", 
               values_to = "value")

## Compute composite average
data_clean <- data_clean %>%
  mutate(composite.avg = (intelligent + accurate + fair + just) / 4)

stderror(data_clean$composite.avg)

## [1] 0.2814972

t.test(data_clean$composite.avg, mu = 4)

## 
##  One Sample t-test
## 
## data:  data_clean$composite.avg
## t = -3.0957, df = 34, p-value = 0.003917
## alternative hypothesis: true mean is not equal to 4
## 95 percent confidence interval:
##  2.556500 3.700643
## sample estimates:
## mean of x 
##  3.128571

Reproducing figure 4

# will be linear ANVOVA of posterior percent vs person X attributions
figure4 <- data_clean %>%
  ggplot(mapping = aes(x = composite.avg, y = posterior_percent, color = target)) + 
  geom_smooth(method = "lm", se = TRUE) +
  theme_classic() +
  scale_x_continuous(breaks = seq(1, 7, 1)) + 
  scale_y_continuous(breaks = seq(0, 1, 0.10),
                    label = percent_format()) + 
  theme(aspect.ratio = 10/7) +
  scale_color_discrete(labels = c("Man communicated w/ATC",
                                  "Woman communicated w/ATC")) +
  annotate("text", c(1.5, 5.7), y = -0.15,
           label = c("Negative", "Positive")) +
  annotate("text", c(3.5), y = -0.2 , label = c("Evaluation of Person X")) +
  coord_cartesian(xlim = c(1, 6), ylim = c(0, 1), clip = "off") +
  labs ( y = expression(paste(italic(P), "(Target = Pilot)"))) +
   theme(plot.margin = unit(c(1, 1, 4, 1), "lines"),
         axis.title.x = element_blank(),
         text=element_text(size=12,  family="sans")
   )
           
figure4

## `geom_smooth()` using formula 'y ~ x'

Figure 4 with data points

# will be linear ANVOVA of posterior percent vs person X attributions
figure4_2 <- data_clean %>%
  ggplot(mapping = aes(x = composite.avg, y = posterior_percent, color = target)) + 
  geom_point(width=0.2) +
  geom_smooth(method = "lm", se = TRUE) +
  theme_classic() +
  scale_x_continuous(breaks = seq(1, 7, 1)) + 
  scale_y_continuous(breaks = seq(0, 1, 0.10),
                    label = percent_format()) + 
  theme(aspect.ratio = 10/7) +
  scale_color_discrete(labels = c("Man communicated w/ATC",
                                  "Woman communicated w/ATC")) +
  annotate("text", c(1.5, 5.7), y = -0.15,
           label = c("Negative", "Positive")) +
  annotate("text", c(3.5), y = -0.2 , label = c("Evaluation of Person X")) +
  coord_cartesian(xlim = c(1, 6), ylim = c(0, 1), clip = "off") +
  labs ( y = expression(paste(italic(P), "(Target = Pilot)"))) +
   theme(plot.margin = unit(c(1, 1, 4, 1), "lines"),
         axis.title.x = element_blank(),
         text=element_text(size=12,  family="sans")
   )

## Warning: Ignoring unknown parameters: width

figure4_2

## `geom_smooth()` using formula 'y ~ x'

Normality Checks

#posterior judgments of man condition seem to vary more. lets see distribution of evaluations by condition
bxp_eval <- data_clean %>%
  select(composite.avg, target)

bxp_eval <- bxp_eval %>%
  pivot_longer(-c("target"),
               names_to = "variable",
               values_to = "value")

bxp_eval_plot <- ggplot(bxp_eval,
       aes(x = target, y = value, fill = target)) +
  geom_boxplot() +
  geom_jitter(width = 0.2)
bxp_eval_plot

##Target vs posterior percent vs composite.avg scatterplot
 data_clean %>%
  ggplot(mapping = aes(x = composite.avg, y = posterior_percent, color = target)) + 
   geom_point() +
   geom_smooth() +
   scale_x_continuous(breaks = seq(1, 7, 1)) + 
   scale_y_continuous(breaks = seq(0, 1, 0.10),
                    label = percent_format())

## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

skewness(bar_2$value, na.rm = TRUE) #coeff is -0.03

## [1] -0.03595821

bar_2 %>%
  group_by(target, type) %>%
  identify_outliers(value) #single non=extreme outlier

## # A tibble: 1 x 6
##   target type          value     SE is.outlier is.extreme
##   <fct>  <fct>         <dbl>  <dbl> <lgl>      <lgl>     
## 1 woman  prior_percent  0.71 0.0383 TRUE       FALSE

bar_2 %>%
  group_by (target) %>%
  levene_test(value ~ type) #non=homogeneity of variance in woman condition (0.02 p-value)

## # A tibble: 2 x 5
##   target   df1   df2 statistic      p
##   <fct>  <int> <int>     <dbl>  <dbl>
## 1 man        1    32     0.115 0.737 
## 2 woman      1    34     5.15  0.0297

Analyses

library(car)

## Loading required package: carData

## 
## Attaching package: 'car'

## The following object is masked from 'package:dplyr':
## 
##     recode

## The following object is masked from 'package:purrr':
## 
##     some

library(lsr)
library(MBESS)


## shift composite avgs
data_clean <- data_clean %>%
  mutate(composite.avg.shifted = composite.avg - 1)

## Main effect of target
anova(lm(formula = posterior_percent ~ composite.avg.shifted * target, 
         data = data_clean)) # There was a statistically significant main effect of target on posterior judgments F(1, 31) = 12, p = 0.001, etasquared = 0.26, 95% CI = [0.05, 0.47]

## Analysis of Variance Table
## 
## Response: posterior_percent
##                              Df  Sum Sq Mean Sq F value   Pr(>F)   
## composite.avg.shifted         1 0.05859 0.05859  0.7203 0.402566   
## target                        1 0.97816 0.97816 12.0240 0.001563 **
## composite.avg.shifted:target  1 0.14182 0.14182  1.7433 0.196385   
## Residuals                    31 2.52189 0.08135                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

#effect size
etaSquared(lm(formula = posterior_percent ~ composite.avg.shifted * target, 
         data = data_clean))

##                                  eta.sq eta.sq.part
## composite.avg.shifted        0.04602877  0.06326683
## target                       0.26433533  0.27947133
## composite.avg.shifted:target 0.03832525  0.05324204

## 95% CI around effect sizes
  ## Main effect of target
ci.pvaf(F.value = 12, 
        df.1 = 1,
        df.2 = 31,
        N = 35, 
        conf.level = 0.95)

## $Lower.Limit.Proportion.of.Variance.Accounted.for
## [1] 0.04610724
## 
## $Probability.Less.Lower.Limit
## [1] 0.025
## 
## $Upper.Limit.Proportion.of.Variance.Accounted.for
## [1] 0.4708296
## 
## $Probability.Greater.Upper.Limit
## [1] 0.025
## 
## $Actual.Coverage
## [1] 0.95

 ## Interaction
ci.pvaf(F.value = 1.7, 
        df.1 = 1,
        df.2 = 31,
        N = 35, 
        conf.level = 0.95)

## $Lower.Limit.Proportion.of.Variance.Accounted.for
## [1] 0
## 
## $Probability.Less.Lower.Limit
## [1] 0
## 
## $Upper.Limit.Proportion.of.Variance.Accounted.for
## [1] 0.2351151
## 
## $Probability.Greater.Upper.Limit
## [1] 0.025
## 
## $Actual.Coverage
## [1] 0.975

  ##----------#

## 95% CI around effect sizes
  ## Main effect of target
ci.pvaf(F.value = 84.579, 
        df.1 = 1,
        df.2 = 344,
        N = 348, 
        conf.level = 0.95)

## $Lower.Limit.Proportion.of.Variance.Accounted.for
## [1] 0.1269588
## 
## $Probability.Less.Lower.Limit
## [1] 0.025
## 
## $Upper.Limit.Proportion.of.Variance.Accounted.for
## [1] 0.2673035
## 
## $Probability.Greater.Upper.Limit
## [1] 0.025
## 
## $Actual.Coverage
## [1] 0.95

  ## Interaction
ci.pvaf(F.value = 10.714, 
        df.1 = 1,
        df.2 = 344,
        N = 348, 
        conf.level = 0.95)

## $Lower.Limit.Proportion.of.Variance.Accounted.for
## [1] 0.004800909
## 
## $Probability.Less.Lower.Limit
## [1] 0.025
## 
## $Upper.Limit.Proportion.of.Variance.Accounted.for
## [1] 0.07328726
## 
## $Probability.Greater.Upper.Limit
## [1] 0.025
## 
## $Actual.Coverage
## [1] 0.95

## Original Cohen's f effect size = 0.48
## Current Cohen's f effect size = 0.59

For finding about participants making Bayesian judgments: A two-way will be conducted to investigate whether participant’s model and reported posteriors favor the man or the woman to be the pilot. This will also be used to compute the difference between the model and reported posteriors among those who learnt that the man versus the woman communicated with ATC.

For finding about moral judgment criticizing person X:The means of all 4 scales will be averaged in each condition to form a composite measure of participant’s evaluation of person X.

Side-by-side graph with original graph is ideal here

Exploratory analyses

Any follow-up analyses desired (not required).

Discussion

Summary of Replication Attempt

Open the discussion section with a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.

Commentary

Add open-ended commentary (if any) reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt. None of these need to be long.

Replication of People make the Same Bayesian Judgment They Criticize in Others by Cao, Kleiman-Weiner and Banaji (2019, Psychological Science)

Joseph Outa (joouta@stanford.edu)

November 22, 2020