Schaerer et al.’s goal of Experiment 1a was to provide preliminary evidence that having an alternative is not always optimal in a negotiation. While previous research argued that having alternatives provided power to the negotiator because they are not completely dependent on the current negotiation, the study aimed to demonstrate that weak alternatives actually lead to worse outcomes than having no alterantive offer at all. The mechanism at play is an anchoring effect. The alternatives anchor the negotiator’s first offer. Having a weak alternative causes a negotatior to start too low on the first offer, leading to a worse outcome relative to having a strong attractive alternative, and more importantly, relative to having no alternative at all.
The target finding of the replication was to find that participants with a weak alternative will make lower first offers than participants with either a strong alternative or no alternative at all in a hypothetical negotiation. In addition, a secondary finding was that the participants with no alternative will still feel the least powerful of all three conditions.
The original study reports cohen’s ds of .76 or higher for the effect of condition on first offers and ds of .42 for the measure of power. Given that the main DV is the first offer, we will make sure we have enough power to capture this effect.
Importantly, while there are three conditions, each effect size is the simple comparison between two conditions. Thus, I will make sure we have enough power to capture the smallest effect size observed for the first offer, which is the effect size comparing the weak BATNA vs. no BATNA first offers (a t-test using dummy coded regression). Because the standard deviations are different for each group, I estimated how many participants we needed for 80% power using a package that allows for hetereogeneous variance.
I inputted the mean difference between the weak BATNA and no BATNA condition and the standard deviations for both conditions for the first offer DV.
n.ttest(power = 0.8, alpha = 0.05, mean.diff = 3.04, sd1 = 5.33, sd2 = 1.74,
design = "unpaired", variance = "unequal")
## Warning in n.ttest(power = 0.8, alpha = 0.05, mean.diff = 3.04, sd1 =
## 5.33, : Arguments -fraction- and -k- are not used, when variances are
## unequal
## $`Total sample size`
## [1] 46
##
## $`Sample size group 1`
## [1] 34
##
## $`sample size group 2`
## [1] 12
The standard deviations are not equal, so the sample sizes for each condition are different, but the results of the power analysis suggest that we will acheive 80% power for the weakest effect size if I use at least 34 people per condition. This seems relatively low, so we decided to use 50 per condition as a conservative test of this effect.
However, just to check, I also ran a power analysis for how many people would we need for the secondary DV of perceived power. We can use power.t.test in this case, since the variance is homogeneous.
power.t.test(delta=0.5, sd=1, power=.80, sig.level=0.05)
##
## Two-sample t test power calculation
##
## n = 63.76576
## delta = 0.5
## sd = 1
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
##
## NOTE: n is number in *each* group
For 80% power, we need 63 per condition. Thus, our study is not powered enough for this DV. That is ok since the main DV we are interested in is the first offer.
The planned sample size is based on the power analysis. Specifically, we will run 50 per condition. There were no preselections as there were none in the original study.
We contacted the authors and got the survey materials with exact wording. See below.
Participants will be randomly assigned to a no BATNA (best alternative to the negotiated agreement), strongBATNA, or weak BATNA condition.
Participants were told:
“Imagine that you just bought an MP3 player and you want to sell some of your old CDs since you no longer need them. One of the CDs you want to sell is the album,”Forty Licks" by The Rolling Stones. Even though the CD is used, it is in reasonably good condition (case intact, no scratches on disc). Then, they were assigned to one of three conditions.
“Someone (”the buyer“) just approached you and raised an interest in purchasing your CD. The buyer asks you how much you want for the CD. You also know that nobody else has offered you any money for the CD. Thus, if you can’t reach an agreement in the current negotiation, you won’t get any money for the CD. You are negotiating the price for the CD. What is your first offer to the buyer?”
“Someone (”the buyer“) just approached you and raised an interest in purchasing your CD. The buyer asks you how much you want for the CD.You also know that another buyer has offered you $8 for the CD. Thus, if you can’t reach an agreement in the current negotiation, you will get $8 for the CD. You are negotiating the price for the CD. What is your first offer to the buyer?”
“Immediately after the BATNA manipulation, participants were instructed to make the first offer, which was our key dependent measure. Then, the negotiation was terminated, and participants were asked to indicate the extent to which they felt powerful (1 = powerless, 7 = powerful), in control (1 = no control, 7 = in control), strong (1 = weak, 7 = strong), and confident (1 = unconfident, 7 = confident). Responses to these four items were averaged to create a single measure of perceived power. Finally, participants completed an attention check and provided demographic information”" (see the experiment for details… they were asked to check specific boxes at the end of a long paragraph of irrelevant instructions).
The procedure was the same as when the authors ran the study on Mturk: Participants first read the manipulation (a hypothetical negotiation situation), and then they answered the dependent variable questions in the same order as the original paper.
As the original authors did, we excluded participants that had a duplicate IP address, incorrectly answered the attention check, and who made first offers with extreme values (> 3 SD from the mean). For the original paper, this amounted to 17/305, or ~6%.
For the first offers, we used dummy coding and regression to compare the mean first offer of the no BATNA condition to the other two conditions. We also used different dummy codes to compare the strong BATNA to the no BATNA condition. For the perceived power (the secondary DV), we aggregated the four items to create a single measure, as the original paper did, and then used regression and dummy coding to compare the no alternative condition to the other to conditions as well as the weak condition relative to the strong condition. In addition, to match the exact analsyes of the authors, we also calcualted effect sizes and reported confidence intervals for the means of the different conditions.
Fortunately, we did not antcipate many differences between this replciation and the original paper. Both studies will be run on MTurk with a similar demogrpahic and the materials will be identical to the original study. The only difference is we will oly be paying .30 cents as we determined the task only takes around three minutes.
Sample size, demographics, data exclusions based on rules spelled out in analysis plan.
Any differences from what was described as the original plan, or “none”.
Read in the data from .json files, and format into a dataframe.
path <- "~/Box Sync/mturk/"
files <- dir(paste0(path,"production-results/"),
pattern = "*.json")
d.raw <- data.frame()
for (f in files) {
jf <- paste0(path, "production-results/",f)
jd <- fromJSON(paste(readLines(jf), collapse=""))
id <- data.frame(ip = jd$answer$data$fingerprint[[1]]$answer$ip,
workerid = jd$WorkerId,
cond = as.factor(jd$answer$data$cond),
first_offer = as.numeric(jd$answer$data$first_offer),
power= as.numeric(jd$answer$data$power),
control = as.numeric(jd$answer$data$control),
strong = as.numeric(jd$answer$data$strong),
confident = as.numeric(jd$answer$data$confident),
sex =jd$answer$data$sex,
age = jd$answer$data$age,
nat = jd$answer$data$nat,
lang = jd$answer$data$langu,
eth = jd$answer$data$eth,
att_check= jd$answer$data$AC)
d.raw <- bind_rows(d.raw, id)
}
As reported, I removed all ips that were duplicated, participants who didn’t pass the attention check and any first offers that were 3 SD above the mean, just as the authors did.
d1 = d.raw[d.raw$ip=="",] #dataset without ip addresses
d2 = d.raw[d.raw$ip!="",] #dataset with ip addresses
d = d2[!(duplicated(d2$ip) | duplicated(d2$ip, fromLast = TRUE)), ] #first removing ALL duplicates among ip addresses that wer e not blank
d = rbind(d1, d)
Here is the proportion of people who will remain in the study for analysis:
length(d$power)/ length(d.raw$power)
## [1] 1
Now I will aggregate the four power measures to get ther perceived power DV
d = d %>% filter (att_check =="pass" & first_offer < (3*sd(first_offer))) %>%
mutate (agg_power = (power + control + strong + confident)/4)
Let’s look at some histograms first!
qplot(d$power)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
qplot(d$first_offer)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The alpha of the original paper was .92.
#calculating alpha
power = cbind(d$power, d$control, d$strong, d$confident)
ICC(power, missing = F)[[1]]$ICC[6] #alphas are a subclass of interclass correlation coefficients, and this one captures cronbach's alpha
## [1] 0.9430255
Here is the original graph:
Here are my plots:
d.mean = d %>%
group_by (cond) %>%
summarise (power= mean(agg_power),
first_offer1 = mean(first_offer),
sem_power = sem(agg_power),
sem_fo = sem(first_offer))
d.mean$cond <- factor(d.mean$cond, levels = c("none","weak","strong"))
ggplot(d.mean, aes(x = cond, y = first_offer1)) +
geom_bar(position = position_dodge(), stat = "identity") +
geom_errorbar(aes(ymin=first_offer1-sem_fo, ymax=first_offer1+sem_fo),
width=.2,
position=position_dodge(.9))
ggplot(d.mean, aes(x = cond, y = power, group=1)) +
geom_line(position = position_dodge(), stat = "identity") +
geom_point()+
geom_errorbar(aes(ymin=power-sem_power, ymax=power+sem_power),
width=.2,
position=position_dodge(.9))
The original paper compared the first offers of the weak BATNA to both the strong BATNA and no BATNA and found that the weak BATNA first offer (M = $4.57, SD = 1.74, 95% CI = [4.33, 4.92]) was significantly lower than the strong (M = $11.02, SD = 1.90, 95% CI = [10.63, 11.42]) and no BATNA condition. In addition, they found that the no BATNA first offer was lower than the strong BATNA first offer (M = $7.61, SD = 5.33, 95% CI = [6.54, 8.68]) using t-tests in dummy coded regression (all t’s higher than 6.18).
in addition, the original paper found that the weak BATNA (M = 5.25, SD = 1.09, 95% CI = [5.03, 5.47]) reported feeling less powerful than the strong BATNA (M = 5.75, SD = 0.96, 95% CI = [5.55, 5.95]) and more powerful than the no BATNA condition (M = 4.80, SD = 1.03, 95% CI = [4.59, 5.01]) and that (obviously) the strong BATNA felt more powerful than the no BATNA condition using t-tests in dummy coded regression.
I replicate these analyses below using dummy coding and regression.
d$cond = as.factor(d$cond)
contrasts(d$cond)
## strong weak
## none 0 0
## strong 1 0
## weak 0 1
contrasts(d$cond) = cbind(c(0,1,0),c(1,0,0)) # compare each against weak BATNA condiion
knitr::kable(summary(lm(power ~ cond, data= d))$coef)
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | 5.5 | 0.500000 | 11.0000000 | 0.0016089 |
| cond1 | 0.5 | 1.118034 | 0.4472136 | 0.6850376 |
| cond2 | 0.5 | 1.118034 | 0.4472136 | 0.6850376 |
contrasts(d$cond) = cbind(c(0,1,0),c(0,0,1)) # compare no vs. strong
knitr::kable(summary(lm(power ~ cond, data= d))$coef) #first beta
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | 6.0 | 1.000000 | 6.0000000 | 0.0092727 |
| cond1 | 0.0 | 1.414214 | 0.0000000 | 1.0000000 |
| cond2 | -0.5 | 1.118034 | -0.4472136 | 0.6850376 |
contrasts(d$cond) = cbind(c(0,1,0),c(1,0,0)) # compare each against weak BATNA condiion
knitr::kable(summary(lm(first_offer ~ cond, data= d))$coef)
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | 4.625 | 1.143369 | 4.045065 | 0.0271980 |
| cond1 | 4.375 | 2.556650 | 1.711224 | 0.1855640 |
| cond2 | -1.125 | 2.556650 | -0.440029 | 0.6896888 |
contrasts(d$cond) = cbind(c(0,1,0),c(0,0,1)) # compare no vs. strong
knitr::kable(summary(lm(first_offer ~ cond, data= d))$coef) #second beta
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | 3.500 | 2.286737 | 1.530565 | 0.2233740 |
| cond1 | 5.500 | 3.233935 | 1.700715 | 0.1875544 |
| cond2 | 1.125 | 2.556650 | 0.440029 | 0.6896888 |
However, orthogonal contrasts make sense as the analysis strategy, so I will do those here as a supplemental analysis:
contrasts(d$cond)
## [,1] [,2]
## none 0 0
## strong 1 0
## weak 0 1
contrasts(d$cond) = cbind(c(1,1,-2),c(-1,1,0)) #comparing weak to other two, and then strong vs. no
knitr::kable(summary(lm(first_offer ~ cond, data= d))$coef)
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | 5.7083333 | 1.1433686 | 4.9925575 | 0.0154554 |
| cond1 | 0.5416667 | 0.6601241 | 0.8205527 | 0.4720290 |
| cond2 | 2.7500000 | 1.6169673 | 1.7007146 | 0.1875544 |
contrasts(d$cond) = cbind(c(-2,1,1),c(0,1,-1)) # comparing no batna to other two, than weak vs. strong
knitr::kable(summary(lm(first_offer ~ cond, data= d))$coef)
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | 5.708333 | 1.1433686 | 4.992558 | 0.0154554 |
| cond1 | 1.104167 | 0.8732622 | 1.264416 | 0.2953838 |
| cond2 | 2.187500 | 1.2783249 | 1.711224 | 0.1855640 |
The original paper also calculated effect sizes for all comparisons between conditions. Below I calcualte effect sizes for each comparison (stong vs. weak, weak vs. none, and none vs. strong) for both first offer and perceived power.
All effect sizes were higher than .76 for the first offer and higher than .42 for the power DV in the original paper.
#effect size split into two conditions
d.ws = d %>% filter(d$cond =="weak" | d$cond=="strong"); d.ws$cond = droplevels(d.ws$cond)
d.wn = d %>% filter(d$cond =="weak" | d$cond=="none"); d.wn$cond = droplevels(d.wn$cond)
d.sn = d %>% filter(d$cond =="none" | d$cond=="strong"); d.sn$cond = droplevels(d.sn$cond)
#calcualte d for first offer
cohen.d(d.ws$first_offer, d.ws$cond)
##
## Cohen's d
##
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup
## NA NA
cohen.d(d.wn$first_offer, d.wn$cond)
##
## Cohen's d
##
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup
## NA NA
cohen.d(d.sn$first_offer, d.sn$cond)
## Warning in qt((1 - conf.level)/2, df): NaNs produced
##
## Cohen's d
##
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup
## NA NA
#calcualte d for power
cohen.d(d.ws$power, d.ws$cond)
##
## Cohen's d
##
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup
## NA NA
cohen.d(d.wn$power, d.wn$cond)
##
## Cohen's d
##
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup
## NA NA
cohen.d(d.sn$power, d.sn$cond)
## Warning in qt((1 - conf.level)/2, df): NaNs produced
##
## Cohen's d
##
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup
## NA NA
In addition, the original paper calcualted CIs for each mean of each condition for both power and first offers. I replicate below.
ci.95 = function (mean, sd, n) {
error = qt(0.975,df=n-1)*sd/sqrt(n)
left = mean - error
right = mean + error
print(left)
print(right)
}
d.s = d %>% filter(d$cond=="strong")
d.n = d %>% filter(d$cond=="none")
d.w = d %>% filter(d$cond=="weak")
#strong BATNA CI
ci.95(mean(d.s$first_offer), sd(d.s$first_offer), length(d.s$first_offer))
## Warning in qt(0.975, df = n - 1): NaNs produced
## [1] NaN
## [1] NaN
#weak BATNA CI
ci.95(mean(d.w$first_offer), sd(d.w$first_offer), length(d.w$first_offer))
## [1] 0.9862909
## [1] 8.263709
#strong BATNA CI
ci.95(mean(d.n$first_offer), sd(d.n$first_offer), length(d.n$first_offer))
## Warning in qt(0.975, df = n - 1): NaNs produced
## [1] NaN
## [1] NaN
#strong BATNA CI
ci.95(mean(d.s$power), sd(d.s$power), length(d.s$power))
## Warning in qt(0.975, df = n - 1): NaNs produced
## [1] NaN
## [1] NaN
#weak BATNA CI
ci.95(mean(d.w$power), sd(d.w$power), length(d.w$power))
## [1] 3.908777
## [1] 7.091223
#strong BATNA CI
ci.95(mean(d.n$power), sd(d.n$power), length(d.n$power))
## Warning in qt(0.975, df = n - 1): NaNs produced
## [1] NaN
## [1] NaN
Any follow-up analyses will be added here
Here will be a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.
Here will be an open-ended commentary reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt.