Replication of by Anchors Weigh More Than Power: Why Absolute Powerlessness Liberates Negotiators to Achieve Better Outcomes Schaerer, Swaab, & Galinsky (2015, Psychological Science)

Melanie Brucks

brucksm@stanford.edu

Introduction

Schaerer et al.’s goal of Experiment 1a was to provide preliminary evidence that having an alternative is not always optimal in a negotiation. While previous research argued that having alternatives provided power to the negotiator because they are not completely dependent on the current negotiation, the study aimed to demonstrate that weak alternatives actually lead to worse outcomes than having no alterantive offer at all. The mechanism at play is an anchoring effect. The alternatives anchor the negotiator’s first offer. Having a weak alternative causes a negotatior to start too low on the first offer, leading to a worse outcome relative to having a strong attractive alternative, and more importantly, relative to having no alternative at all.

The target finding of the replication was to find that participants with a weak alternative will make lower first offers than participants with either a strong alternative or no alternative at all in a hypothetical negotiation. In addition, a secondary finding was that the participants with no alternative will still feel the least powerful of all three conditions.

Methods

Link to Experiment

https://web.stanford.edu/~brucksm/brucksmProject/Project/Schaerer_mb_rep.html

Power Analysis

The original study reports cohen’s ds of .76 or higher for the effect of condition on first offers and ds of .42 for the measure of power. Given that the main DV is the first offer, we will make sure we have enough power to capture this effect.

Importantly, while there are three conditions, each effect size is the simple comparison between two conditions. Thus, I will make sure we have enough power to capture the smallest effect size observed for the first offer, which is the effect size comparing the weak BATNA vs. no BATNA first offers (a t-test using dummy coded regression). Because the standard deviations are different for each group, I estimated how many participants we needed for 80% power using a package that allows for hetereogeneous variance.

I inputted the mean difference between the weak BATNA and no BATNA condition and the standard deviations for both conditions for the first offer DV.

n.ttest(power = 0.8, alpha = 0.05, mean.diff = 3.04, sd1 = 5.33, sd2 = 1.74,
          design = "unpaired", variance = "unequal")

## Warning in n.ttest(power = 0.8, alpha = 0.05, mean.diff = 3.04, sd1 =
## 5.33, : Arguments -fraction- and -k- are not used, when variances are
## unequal

## $`Total sample size`
## [1] 46
## 
## $`Sample size group 1`
## [1] 34
## 
## $`sample size group 2`
## [1] 12

The standard deviations are not equal, so the sample sizes for each condition are different, but the results of the power analysis suggest that we will acheive 80% power for the weakest effect size if I use at least 34 people per condition. This seems relatively low, so we decided to use 50 per condition as a conservative test of this effect.

However, just to check, I also ran a power analysis for how many people would we need for the secondary DV of perceived power. We can use power.t.test in this case, since the variance is homogeneous.

power.t.test(delta=0.5, sd=1, power=.80, sig.level=0.05)

## 
##      Two-sample t test power calculation 
## 
##               n = 63.76576
##           delta = 0.5
##              sd = 1
##       sig.level = 0.05
##           power = 0.8
##     alternative = two.sided
## 
## NOTE: n is number in *each* group

For 80% power, we need 63 per condition. Thus, our study is not powered enough for this DV. That is ok since the main DV we are interested in is the first offer.

Planned Sample

The planned sample size is based on the power analysis. Specifically, we will run 50 per condition. There were no preselections as there were none in the original study.

Materials

We contacted the authors and got the survey materials with exact wording. See below.

Manipulation

Participants will be randomly assigned to a no BATNA (best alternative to the negotiated agreement), strongBATNA, or weak BATNA condition.

Participants were told:

“Imagine that you just bought an MP3 player and you want to sell some of your old CDs since you no longer need them. One of the CDs you want to sell is the album,”Forty Licks" by The Rolling Stones. Even though the CD is used, it is in reasonably good condition (case intact, no scratches on disc). Then, they were assigned to one of three conditions.

In the no-BATNA condition, participants were told:

“Someone (”the buyer“) just approached you and raised an interest in purchasing your CD. The buyer asks you how much you want for the CD. You also know that nobody else has offered you any money for the CD. Thus, if you can’t reach an agreement in the current negotiation, you won’t get any money for the CD. You are negotiating the price for the CD. What is your first offer to the buyer?”

In the strong-BATNA condition, participants read:

“Someone (”the buyer“) just approached you and raised an interest in purchasing your CD. The buyer asks you how much you want for the CD.You also know that another buyer has offered you $8 for the CD. Thus, if you can’t reach an agreement in the current negotiation, you will get $8 for the CD. You are negotiating the price for the CD. What is your first offer to the buyer?”

in the weak-Batna condition, instructions were identical in the strong-BATNA condition, but the alternative offer was $2 instead of $8."

DVs

“Immediately after the BATNA manipulation, participants were instructed to make the first offer, which was our key dependent measure. Then, the negotiation was terminated, and participants were asked to indicate the extent to which they felt powerful (1 = powerless, 7 = powerful), in control (1 = no control, 7 = in control), strong (1 = weak, 7 = strong), and confident (1 = unconfident, 7 = confident). Responses to these four items were averaged to create a single measure of perceived power. Finally, participants completed an attention check and provided demographic information”" (see the experiment for details… they were asked to check specific boxes at the end of a long paragraph of irrelevant instructions).

Procedure

The procedure was the same as when the authors ran the study on Mturk: Participants first read the manipulation (a hypothetical negotiation situation), and then they answered the dependent variable questions in the same order as the original paper.

Analysis Plan

Data Cleaning Rules

As the original authors did, we excluded participants that had a duplicate IP address, incorrectly answered the attention check, and who made first offers with extreme values (> 3 SD from the mean). For the original paper, this amounted to 17/305, or ~6%.

Analyses

For the first offers, we used dummy coding and regression to compare the mean first offer of the no BATNA condition to the other two conditions. We also used different dummy codes to compare the strong BATNA to the no BATNA condition. For the perceived power (the secondary DV), we aggregated the four items to create a single measure, as the original paper did, and then used regression and dummy coding to compare the no alternative condition to the other to conditions as well as the weak condition relative to the strong condition. In addition, to match the exact analsyes of the authors, we also calcualted effect sizes and reported confidence intervals for the means of the different conditions.

Differences from Original Study

Fortunately, we did not antcipate many differences between this replciation and the original paper. Both studies will be run on MTurk with a similar demogrpahic and the materials will be identical to the original study. The only difference is we will oly be paying .30 cents as we determined the task only takes around three minutes.

(Post Data Collection) Methods Addendum

Actual Sample

Sample size, demographics, data exclusions based on rules spelled out in analysis plan.

Differences from pre-data collection methods plan

Any differences from what was described as the original plan, or “none”.

Results

Data preparation

Read in the data from .json files, and format into a dataframe.

path <- "~/Box Sync/mturk/"
files <- dir(paste0(path,"production-results/"), 
             pattern = "*.json")
d.raw <- data.frame()

for (f in files) {
  jf <- paste0(path, "production-results/",f)
  jd <- fromJSON(paste(readLines(jf), collapse=""))
  id <- data.frame(ip = jd$answer$data$fingerprint[[1]]$answer$ip,
                   workerid = jd$WorkerId,
                   cond = as.factor(jd$answer$data$cond),
                   first_offer = as.numeric(jd$answer$data$first_offer),
                   power= as.numeric(jd$answer$data$power),
                   control = as.numeric(jd$answer$data$control),
                   strong = as.numeric(jd$answer$data$strong),
                   confident = as.numeric(jd$answer$data$confident),
                   sex =jd$answer$data$sex,
                   age = jd$answer$data$age,
                   nat = jd$answer$data$nat,
                   lang = jd$answer$data$langu,
                   eth = jd$answer$data$eth,
                   att_check= jd$answer$data$AC)
                   
  
     d.raw <- bind_rows(d.raw, id)
}

Remove exclusions

As reported, I removed all ips that were duplicated, participants who didn’t pass the attention check and any first offers that were 3 SD above the mean, just as the authors did.

d1 = d.raw[d.raw$ip=="",] #dataset without ip addresses
d2 = d.raw[d.raw$ip!="",] #dataset with ip addresses
d = d2[!(duplicated(d2$ip) | duplicated(d2$ip, fromLast = TRUE)), ] #first removing ALL duplicates among ip addresses that wer e not blank

d = rbind(d1, d)

Here is the proportion of people who will remain in the study for analysis:

length(d$power)/ length(d.raw$power)

## [1] 1

Now I will aggregate the four power measures to get ther perceived power DV

d = d %>% filter (att_check =="pass" & first_offer < (3*sd(first_offer))) %>% 
    mutate (agg_power = (power + control + strong + confident)/4)

Let’s look at some histograms first!

qplot(d$power)

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

qplot(d$first_offer)

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Calcualting cronbach’s alpha for power measures (a measure of agreement among the different power measures)

The alpha of the original paper was .92.

#calculating alpha
power = cbind(d$power, d$control, d$strong, d$confident)
ICC(power, missing = F)[[1]]$ICC[6]  #alphas are a subclass of interclass correlation coefficients, and this one captures cronbach's alpha

## [1] 0.9430255

Graphs for first offer and power measures.

Here is the original graph:

Here are my plots:

d.mean = d %>%
         group_by (cond) %>%
         summarise (power= mean(agg_power),
                    first_offer1 = mean(first_offer),
                    sem_power = sem(agg_power),
                    sem_fo = sem(first_offer))

d.mean$cond <- factor(d.mean$cond, levels = c("none","weak","strong"))

ggplot(d.mean, aes(x = cond, y = first_offer1)) +
      geom_bar(position = position_dodge(), stat = "identity") + 
      geom_errorbar(aes(ymin=first_offer1-sem_fo, ymax=first_offer1+sem_fo),
                  width=.2,                    
                  position=position_dodge(.9))

ggplot(d.mean, aes(x = cond, y = power, group=1)) +
      geom_line(position = position_dodge(), stat = "identity") +
      geom_point()+
      geom_errorbar(aes(ymin=power-sem_power, ymax=power+sem_power),
                  width=.2,                    
                  position=position_dodge(.9))

Analysis:

The original paper compared the first offers of the weak BATNA to both the strong BATNA and no BATNA and found that the weak BATNA first offer (M = $4.57, SD = 1.74, 95% CI = [4.33, 4.92]) was significantly lower than the strong (M = $11.02, SD = 1.90, 95% CI = [10.63, 11.42]) and no BATNA condition. In addition, they found that the no BATNA first offer was lower than the strong BATNA first offer (M = $7.61, SD = 5.33, 95% CI = [6.54, 8.68]) using t-tests in dummy coded regression (all t’s higher than 6.18).

in addition, the original paper found that the weak BATNA (M = 5.25, SD = 1.09, 95% CI = [5.03, 5.47]) reported feeling less powerful than the strong BATNA (M = 5.75, SD = 0.96, 95% CI = [5.55, 5.95]) and more powerful than the no BATNA condition (M = 4.80, SD = 1.03, 95% CI = [4.59, 5.01]) and that (obviously) the strong BATNA felt more powerful than the no BATNA condition using t-tests in dummy coded regression.

I replicate these analyses below using dummy coding and regression.

d$cond = as.factor(d$cond)
contrasts(d$cond)

##        strong weak
## none        0    0
## strong      1    0
## weak        0    1

contrasts(d$cond) = cbind(c(0,1,0),c(1,0,0)) # compare each against weak BATNA condiion
knitr::kable(summary(lm(power ~ cond, data= d))$coef)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	5.5	0.500000	11.0000000	0.0016089
cond1	0.5	1.118034	0.4472136	0.6850376
cond2	0.5	1.118034	0.4472136	0.6850376

contrasts(d$cond) = cbind(c(0,1,0),c(0,0,1)) # compare no vs. strong
knitr::kable(summary(lm(power ~ cond, data= d))$coef) #first beta

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	6.0	1.000000	6.0000000	0.0092727
cond1	0.0	1.414214	0.0000000	1.0000000
cond2	-0.5	1.118034	-0.4472136	0.6850376

contrasts(d$cond) = cbind(c(0,1,0),c(1,0,0)) # compare each against weak BATNA condiion
knitr::kable(summary(lm(first_offer ~ cond, data= d))$coef)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	4.625	1.143369	4.045065	0.0271980
cond1	4.375	2.556650	1.711224	0.1855640
cond2	-1.125	2.556650	-0.440029	0.6896888

contrasts(d$cond) = cbind(c(0,1,0),c(0,0,1)) # compare no vs. strong 
knitr::kable(summary(lm(first_offer ~ cond, data= d))$coef) #second beta

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	3.500	2.286737	1.530565	0.2233740
cond1	5.500	3.233935	1.700715	0.1875544
cond2	1.125	2.556650	0.440029	0.6896888

However, orthogonal contrasts make sense as the analysis strategy, so I will do those here as a supplemental analysis:

contrasts(d$cond)

##        [,1] [,2]
## none      0    0
## strong    1    0
## weak      0    1

contrasts(d$cond) = cbind(c(1,1,-2),c(-1,1,0)) #comparing weak to other two, and then strong vs. no
knitr::kable(summary(lm(first_offer ~ cond, data= d))$coef)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	5.7083333	1.1433686	4.9925575	0.0154554
cond1	0.5416667	0.6601241	0.8205527	0.4720290
cond2	2.7500000	1.6169673	1.7007146	0.1875544

contrasts(d$cond) = cbind(c(-2,1,1),c(0,1,-1)) # comparing no batna to other two, than weak vs. strong
knitr::kable(summary(lm(first_offer ~ cond, data= d))$coef)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	5.708333	1.1433686	4.992558	0.0154554
cond1	1.104167	0.8732622	1.264416	0.2953838
cond2	2.187500	1.2783249	1.711224	0.1855640

The original paper also calculated effect sizes for all comparisons between conditions. Below I calcualte effect sizes for each comparison (stong vs. weak, weak vs. none, and none vs. strong) for both first offer and perceived power.

All effect sizes were higher than .76 for the first offer and higher than .42 for the power DV in the original paper.

#effect size split into two conditions
d.ws = d %>% filter(d$cond =="weak" | d$cond=="strong"); d.ws$cond = droplevels(d.ws$cond)
d.wn = d %>% filter(d$cond =="weak" | d$cond=="none"); d.wn$cond = droplevels(d.wn$cond)
d.sn = d %>% filter(d$cond =="none" | d$cond=="strong"); d.sn$cond = droplevels(d.sn$cond)

#calcualte d for first offer
cohen.d(d.ws$first_offer, d.ws$cond)

## 
## Cohen's d
## 
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup 
##  NA  NA

cohen.d(d.wn$first_offer, d.wn$cond)

## 
## Cohen's d
## 
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup 
##  NA  NA

cohen.d(d.sn$first_offer, d.sn$cond)

## Warning in qt((1 - conf.level)/2, df): NaNs produced

## 
## Cohen's d
## 
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup 
##  NA  NA

#calcualte d for power
cohen.d(d.ws$power, d.ws$cond)

## 
## Cohen's d
## 
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup 
##  NA  NA

cohen.d(d.wn$power, d.wn$cond)

## 
## Cohen's d
## 
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup 
##  NA  NA

cohen.d(d.sn$power, d.sn$cond)

## Warning in qt((1 - conf.level)/2, df): NaNs produced

## 
## Cohen's d
## 
## d estimate: NA (NA)
## 95 percent confidence interval:
## inf sup 
##  NA  NA

In addition, the original paper calcualted CIs for each mean of each condition for both power and first offers. I replicate below.

Preparing for CIs

ci.95 = function (mean, sd, n) {
  error = qt(0.975,df=n-1)*sd/sqrt(n)
  left = mean - error
  right = mean + error
  print(left)
  print(right)
}

d.s = d %>% filter(d$cond=="strong")
d.n = d %>% filter(d$cond=="none")
d.w = d %>% filter(d$cond=="weak")

First Offer mean CIs

#strong BATNA CI
ci.95(mean(d.s$first_offer), sd(d.s$first_offer), length(d.s$first_offer))

## Warning in qt(0.975, df = n - 1): NaNs produced

## [1] NaN
## [1] NaN

#weak BATNA CI
ci.95(mean(d.w$first_offer), sd(d.w$first_offer), length(d.w$first_offer))

## [1] 0.9862909
## [1] 8.263709

#strong BATNA CI
ci.95(mean(d.n$first_offer), sd(d.n$first_offer), length(d.n$first_offer))

## Warning in qt(0.975, df = n - 1): NaNs produced

## [1] NaN
## [1] NaN

Power mean CIs

#strong BATNA CI
ci.95(mean(d.s$power), sd(d.s$power), length(d.s$power))

## Warning in qt(0.975, df = n - 1): NaNs produced

## [1] NaN
## [1] NaN

#weak BATNA CI
ci.95(mean(d.w$power), sd(d.w$power), length(d.w$power))

## [1] 3.908777
## [1] 7.091223

#strong BATNA CI
ci.95(mean(d.n$power), sd(d.n$power), length(d.n$power))

## Warning in qt(0.975, df = n - 1): NaNs produced

## [1] NaN
## [1] NaN

Exploratory analyses

Any follow-up analyses will be added here

Discussion

Summary of Replication Attempt

Here will be a paragraph summarizing the primary result from the confirmatory analysis and the assessment of whether it replicated, partially replicated, or failed to replicate the original result.

Commentary

Here will be an open-ended commentary reflecting (a) insights from follow-up exploratory analysis, (b) assessment of the meaning of the replication (or not) - e.g., for a failure to replicate, are the differences between original and present study ones that definitely, plausibly, or are unlikely to have been moderators of the result, and (c) discussion of any objections or challenges raised by the current and original authors about the replication attempt.