Justin Salloum
The present study replicates Experiment 2 from the original research, which assesses whether perceivers use speakers’ hierarchy-induced acoustic cues to make hierarchical inferences about speakers. The researchers found that “perceivers used higher pitch, greater loudness, and greater loudness variability to make accurate inferences of speakers’ hierarchical rank, demonstrating that acoustic cues are systematically used to detect hierarchy.” Of particular interest is the result that “speakers who had been in the high-rank condition — regardless of their sex — were rated as more likely to engage in high-rank behaviors than were those in the low-rank condition.”
Original effect size \(\eta^2\) = 0.603, \(f^2\) = 0.572. The effect size was determined using the F-statistic and the between- and within-subject degrees of freedom.
\(df_1 = 1\)
\(df_2 = 55\)
\(F(1, 55) = 83.67\)
\(\eta^2 = \frac{df_1F}{df_1F + df_2} = 0.603\)
\(f^2 = \frac{\eta^2}{1 - \eta^2} = 0.572\)
Power analysis was done using the software G*Power. To detect an effect size of 0.603, the following samples sizes are needed to achieve various power:
| Power | Sample Size Needed |
|---|---|
| 0.8 | 17 |
| 0.9 | 21 |
| 0.95 | 25 |
All of these sample sizes are reasonable and financially feasible. Note that here the sample size used in data analysis actually corresponded to the number of stimuli (speakers) rather than the number of participants in the experiment, since all the scores for each speaker were averaged over all the participants in the original research.
Just like in the original study, 40 undergraduates will be randomly selected for the sample, without restriction on age, gender or demographics. However, as a result of our power analysis, each participant will listen to only 24 speakers (25 was the necessary sample size calculated, but it must be an even number to ensure equal respresentation between speaker sex), as opposed to 60 which the number of speakers in the original research.
Recording of speakers saying aloud the Negotiation Passage: “I’m glad that we are able to meet today and I am looking forward to our negotiation. I know that you and I have different perspectives on some of the key issues and that these differences would need to be resolved for us to come to an agreement.”
“The voices’ baseline acoustics served as the criterion for the subset of voices such that the chosen voices’ baseline values had a smaller average deviation from the mean of their respective sex’s baseline values.”
The following items are used to measure hierarchy-based behavioral influences:
“Each perceiver listened to a subset of recordings of the Negotiation Passage from Experiment 1 (12 female and 12 male voices). After each recording, perceivers rated the speaker on 12 hierarchy-based behaviors plausible in a negotiation context, using a scale from 1 (not at all) to 7 (very much). Six of these behaviors were associated with high rank, and six with low rank. The order of the speakers and the order of the behaviors were randomized for each perceiver. The low-rank behaviors were reverse-scored, and then scores for all 12 behaviors were averaged to create one composite hierarchical-inference score per perceiver per speaker.
The original procedure is followed exactly, with the exception that Experiment 1 isn’t actually carried out - it is just used as a reference in the original research to obtain the recordings of the speakers.
The data will be analyzed with the same approach as in the original research. Effect of condition on hierarchy-based behavioral inferences:
“We examined the extent to which perceivers’ hierarchical inferences were consistent with the speakers’ hierarchical rank using a 2 (speaker’s condition: high rank, low rank) × 2 (speaker’s sex: male, female) analysis of variance.”
Like in the original research, the current research will look for a main effect of speaker’s condition, as well as main effect of speaker sex and interaction effects between speaker condition and sex.
The biggest difference from the original study is the number of speakers that each participant listens to. In the original study each participant listened to and made inferences about 60 speakers, whereas in this study the number of speakers is reduced to 24. Another key difference is that answering the 12 questions about hierarchy-based behavior is the only inferential task that participants perform in this study. The third difference between the current study and the original study is the setting; the current study will be entirely online and distributed via Amazon Mechanical Turk.
Loading the libaries needed for data analysis.
options(warn=-1)
rm(list=ls())
library(tidyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(rjson)
library(tidyjson)
library(lme4)
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following object is masked from 'package:tidyr':
##
## expand
library(lmerTest)
##
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
##
## lmer
## The following object is masked from 'package:stats':
##
## step
library(gridExtra)
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
sem <- function(x) {sd(x, na.rm=TRUE) / sqrt(length(x))}
ci95 <- function(x) {sem(x) * 1.96}
Data is read from the various json files into a data frame in long form.
wid = 1
files = dir(paste0("./","production-results/"), pattern = "*.json")
d.raw = data.frame()
for (f in files) {
jf = paste0("./", "production-results/",f)
jd = fromJSON(paste(readLines(jf), collapse=""))
for (elem in jd$answers$data) {
id = data.frame(workerId = as.factor(wid),
speakerId = elem$speakerId,
speakerSex = elem$sex,
plev = elem$plev,
behaviorScore = elem$behaviorScore)
d.raw = bind_rows(d.raw, id)
}
wid = wid + 1
}
The original data is simply read in from the csv provided on OSF.
d = read.csv('S2_voice_level_Final.csv')
To prepare the data for analysis, speakerSex and plev are recoded as [‘Male’, ‘Female’] and [‘Low-rank’, ‘High-rank’], respectively. Two sets of analysis will be carried out:
# d.a is the aggregated data with speakerSex and plev as numeric, while d.af codes speakerSex and plev as factors. The same applies for d.o and d.of, which is the original data
d.a = aggregate(d.raw[3:5], list(d.raw$speakerId), mean)
d.a = rename(d.a, speakerId = Group.1)
d.af = d.a
d.a$speakerSex[d.a$speakerSex == -1] = 0
d.a$plev[d.a$plev == -1] = 0
d.af$speakerSex[d.af$speakerSex == -1] = 'Male'
d.af$speakerSex[d.af$speakerSex == 1] = 'Female'
d.af$plev[d.af$plev == -1] = 'Low'
d.af$plev[d.af$plev == 1] = 'High'
d.o = select(d, voice, plev, vsex, newpster)
d.o = rename(d.o, speakerSex = vsex, behaviorScore = newpster)
d.o = d.o[complete.cases(d.o),]
d.of = d.o
d.of$speakerSex[d.of$speakerSex == -1] = 'Male'
d.of$speakerSex[d.of$speakerSex == 1] = 'Female'
d.of$plev[d.of$plev == -1] = 'Low'
d.of$plev[d.of$plev == 1] = 'High'
# d.raw is the raw data with speakerSex and plev as numeric, while d.rawf codes speakerSex and plev as factors
d.rawf = d.raw
d.raw$speakerSex[d.raw$speakerSex == -1] = 0
d.raw$plev[d.raw$plev == -1] = 0
d.rawf$speakerSex[d.rawf$speakerSex == -1] = 'Male'
d.rawf$speakerSex[d.rawf$speakerSex == 1] = 'Female'
d.rawf$plev[d.rawf$plev == -1] = 'Low'
d.rawf$plev[d.rawf$plev == 1] = 'High'
Now that the data has been prepared, here’s a quick look at our data (in both forms) that we will analyze:
print(head(d.af))
## speakerId speakerSex plev behaviorScore
## 1 5 Male High 4.481481
## 2 16 Male High 4.555556
## 3 24 Male Low 3.925926
## 4 37 Male High 4.444444
## 5 40 Male High 4.398148
## 6 45 Male Low 3.601852
print(head(d.rawf))
## Source: local data frame [6 x 5]
##
## workerId speakerId speakerSex plev behaviorScore
## (chr) (dbl) (chr) (chr) (dbl)
## 1 1 37 Male High 4.750000
## 2 1 24 Male Low 2.833333
## 3 1 163 Female Low 4.333333
## 4 1 75 Male Low 4.333333
## 5 1 53 Male High 4.333333
## 6 1 155 Female Low 2.833333
We will gather an idea of the distribution of behavior scores in relation to hierarchy condition and speaker sex, and compare with the original results.
bx1 = ggplot(d.of, aes(x = plev, y = behaviorScore, fill = speakerSex)) +
geom_boxplot() +
labs(title = 'Original Results', x = 'Hierarchy Condition', y = 'Behavior Score') +
scale_fill_discrete(name = 'Speaker Sex')
bx2 = ggplot(d.af, aes(x = plev, y = behaviorScore, fill = speakerSex)) +
geom_boxplot() +
labs(title = 'Replication Results', x = 'Hierarchy Condition', y = 'Behavior Score') +
scale_fill_discrete(name = 'Speaker Sex')
grid.arrange(bx1, bx2, ncol = 2)
We will use bar plots to get an idea of the average behavior scores between hierarchy condition and speaker sex, and compare with the original results.
bp1 = ggplot(d.of, aes(x = plev, y = behaviorScore, fill = speakerSex)) +
geom_bar(position = 'dodge', stat = 'identity') +
labs(title = 'Original Results', x = 'Hierarchy Condition', y = 'Behavior Score') +
scale_fill_discrete(name = 'Speaker Sex')
bp2 = ggplot(d.af, aes(x = plev, y = behaviorScore, fill = speakerSex)) +
geom_bar(position = 'dodge', stat = 'identity') +
labs(title = 'Replication Results', x = 'Hierarchy Condition', y = 'Behavior Score') +
scale_fill_discrete(name = 'Speaker Sex')
grid.arrange(bp1, bp2, ncol = 2)
Additive Model
rs1.11 = aov(behaviorScore ~ plev + speakerSex, data = d.af)
summary(rs1.11)
## Df Sum Sq Mean Sq F value Pr(>F)
## plev 1 1.8828 1.8828 26.579 4.16e-05 ***
## speakerSex 1 0.2242 0.2242 3.164 0.0897 .
## Residuals 21 1.4876 0.0708
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
rs1.12 = lm(behaviorScore ~ plev + speakerSex, data = d.a)
summary(rs1.12)
##
## Call:
## lm(formula = behaviorScore ~ plev + speakerSex, data = d.a)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.44892 -0.13807 0.00783 0.11001 0.61404
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.8582 0.1002 38.513 < 2e-16 ***
## plev 0.5275 0.1102 4.787 9.93e-05 ***
## speakerSex -0.1960 0.1102 -1.779 0.0897 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2662 on 21 degrees of freedom
## Multiple R-squared: 0.5862, Adjusted R-squared: 0.5467
## F-statistic: 14.87 on 2 and 21 DF, p-value: 9.48e-05
Interactive Model
rs1.21 = aov(behaviorScore ~ plev * speakerSex, data = d.af)
summary(rs1.21)
## Df Sum Sq Mean Sq F value Pr(>F)
## plev 1 1.8828 1.8828 25.621 5.97e-05 ***
## speakerSex 1 0.2242 0.2242 3.050 0.0961 .
## plev:speakerSex 1 0.0178 0.0178 0.243 0.6277
## Residuals 20 1.4698 0.0735
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
rs1.22 = lm(behaviorScore ~ plev * speakerSex, data = d.a)
summary(rs1.22)
##
## Call:
## lm(formula = behaviorScore ~ plev * speakerSex, data = d.a)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.41667 -0.12440 -0.00529 0.10741 0.64630
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.8259 0.1212 31.558 < 2e-16 ***
## plev 0.5828 0.1587 3.672 0.00151 **
## speakerSex -0.1407 0.1587 -0.887 0.38581
## plev:speakerSex -0.1106 0.2245 -0.493 0.62766
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2711 on 20 degrees of freedom
## Multiple R-squared: 0.5911, Adjusted R-squared: 0.5298
## F-statistic: 9.638 on 3 and 20 DF, p-value: 0.000383
All the data analysis thus far replicates the analysis that was actually done in the original study. From here on all analysis aims to follow up on the original analysis by analyzing the raw data (unaggreagted, in long form) to look for other effects.
Just like with the aggregated data, we will gather an idea of the distribution of and average behavior scores in relation to hierarchy condition and speaker sex.
ggplot(d.rawf, aes(x = plev, y = behaviorScore, fill = speakerSex)) +
geom_boxplot() +
labs(title = 'Replication Results (raw)', x = 'Hierarchy Condition', y = 'Behavior Score') +
scale_fill_discrete(name = 'Speaker Sex')
ggplot(d.rawf, aes(x = plev, y = behaviorScore, fill = speakerSex)) +
geom_bar(position = 'dodge', stat = 'identity') +
labs(title = 'Replication Results (raw)', x = 'Hierarchy Condition', y = 'Behavior Score') +
scale_fill_discrete(name = 'Speaker Sex')
Random effect of speakerId
rs2.2 = lmer(behaviorScore ~ (plev * speakerSex) + (1 | speakerId), data = d.raw )
summary(rs2.2)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
## to degrees of freedom [lmerMod]
## Formula: behaviorScore ~ (plev * speakerSex) + (1 | speakerId)
## Data: d.raw
##
## REML criterion at convergence: 416.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.57592 -0.63208 0.05622 0.62009 2.38285
##
## Random effects:
## Groups Name Variance Std.Dev.
## speakerId (Intercept) 0.03277 0.1810
## Residual 0.36645 0.6053
## Number of obs: 216, groups: speakerId, 24
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 3.8259 0.1212 20.0000 31.558 < 2e-16 ***
## plev 0.5828 0.1587 20.0000 3.672 0.00151 **
## speakerSex -0.1407 0.1587 20.0000 -0.887 0.38581
## plev:speakerSex -0.1106 0.2245 20.0000 -0.493 0.62766
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) plev spkrSx
## plev -0.764
## speakerSex -0.764 0.583
## plev:spkrSx 0.540 -0.707 -0.707
Random effect of speakerId and workerId
rs2.3 = lmer(behaviorScore ~ (plev * speakerSex) + (1 | workerId) + (1 | speakerId), data = d.raw )
summary(rs2.3)
## Linear mixed model fit by REML t-tests use Satterthwaite approximations
## to degrees of freedom [lmerMod]
## Formula:
## behaviorScore ~ (plev * speakerSex) + (1 | workerId) + (1 | speakerId)
## Data: d.raw
##
## REML criterion at convergence: 416.5
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.57592 -0.63208 0.05622 0.62009 2.38285
##
## Random effects:
## Groups Name Variance Std.Dev.
## speakerId (Intercept) 0.03277 0.1810
## workerId (Intercept) 0.00000 0.0000
## Residual 0.36645 0.6053
## Number of obs: 216, groups: speakerId, 24; workerId, 9
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 3.8259 0.1212 20.0000 31.558 < 2e-16 ***
## plev 0.5828 0.1587 20.0000 3.672 0.00151 **
## speakerSex -0.1407 0.1587 20.0000 -0.887 0.38581
## plev:speakerSex -0.1106 0.2245 20.0000 -0.493 0.62766
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) plev spkrSx
## plev -0.764
## speakerSex -0.764 0.583
## plev:spkrSx 0.540 -0.707 -0.707