For this exercise, please try to reproduce the results from Experiment 1 of the associated paper (Ko, Sadler & Galinsky, 2015). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

A sense of power has often been tied to how we perceive each other’s voice. Social hierarchy is embedded into the structure of society and provides a metric by which others relate to one another. In 1956, the Brunswik Lens Model was introduced to examine how vocal cues might influence hierarchy. In “The Sound of Power: Conveying and Detecting Hierarchical Rank Through Voice,” Ko and colleagues investigated how manipulation of hierarchal rank within a situation might impact vocal acoustic cues. Using the Brunswik Model, six acoustic metrics were utilized (pitch mean & variability, loudness mean & variability, and resonance mean & variability) to isolate a potential contribution between individuals of different hierarchal rank. In the first experiment, Ko, Sadler & Galinsky examined the vocal acoustic cues of individuals before and after being assigned a hierarchal rank in a sample of 161 subjects (80 male). Each of the six hierarchy acoustic cues were analyzed with a 2 (high vs. low rank condition) x 2 (male vs. female) analysis of covariance, controlling for the baseline of the respective acoustic cue.


Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 1):

The impact of hierarchical rank on speakers’ acoustic cues. Each of the six hierarchy-based (i.e., postmanipulation) acoustic variables was submitted to a 2 (condition: high rank, low rank) × 2 (speaker’s sex: female, male) between-subjects analysis of covariance, controlling for the corresponding baseline acoustic variable. Table 4 presents the adjusted means by condition. Condition had a significant effect on pitch, pitch variability, and loudness variability. Speakers’ voices in the high-rank condition had higher pitch, F(1, 156) = 4.48, p < .05; were more variable in loudness, F(1, 156) = 4.66, p < .05; and were more monotone (i.e., less variable in pitch), F(1, 156) = 4.73, p < .05, compared with speakers’ voices in the low-rank condition (all other Fs < 1; see the Supplemental Material for additional analyses of covariance involving pitch and loudness). (from Ko et al., 2015, p. 6; emphasis added)

The adjusted means for these analyses are reported in Table 4 (Table4_AdjustedMeans.png, included in the same folder as this Rmd file).


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages:
# library(psych)
# library(car) # for ANCOVA
# library(compute.es) # for ANCOVA
# library(lsmeans) # for ANCOVA

Step 2: Load data

# Just Experiment 1
d <-read_csv("/Users/alexpereira/Library/CloudStorage/OneDrive-Personal/Academics/Stanford/Courses/Fall_2023/PSYCH_251/problem_sets/ps3/Group B/Choice 2/data/S1_voice_level_Final.csv")
# DT::datatable(d)

spec(d)
## cols(
##   voice = col_double(),
##   form_smean = col_double(),
##   form_svar = col_double(),
##   form_rmean = col_double(),
##   form_rvar = col_double(),
##   intense_smean = col_double(),
##   intense_svar = col_double(),
##   intense_rmean = col_double(),
##   intense_rvar = col_double(),
##   pitch_smean = col_double(),
##   pitch_svar = col_double(),
##   pitch_rmean = col_double(),
##   pitch_rvar = col_double(),
##   pow = col_double(),
##   age = col_double(),
##   sex = col_character(),
##   race = col_character(),
##   native = col_character(),
##   feelpower = col_double(),
##   plev = col_double(),
##   vsex = col_double(),
##   pitch_rmeanMD = col_double(),
##   pitch_rvarMD = col_double(),
##   intense_rmeanMD = col_double(),
##   intense_rvarMD = col_double(),
##   formant_rmeanMD = col_double(),
##   formant_rvarMD = col_double(),
##   pitch_smeanMD = col_double(),
##   pitch_svarMD = col_double(),
##   intense_smeanMD = col_double(),
##   intense_svarMD = col_double(),
##   formant_smeanMD = col_double(),
##   formant_svarMD = col_double(),
##   Zpitch_rmean = col_double(),
##   Zpitch_rvar = col_double(),
##   Zform_rmean = col_double(),
##   Zform_rvar = col_double(),
##   Zintense_rmean = col_double(),
##   Zintense_rvar = col_double(),
##   Zpitch_smean = col_double(),
##   Zpitch_svar = col_double(),
##   Zform_smean = col_double(),
##   Zform_svar = col_double(),
##   Zintense_smean = col_double(),
##   Zintense_svar = col_double()
## )
view(d)

Step 3: Tidy data

# Column details sourced from document "Codebook_all_data_sets". We care about:
## Voice (voice identification number), plev (heirarchy rank), vsex (speaker sex), form_smean (form mean, script), form_svar (form variance, script), form_rmean (form mean, rainbow), form_rvar (form variance, rainbow) intense_smean (intensity mean, script), intense_svar (intensity variance, script) intense_rmean (intensity mean, rainbow), intense_rvar (intensity variance, rainbow), pitch_smean (mean pitch, script), pitch_svar (pitch variance, script), pitch_rmean (mean pitch, rainbow), pitch_rvar (pitch variance, rainbow)

d_filtered = d %>%
  select(voice, plev, vsex, form_smean, form_svar, form_rmean, form_rvar, pitch_smean, pitch_svar, pitch_rmean, pitch_rvar, intense_smean, intense_svar, intense_rmean, intense_rvar)

d_filtered$plev = ifelse(d$plev == 1, 1, 0)

view(d_filtered)
head(d_filtered)
## # A tibble: 6 × 15
##   voice  plev  vsex form_smean form_svar form_rmean form_rvar pitch_smean
##   <dbl> <dbl> <dbl>      <dbl>     <dbl>      <dbl>     <dbl>       <dbl>
## 1     1     1    -1      1043.    38805.      1259.    68275.        116.
## 2     2     1    -1      1083.    35609.      1278.    54612.        146.
## 3     3     1    -1      1092.    25979.      1334.    55175.        106.
## 4     4     1    -1      1268.    71346.      1298.    74340.        102.
## 5     5     1    -1      1071.    38246.      1256.    67846.        122.
## 6     6     1    -1      1095.    40716.      1278.    76674.        137.
## # ℹ 7 more variables: pitch_svar <dbl>, pitch_rmean <dbl>, pitch_rvar <dbl>,
## #   intense_smean <dbl>, intense_svar <dbl>, intense_rmean <dbl>,
## #   intense_rvar <dbl>
summary(d_filtered)
##      voice            plev             vsex             form_smean  
##  Min.   :  1.0   Min.   :0.0000   Min.   :-1.000000   Min.   :1013  
##  1st Qu.: 41.0   1st Qu.:0.0000   1st Qu.:-1.000000   1st Qu.:1070  
##  Median :101.0   Median :1.0000   Median : 1.000000   Median :1096  
##  Mean   : 91.4   Mean   :0.5093   Mean   : 0.006211   Mean   :1129  
##  3rd Qu.:141.0   3rd Qu.:1.0000   3rd Qu.: 1.000000   3rd Qu.:1162  
##  Max.   :181.0   Max.   :1.0000   Max.   : 1.000000   Max.   :1380  
##    form_svar        form_rmean     form_rvar       pitch_smean    
##  Min.   : 25979   Min.   :1150   Min.   : 37436   Min.   : 89.71  
##  1st Qu.: 34293   1st Qu.:1261   1st Qu.: 55266   1st Qu.:113.63  
##  Median : 38621   Median :1295   Median : 62655   Median :156.98  
##  Mean   : 42912   Mean   :1293   Mean   : 64131   Mean   :157.08  
##  3rd Qu.: 44206   3rd Qu.:1326   3rd Qu.: 70318   3rd Qu.:193.81  
##  Max.   :107176   Max.   :1404   Max.   :129383   Max.   :254.74  
##    pitch_svar       pitch_rmean       pitch_rvar     intense_smean  
##  Min.   :  44.42   Min.   : 87.16   Min.   :  30.9   Min.   :48.04  
##  1st Qu.: 771.57   1st Qu.:110.23   1st Qu.: 806.2   1st Qu.:56.38  
##  Median :1542.60   Median :154.72   Median :1682.6   Median :59.11  
##  Mean   :1537.64   Mean   :149.57   Mean   :1752.5   Mean   :59.02  
##  3rd Qu.:2091.64   3rd Qu.:183.38   3rd Qu.:2414.0   3rd Qu.:61.51  
##  Max.   :4706.32   Max.   :228.92   Max.   :7345.5   Max.   :71.08  
##   intense_svar    intense_rmean    intense_rvar   
##  Min.   : 61.05   Min.   :46.50   Min.   : 82.37  
##  1st Qu.:155.32   1st Qu.:54.47   1st Qu.:145.72  
##  Median :186.90   Median :57.50   Median :174.09  
##  Mean   :190.13   Mean   :57.46   Mean   :182.75  
##  3rd Qu.:218.15   3rd Qu.:60.02   3rd Qu.:212.87  
##  Max.   :334.19   Max.   :70.23   Max.   :349.52

Step 4: Run analysis

Pre-processing

Descriptive statistics

In the paper, the adjusted means by condition are reported (see Table 4, or Table4_AdjustedMeans.png, included in the same folder as this Rmd file). Reproduce these values below:

## I'm not following how each mean was "adjusted by its corresponding baseline". It can't be just the difference between baseline (rainbow) and measure (script), because the values are too small, i.e.,

high_rank = subset(d_filtered, plev == 1)
high_rank_mean_pitch = mean(high_rank$pitch_smean - high_rank$pitch_rmean)
low_rank = subset(d_filtered, plev != 1)
low_rank_mean_pitch = mean(low_rank$pitch_smean - low_rank$pitch_rmean)

print(paste0("High-Rank Condition Mean Pitch", high_rank_mean_pitch))
## [1] "High-Rank Condition Mean Pitch9.02192102414634"
print(paste0("Low-Rank Condition Mean Pitch", low_rank_mean_pitch))
## [1] "Low-Rank Condition Mean Pitch5.94897239962025"

Inferential statistics

The impact of hierarchical rank on speakers’ acoustic cues. Each of the six hierarchy-based (i.e., postmanipulation) acoustic variables was submitted to a 2 (condition: high rank, low rank) × 2 (speaker’s sex: female, male) between-subjects analysis of covariance, controlling for the corresponding baseline acoustic variable. […] Condition had a significant effect on pitch, pitch variability, and loudness variability. Speakers’ voices in the high-rank condition had higher pitch, F(1, 156) = 4.48, p < .05; were more variable in loudness, F(1, 156) = 4.66, p < .05; and were more monotone (i.e., less variable in pitch), F(1, 156) = 4.73, p < .05, compared with speakers’ voices in the low-rank condition (all other Fs < 1; see the Supplemental Material for additional analyses of covariance involving pitch and loudness).

# reproduce the above results here

ggplot(d_filtered, aes(x = pitch_rmean, y = pitch_smean, colour = plev)) + 
  geom_point()

ggplot(d_filtered, aes(x = form_rmean, y = form_smean, colour = plev)) + 
  geom_point()

ggplot(d_filtered, aes(x = intense_rmean, y = intense_smean, colour = plev)) +
  geom_point()

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

No! (1) I might have missed something in the methodology required to accurately reproduce the descriptive statistics. I was confused by how they were “adjusted”. (2) I also struggled with reproducing the 2x2 ANCOVA, likely because I don’t understand ANCOVA very well (but also because the “adjusted” problem seemed to carry over).

How difficult was it to reproduce your results?

I found it quite difficult, both because my coding and statistics knowledge are gappy (but improving).

What aspects made it difficult? What aspects made it easy?

Generally, the high number of measurements and complicated methodology made it harder to track the data analysis reasoning. Relevant information also seemed scattered over a number of files, materials, and locations. Consequentially, I am not sure if I missed something important for example, re ‘mean adjustments’, or it was something the authors forgot to detail. The 2x2 ANCOVA also confused me – but that might be down to my own statistical knowledge.