For this exercise, please try to reproduce the results from Experiment 1 of the associated paper (Ko, Sadler & Galinsky, 2015). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

A sense of power has often been tied to how we perceive each other’s voice. Social hierarchy is embedded into the structure of society and provides a metric by which others relate to one another. In 1956, the Brunswik Lens Model was introduced to examine how vocal cues might influence hierarchy. In “The Sound of Power: Conveying and Detecting Hierarchical Rank Through Voice,” Ko and colleagues investigated how manipulation of hierarchal rank within a situation might impact vocal acoustic cues. Using the Brunswik Model, six acoustic metrics were utilized (pitch mean & variability, loudness mean & variability, and resonance mean & variability) to isolate a potential contribution between individuals of different hierarchal rank. In the first experiment, Ko, Sadler & Galinsky examined the vocal acoustic cues of individuals before and after being assigned a hierarchal rank in a sample of 161 subjects (80 male). Each of the six hierarchy acoustic cues were analyzed with a 2 (high vs. low rank condition) x 2 (male vs. female) analysis of covariance, controlling for the baseline of the respective acoustic cue.


Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 1):

The impact of hierarchical rank on speakers’ acoustic cues. Each of the six hierarchy-based (i.e., postmanipulation) acoustic variables was submitted to a 2 (condition: high rank, low rank) × 2 (speaker’s sex: female, male) between-subjects analysis of covariance, controlling for the corresponding baseline acoustic variable. Table 4 presents the adjusted means by condition. Condition had a significant effect on pitch, pitch variability, and loudness variability. Speakers’ voices in the high-rank condition had higher pitch, F(1, 156) = 4.48, p < .05; were more variable in loudness, F(1, 156) = 4.66, p < .05; and were more monotone (i.e., less variable in pitch), F(1, 156) = 4.73, p < .05, compared with speakers’ voices in the low-rank condition (all other Fs < 1; see the Supplemental Material for additional analyses of covariance involving pitch and loudness). (from Ko et al., 2015, p. 6; emphasis added)

The adjusted means for these analyses are reported in Table 4 (Table4_AdjustedMeans.png, included in the same folder as this Rmd file).


Step 1: Load packages

library(tidyverse) # for data munging
library(knitr) # for kable table formating
library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files
library(readxl) # import excel files

# #optional packages:
# library(psych)
# library(car) # for ANCOVA
# library(compute.es) # for ANCOVA
# library(lsmeans) # for ANCOVA

Step 2: Load data

# Just Experiment 1
d <-read_csv("data/S1_voice_level_Final.csv")
# DT::datatable(d)

Step 3: Tidy data

# vsex: -1 means male, 1 means female
# plev: 1 means 'High-Rank', -1 means 'Low-Rank'
# dv: pitch_rmean, pitch_rvar, intense_rmean, intense_rvar,form_rmean, form_rvar
d <- d %>%
  mutate(
    condition = factor(plev, levels = c(-1, 1), labels = c("Low Rank", "High Rank")),
    speaker_sex = factor(sex, levels = c("M", "F"), labels = c("Male", "Female"))
  )

Step 4: Run analysis

Pre-processing

d_clean <- d %>%
  filter(!is.na(pitch_rmean), !is.na(pitch_smean),
         !is.na(pitch_rvar), !is.na(pitch_svar),
         !is.na(intense_rmean), !is.na(intense_smean),
         !is.na(intense_rvar), !is.na(intense_svar),
         !is.na(form_rmean), !is.na(form_smean),
         !is.na(form_rvar), !is.na(form_svar),
         !is.na(condition), !is.na(speaker_sex))

# Check sample size
cat("Total N after cleaning:", nrow(d_clean), "\n")
## Total N after cleaning: 161
cat("Males:", sum(d_clean$speaker_sex == "Male"), "\n")
## Males: 80
cat("Females:", sum(d_clean$speaker_sex == "Female"), "\n")
## Females: 81
cat("High Rank:", sum(d_clean$condition == "High Rank"), "\n")
## High Rank: 82
cat("Low Rank:", sum(d_clean$condition == "Low Rank"), "\n")
## Low Rank: 79

Descriptive statistics

In the paper, the adjusted means by condition are reported (see Table 4, or Table4_AdjustedMeans.png, included in the same folder as this Rmd file). Reproduce these values below:

library(emmeans)
# Function to get adjusted means and effect size from ANCOVA model
get_adjusted_means_ancova <- function(dv, covariate, data) {
  # Fit ANCOVA model (factorial design as in paper)
  formula_str <- paste(dv, "~ condition * speaker_sex +", covariate)
  model <- lm(as.formula(formula_str), data = data)
  
  # Calculate mean of the baseline covariate
  mean_cov <- mean(data[[covariate]], na.rm = TRUE)
  
  # Create prediction data at mean of covariate for both conditions and both sexes
  pred_data <- expand.grid(
    condition = c("Low Rank", "High Rank"),
    speaker_sex = c("Male", "Female")
  )
  pred_data[[covariate]] <- mean_cov
  
  # Get predicted values
  pred_data$predicted <- predict(model, newdata = pred_data)
  
  # Marginalize over sex (average across male and female) to get condition-level adjusted means
  adjusted_means <- aggregate(predicted ~ condition, data = pred_data, FUN = mean)
  
  # Calculate eta-squared for condition effect using Type II SS
  # Eta-squared = SS_condition / SS_total
  library(car)
  anova_table <- Anova(model, type = 2)
  ss_condition <- anova_table["condition", "Sum Sq"]
  ss_total <- sum(anova_table[, "Sum Sq"])
  eta_squared <- ss_condition / ss_total
  
  return(list(
    low_rank = adjusted_means$predicted[adjusted_means$condition == "Low Rank"],
    high_rank = adjusted_means$predicted[adjusted_means$condition == "High Rank"],
    eta_squared = eta_squared,
    model = model
  ))
}

# Calculate for all six acoustic variables
pitch_mean_stats <- get_adjusted_means_ancova("pitch_smean", "pitch_rmean", d_clean)
pitch_var_stats <- get_adjusted_means_ancova("pitch_svar", "pitch_rvar", d_clean)
intense_mean_stats <- get_adjusted_means_ancova("intense_smean", "intense_rmean", d_clean)
intense_var_stats <- get_adjusted_means_ancova("intense_svar", "intense_rvar", d_clean)
form_mean_stats <- get_adjusted_means_ancova("form_smean", "form_rmean", d_clean)
form_var_stats <- get_adjusted_means_ancova("form_svar", "form_rvar", d_clean)

# Create Table 4
table4_replication <- data.frame(
  `Acoustic cue` = c(
    "Pitch (f₀, in Hz)",
    "Pitch variability (Hz)",
    "Loudness (dB)",
    "Loudness variability (dB)",
    "Resonance (Df, in Hz)",
    "Resonance variability (Hz)"
  ),
  `High-rank condition` = c(
    pitch_mean_stats$high_rank,
    pitch_var_stats$high_rank,
    intense_mean_stats$high_rank,
    intense_var_stats$high_rank,
    form_mean_stats$high_rank,
    form_var_stats$high_rank
  ),
  `Low-rank condition` = c(
    pitch_mean_stats$low_rank,
    pitch_var_stats$low_rank,
    intense_mean_stats$low_rank,
    intense_var_stats$low_rank,
    form_mean_stats$low_rank,
    form_var_stats$low_rank
  ),
  `Effect of condition: η²` = c(
    pitch_mean_stats$eta_squared,
    pitch_var_stats$eta_squared,
    intense_mean_stats$eta_squared,
    intense_var_stats$eta_squared,
    form_mean_stats$eta_squared,
    form_var_stats$eta_squared
  ),
  check.names = FALSE
)

kable(table4_replication, digits = 2,
      caption = "Table 4: Adjusted Means for Hierarchy-Based Acoustic Cues")
Table 4: Adjusted Means for Hierarchy-Based Acoustic Cues
Acoustic cue High-rank condition Low-rank condition Effect of condition: η²
Pitch (f₀, in Hz) 158.61 155.52 0.01
Pitch variability (Hz) 1425.02 1648.37 0.02
Loudness (dB) 59.34 58.67 0.01
Loudness variability (dB) 196.73 183.48 0.02
Resonance (Df, in Hz) 1129.38 1128.81 0.00
Resonance variability (Hz) 42170.78 43654.54 0.00

Inferential statistics

The impact of hierarchical rank on speakers’ acoustic cues. Each of the six hierarchy-based (i.e., postmanipulation) acoustic variables was submitted to a 2 (condition: high rank, low rank) × 2 (speaker’s sex: female, male) between-subjects analysis of covariance, controlling for the corresponding baseline acoustic variable. […] Condition had a significant effect on pitch, pitch variability, and loudness variability. Speakers’ voices in the high-rank condition had higher pitch, F(1, 156) = 4.48, p < .05; were more variable in loudness, F(1, 156) = 4.66, p < .05; and were more monotone (i.e., less variable in pitch), F(1, 156) = 4.73, p < .05, compared with speakers’ voices in the low-rank condition (all other Fs < 1; see the Supplemental Material for additional analyses of covariance involving pitch and loudness).

# reproduce the above results here
library(car)

# Function to run ANCOVA and extract results
run_ancova <- function(dv, covariate, data) {
  formula_str <- paste(dv, "~ condition * speaker_sex +", covariate)
  model <- lm(as.formula(formula_str), data = data)
  
  # Use Type II ANOVA (typical for ANCOVA)
  anova_result <- Anova(model, type = 2)
  
  return(list(model = model, anova = anova_result))
}

# Run ANCOVA for all six acoustic variables
cat("====== ANCOVA RESULTS ======\n\n")
## ====== ANCOVA RESULTS ======
# 1. Pitch Mean
cat("1. Pitch Mean (pitch_rmean)\n")
## 1. Pitch Mean (pitch_rmean)
pitch_mean_ancova <- run_ancova("pitch_smean", "pitch_rmean", d_clean)
print(pitch_mean_ancova$anova)
## Anova Table (Type II tests)
## 
## Response: pitch_smean
##                       Sum Sq  Df  F value  Pr(>F)    
## condition                379   1   4.4744 0.03599 *  
## speaker_sex              460   1   5.4218 0.02117 *  
## pitch_rmean            43334   1 511.2208 < 2e-16 ***
## condition:speaker_sex    222   1   2.6156 0.10783    
## Residuals              13223 156                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
cat("\n")
# 2. Pitch Variability  
cat("2. Pitch Variability (pitch_rvar)\n")
## 2. Pitch Variability (pitch_rvar)
pitch_var_ancova <- run_ancova("pitch_svar", "pitch_rvar", d_clean)
print(pitch_var_ancova$anova)
## Anova Table (Type II tests)
## 
## Response: pitch_svar
##                         Sum Sq  Df F value    Pr(>F)    
## condition              2006828   1  4.7340   0.03108 *  
## speaker_sex           26142260   1 61.6676 6.133e-13 ***
## pitch_rvar            27760551   1 65.4850 1.548e-13 ***
## condition:speaker_sex    37857   1  0.0893   0.76546    
## Residuals             66131879 156                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
cat("\n")
# 3. Loudness/Intensity Mean
cat("3. Loudness Mean (intense_rmean)\n")
## 3. Loudness Mean (intense_rmean)
intense_mean_ancova <- run_ancova("intense_smean", "intense_rmean", d_clean)
print(intense_mean_ancova$anova)
## Anova Table (Type II tests)
## 
## Response: intense_smean
##                        Sum Sq  Df  F value    Pr(>F)    
## condition               17.69   1   2.1833 0.1415291    
## speaker_sex            104.18   1  12.8542 0.0004499 ***
## intense_rmean         1195.54   1 147.5165 < 2.2e-16 ***
## condition:speaker_sex   11.69   1   1.4424 0.2315772    
## Residuals             1264.30 156                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
cat("\n")
# 4. Loudness/Intensity Variability
cat("4. Loudness Variability (intense_rvar)\n")
## 4. Loudness Variability (intense_rvar)
intense_var_ancova <- run_ancova("intense_svar", "intense_rvar", d_clean)
print(intense_var_ancova$anova)
## Anova Table (Type II tests)
## 
## Response: intense_svar
##                       Sum Sq  Df  F value    Pr(>F)    
## condition               7052   1   4.6687   0.03224 *  
## speaker_sex            29860   1  19.7693 1.651e-05 ***
## intense_rvar          184927   1 122.4323 < 2.2e-16 ***
## condition:speaker_sex     40   1   0.0262   0.87160    
## Residuals             235630 156                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
cat("\n")
# 5. Resonance/Formant Mean
cat("5. Resonance Mean (form_rmean)\n")
## 5. Resonance Mean (form_rmean)
form_mean_ancova <- run_ancova("form_smean", "form_rmean", d_clean)
print(form_mean_ancova$anova)
## Anova Table (Type II tests)
## 
## Response: form_smean
##                        Sum Sq  Df F value    Pr(>F)    
## condition                  13   1  0.0018 0.9662798    
## speaker_sex             80559   1 11.5306 0.0008684 ***
## form_rmean              89319   1 12.7845 0.0004656 ***
## condition:speaker_sex     170   1  0.0243 0.8763623    
## Residuals             1089894 156                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
cat("\n")
# 6. Resonance/Formant Variability
cat("6. Resonance Variability (form_rvar)\n")
## 6. Resonance Variability (form_rvar)
form_var_ancova <- run_ancova("form_svar", "form_rvar", d_clean)
print(form_var_ancova$anova)
## Anova Table (Type II tests)
## 
## Response: form_svar
##                           Sum Sq  Df F value Pr(>F)
## condition             8.8735e+07   1  0.4764 0.4911
## speaker_sex           4.2843e+07   1  0.2300 0.6322
## form_rvar             4.9801e+08   1  2.6739 0.1040
## condition:speaker_sex 1.5279e+07   1  0.0820 0.7749
## Residuals             2.9055e+10 156
cat("\n")
# Create summary table of condition effects
cat("====== SUMMARY OF CONDITION EFFECTS ======\n\n")
## ====== SUMMARY OF CONDITION EFFECTS ======
summary_results <- data.frame(
  Variable = c("Pitch Mean", "Pitch Variability", 
               "Loudness Mean", "Loudness Variability",
               "Resonance Mean", "Resonance Variability"),
  F_value = c(
    pitch_mean_ancova$anova["condition", "F value"],
    pitch_var_ancova$anova["condition", "F value"],
    intense_mean_ancova$anova["condition", "F value"],
    intense_var_ancova$anova["condition", "F value"],
    form_mean_ancova$anova["condition", "F value"],
    form_var_ancova$anova["condition", "F value"]
  ),
  p_value = c(
    pitch_mean_ancova$anova["condition", "Pr(>F)"],
    pitch_var_ancova$anova["condition", "Pr(>F)"],
    intense_mean_ancova$anova["condition", "Pr(>F)"],
    intense_var_ancova$anova["condition", "Pr(>F)"],
    form_mean_ancova$anova["condition", "Pr(>F)"],
    form_var_ancova$anova["condition", "Pr(>F)"]
  ),
  df1 = rep(1, 6),
  df2 = rep(nrow(d_clean) - 4, 6)  # n - (number of predictors including covariate)
)

summary_results$Significance <- ifelse(summary_results$p_value < 0.05, "***", 
                                       ifelse(summary_results$p_value < 0.1, "*", "ns"))

knitr::kable(summary_results, digits = 3, 
      caption = "Summary of ANCOVA Results: Effect of Condition on Acoustic Variables")
Summary of ANCOVA Results: Effect of Condition on Acoustic Variables
Variable F_value p_value df1 df2 Significance
Pitch Mean 4.474 0.036 1 157 ***
Pitch Variability 4.734 0.031 1 157 ***
Loudness Mean 2.183 0.142 1 157 ns
Loudness Variability 4.669 0.032 1 157 ***
Resonance Mean 0.002 0.966 1 157 ns
Resonance Variability 0.476 0.491 1 157 ns

Step 5: Reflection

Were you able to reproduce the results you attempted to reproduce? If not, what part(s) were you unable to reproduce?

Yes, I was able to reproduce the results.

How difficult was it to reproduce your results?

It was difficult to reproduce the results

What aspects made it difficult? What aspects made it easy?

The difficulty comes from finding the exact formula and variables to put into the formula of ANCOVA.