reproducibilityReport.utf8

Satchel Grant

For this exercise, please try to reproduce the results from Experiment 1 of the associated paper (Ko, Sadler & Galinsky, 2015). The PDF of the paper is included in the same folder as this Rmd file.

Methods summary:

A sense of power has often been tied to how we perceive each other’s voice. Social hierarchy is embedded into the structure of society and provides a metric by which others relate to one another. In 1956, the Brunswik Lens Model was introduced to examine how vocal cues might influence hierarchy. In “The Sound of Power: Conveying and Detecting Hierarchical Rank Through Voice,” Ko and colleagues investigated how manipulation of hierarchal rank within a situation might impact vocal acoustic cues. Using the Brunswik Model, six acoustic metrics were utilized (pitch mean & variability, loudness mean & variability, and resonance mean & variability) to isolate a potential contribution between individuals of different hierarchal rank. In the first experiment, Ko, Sadler & Galinsky examined the vocal acoustic cues of individuals before and after being assigned a hierarchal rank in a sample of 161 subjects (80 male). Each of the six hierarchy acoustic cues were analyzed with a 2 (high vs. low rank condition) x 2 (male vs. female) analysis of covariance, controlling for the baseline of the respective acoustic cue.

Target outcomes:

Below is the specific result you will attempt to reproduce (quoted directly from the results section of Experiment 1):

The impact of hierarchical rank on speakers’ acoustic cues. Each of the six hierarchy-based (i.e., postmanipulation) acoustic variables was submitted to a 2 (condition: high rank, low rank) × 2 (speaker’s sex: female, male) between-subjects analysis of covariance, controlling for the corresponding baseline acoustic variable. Table 4 presents the adjusted means by condition. Condition had a significant effect on pitch, pitch variability, and loudness variability. Speakers’ voices in the high-rank condition had higher pitch, F(1, 156) = 4.48, p < .05; were more variable in loudness, F(1, 156) = 4.66, p < .05; and were more monotone (i.e., less variable in pitch), F(1, 156) = 4.73, p < .05, compared with speakers’ voices in the low-rank condition (all other Fs < 1; see the Supplemental Material for additional analyses of covariance involving pitch and loudness). (from Ko et al., 2015, p. 6; emphasis added)

The adjusted means for these analyses are reported in Table 4 (Table4_AdjustedMeans.png, included in the same folder as this Rmd file).

Step 1: Load Packages

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.formula.api import ols
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)

Step 2: Load Data

init_df = pd.read_csv("data/S1_voice_level_Final.csv")
init_df.head()

	voice	form_smean	form_svar	form_rmean	form_rvar	intense_smean	intense_svar	intense_rmean	intense_rvar	pitch_smean	pitch_svar	pitch_rmean	pitch_rvar	pow	age	sex	race	native	feelpower	plev	vsex	pitch_rmeanMD	pitch_rvarMD	intense_rmeanMD	intense_rvarMD	formant_rmeanMD	formant_rvarMD	pitch_smeanMD	pitch_svarMD	intense_smeanMD	intense_svarMD	formant_smeanMD	formant_svarMD	Zpitch_rmean	Zpitch_rvar	Zform_rmean	Zform_rvar	Zintense_rmean	Zintense_rvar	Zpitch_smean	Zpitch_svar	Zform_smean	Zform_svar	Zintense_smean	Zintense_svar
0	1	1042.869962	38804.93761	1259.208907	68275.23613	65.490990	157.335432	58.700351	174.085733	116.027174	858.067860	108.828760	2481.725366	1	19	M	X	Y	6	1	-1	-40.741240	729.275367	1.240351	-8.664267	-33.401093	4144.636134	-41.052826	-679.572140	6.470990	-32.794568	-86.400038	-4107.182388	-0.241662	0.619236	-0.601796	-0.111502	0.048117	-0.024327	-0.101165	-0.228936	-0.684037	-0.280327	1.593384	-0.697216
1	2	1083.029813	35609.35816	1277.910104	54611.71739	65.475869	300.968230	63.890231	271.350034	145.696029	1404.354463	133.178193	1231.082125	3	21	M	C	Y	5	1	-1	-16.391807	-521.367875	6.430231	88.600034	-14.699896	-9518.882613	-11.383971	-133.285537	6.455869	110.838230	-46.240187	-7302.761841	1.352745	-0.194782	-0.186358	-1.125049	1.355314	1.652535	1.683224	0.438988	-0.228418	-0.485594	1.589696	1.679766
2	3	1092.225925	25978.96309	1333.830754	55175.29165	53.598913	71.173535	53.019391	87.359022	105.506334	706.219256	108.263955	228.722077	3	18	M	A	Y	3	1	-1	-41.306045	-1523.727924	-4.440609	-95.390978	41.220754	-8955.308346	-51.573666	-831.420744	-5.421087	-118.956465	-37.044075	-16933.156910	-0.278646	-0.847197	1.055891	-1.083244	-1.382771	-1.519519	-0.733924	-0.414595	-0.124087	-1.104199	-1.306822	-2.123111
3	4	1267.553623	71345.58644	1297.806593	74340.49653	60.709749	201.159387	59.757294	198.791744	102.102796	449.830366	107.596595	1744.493863	3	22	M	C	Y	6	1	-1	-41.973405	-7.956137	2.297294	16.041744	5.196593	10209.896530	-54.977203	-1087.809634	1.689749	11.029387	138.283623	28433.466440	-0.322345	0.139387	0.255632	0.338414	0.314333	0.401611	-0.938625	-0.728072	1.865026	1.809911	0.427348	0.028028
4	5	1071.457828	38245.57579	1255.766713	67846.20803	60.072514	203.682245	57.495309	202.390153	122.473164	478.070070	115.059753	946.264262	4	20	M	C	Y	6	1	-1	-34.510247	-806.185738	0.035309	19.640153	-36.843287	3715.608029	-34.606836	-1059.569930	1.052514	13.552245	-57.812172	-4666.544210	0.166345	-0.380164	-0.678263	-0.143327	-0.255402	0.463649	0.286520	-0.693544	-0.359704	-0.316257	0.271941	0.069778

3: Tidy Data

From the codebook, we can see that plev denotes the “power level” or rank. And vsex denotes the sex of the individuals. There isn’t much to tidy because we can see that the data is already in long form.

set(init_df['plev'])

{-1, 1}

set(init_df['vsex'])

{-1, 1}

df = init_df.copy()
df["high_rank"] = init_df.apply(lambda x: x.plev>0, axis=1)
df["is_male"] = init_df.apply(lambda x: x.vsex<0, axis=1)
df.loc[:,["voice","high_rank","is_male"]].head()

	voice	high_rank	is_male
0	1	True	True
1	2	True	True
2	3	True	True
3	4	True	True
4	5	True	True

Step 4: Run analysis

Preprocessing

To find the adjusted means, we need to first fit a regression line for

pitch_model = ols("pitch_smean ~ pitch_rmean + high_rank", data=df).fit()

pitch_model.summary()

 <td>pitch_smean</td>   <th>  R-squared:         </th> <td>   0.954</td>

             <td>OLS</td>       <th>  Adj. R-squared:    </th> <td>   0.954</td>

       <td>Least Squares</td>  <th>  F-statistic:       </th> <td>   1643.</td>

       <td>Sun, 25 Oct 2020</td> <th>  Prob (F-statistic):</th> <td>1.84e-106</td>

           <td>22:43:58</td>     <th>  Log-Likelihood:    </th> <td> -587.36</td>

<td>   161</td>      <th>  AIC:               </th> <td>   1181.</td>

    <td>   158</td>      <th>  BIC:               </th> <td>   1190.</td>

        <td>     2</td>      <th>                     </th>     <td> </td>

<td>nonrobust</td>    <th>                     </th>     <td> </td>

OLS Regression Results
Dep. Variable:
Model:
Method:
Date:
Time:
No. Observations:
Df Residuals:
Df Model:
Covariance Type:

   <td>   -0.8672</td> <td>    2.970</td> <td>   -0.292</td> <td> 0.771</td> <td>   -6.734</td> <td>    5.000</td>

 <td>    1.0448</td> <td>    0.018</td> <td>   57.303</td> <td> 0.000</td> <td>    1.009</td> <td>    1.081</td>

Intercept
	coef	std err	t	P>\|t\|	[0.025	0.975]
high_rank[T.True]	3.3123	1.482	2.235	0.027	0.385	6.240
pitch_rmean

 <td> 4.363</td> <th>  Durbin-Watson:     </th> <td>   1.850</td>

    <td> 0.143</td> <th>  Prob(JB):          </th> <td>  0.0767</td>

<td> 3.827</td> <th>  Cond. No.          </th> <td>    632.</td>

df['adj_pitch_smean'] = df.apply(lambda x: -0.8672 + x.pitch_rmean*1.0448 + x.high_rank*3.3123,axis=1)

form_model = ols("form_smean ~ form_rmean + high_rank", data=df).fit()
form_model.summary()

Omnibus:
Prob(Omnibus):	0.113	Jarque-Bera (JB):	5.135
Skew:
Kurtosis:

 <td>form_smean</td>    <th>  R-squared:         </th> <td>   0.091</td>

             <td>OLS</td>       <th>  Adj. R-squared:    </th> <td>   0.079</td>

       <td>Least Squares</td>  <th>  F-statistic:       </th> <td>   7.908</td>

       <td>Mon, 26 Oct 2020</td> <th>  Prob (F-statistic):</th> <td>0.000533</td>

           <td>00:22:49</td>     <th>  Log-Likelihood:    </th> <td> -944.23</td>

<td>   161</td>      <th>  AIC:               </th> <td>   1894.</td>

    <td>   158</td>      <th>  BIC:               </th> <td>   1904.</td>

        <td>     2</td>      <th>                     </th>     <td> </td>

<td>nonrobust</td>    <th>                     </th>     <td> </td>

OLS Regression Results
Dep. Variable:
Model:
Method:
Date:
Time:
No. Observations:
Df Residuals:
Df Model:
Covariance Type:

   <td>  345.7924</td> <td>  198.209</td> <td>    1.745</td> <td> 0.083</td> <td>  -45.688</td> <td>  737.273</td>

  <td>    0.6063</td> <td>    0.153</td> <td>    3.967</td> <td> 0.000</td> <td>    0.304</td> <td>    0.908</td>

Intercept
	coef	std err	t	P>\|t\|	[0.025	0.975]
high_rank[T.True]	-0.4636	13.595	-0.034	0.973	-27.316	26.388
form_rmean

 <td>21.534</td> <th>  Durbin-Watson:     </th> <td>   2.002</td>

    <td> 1.005</td> <th>  Prob(JB):          </th> <td>1.31e-06</td>

<td> 2.966</td> <th>  Cond. No.          </th> <td>3.78e+04</td>

df['adj_form_smean'] = df.apply(lambda x: 345.79 + x.form_rmean*.6063 + x.high_rank*-.4636,axis=1)

intense_model = ols("intense_smean ~ intense_rmean + high_rank", data=df).fit()
intense_model.summary()

Omnibus:
Prob(Omnibus):	0.000	Jarque-Bera (JB):	27.089
Skew:
Kurtosis:

<td>intense_smean</td>  <th>  R-squared:         </th> <td>   0.453</td>

             <td>OLS</td>       <th>  Adj. R-squared:    </th> <td>   0.446</td>

       <td>Least Squares</td>  <th>  F-statistic:       </th> <td>   65.36</td>

       <td>Mon, 26 Oct 2020</td> <th>  Prob (F-statistic):</th> <td>2.07e-21</td>

           <td>00:26:36</td>     <th>  Log-Likelihood:    </th> <td> -401.41</td>

<td>   161</td>      <th>  AIC:               </th> <td>   808.8</td>

    <td>   158</td>      <th>  BIC:               </th> <td>   818.1</td>

        <td>     2</td>      <th>                     </th>     <td> </td>

<td>nonrobust</td>    <th>                     </th>     <td> </td>

OLS Regression Results
Dep. Variable:
Model:
Method:
Date:
Time:
No. Observations:
Df Residuals:
Df Model:
Covariance Type:

   <td>   20.0121</td> <td>    3.426</td> <td>    5.840</td> <td> 0.000</td> <td>   13.244</td> <td>   26.780</td>

Intercept
	coef	std err	t	P>\|t\|	[0.025	0.975]
high_rank[T.True]	0.6182	0.466	1.327	0.187	-0.302	1.539
intense_rmean	0.6734	0.059	11.334	0.000	0.556	0.791

 <td> 3.127</td> <th>  Durbin-Watson:     </th> <td>   1.869</td>

    <td>-0.296</td> <th>  Prob(JB):          </th> <td>   0.262</td>

<td> 3.222</td> <th>  Cond. No.          </th> <td>    848.</td>

df['adj_intense_smean'] = df.apply(lambda x: 20.0121 + x.intense_rmean*.6734 + x.high_rank*.6182,axis=1)

Descriptive statistics

In the paper, the adjusted means by condition are reported (see Table 4, or Table4_AdjustedMeans.png, included in the same folder as this Rmd file). Reproduce these values below:

fig4 = df.groupby("high_rank")["adj_pitch_smean","adj_intense_smean", "adj_form_smean"].mean().T

def cohens_d(df,variable):
    high = df.loc[df["high_rank"]]
    low = df.loc[~df["high_rank"]]
    high_mean = np.mean(high[variable])
    low_mean = np.mean(low[variable])
    high_var = np.std(high[variable])**2
    low_var = np.std(low[variable])**2
    high_n = len(high)
    low_n = len(low)
    pooled_sd = np.sqrt(((high_n-1)*high_var+(low_n-1)*low_var)/(high_n+low_n-2))
    return np.abs((high_mean-low_mean)/pooled_sd)

fig4["Effect Size"] = [cohens_d(df,"adj_pitch_smean"),cohens_d(df,"adj_intense_smean"),cohens_d(df,"adj_form_smean")]
fig4

Omnibus:
Prob(Omnibus):	0.209	Jarque-Bera (JB):	2.681
Skew:
Kurtosis:

high_rank	False	True	Effect Size
adj_pitch_smean	158.246405	155.971723	0.053690
adj_intense_smean	58.665304	59.363631	0.264511
adj_form_smean	1131.181271	1127.411797	0.140069

Inferential statistics

The impact of hierarchical rank on speakers’ acoustic cues. Each of the six hierarchy-based (i.e., postmanipulation) acoustic variables was submitted to a 2 (condition: high rank, low rank) × 2 (speaker’s sex: female, male) between-subjects analysis of covariance, controlling for the corresponding baseline acoustic variable. […] Condition had a significant effect on pitch, pitch variability, and loudness variability. Speakers’ voices in the high-rank condition had higher pitch, F(1, 156) = 4.48, p < .05; were more variable in loudness, F(1, 156) = 4.66, p < .05; and were more monotone (i.e., less variable in pitch), F(1, 156) = 4.73, p < .05, compared with speakers’ voices in the low-rank condition (all other Fs < 1; see the Supplemental Material for additional analyses of covariance involving pitch and loudness).

Reproducibility Report: Group B Choice 2