In their 2012 study, Livingston, Rosette, and Washington found that, contrary to commonly held beliefs and longstanding findings in the literature on women and leadership, there are some circumstances under which certain women may not face the so-called “agency penalty” for acting dominantly – a stereotype incongruent type of behavior. They suggest that the experience of this agency penalty is moderated by race such that while White women (and Black men) experience a penalty for acting dominantly, Black women (and White men) do not. Further, they propose that this effect may be a function of Black women’s unique role in the social hierarchy as individuals with two subordinate identities. That is, the prescriptive and proscriptive norms that Black women face may be different than those faced by White women or Black men, with whom they share certain identities. Black women may face less strong proscriptive norms against dominant behaviors than White women or Black men, and the prescriptive norm for communal behaviors may be less strong for Black women than White women.
Using a simple experimental design, Livingston et al recruited 84 participants for an online experiment. Their main finding was a significant three-way interaction between target race (White vs. Black), target gender (male vs. female), and target behavior (dominant vs. communal) on leader status evaluations, F(1, 76) = 11.53, p < .001, \(\eta^2\) = .13. Within the female targets, they found a significant two-way interaction, F(1, 35) = 4.77, p < .04, \(\eta^2\) = .12, such that while White women were punished (rated less favorably) for engaging in dominant behaviors, Black women were not. They also found a significant two-way interaction amongst the male targets, F(1, 41) = 7.16, p < .02, \(\eta^2\) = .15, such that while Black men were punished (rated less favorably) for engaging in dominant behaviors, White men were not. Further, Livingston et al found that there was a significant three-way interaction in how participants tended to make attributions about target behaviors, F(1, 75) = 7.87, p < .006, \(\eta^2\) = .10. Again, there was a significant two-way interaction amongst female targets, F(1, 34) = 2.95, p < .10, \(\eta^2\) = .08, such that for White women, dominance was more attributed to the person than the situation when compared to communal behaviors, but this difference was not significant for Black women. For men, they found that there was a significant effect of race on attributions, t(20) = 2.24, p < .04, such that participants tended to attribute dominant behaviors more to the person rather than the situation for Black targets than for White targets. Finally, they found that within Black targets, there was a marginally significant effect of gender, t(18) = 1.86, p < .08, such that participants tended to attribute dominant behaviors more to the person rather than the situation for Black men than for Black women.
Figure 1. Plot of original findings reported by Livingston, Rosette, & Washington (2012)
Livingston et al reported an effect size for the main three-way interaction on leader status evaluations of \(\eta^2\) = .13. We found that this translates to 93.8% statistical power. Given that the effect size that Livingston et al found was the true effect size of this effect, this estimation of power seems plausible, even with the small sample size (N = 84). Using their effect size and our desired 95% power, we found that we would need approximately 90 participants split between the 8 conditions to detect the effect if Livingston et al’s effect size is true. In order to be conservative, we decided to use 2 times the original sample size as our final sample size (N = 168). If Livingston et al’s effect size is the true size of the effect, this number of participants should be more than enough to detect the effect. However, it is also a large enough sample to be conservative – if the true effect size is smaller than what Livingston et al found, we may still be able to detect the effect, if it exists. Further, the sample is a reasonable size, and given that the survey takes approximately 3 minutes to complete, should not be too costly to implement.
The planned sample is the mTurk population. As Livingston et al used the mTurk or a similar pool as a “nationally representative” pool, so shall we. No participants will be excluded on the basis of race or gender. A manipulation check was built into our version of the study that may not have been included in the original study. In our study, we have a question asking whether participants attended to the manipulated race of the target. If they answer this question incorrectly, they will be excluded from analyses. As stated above, we will aim for a sample size of N = 210.
The following is the passage participants will read.
The instructions page will read:
On the following page, you will read a brief biography and account of a workplace encounter. Please take a moment to read and consider it carefully. Take your time. You will be asked to answer a series of questions about this passage at a later time.
Then the vignette will read:
[Molly/Aliyah/Mark/Darnell]* Johnson is a Senior Vice President at a major consulting firm, where [she/he] has worked for twenty years. [She/He] obtained [her/his] BA in Economics from University of Southern California and [her/his] MBA from London Business School. [She/he is currently an active member of the National Association of Business Executives./She/he is currently an active member of the National Association of Black Business Executives.] The company sets forth as its mission: “To provide our clients with the highest quality services to address their business needs. We do so by recruiting and retaining the most diverse, passionate, and knowledgeable professionals and by providing a collaborative and open work environment that enables our employees to thrive.” Values that the company emphasizes include honesty and integrity, intellectual rigor, accountability, and the pursuit of excellence.
Recently, [Molly/Aliyah/Mark/Darnell] Johnson’s group nearly missed a deadline on a major project they were working on for an important client, with whom the company has had a longstanding relationship. [Ms./Mr.] Johnson discovered that this potentially costly mistake was largely due to one of [her/his] subordinates. When [she/he] called this employee into [her/his] office to discuss this matter, [she/he] issued reprimands, saying, [“I demand that you take steps to improve your performance.”/“I encourage you to take steps to improve your performance.”]
When asked by [her/his] peers about how [she/he] was dealing with this mistake, [she/he] stated, [“I am a tough, determined boss and intend to do everything in my power to ensure that my employees’ performance improves.”/“I am a caring, committed boss and intend to do everything in my power to ensure that my employees’ performance improves.”]
*Molly and Mark are intended to be interpreted as White, as these names are more stereotypically White. Aliyah and Darnell are intended to be interpreted as Black, as these names are more stereotypically Black. Further cues to race are included in the business executive organizations targets are members of.
We were unable to obtain the original materials used in the 2012 study, as the author who was in possession of the materials could not be reached. However, the above materials were synthesized using details gleaned from the original study and other similar studies that Livingston and colleagues have run, either together or individually, and represent a conceptual replication of the original study.
Final survey to be administered can be found here: https://stanfordgsb.qualtrics.com/SE/?SID=SV_3UTmvTzvBqY0TXv
Quoted from the original article, the procedures is as follows (with certain modifications noted):
Participants were randomly assigned to one of eight conditions in a 2 (leader’s race: White vs. Black) × 2 (leader’s gender: male vs. female) × 2 (leader’s behavior: dominant vs. communal) between-subjects design. All participants were shown a description and photograph of a fictitious senior vice president who worked for a Fortune 500 company. The photographs were matched on perceived age, attractiveness, and “babyfaceness,” [1] and the descriptions included identical information about the leader’s education, company tenure, and leadership mission. The materials described a meeting between the leader and a subordinate employee who did not meet the company’s expectations. Dominant leaders were described as communicating their disappointment by demanding action (i.e., “I demand that you take steps to improve your performance”) and expressing assertiveness (i.e., “I am a tough, determined boss and intend to do everything in my power to ensure that your performance improves”). Communal leaders were described as encouraging the subordinate (i.e., “I encourage you to take steps to improve your performance”) and communicating compassion (i.e., “I am a caring, committed boss and intend to do everything in my power to ensure that your performance improves”).
Participants rated the leader on the following questions: “How well do you think the leader handled the situation with the employee?” “How effective is the leader at maximizing the employee’s performance?” “How much do you think the leader is admired by his or her employees?” and “How respected is this leader by the other executives at the company?” (Cronbach’s α = .89). We also assessed participants’ expectations regarding the leader’s salary by asking them to indicate what they thought the leader’s annual salary should be, using a scale ranging from 1 ($100,000) to 9 ($500,000). We combined these variables into a single composite score assessing the leader’s status in the organization.
Finally, we assessed attributions for the leader’s behavior using the following item: “How much does the leader’s reaction reflect something about his/her personality versus something about the situation?” (1 = definitely the personality, 7 = definitely the situation). If dominance is proscribed for Black men and White women, then internal attributions should be higher for Black men and White women who “break the rules” by behaving dominantly than for Black men and White women who behave normatively by adhering to prescribed stereotypes. [2]
[1] In our experiment, we only used the text manipulation, using manipulation of target names and mention of a membership in an organization for Black executives or simply executives as a manipulation of perceived race. Previous research has found that this type of manipulation is often sufficient to manipulate participants’ perceptions of race (e.g. see Bertrand & Mullainathan, 2004), and we expected that the organization related to Black executives would prime race effectively, if all else fails. Because we were not able to obtain the original materials, we did not want to introduce more confounds by selecting our own faces to use. We did not know where the original authors obtained these stimuli nor what they looked like, besides that they were matched on age, attractiveness, and babyfacedness (though the authors did not mention what age they were matched at, etc.). In order to achieve a cleaner manipulation, we simply used names and mentions of the organizations individuals were members of to manipulate race.
[2] Although the authors did not explicitly state, we assume that they collected relevant demographic information, including but not limited to participant race and gender. In addition to these two variables, we also asked about participant age, educational background, work experience, social liberalism or conservatism, and economic liberalism or conservatism. While the authors did not report that they controlled for any of these demographic factors, it is possible that they have an impact on how targets are evaluated, especially political beliefs. Therefore, we include them with the potential to run additional analyses in addition to the main ones conducted by Livingston et al.
Livingston et al (2012) did not specify what, if any, exclusion criteria used or any rules used in data cleaning. For our purposes, we will be excluding all participants that fail the manipulation check (question that asks about perceived target race).
Our main analysis will be examining the three-way interaction between target race, target gender, and target behavior on leader status evaluations. The original authors used an ANOVA to examine this interaction, but we will be using a linear model, which conceptually achieves the same thing and can be used more flexibly in R. We will be using the appropriate linear models throughout our analyses where the original authors used ANOVAs or t-tests.
In additional, exploratory analyses that are not meant to speak to the replicability of the original finding but perhaps may extend the original finding, we will also factor in political liberalism/conservatism as a covariate in our analyses after running the main analyses.
The differences from the original study reported here are assumed, but it is difficult to tell how similar or different this study is from the original, as we don’t have the original materials. However, we have made every attempt to keep this study as conceptually close to the original as possible.
One difference is that in our study, we use only the name manipulation and added another text manipulation, and not the photograph manipulation, as noted above. Again, this is because we were unsure of what pictures were used and where these images were selected from (or if the experimenters took the photographs themselves), and we did not want to introduce too many differences into the experiment. However, this change is not anticipated to have a huge impact on the results, as name manipulations have been shown previously to be effective in manipulating perceived race (e.g. Bertrand & Mullainathan, 2004). Therefore, the photographs may not be necessary to achieve the desired effect and in fact, in our case, may do more harm than good. As such, we decided that it might be a cleaner manipulation to only include name manipulations and manipulations of leadership organizations individuals were members of rather than names and photographs.
Of course another difference is the passage itself. As much as possible, we used the exact wording used in the original study (e.g. the parts quoted in the article itself). We also made sure to include each part mentioned in the article, using the article itself as well as other studies carried out by Livingston and colleagues, together or individually, to guide our speculation of what exactly the content of the passage was. Although the wording and content of this passage will be inevitably different from the materials used in the original paper, we believe at this point that the passage is a fairly good conceptual replication of the passage used in the original study, meaning that it likely manipulates the same things and should produce the same effect, given that the effect is true.
Other differences are minor and mostly technical. We added some demographic questions that may or may not have been asked in the original study, but this should not impact the main effect of interest in any way because these are not manipulations and are measured at the end of the experiment. We also included a timer on the page of the survey with the passage designed to ensure that participants spend at least 30 seconds on that page reading the passage (but they’re allowed to spend longer if necessary). This is to ensure that participants spend enough time on the page to read the passage and should not change the results, as it is simply to ensure that the manipulation is effective. Finally, we also included a manipulation check, which Livingston et al did not. This manipulation check examines if participants attended to the manipulated race of the target and should not impact the main effect of interest, as it is asked after the main measures are administered.
In the final sample, we obtained 163 participants from Amazon Mechanical Turk (mTurk). Although a total number of 168 HITs on mTurk were captured, some responses in the survey were not completed fully or participants reported errors and quit early.
In our primary analyses, we excluded those who failed the manipulation check (i.e. answered the question “What was the race/ethnicity of the individual you read about?” incorrectly). This left us with a total of 131 participants. Of these participants, 101 were White, 6 were Black, 15 were Asian, 8 were Hispanic, and 1 was Native American. Further, 89 participants were male and 42 were female. Our sample was likely relatively similar to the one originally obtained by Livingston et al (2012), who reported a nationally representative (i.e. likely majority white) sample, but our sample was majority male participants, in contrast to Livingston and colleagues, whose sample was composed of 64% women and 36% men.
In the full sample, which we use in our exploratory analyses, was made up of 121 White participants, 7 Black participants, 23 Asian participants, 10 Hispanic participants, and 2 Native American participants. Further, of this sample, 110 were male and 53 were female. Again, while the racial composition of our sample matches generally what Livingston et al. obtained, but the gender composition does not.
Although a total of 163 participants were collected, because of technical errors on mTurk, participants were collected in two subsequent batches. The first batch captured 80 participants, and then due to a technical glitch on mTurk, the batch was canceled. Then, a second batch of 88 participants was collected in mTurk. All participants were recruited within the same 24 hours and the second batch was made live as soon as the glitch on mTurk was caught. All other methods were the same.
Further, during a presentation of the replication project to peers, it was brought up that when selecting only for those who passed the manipulation check, we may introduce sampling bias such that those who are kept in the data set may be abnormally sensitive to racial stereotypes. As such, analyses will be run with both the full data set (see section titled “Exploratory analyses”) and with those who failed the manipulation check taken out (see section titled “Confirmatory analyses”), and results will be reported and discussed for both versions of analysis.
Data preparation following the analysis plan.
###Data Preparation
####Load Relevant Libraries and Functions
library(tidyverse)
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag(): dplyr, stats
library(psych)
##
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
library(ggplot2)
####Import data
d = read.csv("~/Desktop/livingston2012/anonymized-results.csv")
#### Data exclusion / filtering
d.tidy = d %>%
filter(manipcheck == 0) %>%
dplyr::select(subid, T_race, T_gender, T_behavior, lead_well, lead_max, lead_admire, lead_respect, salary, PvsS, race, age, gender,educ, workex,pol_s, pol_e)
#### Prepare data for analysis - create columns etc.
d.tidy$T_race = as.factor(d.tidy$T_race)
levels(d.tidy$T_race)[1] = "white"
levels(d.tidy$T_race)[2] = "black"
d.tidy$T_gender = as.factor(d.tidy$T_gender)
levels(d.tidy$T_gender)[1] = "male"
levels(d.tidy$T_gender)[2] = "female"
d.tidy$T_behavior = as.factor(d.tidy$T_behavior)
levels(d.tidy$T_behavior)[1] = "dominant"
levels(d.tidy$T_behavior)[2] = "communal"
d.tidy$race = as.factor(d.tidy$race)
levels(d.tidy$race)[1] = "white"
levels(d.tidy$race)[2] = "black"
levels(d.tidy$race)[3] = "asian"
levels(d.tidy$race)[4] = "hispanic"
levels(d.tidy$race)[5] = "native american"
levels(d.tidy$race)[6] = "other"
d.tidy$gender = as.factor(d.tidy$gender)
levels(d.tidy$gender)[1] = "male"
levels(d.tidy$gender)[2] = "female"
levels(d.tidy$gender)[3] = "other"
d.tidy$lead_well = as.numeric(d.tidy$lead_well)
d.tidy$lead_max = as.numeric(d.tidy$lead_max)
d.tidy$lead_admire = as.numeric(d.tidy$lead_admire)
d.tidy$lead_respect = as.numeric(d.tidy$lead_respect)
d.tidy$salary = as.numeric(d.tidy$salary)
d.tidy$PvsS = as.numeric(d.tidy$PvsS)
d.tidy$leadev = rowMeans(d.tidy[,5:9]) #make composite
lead = matrix(c(d.tidy$lead_well, d.tidy$lead_max, d.tidy$lead_admire, d.tidy$lead_respect,d.tidy$salary), ncol=5)
alpha(lead)
##
## Reliability analysis
## Call: alpha(x = lead)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd
## 0.86 0.86 0.88 0.55 6.1 0.018 5.2 1.6
##
## lower alpha upper 95% confidence boundaries
## 0.83 0.86 0.9
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se
## V1 0.79 0.78 0.80 0.48 3.6 0.029
## V2 0.81 0.80 0.82 0.50 4.0 0.027
## V3 0.79 0.78 0.80 0.47 3.5 0.030
## V4 0.83 0.83 0.84 0.54 4.8 0.024
## V5 0.92 0.92 0.91 0.75 12.2 0.011
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## V1 131 0.91 0.91 0.91 0.85 5.8 1.9
## V2 131 0.88 0.87 0.86 0.80 5.6 2.0
## V3 131 0.92 0.92 0.92 0.86 5.7 2.2
## V4 131 0.81 0.81 0.76 0.69 6.0 1.8
## V5 131 0.47 0.49 0.29 0.27 3.1 1.8
##
## Non missing response frequency for each item
## 1 2 3 4 5 6 7 8 9 miss
## [1,] 0.02 0.02 0.14 0.08 0.12 0.21 0.23 0.12 0.06 0
## [2,] 0.02 0.05 0.11 0.10 0.17 0.15 0.21 0.14 0.05 0
## [3,] 0.02 0.09 0.08 0.09 0.15 0.17 0.15 0.18 0.07 0
## [4,] 0.02 0.02 0.07 0.07 0.18 0.19 0.24 0.15 0.05 0
## [5,] 0.19 0.26 0.20 0.16 0.08 0.05 0.04 0.02 0.01 0
d.women = d.tidy %>%
filter(T_gender == "female")
d.men = d.tidy %>%
filter(T_gender == "male")
#full dataset
d.tidy2 = d %>%
dplyr::select(subid, T_race, T_gender, T_behavior, lead_well, lead_max, lead_admire, lead_respect, salary, PvsS, race, age, gender,educ, workex,pol_s, pol_e)
d.tidy2$T_race = as.factor(d.tidy2$T_race)
levels(d.tidy2$T_race)[1] = "white"
levels(d.tidy2$T_race)[2] = "black"
d.tidy2$T_gender = as.factor(d.tidy2$T_gender)
levels(d.tidy2$T_gender)[1] = "male"
levels(d.tidy2$T_gender)[2] = "female"
d.tidy2$T_behavior = as.factor(d.tidy2$T_behavior)
levels(d.tidy2$T_behavior)[1] = "dominant"
levels(d.tidy2$T_behavior)[2] = "communal"
d.tidy2$race = as.factor(d.tidy2$race)
levels(d.tidy2$race)[1] = "white"
levels(d.tidy2$race)[2] = "black"
levels(d.tidy2$race)[3] = "asian"
levels(d.tidy2$race)[4] = "hispanic"
levels(d.tidy2$race)[5] = "native american"
levels(d.tidy2$race)[6] = "other"
d.tidy2$gender = as.factor(d.tidy2$gender)
levels(d.tidy2$gender)[1] = "male"
levels(d.tidy2$gender)[2] = "female"
levels(d.tidy2$gender)[3] = "other"
d.tidy2$lead_well = as.numeric(d.tidy2$lead_well)
d.tidy2$lead_max = as.numeric(d.tidy2$lead_max)
d.tidy2$lead_admire = as.numeric(d.tidy2$lead_admire)
d.tidy2$lead_respect = as.numeric(d.tidy2$lead_respect)
d.tidy2$salary = as.numeric(d.tidy2$salary)
d.tidy2$PvsS = as.numeric(d.tidy2$PvsS)
d.tidy2$leadev = rowMeans(d.tidy2[,5:9]) #make composite
lead = matrix(c(d.tidy2$lead_well, d.tidy2$lead_max, d.tidy2$lead_admire, d.tidy2$lead_respect,d.tidy2$salary), ncol=5)
alpha(lead)
##
## Reliability analysis
## Call: alpha(x = lead)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd
## 0.87 0.86 0.87 0.56 6.3 0.016 5.3 1.6
##
## lower alpha upper 95% confidence boundaries
## 0.83 0.87 0.9
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se
## V1 0.80 0.80 0.81 0.50 3.9 0.025
## V2 0.81 0.81 0.82 0.52 4.3 0.024
## V3 0.79 0.79 0.79 0.48 3.7 0.027
## V4 0.84 0.84 0.84 0.56 5.1 0.021
## V5 0.92 0.92 0.90 0.74 11.2 0.010
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## V1 163 0.90 0.90 0.89 0.83 5.9 2.0
## V2 163 0.87 0.87 0.84 0.78 5.7 2.0
## V3 163 0.92 0.92 0.92 0.86 5.7 2.2
## V4 163 0.80 0.80 0.74 0.68 6.0 1.9
## V5 163 0.53 0.54 0.35 0.33 3.3 1.9
##
## Non missing response frequency for each item
## 1 2 3 4 5 6 7 8 9 miss
## [1,] 0.02 0.02 0.12 0.07 0.15 0.20 0.21 0.13 0.07 0
## [2,] 0.02 0.05 0.10 0.09 0.17 0.16 0.20 0.15 0.06 0
## [3,] 0.02 0.10 0.06 0.10 0.16 0.17 0.15 0.16 0.08 0
## [4,] 0.02 0.02 0.07 0.07 0.17 0.18 0.25 0.15 0.07 0
## [5,] 0.20 0.23 0.20 0.14 0.09 0.07 0.06 0.01 0.01 0
d.women2 = d.tidy2 %>%
filter(T_gender == "female")
d.men2 = d.tidy2 %>%
filter(T_gender == "male")
The analyses as specified in the analysis plan.
###descriptives
summary(d.tidy$race)
## white black asian hispanic
## 101 6 15 8
## native american other
## 1 0
summary(d.tidy$gender)
## male female other
## 89 42 0
hist(d.tidy$leadev)
hist(d.tidy$PvsS)
###main analyses
rs1 = summary(lm(leadev ~ T_race*T_gender*T_behavior, d.tidy)); rs1 #main 3 way interaction (leader evaluations)
##
## Call:
## lm(formula = leadev ~ T_race * T_gender * T_behavior, data = d.tidy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0316 -1.1444 0.0909 1.1133 3.2667
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 4.4222 0.3574 12.374
## T_raceblack 0.6444 0.5054 1.275
## T_genderfemale 0.3111 0.4870 0.639
## T_behaviorcommunal 1.4094 0.4987 2.826
## T_raceblack:T_genderfemale -0.1244 0.7199 -0.173
## T_raceblack:T_behaviorcommunal -0.5560 0.7787 -0.714
## T_genderfemale:T_behaviorcommunal -0.3111 0.6922 -0.449
## T_raceblack:T_genderfemale:T_behaviorcommunal -0.6865 1.0950 -0.627
## Pr(>|t|)
## (Intercept) <2e-16 ***
## T_raceblack 0.2047
## T_genderfemale 0.5241
## T_behaviorcommunal 0.0055 **
## T_raceblack:T_genderfemale 0.8630
## T_raceblack:T_behaviorcommunal 0.4765
## T_genderfemale:T_behaviorcommunal 0.6539
## T_raceblack:T_genderfemale:T_behaviorcommunal 0.5319
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.516 on 123 degrees of freedom
## Multiple R-squared: 0.1132, Adjusted R-squared: 0.06269
## F-statistic: 2.242 on 7 and 123 DF, p-value: 0.03516
rs2 = summary(lm(leadev ~ T_race*T_behavior, d.women)); rs2 #2 way for women (leader evaluations)
##
## Call:
## lm(formula = leadev ~ T_race * T_behavior, data = d.women)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0316 -1.2316 -0.1212 1.3130 3.2667
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.7333 0.3445 13.740 <2e-16 ***
## T_raceblack 0.5200 0.5337 0.974 0.3337
## T_behaviorcommunal 1.0982 0.4998 2.197 0.0318 *
## T_raceblack:T_behaviorcommunal -1.2425 0.8016 -1.550 0.1262
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.579 on 62 degrees of freedom
## Multiple R-squared: 0.07332, Adjusted R-squared: 0.02848
## F-statistic: 1.635 on 3 and 62 DF, p-value: 0.1903
rs3 = summary(lm(leadev ~ T_race*T_behavior, d.men)); rs3 #2 way for men (leader evaluations)
##
## Call:
## lm(formula = leadev ~ T_race * T_behavior, data = d.men)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.2316 -0.8667 0.2800 0.9684 2.5684
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.4222 0.3418 12.938 < 2e-16 ***
## T_raceblack 0.6444 0.4834 1.333 0.18741
## T_behaviorcommunal 1.4094 0.4770 2.955 0.00444 **
## T_raceblack:T_behaviorcommunal -0.5560 0.7447 -0.747 0.45815
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.45 on 61 degrees of freedom
## Multiple R-squared: 0.1568, Adjusted R-squared: 0.1153
## F-statistic: 3.781 on 3 and 61 DF, p-value: 0.01485
rs4 = summary(lm(PvsS ~ T_race*T_gender*T_behavior,d.tidy)); rs4 #3 way interaction (attribution of behavior)
##
## Call:
## lm(formula = PvsS ~ T_race * T_gender * T_behavior, data = d.tidy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.3158 -1.0526 -0.0556 0.9474 3.7273
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 2.6111 0.3322 7.859
## T_raceblack 0.4444 0.4699 0.946
## T_genderfemale 0.2937 0.4528 0.649
## T_behaviorcommunal 0.7047 0.4636 1.520
## T_raceblack:T_genderfemale 0.5175 0.6692 0.773
## T_raceblack:T_behaviorcommunal -1.1602 0.7239 -1.603
## T_genderfemale:T_behaviorcommunal -0.5568 0.6436 -0.865
## T_raceblack:T_genderfemale:T_behaviorcommunal 0.4184 1.0180 0.411
## Pr(>|t|)
## (Intercept) 1.66e-12 ***
## T_raceblack 0.346
## T_genderfemale 0.518
## T_behaviorcommunal 0.131
## T_raceblack:T_genderfemale 0.441
## T_raceblack:T_behaviorcommunal 0.112
## T_genderfemale:T_behaviorcommunal 0.389
## T_raceblack:T_genderfemale:T_behaviorcommunal 0.682
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.41 on 123 degrees of freedom
## Multiple R-squared: 0.06745, Adjusted R-squared: 0.01438
## F-statistic: 1.271 on 7 and 123 DF, p-value: 0.2702
rs5 = summary(lm(PvsS ~ T_race*T_behavior,d.women)); rs5 #2 way for women (attribution of behavior)
##
## Call:
## lm(formula = PvsS ~ T_race * T_behavior, data = d.women)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.2727 -0.9048 -0.0526 0.9474 3.7273
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.9048 0.3094 9.387 1.62e-13 ***
## T_raceblack 0.9619 0.4794 2.007 0.0492 *
## T_behaviorcommunal 0.1479 0.4490 0.329 0.7430
## T_raceblack:T_behaviorcommunal -0.7418 0.7200 -1.030 0.3069
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.418 on 62 degrees of freedom
## Multiple R-squared: 0.06676, Adjusted R-squared: 0.02161
## F-statistic: 1.478 on 3 and 62 DF, p-value: 0.2291
rs6 = summary(lm(PvsS ~ T_race*T_behavior,d.men)); rs6 #2 way for men (attribution of behavior)
##
## Call:
## lm(formula = PvsS ~ T_race * T_behavior, data = d.men)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.3158 -1.0556 -0.3158 0.6842 2.9444
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.6111 0.3302 7.907 6.35e-11 ***
## T_raceblack 0.4444 0.4670 0.952 0.345
## T_behaviorcommunal 0.7047 0.4608 1.529 0.131
## T_raceblack:T_behaviorcommunal -1.1602 0.7195 -1.613 0.112
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.401 on 61 degrees of freedom
## Multiple R-squared: 0.04792, Adjusted R-squared: 0.001098
## F-statistic: 1.023 on 3 and 61 DF, p-value: 0.3886
rs7 = summary(lm(PvsS ~ T_race,d.tidy)); rs7 #main effect of race
##
## Call:
## lm(formula = PvsS ~ T_race, data = d.tidy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.241 -0.974 0.026 1.026 3.759
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.9740 0.1617 18.389 <2e-16 ***
## T_raceblack 0.2667 0.2519 1.059 0.292
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.419 on 129 degrees of freedom
## Multiple R-squared: 0.008615, Adjusted R-squared: 0.0009303
## F-statistic: 1.121 on 1 and 129 DF, p-value: 0.2917
Comparison of main findings with original study.
original = data.frame(
race = factor(c("White","White", "Black", "Black","White","White", "Black", "Black"), levels=c("White","Black")),
gender = factor(c("Male", "Male","Male","Male", "Female","Female","Female","Female"), levels=c("Male","Female")),
behavior = factor(c("Dominant", "Communal"), levels=c("Dominant","Communal")),
leadev = c(3.11,3.25,1.97,4.06,2.23,3.85,3.05,3.32),
sd = c(1.36,1.24,.98,1.27,.7,1.24,.8,1.02)
)
ggplot(data=original, aes(x=race, y = leadev,fill=behavior, ymin=(original$leadev - original$sd), ymax=(original$leadev + original$sd))) +
facet_wrap(~gender)+
geom_bar(position="dodge", stat="identity") +
geom_errorbar(position="dodge") +
labs(title = "Original plot of main finding from Livingston et al. (2012)",
x = "Leader type",
y = "Leader status")
Original means and standard deviations of leader status score as function of leader’s race, gender, and behavior – as reported in Livingston et al., 2012. Error bars represent standard deviations.
group = d.tidy %>%
group_by(T_race,T_gender, T_behavior)
d.tidy.graph = group %>%
summarise(m_leadev = mean(leadev), sd_leadev = sd(leadev))
ggplot(d.tidy.graph, aes(x=T_race, y = m_leadev, fill=T_behavior, ymin=(m_leadev - sd_leadev), ymax=(m_leadev + sd_leadev))) +
facet_wrap(~T_gender)+
geom_bar(position="dodge", stat="identity") +
geom_errorbar(position="dodge") +
labs(title = "Results from replication of Livingston et al. (2012)",
x = "Leader type",
y = "Leader status")
Means and standard deviations of leader status score as function of leader’s race, gender, and behavior – as found in our replication. Error bars represent standard deviations.
###descriptives
summary(d.tidy2$race)
## white black asian hispanic
## 121 7 23 10
## native american other
## 2 0
summary(d.tidy2$gender)
## male female other
## 110 53 0
hist(d.tidy2$leadev)
hist(d.tidy2$PvsS)
###main analyses with full dataset
rs8 = summary(lm(leadev ~ T_race*T_gender*T_behavior, d.tidy2));rs8 #main 3 way interaction (leader evaluations)
##
## Call:
## lm(formula = leadev ~ T_race * T_gender * T_behavior, data = d.tidy2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0500 -1.0350 -0.0333 1.1967 3.2364
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 4.2200 0.3382 12.477
## T_raceblack 0.8300 0.4783 1.735
## T_genderfemale 0.5436 0.4673 1.163
## T_behaviorcommunal 1.7365 0.4625 3.755
## T_raceblack:T_genderfemale -0.5436 0.6543 -0.831
## T_raceblack:T_behaviorcommunal -0.5532 0.6748 -0.820
## T_genderfemale:T_behaviorcommunal -0.7365 0.6495 -1.134
## T_raceblack:T_genderfemale:T_behaviorcommunal -0.1540 0.9603 -0.160
## Pr(>|t|)
## (Intercept) < 2e-16 ***
## T_raceblack 0.084690 .
## T_genderfemale 0.246502
## T_behaviorcommunal 0.000245 ***
## T_raceblack:T_genderfemale 0.407337
## T_raceblack:T_behaviorcommunal 0.413614
## T_genderfemale:T_behaviorcommunal 0.258565
## T_raceblack:T_genderfemale:T_behaviorcommunal 0.872836
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.513 on 155 degrees of freedom
## Multiple R-squared: 0.1503, Adjusted R-squared: 0.1119
## F-statistic: 3.917 on 7 and 155 DF, p-value: 0.0005821
rs9 = summary(lm(leadev ~ T_race*T_behavior, d.women2)); rs9 #2 way for women (leader evaluations)
##
## Call:
## lm(formula = leadev ~ T_race * T_behavior, data = d.women2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0500 -1.1429 -0.0964 1.3500 3.2364
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.7636 0.3330 14.307 <2e-16 ***
## T_raceblack 0.2864 0.4610 0.621 0.5363
## T_behaviorcommunal 1.0000 0.4709 2.124 0.0369 *
## T_raceblack:T_behaviorcommunal -0.7071 0.7054 -1.003 0.3192
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.562 on 78 degrees of freedom
## Multiple R-squared: 0.05927, Adjusted R-squared: 0.02309
## F-statistic: 1.638 on 3 and 78 DF, p-value: 0.1873
rs10 = summary(lm(leadev ~ T_race*T_behavior, d.men2)); rs10 #2 way for men (leader evaluations)
##
## Call:
## lm(formula = leadev ~ T_race * T_behavior, data = d.men2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4333 -0.9565 0.1667 0.9500 2.4435
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.2200 0.3267 12.916 < 2e-16 ***
## T_raceblack 0.8300 0.4621 1.796 0.076372 .
## T_behaviorcommunal 1.7365 0.4467 3.887 0.000214 ***
## T_raceblack:T_behaviorcommunal -0.5532 0.6519 -0.849 0.398731
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.461 on 77 degrees of freedom
## Multiple R-squared: 0.2326, Adjusted R-squared: 0.2027
## F-statistic: 7.779 on 3 and 77 DF, p-value: 0.0001329
rs11 = summary(lm(PvsS ~ T_race*T_gender*T_behavior,d.tidy2)); rs11 #3 way interaction (attribution of behavior)
##
## Call:
## lm(formula = PvsS ~ T_race * T_gender * T_behavior, data = d.tidy2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.9167 -1.0455 -0.0455 1.0455 3.3333
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 2.750000 0.336651 8.169
## T_raceblack 0.400000 0.476096 0.840
## T_genderfemale 0.204545 0.465150 0.440
## T_behaviorcommunal 0.510870 0.460309 1.110
## T_raceblack:T_genderfemale 0.562121 0.651262 0.863
## T_raceblack:T_behaviorcommunal 0.005797 0.671673 0.009
## T_genderfemale:T_behaviorcommunal -0.419960 0.646487 -0.650
## T_raceblack:T_genderfemale:T_behaviorcommunal -0.299087 0.955801 -0.313
## Pr(>|t|)
## (Intercept) 1.02e-13 ***
## T_raceblack 0.402
## T_genderfemale 0.661
## T_behaviorcommunal 0.269
## T_raceblack:T_genderfemale 0.389
## T_raceblack:T_behaviorcommunal 0.993
## T_genderfemale:T_behaviorcommunal 0.517
## T_raceblack:T_genderfemale:T_behaviorcommunal 0.755
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.506 on 155 degrees of freedom
## Multiple R-squared: 0.06526, Adjusted R-squared: 0.02305
## F-statistic: 1.546 on 7 and 155 DF, p-value: 0.1557
rs12 = summary(lm(PvsS ~ T_race*T_behavior,d.women2)); rs12 #2 way for women (attribution of behavior)
##
## Call:
## lm(formula = PvsS ~ T_race * T_behavior, data = d.women2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.9167 -0.9545 0.0000 1.0833 3.2857
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.95455 0.31530 9.370 2.07e-14 ***
## T_raceblack 0.96212 0.43652 2.204 0.0305 *
## T_behaviorcommunal 0.09091 0.44591 0.204 0.8390
## T_raceblack:T_behaviorcommunal -0.29329 0.66798 -0.439 0.6618
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.479 on 78 degrees of freedom
## Multiple R-squared: 0.08039, Adjusted R-squared: 0.04502
## F-statistic: 2.273 on 3 and 78 DF, p-value: 0.08664
rs13 = summary(lm(PvsS ~ T_race*T_behavior,d.men2)); rs13 #2 way for men (attribution of behavior)
##
## Call:
## lm(formula = PvsS ~ T_race * T_behavior, data = d.men2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.667 -1.261 -0.150 0.850 3.333
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.750000 0.342579 8.027 9.01e-12 ***
## T_raceblack 0.400000 0.484480 0.826 0.412
## T_behaviorcommunal 0.510870 0.468415 1.091 0.279
## T_raceblack:T_behaviorcommunal 0.005797 0.683501 0.008 0.993
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.532 on 77 degrees of freedom
## Multiple R-squared: 0.04292, Adjusted R-squared: 0.00563
## F-statistic: 1.151 on 3 and 77 DF, p-value: 0.334
rs14 = summary(lm(PvsS ~ T_race,d.tidy2)); rs14 #main effect of race
##
## Call:
## lm(formula = PvsS ~ T_race, data = d.tidy2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.6184 -1.0115 -0.0115 0.9885 3.3816
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.0115 0.1605 18.761 <2e-16 ***
## T_raceblack 0.6069 0.2351 2.582 0.0107 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.497 on 161 degrees of freedom
## Multiple R-squared: 0.03975, Adjusted R-squared: 0.03379
## F-statistic: 6.665 on 1 and 161 DF, p-value: 0.01072
group2 = d.tidy2 %>%
group_by(T_race,T_gender, T_behavior)
d.tidy.graph2 = group2 %>%
summarise(m_leadev = mean(leadev), sd_leadev = sd(leadev))
ggplot(d.tidy.graph2, aes(x=T_race, y = m_leadev, fill=T_behavior, ymin=(m_leadev - sd_leadev), ymax=(m_leadev + sd_leadev))) +
facet_wrap(~T_gender)+
geom_bar(position="dodge", stat="identity") +
geom_errorbar(position="dodge") +
labs(title = "Results from replication of Livingston et al. (2012) - full dataset",
x = "Leader type",
y = "Leader status")
summary(lm(leadev ~ gender, d.tidy))
##
## Call:
## lm(formula = leadev ~ gender, data = d.tidy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.7888 -1.0634 0.0619 1.2619 3.2112
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.1888 0.1665 31.166 <2e-16 ***
## genderfemale 0.1493 0.2940 0.508 0.612
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.571 on 129 degrees of freedom
## Multiple R-squared: 0.001996, Adjusted R-squared: -0.005741
## F-statistic: 0.2579 on 1 and 129 DF, p-value: 0.6124
summary(lm(leadev ~ pol_s, d.tidy))
##
## Call:
## lm(formula = leadev ~ pol_s, data = d.tidy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.8373 -1.1275 0.1627 1.3529 3.1627
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.37263 0.28729 18.701 <2e-16 ***
## pol_s -0.04510 0.08371 -0.539 0.591
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.57 on 129 degrees of freedom
## Multiple R-squared: 0.002245, Adjusted R-squared: -0.00549
## F-statistic: 0.2903 on 1 and 129 DF, p-value: 0.591
summary(lm(leadev ~ T_race*T_gender*T_behavior + pol_s, d.tidy)) #main 3 way interaction (leader evaluations)
##
## Call:
## lm(formula = leadev ~ T_race * T_gender * T_behavior + pol_s,
## data = d.tidy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9808 -1.0829 0.0617 1.1071 3.1917
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 4.52553 0.43375 10.433
## T_raceblack 0.67226 0.51135 1.315
## T_genderfemale 0.31849 0.48898 0.651
## T_behaviorcommunal 1.38510 0.50367 2.750
## pol_s -0.03576 0.08448 -0.423
## T_raceblack:T_genderfemale -0.16282 0.72794 -0.224
## T_raceblack:T_behaviorcommunal -0.55918 0.78134 -0.716
## T_genderfemale:T_behaviorcommunal -0.26955 0.70147 -0.384
## T_raceblack:T_genderfemale:T_behaviorcommunal -0.69995 1.09917 -0.637
## Pr(>|t|)
## (Intercept) < 2e-16 ***
## T_raceblack 0.19109
## T_genderfemale 0.51605
## T_behaviorcommunal 0.00687 **
## pol_s 0.67281
## T_raceblack:T_genderfemale 0.82339
## T_raceblack:T_behaviorcommunal 0.47556
## T_genderfemale:T_behaviorcommunal 0.70145
## T_raceblack:T_genderfemale:T_behaviorcommunal 0.52545
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.521 on 122 degrees of freedom
## Multiple R-squared: 0.1145, Adjusted R-squared: 0.05639
## F-statistic: 1.971 on 8 and 122 DF, p-value: 0.05559
summary(lm(leadev ~ T_race*T_gender*T_behavior + gender, d.tidy))
##
## Call:
## lm(formula = leadev ~ T_race * T_gender * T_behavior + gender,
## data = d.tidy)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0928 -1.1033 0.0237 1.0765 3.2767
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 4.3752 0.3816 11.467
## T_raceblack 0.6562 0.5083 1.291
## T_genderfemale 0.3480 0.4994 0.697
## T_behaviorcommunal 1.4230 0.5019 2.835
## genderfemale 0.1057 0.2930 0.361
## T_raceblack:T_genderfemale -0.1614 0.7296 -0.221
## T_raceblack:T_behaviorcommunal -0.5661 0.7820 -0.724
## T_genderfemale:T_behaviorcommunal -0.3591 0.7074 -0.508
## T_raceblack:T_genderfemale:T_behaviorcommunal -0.6452 1.1049 -0.584
## Pr(>|t|)
## (Intercept) < 2e-16 ***
## T_raceblack 0.19913
## T_genderfemale 0.48717
## T_behaviorcommunal 0.00536 **
## genderfemale 0.71894
## T_raceblack:T_genderfemale 0.82535
## T_raceblack:T_behaviorcommunal 0.47048
## T_genderfemale:T_behaviorcommunal 0.61256
## T_raceblack:T_genderfemale:T_behaviorcommunal 0.56036
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.522 on 122 degrees of freedom
## Multiple R-squared: 0.1141, Adjusted R-squared: 0.05601
## F-statistic: 1.964 on 8 and 122 DF, p-value: 0.05652
summary(lm(leadev ~ T_race*T_behavior*pol_s, d.women)) #2 way for women (leader evaluations)
##
## Call:
## lm(formula = leadev ~ T_race * T_behavior * pol_s, data = d.women)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.1145 -1.2446 -0.0511 1.2647 3.3484
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.61258 0.73855 6.245 5.35e-08
## T_raceblack 1.56691 1.64294 0.954 0.344
## T_behaviorcommunal 1.01028 1.10474 0.914 0.364
## pol_s 0.03901 0.20949 0.186 0.853
## T_raceblack:T_behaviorcommunal -1.59305 2.08978 -0.762 0.449
## T_raceblack:pol_s -0.36978 0.54432 -0.679 0.500
## T_behaviorcommunal:pol_s 0.01931 0.29291 0.066 0.948
## T_raceblack:T_behaviorcommunal:pol_s 0.13844 0.65816 0.210 0.834
##
## (Intercept) ***
## T_raceblack
## T_behaviorcommunal
## pol_s
## T_raceblack:T_behaviorcommunal
## T_raceblack:pol_s
## T_behaviorcommunal:pol_s
## T_raceblack:T_behaviorcommunal:pol_s
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.62 on 58 degrees of freedom
## Multiple R-squared: 0.08693, Adjusted R-squared: -0.02327
## F-statistic: 0.7888 on 7 and 58 DF, p-value: 0.5995
summary(lm(leadev ~ T_race*T_behavior*gender, d.women))
##
## Call:
## lm(formula = leadev ~ T_race * T_behavior * gender, data = d.women)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0000 -1.2779 -0.0384 1.3455 3.2632
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 4.73684 0.36749 12.890
## T_raceblack 0.20316 0.62581 0.325
## T_behaviorcommunal 1.11770 0.60689 1.842
## genderfemale -0.03684 1.19080 -0.031
## T_raceblack:T_behaviorcommunal -0.57199 0.99572 -0.574
## T_raceblack:genderfemale 0.97684 1.47911 0.660
## T_behaviorcommunal:genderfemale -0.01770 1.40428 -0.013
## T_raceblack:T_behaviorcommunal:genderfemale -1.95801 1.93644 -1.011
## Pr(>|t|)
## (Intercept) <2e-16 ***
## T_raceblack 0.7466
## T_behaviorcommunal 0.0706 .
## genderfemale 0.9754
## T_raceblack:T_behaviorcommunal 0.5679
## T_raceblack:genderfemale 0.5116
## T_behaviorcommunal:genderfemale 0.9900
## T_raceblack:T_behaviorcommunal:genderfemale 0.3161
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.602 on 58 degrees of freedom
## Multiple R-squared: 0.1075, Adjusted R-squared: -0.0002635
## F-statistic: 0.9976 on 7 and 58 DF, p-value: 0.4424
summary(lm(leadev ~ T_race*T_behavior*pol_s, d.men)) #2 way for men (leader evaluations)
##
## Call:
## lm(formula = leadev ~ T_race * T_behavior * pol_s, data = d.men)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.1649 -0.8585 -0.0129 0.9569 2.5415
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.76000 0.64474 7.383 7.32e-10
## T_raceblack 1.00140 1.12539 0.890 0.377
## T_behaviorcommunal 0.37122 0.87201 0.426 0.672
## pol_s -0.11692 0.18967 -0.616 0.540
## T_raceblack:T_behaviorcommunal 0.84489 1.58778 0.532 0.597
## T_raceblack:pol_s -0.07255 0.30111 -0.241 0.810
## T_behaviorcommunal:pol_s 0.43375 0.29005 1.495 0.140
## T_raceblack:T_behaviorcommunal:pol_s -0.60894 0.47149 -1.292 0.202
##
## (Intercept) ***
## T_raceblack
## T_behaviorcommunal
## pol_s
## T_raceblack:T_behaviorcommunal
## T_raceblack:pol_s
## T_behaviorcommunal:pol_s
## T_raceblack:T_behaviorcommunal:pol_s
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.442 on 57 degrees of freedom
## Multiple R-squared: 0.2212, Adjusted R-squared: 0.1256
## F-statistic: 2.313 on 7 and 57 DF, p-value: 0.03792
summary(lm(leadev ~ T_race*T_behavior*gender, d.men))
##
## Call:
## lm(formula = leadev ~ T_race * T_behavior * gender, data = d.men)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.3077 -0.9833 0.0167 1.0571 2.4923
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 4.2800 0.4713 9.082
## T_raceblack 0.7033 0.6381 1.102
## T_behaviorcommunal 1.6277 0.6268 2.597
## genderfemale 0.3200 0.7069 0.453
## T_raceblack:T_behaviorcommunal -0.8682 0.9462 -0.918
## T_raceblack:genderfemale -0.0700 1.0271 -0.068
## T_behaviorcommunal:genderfemale -0.5610 1.0201 -0.550
## T_raceblack:T_behaviorcommunal:genderfemale 0.9015 1.6290 0.553
## Pr(>|t|)
## (Intercept) 1.13e-12 ***
## T_raceblack 0.275
## T_behaviorcommunal 0.012 *
## genderfemale 0.652
## T_raceblack:T_behaviorcommunal 0.363
## T_raceblack:genderfemale 0.946
## T_behaviorcommunal:genderfemale 0.585
## T_raceblack:T_behaviorcommunal:genderfemale 0.582
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.49 on 57 degrees of freedom
## Multiple R-squared: 0.1678, Adjusted R-squared: 0.06561
## F-statistic: 1.642 on 7 and 57 DF, p-value: 0.1423
First, we collapsed the five leader status evaluation ratings into one composite dependent variable (leadev), and we were able to achieve adequate reliability, \(\alpha\) = .86. As such, we used this composite as the dependent variable in subsequent analyses.
In our replication, we found that there was a main effect of behavior on leader status evaluations, b = 1.409, t(123) = 2.83, p = 0.006. That is, on average, those in the communal condition (M = 5.71, SD = 1.51) were evaluated significantly more favorably than those in the dominant condition (M = 4.85, SD = 1.51). However, there was no significant three-way interaction between target race, target gender, and target behavior, b = -0.686, t(123) = -0.63, p = 0.532. Further, when we analyzed the separate two-way interactions for male and female targets, we found that for female targets, there was a significant main effect of behavior on leader status evaluations, b = 1.098, t(62) = 2.2, p = 0.032. Those in the communal condition (M = 5.57, SD = 1.52) were rated significantly more favorably than those in the dominant condition (M = 4.95, SD = 1.64). However, there was no significant interaction between target race and target behavior, b = -1.242, t(62) = -1.55, p = 0.126. For male targets, there was also a significant main effect of behavior on leader status evaluations, b = 1.409, t(61) = 2.95, p = 0.004. Those in the communal condition (M = 5.86, SD = 1.51) were rated significantly more favorably than those in the dominant condition (M = 4.74, SD = 1.4). However, again, there was no significant interaction between target race and target behavior, b = -0.556, t(61) = -0.75, p = 0.458. All other main effects and interactions in these analyses were not signficant. Overall, with regards to Livingston et al.’s main findings of the interactive effect of target race, target gender, and target behavior on leader status evaluations, we were unable to replicate their results. We did, however, replicate the main effect of behavior on leader status evaluations. Further, when looking at the bar graphs we created, we can see that the difference in ratings between dominant and communal Black female targets was smaller than the difference between dominant and communal White female targets, but this difference (the interaction) did not achieve statistical significance.
We also found that, when looking at how individuals attributed targets’ behavior (e.g. if participants attributed behavior more to the situation or the person) there was no significant interaction between target race, target gender, and target behavior, b = -0.557, t(123) = -0.87, p = 0.389. There were also no significant simple effects or two-way interactions. We found that when only looking at female targets, there was also no significant interaction of target race and behavior, b = -0.742, t(62) = -1.03, p = 0.307. There was, however, a significant main effect of race on dispositional or situational attributions of behavior, b = 0.962, t(62) = 2.01, p = 0.049. Specifically, for White women, behaviors were attributed as more dispositional (M = 2.98, SD = 1.17) than behaviors for Black women (M = 3.62, SD = 1.72). When looking only at male targets, we found that there was no significant interaction between target race and behavior, b = -1.16, t(61) = -1.61, p = 0.112. The main effects of race and behavior were also not significant. There was also no main effect of race on attributions, b = 0.267, t(129) = 1.06, p = 0.292. Overall, with these analyses, we were also unable to replicate Livingston et al.’s results. We were unable to find any of the interactive effects that Livingston et al. reported, though what significant results we did find seemed to be in the same direction as Livingston et al.’s findings. Specifically, although Livingston et al. reported differences in attribution based on an interaction between race and behavior within female targets and we did not find this interaction, Livingston et al. did report that the mean rating for White women was more dispositional (M = 3.52) than the mean rating for Black women (M = 3.93). While it is unclear if this result achieved statistical significance in the original study, our results do support this previous trend and our result does achieve statistical significance.
In our exploratory analyses with the full dataset, most of the results remained the same. Some minor differences in results are reported here. When we reran the full, three-way interaction between target race, target gender, and target behavior, we found that in addition to the main effect of behavior, there was also a marginally significant effect of target race, b = 0.83, t(155) = 1.74, p = 0.085. Black targets were rated slightly more favorably on average (M = 5.38, SD = 1.48) than White targets (M = 5.21, SD = 1.71). Further, while results for female targets remained largely the same, when we looked at male targets, we found that in addition to the main effect of behavior, there was again a marginally significant main effect of target race on leader status evaluations, b = 0.83, t(77) = 1.8, p = 0.076. On average, Black targets were rated slightly more favorably (M = 5.61, SD = 1.42) than White targets (M = 5.15, SD = 1.79). Finally, the remaining difference we found when we reran our analyses with the full dataset was that there now emerged a significant effect of race on attributions, b = 0.607, t(161) = 2.58, p = 0.011. Specifically, behaviors were attributed more to disposition for White targets (M = 3.01, SD = 1.31) than for Black targets (M = 3.62, SD = 1.69). Again, these results do not directly replicate Livingston et al.’s original findings. Further, many of the results found in these exploratory analyses are only marginally significant, so we take them with caution when drawing conclusions.
Finally, in a series of exploratory analyses, we wondered if participant characteristics moderated the effect of target race, gender, and behavior on leader status evaluations. Specifically, we were interested in participant gender and participants’ social conservatism. Although we did find some significant effects of target race, gender, and behavior when factoring in either participant gender or social conservatism, these results did not differ from the results we previously found without moderation.
Although we largely failed to replicate Livingston et al.’s original results, there are a number of reasons this could be. Because the original sample size was quite small (N = 84), there were only approximately 10 or so participants per condition, meaning that even if the effect is a real one, the chances of detecting it would be quite low. Even though our achieved sample size (N = 131) was bigger than Livingston et al.’s sample size, it still only had approximately 16 participants per condition. As this is a relatively small number of participants per condition, even if the effect is true, the chances that we would capture the effect again are rather slim. Because our sample size, too, was quite small, it is difficult to make definitive conclusions about this effect. A further replication with a much larger sample size and more statistical power would be necessary to more conclusively demonstrate a replication (or lack thereof) of this result.
Another possibility is that in the original study, the manipulations and materials looked slightly different. The original study used photos along with their vignettes, which makes for a stronger manipulation of race and gender, and perhaps the salience of these variables helped produce their effect. However, since we were unable to obtain the original materials, we resorted to different methods, which may have made for a weaker manipulation, making it more difficult for us to obtain the effect.
Additionally, the demographic composition of sample that we collected was a bit different than the one that Livingston et al. reported. While they had a majority female sample, we had a majority male sample. This too could have caused confounds, perhaps indicating that the gender of the participants makes a difference in how individuals perceive others. Of course, this conclusion is tentative, and may play a smaller role in producing our failure to replicate than the other factors.
Finally, in the process of reviewing Livingston et al.’s paper in depth, I noticed a couple of oddities that, while not mistakes, represent quirks in their data reporting that bear addressing. First, when recreating Livingston et al.’s original graph, it came to my attention that standard deviations were used for the error bars rather than standard error, which is perhaps a more standard practice. Second, these error bars appeared to extend only half a standard deviation above and below the means, which again seems to go against convention. Third, and this is something I have begun to notice now about many papers I read now and not just this one, documentation of the methods and materials used in this study in the paper was quite sparse, making it difficult to faithfully replicate, especially since I was not able to obtain the original materials.