IAT Scoring Made Easy: An Automated R Script to Analyze Implicit Association Test Output

This publication is an R Markdown document written in HTML containing R code developed by Daniel Storage to analyze Implicit Assocation Test (IAT) data output. See Greenwald, McGhee, and Schwartz (1998) for more information about what the IAT is and how to use it. This R script analyzes IAT data in accordance with the widely used scoring algorithm recommended by Greenwald, Nosek, and Banaji (2003). This script is open-source and can be used by anyone. For questions regarding use of this R script, email Daniel Storage at storage2@illinois.edu.

The Implicit Association Test (IAT) is a computer-based task that allows you to measure people’s associations implicitly (Greenwald et al., 1998). Do people think men are better suited for science than women? Do people associate intelligence with some race groups more than others? Are people more willing (i.e., quicker) to pair African Americans with negative concepts (e.g., weapons) than European Americans? These are all questions that the IAT can answer, even if the participants are unwilling or unable to admit to holding such negative associations. Despite previous reports calling into question the validity of the IAT (Tetlock & Mitchell, 2009), experts agree that the IAT is a useful and meaningful predictor of behavior (e.g., Jost et al., 2009).

This publication details R code that can be used to analyze data output from the IAT. The script takes an average of 2.5 seconds to compile, depending on the computer used to run it.

# This script is designed to automatically score IAT output files 
# The procedure used is that of Greenwald et al. (2003)

Data files should consist of six columns organized in order as follows: Block (0-6), trial (0-19 for training blocks, 0-39 for test blocks), category (dependent on your IAT), the type of item within that category (dependent on your IAT), a dummy variable indicating whether the participant was correct or incorrect on that trial (0=correct, 1=incorrect), and the participant’s reaction time (in milliseconds). Import your data into R, labeling it “IAT” (without quotes) and leaving out headings.

Note that this code assumes your IAT consists of 220 total trials across the IAT. If you are running a brief IAT (which consists of three categories to sort instead of four; see Sriram & Greenwald, 2009), you may need to make slight alterations to the code to suit the structure of your IAT (e.g., by changing “i < 221” to “i < 140” in Step 1 below).

# Upload raw data file and label it IAT 
IAT <- read.csv("IATsampledataset.txt", header=FALSE)

# Change headings of the dataset
colnames(IAT) <- c("Block", "Trial", "Category", "Cat_Item", "Correct", "RT")

In accordance with the scoring algorithm used by Greenwald et al. (2003), values for any trial with a reaction time over 10,000 milliseconds (10 seconds) must be deleted. Given that a response should only take a second or two, it is more than likely that a reaction time of 10,000 ms reflects inattention (or that the participant stepped away from his or her computer) and should therefore not be included in calculating the participant’s IAT effect size.

# Step 1: Delete any reaction times > 10,000 ms ####
i <- nrow(IAT) # define i counting variable for while loop
while (i < 221) { # define while loop for Step 1 
  if (IAT$RT[i] > 10000) {IAT$RT[i] <- 0}
  i = i + 1
}
IAT2 <- subset(IAT, RT!=0) # new data frame, excluding trials over 10,000 ms

Again following the instruction of Greenwald et al. (2003), any participants who responded faster than 300 ms on over 10% of trials must be excluded. Participants who meet this criteria likely rushed through the IAT (e.g., by rapidly pressing buttons) and therefore have produced unreliable IAT effect sizes that do not accurately represent their associations.

The following chunk of code calculates the proportion of trials on which the participant responded faster than 300 ms, and prints a warning message at the end of the script if the participant should be excluded on this basis.

# Step 2: Check for exclusion based on response speed (10% trials < 300 ms) ####
SpeedCount <- length(which(IAT2$RT<300)) # count number of RTs under 300
SpeedCount # display the number of RTs under 300

## [1] 0

SpeedProp <- SpeedCount/nrow(IAT2) # calculate proportion of RTs under 300
SpeedProp # display proportion of RTs under 300

## [1] 0

if (SpeedProp > 0.10) { 
  print("STOP ANALYZING AND EXCLUDE")
  Exclude <- "Yes"}

The next step in the scoring algorithm involves computing means for all correct trials in our key test blocks (blocks 2, 3, 5, and 6, indexed from 0-6).

# Step 3: Compute means of correct trials in blocks 2, 3, 5, and 6 ####
Block2trials <- subset(IAT2, Block==2) # subset data frame for only Block 2 trials 
Block2correct <- subset(Block2trials, Correct==0) # subset data frame for only correct trials
Block2correctMean <- mean(Block2correct$RT) # mean of Block 2 correct trials 
Block3trials <- subset(IAT2, Block==3) # subset data frame for only Block 3 trials 
Block3correct <- subset(Block3trials, Correct==0) # subset data frame for only correct trials
Block3correctMean <- mean(Block3correct$RT) # mean of Block 3 correct trials 
Block5trials <- subset(IAT2, Block==5) # subset data frame for only Block 5 trials 
Block5correct <- subset(Block5trials, Correct==0) # subset data frame for only correct trials
Block5correctMean <- mean(Block5correct$RT) # mean of Block 5 correct trials 
Block6trials <- subset(IAT2, Block==6) # subset data frame for only Block 6 trials 
Block6correct <- subset(Block6trials, Correct==0) # subset data frame for only correct trials
Block6correctMean <- mean(Block6correct$RT) # mean of Block 6 correct trials

We now need to replace the reaction times for incorrect trials within each block with that block’s mean reaction time plus 600 ms (the penalty for an incorrect response).

# Step 4: Replace incorrect trials with average RT by block + 600 ####
newBlock2 <- Block2correctMean + 600
newBlock3 <- Block3correctMean + 600
newBlock5 <- Block5correctMean + 600
newBlock6 <- Block6correctMean + 600

i <- 1 # define i counting variable for while loop 
while (i < nrow(IAT2) + 1) { # create while loop for Block 2 incorrect trial replacement
  if (IAT2$Block[i] == 2 && IAT2$Correct[i] == 1) {
  IAT2$RT[i] <- newBlock2 }
  i <- i + 1
}

i <- 1 # define i counting variable for while loop 
while (i < nrow(IAT2) + 1) { # create while loop for Block 3 incorrect trial replacement
  if (IAT2$Block[i] == 3 && IAT2$Correct[i] == 1) {
    IAT2$RT[i] <- newBlock3 }
  i <- i + 1
}

i <- 1 # define i counting variable for while loop 
while (i < nrow(IAT2) + 1) { # create while loop for Block 5 incorrect trial replacement
  if (IAT2$Block[i] == 5 && IAT2$Correct[i] == 1) {
    IAT2$RT[i] <- newBlock5 }
  i <- i + 1
}

i <- 1 # define i counting variable for while loop 
while (i < nrow(IAT2) + 1) { # create while loop for Block 6 incorrect trial replacement
  if (IAT2$Block[i] == 6 && IAT2$Correct[i] == 1) {
    IAT2$RT[i] <- newBlock6 }
  i <- i + 1
}

The chunk of code below calculates the standard deviation for blocks 2 & 5 and blocks 3 & 6, our key blocks. These values will be used in our final IAT effect size calculation.

# Step 5: Calculate stdevs for all trials in blocks 2&5 and 3&6 ####
IAT3 <- subset(IAT2, Block==2 | Block==5) # pool all trials in blocks 2&5
sd25 <- sd(IAT3$RT) # stdev of all trials in blocks 2&5
IAT4 <- subset(IAT2, Block==3 | Block==6) # pool all trials in blocks 3&6
sd36 <- sd(IAT4$RT) # stdev of all trials in blocks 3&6

We now calculate the means for all of our key test blocks. These values will again be used in our final IAT effect size calculation.

# Step 6: Calculate means for all trials in blocks 2, 3, 5, and 6 #### 
IAT5 <- subset(IAT2, Block==2) # only Block 2 trials
block2mean <- mean(IAT5$RT) # block 2 mean
IAT6 <- subset(IAT2, Block==3) # only Block 3 trials
block3mean <- mean(IAT6$RT) # block 3 mean 
IAT7 <- subset(IAT2, Block==5) # only Block 5 trials
block5mean <- mean(IAT7$RT) # block 5 mean
IAT8 <- subset(IAT2, Block==6) # only Block 6 trials
block6mean <- mean(IAT8$RT) # block 6 mean

We now compute the mean difference between our key test blocks. These values will again be used in our final IAT effect size calculation.

# Step 7: Compute mean differences in test trials #### 
meandiff1 <- block5mean - block2mean # first mean difference (blocks 5 - 2)
meandiff2 <- block6mean - block3mean # second mean difference (blocks 6 - 3)

We now divide each of the differences computed in Step 7 by the standard deviation of all trials within the blocks that were used to compute those differences. These values will again be used in our final IAT effect size calculation.

# Step 8: Divide each difference by its associated stdev #### 
value1 <- meandiff1/sd25
value2 <- meandiff2/sd36

The final step below computes the overall IAT effect size for the participant. The effect size can be interpreted in roughly the same way as you would a typical effect size.

Note that the sign (positive vs. negative) of this effect size depends both on how your IAT is structured and on whether you have counterbalanced conditions (as you should). If your IAT starts with a “congruent” block (i.e., participants need to sort categories on the left that produce what you believe is an association congruent with their beliefs), no further action is required and you can use the value produced in Step 9. If your IAT starts with an “incongruent” block (i.e., participants need to sort categories on the left that produce what you believe is an association incongruent with their beliefs), then you must flip the sign (i.e., from positive to negative or from negative to positive) on the effect size produced in Step 9.

# Step 9: Average these two values to get your IAT effect size ####
IATeffect <- (value1+value2)/2 # calculate IAT effect size 
IATeffect # print IAT effect size

## [1] -0.2378309

if (SpeedProp > 0.10) { 
  print("SUBJECT WENT TOO FAST, EXCLUDE!")
  }

You now have your final effect size for this participant (about -0.24), scored in accordance with the guidelines provided by Greenwald et al. (2003). The average IAT effect size for social group attitudes is 0.29, and the average IAT effect size for social group stereotypes is 0.32 (Nosek et al., 2007). As an extra reference point, see below for the average IAT effect size for different types of IATs:

Average IAT effect sizes for different IATs: 
Race attitude: 0.49
Weight attitude: 0.35
Race-weapons stereotype: 0.37
Gender-science stereotype: 0.37

Note: The code will print “SUBJECT WENT TOO FAST, EXCLUDE!” if and only if the participant responded to over 10% of the questions with a reaction time faster than 300 ms. If not, only the participant’s IAT effect size will print.

barplot(IATeffect, width=0.2, xlim=c(0,1.2), col="darkorange", names.arg=c("Participant A"), ylab="IAT effect size", ylim=c(-0.3,0.3))