Replication of Study Poverty Impedes Cognitive Function by Mani et al. (2013, Science)

Introduction

Justification for choice of study

For this project, I will be replicating Mani et al.’s 2013 Science paper that looked at how economic scarcity affected cognitive function in adults. I chose to replicate this paper because I am interested in researching how poverty-related stressors affect brain development and learning. The results of this paper provided a causal description of an environmental impact on cognition as a result of financial stress, which, to the best of my knowledge, has not yet been examined in school-aged youth. However, before examining whether or not economic scarcity can also affect cognitive systems integral to learning in children and adolescents, I want to first examine the validity scarcity in an adult population. If my replication study is able to successfully reproduce similar results as the original paper, then I hope that this can be further investigated in younger populations.

Description of stimuli and procedures

In this study, participants will first be randomly assigned to review and respond to a series of financially difficult or easy scenarios. Afterward a brief break, they will participate in two cognitive tasks: the first will compose of a series of Raven’s matrices puzzles, often used as a proxy for fluid reasoning; the second task will gauge cognitive control using the spatial incompatibility task. In the spatial incompatibility task, participants will view objects on one side of the screen and will either have to press a button if it is on the same or opposite side depending on the trail’s instructions. Participant’s cognitive control will be measured based on both speed and accuracy.

Anticipated challenges

There are a few challenges when replicating this particular study using an online format. First, in the original study, participants were shoppers at a New Jersey mall, and their data will be compared to participants on Prolific. This could potentially affect replicating the results of the study given the convenience sampling of the original paper. Additionally, the materials—both the original paper and the supplemental materials—do not explicitly mention in sufficient detail how they ensured participants comprehended and carefully examined each scenario. There is also no mention as to how long they had to review them. As a result, it may be difficult to compose similar comprehension checks and induce a scarcity effect via a remote format. Furthermore, with regards to the analysis, it is not quite clear how they grouped participants into “rich” or “poor” categories in the 2013 paper. Prior to running an analysis, I would like to pre-register this decision as to not effect the final results, but how to carefully and properly label someone into these two groups could affect the final results.

Links

Project repository (on Github): Link to repository

Preregistration (on OSF): Link to registration report

Original paper (as hosted in your repo): Link to paper

Link to Task: Click HERE to access my task

Methods

Power Analysis

With an original effect size of approximately d = .88 with 95% power and a .05 aloha level, the planned minimum sample size is 20 total participants.

Planned Sample

Participants from low and high SES backgrounds will be randomly assigned to either a hard or easy condition. In the original study, 101 adults (mean age = 35.3 years; 65 F) participated in their first experiment. While it is not explicitly clear how many participants were in each group, it is highly likely that approximately 50 participants were divided between each condition, and then further divided between high or low SES for a total of 25 participants per group.

Therefore, at the very least, I anticipate to recruit 25 participants per group (50 low and 50 high SES participants each randomly assigned to hard or easy conditions). Additionally, there will be a preference for adults around the age of 35.

Exclusion Critiera

While the original study did not document any exclusion criteria, I will exclude participants if there is any evidence that they are performing the study at chance level (accuracy ~50%). Additionally, after viewing each scenario, participants will be asked if they had sufficient time to view each scenario prior to proceeding to the next one. If participants indicate that they did not have enough to to deliberate on a response, data from these participants will be excluded. And finally, any participant that does not report income data will not be included in the final analysis (as there will be no way to determine which SES group they will belong to).

Materials

All participants will be recruited using Prolific, a platform to recruit participants for online research. After providing informed consent, scenarios and subsequent comprehension check will be assessed on Qualtrics. After submitting the survey, participants will be automatically sent to Cognition.Run where they will take a behavioral task based of Diamond et al. (2003) which was programmed using jsPsych (de Leeuw, 2015).

Procedure

“In experiment 1, participants (n = 101) were presented with four hypothetical scenarios a few minutes apart. Each scenario described a financial problem the participants might experience. For example:”Your car is having some trouble and requires $X to be fixed. You can pay in full, take a loan, or take a chance and forego the service at the moment… How would you go about making this decision?” These scenarios, by touching on monetary issues, are meant to trigger thoughts of the participant’s own finances. They are intended to bring to the forefront any nascent, easy to activate, financial concerns. After viewing each scenario, and while thinking about how they might go about solving the problem, participants performed two computer-based tasks used to measure cognitive function:Raven’s Progressive Matrices and a spatial compatibility task. The Raven’s test involves a sequence of shapes with one shape missing (27) Participants must choose which of several alternatives best fits in the missing space. Raven’s test is a common component in IQ tests and is used to measure “fluid intelligence,” the capacity to think logically and solve problems in novel situations, independent of acquired knowledge (28, 29). The spatial incompatibility task requires participants to respond quickly and often contrary to their initial impulse. Presented with figures on the screen, they must press the same side in response to some stimuli but press the opposite side in response to others. The speed and accuracy of response measures cognitive control (30), the ability to guide thought and action in accordance with internal goals (31). Both are nonverbal tasks, intended to minimize the potential impact of literacy skills. Upon completion of these tasks, participants responded to the original scenario by typing their answers on the computer or speaking to a tape recorder and then moved on to the next scenario (an analysis of participants’ responses to the scenarios is available in table S1). We also collected participants’ income information at the end of the experiment.”

Procedure will follow similar steps as the original study with a few differences given the online nature of this project. First, after reviewing each scenario, participants will complete a comprehension check to ensure that each scenario was carefully read. Since there are no time restrictions on reviewing the survey given the nature of how the task is being completed, it can be likely that I will not replicate a the results not because of a lack of a scarcity effect but rather because participants did not adequately or seriously review each of the scenarios. Therefore, to ensure each participant carefully examined them, a 4-item questionnaire will be administered asked individuals to report back values that they read. Second, I will only be comparing results from the cognitive control task (i.e., the spatial incompatibility task).

Additionally, a major change in my project is that participants will do a significant amount of trials compared to the original study. According to the supplemental materials from Mani et al. (2013), participants only completed 10 practice trials of the heart and flowers task and after answering each one trial correctly could they complete the experimental task. However, the authors report that they only completed three trials. For my task, participants will complete 3 blocks of XX trials.

Analysis Plan

Ultimately, I aim to produce two figures: a bar chart comparing overall accuracy on the cognitive control task between each group; and a scatter plot fitting a regression model for participants in each condition. Ultimately, I aim to produce two figures: a bar chart comparing overall accuracy on the cognitive control task between each group; and a scatter plot fitting a regression model for participants in each condition.

Step 1: There must be a balanced number of participants within each of my 4 groups AND there must be a balanced number of participants from various income backgrounds (i.e., not too many high SES in the easy condition).

Step 2: Will need to use reported household income with household size in order to determine income-to-needs ration (INR) and use U.S. Federal Poverty guidelines to determine if an individual participant is high or low SES. Will probably recruit in at least two waves: first recruit high SES and then recruit low SES. This will help ensure that I accomplish Step 1.

Step 3: To ensure that participants carefully read and internalized each scenario, I will ask participants to report results on a simple comprehension check. I will also use Qualtrics information to identify the length of completion to submit the survey prior. This will be paramount to ensure that participants are in a state of mind where they are thinking about their finances prior to engaging in the cognitive control task.

Step 4: For the cognitive control task, I will need to know two things: their accuracy and reaction time on each trial.

Additional analysis of interest I would like to examine is differences in reaction time between each group. Even if accuracy is intact across conditions, could scarcity also be causing people to react more slowly to the task? If so, would this manifest in longer reaction times for low-income participants in the hard condition compared to all other groups?

Differences from Original Study

The first major difference is the nature of the study: in the original experiment, participants reviewed scenarios and engaged in two cognitive tasks at a mall. For my study, all participants will engage entirely online and will only do one of the cognitive tasks – the cognitive control task by Diamond et al. (2004). Given the online nature, participants will also complete a comprehension check after reviewing scenarios.

Methods Addendum (Post Data Collection)

Actual Sample

50 participants total were recruited via Prolific. 5 participants either did not provided adequate information regarding income or omitted this information entirely. This left a total of 45 participants included in the analysis described below (Mean Income = $33,222; Mean Age = 26.05; Female = 21, Non-Binary = 2, Undisclosed = 1).

Differences from pre-data collection methods plan

The original study analyzed accuracy of only 3 trials from the cognitive control experiment. However, in my study, my analysis included the accuracy from 40 trials between 4 blocks. A seperate analysis will compare performance from only the first three trials of the task from each participant, but the results presented in the overall analysis will include data from each trial.

Results

Data preparation

Data preparation following the analysis plan.

############################################
#####   PREPARE DATA FOR ANALYSIS     ######
############################################ 
library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.1.4     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.4     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(ggpubr)

#### Import data files from Cognition.Run

setwd("~/Desktop/mani2013/data")
easy_data <- read_csv("easy_data.csv") # raw data files for easy condition

## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   .default = col_character(),
##   run_id = col_double(),
##   condition = col_double(),
##   trial_index = col_double(),
##   time_elapsed = col_double(),
##   recorded_at = col_datetime(format = "")
## )
## ℹ Use `spec()` for the full column specifications.

## Warning: 3930 parsing failures.
## row col   expected     actual            file
##   1  -- 38 columns 37 columns 'easy_data.csv'
##   2  -- 38 columns 37 columns 'easy_data.csv'
##   3  -- 38 columns 37 columns 'easy_data.csv'
##   4  -- 38 columns 37 columns 'easy_data.csv'
##   5  -- 38 columns 37 columns 'easy_data.csv'
## ... ... .......... .......... ...............
## See problems(...) for more details.

hard_data <- read_csv("hard_data.csv") # raw data files for hard condition

## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   .default = col_character(),
##   run_id = col_double(),
##   condition = col_double(),
##   trial_index = col_double(),
##   time_elapsed = col_double(),
##   recorded_at = col_datetime(format = "")
## )
## ℹ Use `spec()` for the full column specifications.

# filter columns to ensure DF are the same size
easy_data_filtered <- easy_data %>%
  select(c(PROLIFIC_PID,
           run_id,
           trial_index,
           subject_id,
           group,
           rt,
           response,
           task,
           correct_response,
           accuracy,
           block))

hard_data_filtered <- hard_data %>%
  select(c(PROLIFIC_PID,
           run_id,
           trial_index,
           subject_id,
           group,
           rt,
           response,
           task,
           correct_response,
           accuracy,
           block))

data <- rbind(easy_data_filtered, hard_data_filtered) # merge into one file

############################################
#####   CLEANING DATA FOR ANALYSIS     #####
############################################ 

# remove JSON text from survey responses
data$response <- gsub('\\{"Q0":"','', data$response)
data$response <- gsub('"\\}', '', data$response)
data$response <- gsub("100000+", "100000", data$response)

# filter data and remove NAs

data_filtered <- data %>%
  filter(group == "HARD" |
           group == "EASY",
         !is.na(PROLIFIC_PID))

data_filtered <- within(data_filtered, response[response == "100000+"] <- 100000)
data_filtered <- within(data_filtered, response[response == "10+"] <- 10)

######   INCOME AND SES ANALYSIS     #####

# determine SES groups using federal poverty guidelines
income_data <- data_filtered %>%
  select(c(PROLIFIC_PID,
           run_id,
           task,
           response)) %>%
  filter(task == "household_size" |
           task == "income") %>%
  filter(run_id != 36) %>% # something happening with participant ID
  pivot_wider(names_from = "task", 
              values_from = "response") %>%
  mutate(ses_group = case_when(as.numeric(income) < 12880 & as.numeric(household_size) == 1 ~ "POOR",
                               as.numeric(income) < 17420 & as.numeric(household_size) == 2 ~ "POOR",
                               as.numeric(income) < 21960 & as.numeric(household_size) == 3 ~ "POOR",
                               as.numeric(income) < 26500 & as.numeric(household_size) == 4 ~ "POOR",
                               as.numeric(income) < 31040 & as.numeric(household_size) == 5 ~ "POOR",
                               as.numeric(income) < 35580 & as.numeric(household_size) == 6 ~ "POOR",
                               as.numeric(income) < 40120 & as.numeric(household_size) == 7 ~ "POOR",
                               as.numeric(income) < 44660 & as.numeric(household_size) == 8 ~ "POOR",
                               as.numeric(income) < 49200 & as.numeric(household_size) == 9 ~ "POOR",
                               as.numeric(income) < 53740 & as.numeric(household_size) == 10 ~ "POOR",
                               as.numeric(income) > 12880 & as.numeric(household_size) == 1 ~ "RICH",
                               as.numeric(income) > 17420 & as.numeric(household_size) == 2 ~ "RICH",
                               as.numeric(income) > 21960 & as.numeric(household_size) == 3 ~ "RICH",
                               as.numeric(income) > 26500 & as.numeric(household_size) == 4 ~ "RICH",
                               as.numeric(income) > 31040 & as.numeric(household_size) == 5 ~ "RICH",
                               as.numeric(income) > 35580 & as.numeric(household_size) == 6 ~ "RICH",
                               as.numeric(income) > 40120 & as.numeric(household_size) == 7 ~ "RICH",
                               as.numeric(income) > 44660 & as.numeric(household_size) == 8 ~ "RICH",
                               as.numeric(income) > 49200 & as.numeric(household_size) == 9 ~ "RICH",
                               as.numeric(income) > 53740 & as.numeric(household_size) == 10 ~ "RICH"))

income_data$ses_group <- factor(income_data$ses_group, levels = c("POOR",
                                                                  "RICH"))

data_filtered <- merge(data_filtered, income_data, by = c('PROLIFIC_PID')) # add to main dataframe

####   AGE, RACE, & GENDER ANALYSIS     ####
demo_data <- data_filtered %>%
  filter(task == "demographic") %>%
  separate(response, c("Age", "DOB", "Race", "Gender"), sep = "([,])") %>%
  select(PROLIFIC_PID,
         Age,
         DOB,
         Race,
         Gender)

# remove JSON text from survey responses
demo_data$Age <- gsub('"', '', as.character(demo_data$Age))
demo_data$DOB <- gsub('"Q1":"', '', demo_data$DOB)
demo_data$DOB <- gsub('"', '', demo_data$DOB)
demo_data$Race <- gsub('"Q2":"', '', demo_data$Race)
demo_data$Race <- gsub('"', '', demo_data$Race)
demo_data$Gender <- gsub('"Q3":"', '', demo_data$Gender)

# updating Gender Column

demo_data$Gender <- gsub("[Mm]ale", "M", demo_data$Gender)
demo_data$Gender <- gsub("Man", "M", demo_data$Gender)
demo_data$Gender <- gsub("Woman", "F", demo_data$Gender)
demo_data$Gender <- gsub("[Ff]emale", "F", demo_data$Gender)
demo_data$Gender <- gsub("[Ff]eM", "F", demo_data$Gender)
demo_data$Gender <- gsub("femme", "F", demo_data$Gender)
demo_data$Gender <- gsub("non binary", "NB", demo_data$Gender)
demo_data$Gender <- gsub('"Q3":', "", demo_data$Gender)

# add demo data to main dataframe
data_filtered <- merge(data_filtered, demo_data, by = c('PROLIFIC_PID')) 

###########   CLEANING DATA    #############

# this separates data by block per participant
clean_data <- data_filtered %>%
  filter(task == "response") %>%
  group_by(PROLIFIC_PID, group, ses_group, block, 
           income, Age, Gender) %>%
  summarize(mean_acc = mean(as.numeric(accuracy)),
            mean_rt = mean(as.numeric(rt)),
            acc_sd = sd(as.numeric(accuracy)),
            acc_n_obs = length(as.numeric(accuracy)),
            acc_sem = acc_sd / sqrt(acc_n_obs),
            acc_ci = acc_sem * 1.96,
            rt_sd = sd(as.numeric(rt)),
            rt_n_obs = length(as.numeric(rt)),
            rt_sem = rt_sd / sqrt(rt_n_obs),
            rt_ci = rt_sem * 1.96
            )

## `summarise()` has grouped output by 'PROLIFIC_PID', 'group', 'ses_group', 'block', 'income', 'Age'. You can override using the `.groups` argument.

# this has the overall average performance between all trials for each participant
cleaner_data <- data_filtered %>%
  filter(task == "response") %>%
  group_by(PROLIFIC_PID, group, ses_group, 
           income, Age, Gender) %>%
  summarize(mean_acc = mean(as.numeric(accuracy)),
            mean_rt = mean(as.numeric(rt)),
            acc_sd = sd(as.numeric(accuracy)),
            acc_n_obs = length(as.numeric(accuracy)),
            acc_sem = acc_sd / sqrt(acc_n_obs),
            acc_ci = acc_sem * 1.96,
            rt_sd = sd(as.numeric(rt)),
            rt_n_obs = length(as.numeric(rt)),
            rt_sem = rt_sd / sqrt(rt_n_obs),
            rt_ci = rt_sem * 1.96
  )

## `summarise()` has grouped output by 'PROLIFIC_PID', 'group', 'ses_group', 'income', 'Age'. You can override using the `.groups` argument.

# change age and income to numeric
cleaner_data$Age <- as.numeric(cleaner_data$Age)
cleaner_data$income <- as.numeric(cleaner_data$income)

############################################
###########   TASK ANALYSIS    #############
############################################ 

#########   DEMOGRAPHIC ANALYSIS    ############
age_graph <- ggplot(cleaner_data, aes(x = Age)) +
  geom_bar() + 
  ggtitle("Distribution by age") + 
  ylab("# of Participants")
age_graph

## Warning: Removed 1 rows containing non-finite values (stat_count).

# What is the average age? 26.05 (rounded)
mean(cleaner_data$Age, na.rm = TRUE) # some did not report age

## [1] 26.04545

income_graph <- ggplot(cleaner_data, aes(x = income)) + 
  geom_bar() +
  ggtitle("Income Distribution") + 
  ylab("# of Participants") + xlab("Approximate Total Income (thousands)")
income_graph

# What is the average income? 33,222.22 
mean(cleaner_data$income) # no NA since they were removed

## [1] 33222.22

# What is the total gender breakdown? # 21 F, 21, M, 2 NB, 1 omitted
gender_table <- demo_data %>%
  group_by(Gender) %>%
  summarise(total = n())

#########   OVERALL ANALYSIS    ############

# this is to create Figure 1: comparing between SES and conditions
task_perf <- cleaner_data %>%
  group_by(group, ses_group) %>%
  summarize(accuracy = mean(mean_acc),
            rt = mean(mean_rt),
            acc_sd = sd(mean_acc),
            acc_n_obs = length(PROLIFIC_PID),
            acc_sem = acc_sd / sqrt(acc_n_obs),
            acc_ci = acc_sem * 1.96,
            rt_sd = sd(mean_rt),
            rt_n_obs = length(PROLIFIC_PID),
            rt_sem = rt_sd / sqrt(rt_n_obs),
            rt_ci = rt_sem * 1.96,
            mean_age = mean(as.numeric(Age), na.rm = TRUE))

## `summarise()` has grouped output by 'group'. You can override using the `.groups` argument.

block_perf <- clean_data %>%
  group_by(group, ses_group, block) %>%
  summarize(accuracy = mean(mean_acc),
            rt = mean(mean_rt),
            acc_sd = sd(mean_acc),
            acc_n_obs = length(PROLIFIC_PID),
            acc_sem = acc_sd / sqrt(acc_n_obs),
            acc_ci = acc_sem * 1.96,
            rt_sd = sd(mean_rt),
            rt_n_obs = length(PROLIFIC_PID),
            rt_sem = rt_sd / sqrt(rt_n_obs),
            rt_ci = rt_sem * 1.96)

## `summarise()` has grouped output by 'group', 'ses_group'. You can override using the `.groups` argument.

##########   ORIGINAL ANALYSIS    ############

# get only the first three trials from each participant
row1 <- data_filtered %>%
  filter(task == "response",
         block == 1) %>%
  group_by(PROLIFIC_PID) %>%
  filter(row_number()==1)

row2 <- data_filtered %>%
  filter(task == "response",
         block == 1) %>%
  group_by(PROLIFIC_PID) %>%
  filter(row_number()==2)

row3 <- data_filtered %>%
  filter(task == "response",
         block == 1) %>%
  group_by(PROLIFIC_PID) %>%
  filter(row_number()==3)

# create data frame with each participant and their data
original_data <- rbind(row1, row2, row3)

# analyze by group and SES
original_perf <- original_data %>%
  group_by(group, ses_group) %>%
  summarize(mean_acc = mean(as.numeric(accuracy)),
            mean_rt = mean(as.numeric(rt)),
            acc_sd = sd(as.numeric(accuracy)),
            acc_n_obs = length(PROLIFIC_PID),
            acc_sem = acc_sd / sqrt(acc_n_obs),
            acc_ci = acc_sem * 1.96,
            rt_sd = sd(as.numeric(rt)),
            rt_n_obs = length(PROLIFIC_PID),
            rt_sem = rt_sd / sqrt(rt_n_obs),
            rt_ci = rt_sem * 1.96)

## `summarise()` has grouped output by 'group'. You can override using the `.groups` argument.

Confirmatory analysis

My analysis will be successful if I am able to reproduce the following:

No significant differences emerge between participants in the easy condition based on a t-test
A t-test reveals that low SES participants in the hard condition will have lower accuracy
A two-way analysis of variance (ANOVA) will highlight a robust interaction between income and condition
Regression models will confirm that as income will be predictive of accuracy for participants in the hard condition only (and not be predictive for participants in the easy condition)

############################################
######   ACCURACY AND RT FIGURES     #######
############################################  
task_perf$group <- factor(task_perf$group, levels = c('HARD', 'EASY'))

# Create Figure 1 - Accuracy Bar Chart
figure1 <- ggplot(task_perf, aes(x = ses_group, y = accuracy, 
                                 fill = group)) +
  geom_bar(position="dodge", stat="identity") + # ADD COLOR-BLIND PALETTE
  geom_errorbar(aes(ymin = accuracy - acc_ci, 
                    ymax = accuracy + acc_ci), width=.2,
                position=position_dodge(.9)) +
  ggtitle("Figure 1: Accuracy by Group") +
  ylab("Mean Accuracy (%)") + xlab("SOCIOECONOMIC STATUS") +
  coord_cartesian(ylim=c(.50,1)) + 
  scale_fill_manual(values=c("#999999","#D3D3D3"))
figure1

# Create Figure 2 - Accuracy Regression Chart
figure2 <- ggplot(cleaner_data, 
                  aes(x = cleaner_data$income, 
                      y = as.numeric(cleaner_data$mean_acc),
                      color = group)) +
  geom_point() + geom_jitter() + 
  scale_color_manual(values=c("#000000", "#999999")) +
  geom_smooth(method=lm, aes(fill = group)) +
  ggtitle("Figure 2: Accuracy by Income and Group") +
  labs(y = "Mean Accuracy", x = "Income (in thousands)")
figure2

## `geom_smooth()` using formula 'y ~ x'

############################################
#######    STATISTICAL ANALYSIS     ########
############################################ 

#######  ANALYSIS OF OVERALL DATA   ########

### TWO-WAY ANOVA
accuracy_aov <- aov(mean_acc ~ group * ses_group,
                    cleaner_data)
summary(accuracy_aov)

##                 Df  Sum Sq   Mean Sq F value Pr(>F)
## group            1 0.00071 0.0007115   0.733  0.397
## ses_group        1 0.00071 0.0007144   0.736  0.396
## group:ses_group  1 0.00123 0.0012307   1.268  0.267
## Residuals       41 0.03979 0.0009704

### Regression Analysis
accuracy_model <- lm(mean_acc ~ group * ses_group, 
                     cleaner_data)
summary(accuracy_model)

## 
## Call:
## lm(formula = mean_acc ~ group * ses_group, data = cleaner_data)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.068182 -0.015909  0.006818  0.018333  0.037500 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              0.981667   0.008043 122.047   <2e-16 ***
## groupHARD               -0.015758   0.012366  -1.274    0.210    
## ses_groupRICH           -0.019167   0.013638  -1.405    0.167    
## groupHARD:ses_groupRICH  0.021439   0.019038   1.126    0.267    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03115 on 41 degrees of freedom
## Multiple R-squared:  0.06259,    Adjusted R-squared:  -0.006002 
## F-statistic: 0.9125 on 3 and 41 DF,  p-value: 0.4433

Exploratory analyses

After careful consideration, I decided to make alterations to the task design slightly to examine a few ideas based on a more careful review of the task design.

Reaction Time Accuracy

First, in addition to analyzing accuracy, I also wanted to observe differences in reaction time (RT) between SES groups and condition (Figures 3 and 4).

# Create Figure 3 - RT Bar Chart
figure3 <- ggplot(task_perf, aes(x = ses_group, y = rt, 
                                 fill = group)) +
  geom_bar(position="dodge", stat="identity") + 
  geom_errorbar(aes(ymin = rt - rt_ci, 
                    ymax = rt + rt_ci), width=.2,
                position=position_dodge(.9)) +
  ggtitle("Figure 3: Reaction Time by Group") +
  ylab("RT (ms)") + xlab("SOCIOECONOMIC STATUS") + 
  scale_fill_manual(values=c("#D3D3D3", "#999999"))
figure3

### TWO-WAY ANOVA
rt_aov <- aov(mean_rt ~ group * ses_group,
                    cleaner_data)
summary(rt_aov)

##                 Df  Sum Sq Mean Sq F value Pr(>F)
## group            1   55471   55471   1.030  0.316
## ses_group        1  104886  104886   1.948  0.170
## group:ses_group  1   49788   49788   0.925  0.342
## Residuals       41 2207217   53835

# Create Figure 4 - RT Regression Chart
figure4 <- ggplot(cleaner_data, 
                  aes(x = as.numeric(cleaner_data$income), 
                      y = as.numeric(mean_rt),
                      color = group)) +
  geom_point() + geom_jitter() + 
  scale_color_manual(values=c("#000000", "#999999")) +
  geom_smooth(method=lm, aes(fill = group)) +
  ggtitle("Figure 4: Reaction Time by Income and Group") +
  labs(y = "Mean Reaction Time (ms)", x = "Income (in thousands)")
figure4

## `geom_smooth()` using formula 'y ~ x'

### Regression Analysis
rt_model <- lm(mean_rt ~ group * ses_group, 
                     cleaner_data)
summary(rt_model)

## 
## Call:
## lm(formula = mean_rt ~ group * ses_group, data = cleaner_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -461.35 -170.44  -62.45  104.38  595.99 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               843.10      59.91  14.073   <2e-16 ***
## groupHARD                -112.72      92.10  -1.224    0.228    
## ses_groupRICH            -168.91     101.58  -1.663    0.104    
## groupHARD:ses_groupRICH   136.36     141.80   0.962    0.342    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 232 on 41 degrees of freedom
## Multiple R-squared:  0.08693,    Adjusted R-squared:  0.02012 
## F-statistic: 1.301 on 3 and 41 DF,  p-value: 0.287

Analysis of Block Data

Next, the original study calculated accuracy for each participant based on data from only three trials. I wanted to see if there were any differences in performance throughout the task and whether or not the the effects of scarcity would diminish with more trials. To observe that, I compared the accuracy between conditions and groups (Figure 5) as well as observe any differences in RT (Figure 6)

# Create Figure 5 - Accuracy by Block
figure5 <- ggplot(block_perf, aes(x = ses_group, 
                                  y = accuracy,
                                  fill = group)) +
  geom_bar(position="dodge", stat="identity") + 
  geom_errorbar(aes(ymin = accuracy - acc_ci, 
                    ymax = accuracy + acc_ci), width=.2,
                position=position_dodge(.9)) +
  facet_wrap(~ block) +
  ggtitle("Figure 5: Accuracy by Block") +
  ylab("Accuracy (mean %)") + xlab("SOCIOECONOMIC STATUS")
figure5

# regression analysis for block, group, income on accuracy

accuracy_block_model <- lm(mean_acc ~ group + ses_group + block, clean_data)
summary(accuracy_block_model)

## 
## Call:
## lm(formula = mean_acc ~ group + ses_group + block, data = clean_data)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.262239  0.008827  0.023703  0.033317  0.045926 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.973395   0.009810  99.220   <2e-16 ***
## groupHARD     -0.006712   0.008497  -0.790    0.431    
## ses_groupRICH -0.008164   0.008600  -0.949    0.344    
## block2         0.004444   0.011870   0.374    0.709    
## block3        -0.004444   0.011870  -0.374    0.709    
## block4         0.017778   0.011870   1.498    0.136    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.05631 on 174 degrees of freedom
## Multiple R-squared:  0.03185,    Adjusted R-squared:  0.00403 
## F-statistic: 1.145 on 5 and 174 DF,  p-value: 0.3386

# Create Figure 6 - Reaction Time by Block
figure6 <- ggplot(block_perf, aes(x = ses_group, 
                                  y = rt, 
                                  fill = group)) +
  geom_bar(position="dodge", stat="identity") +
  geom_errorbar(aes(ymin = rt - rt_ci, 
                    ymax = rt + rt_ci), width=.2,
                position=position_dodge(.9)) +
  facet_wrap(~ block) + 
ggtitle("Figure 6: Reaction Time by Block") +
  ylab("RT (ms)") + xlab("SOCIOECONOMIC STATUS")
figure6

# regression analysis for block, group, income on RT
rt_block_model <- lm(mean_rt ~ group + ses_group + block, clean_data)
summary(rt_block_model)

## 
## Call:
## lm(formula = mean_rt ~ group + ses_group + block, data = clean_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -484.25 -163.10  -47.68  117.86 1232.77 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     854.05      43.24  19.750  < 2e-16 ***
## groupHARD       -55.18      37.45  -1.473  0.14246    
## ses_groupRICH   -98.93      37.91  -2.610  0.00985 ** 
## block2          -18.64      52.32  -0.356  0.72213    
## block3          -61.45      52.32  -1.175  0.24178    
## block4          -61.05      52.32  -1.167  0.24491    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 248.2 on 174 degrees of freedom
## Multiple R-squared:  0.06708,    Adjusted R-squared:  0.04027 
## F-statistic: 2.502 on 5 and 174 DF,  p-value: 0.03234

Results from Figure 5 do not seem to suggest that there appears to be any differences in accuracy between condition, SES groups, or blocks. Thus, it seems highly unlikely from my data set that there was any shift in performance as participants progressed in my study.

Interestingly, however, there does appear to be an emerging pattern of behavior when comparing RT. Data from a linear regression model suggest that there was a significant finding when looking at the effects of income (P = 0.009), such that participants in low-income group were responding slower to the task compared to high-income groups.

Analysis following original paper

Finally, I wanted to see what patterns would emerge if I faithfully replicated the experimental analysis as documented in Mani et al., 2013 (i.e., accuracy among three trials). Figure 7 illustrates the performance between condition and SES group analyzing data using only the first three trials for each participant.

# Create Figure 7 - Accuracy Bar Chart (Original Analysis)
figure7 <- ggplot(original_perf, aes(x = ses_group, y = mean_acc, 
                                 fill = group)) +
  geom_bar(position="dodge", stat="identity") + # ADD COLOR-BLIND PALETTE
  geom_errorbar(aes(ymin = mean_acc - acc_ci, 
                    ymax = mean_acc + acc_ci), width=.2,
                position=position_dodge(.9)) +
  ggtitle("Figure 7: Accuracy by Group — Original Analysis Plan") +
  ylab("Mean Accuracy (%)") + xlab("SOCIOECONOMIC STATUS") +
  coord_cartesian(ylim=c(.50,1)) + 
  scale_fill_manual(values=c("#999999","#D3D3D3"))
figure7

accuracy_original_model <- aov(accuracy ~ group * ses_group, original_data)
summary(accuracy_original_model)

##                  Df Sum Sq Mean Sq F value Pr(>F)
## group             1  0.006 0.00586   0.162  0.688
## ses_group         1  0.034 0.03387   0.934  0.336
## group:ses_group   1  0.027 0.02686   0.741  0.391
## Residuals       131  4.748 0.03625

My results did not suggest that any statistically significant patterns emerged using this analysis plan.

Discussion

Summary of Replication Attempt

The original paper reported that a two-way analysis of variance (ANOVA) “revealed a robust interaction between income and condition”, with a p-value of < .001 for performance on the cognitive control task. However, my analysis did not achieve similar findings. Instead, my two-way ANOVA indicated that there were no significant interaction effects between income and group [F(1,41) = 1.268, P = 0.267]. Figures 1 and 2 also suggest that no such differences would emerge between groups either through a t-test or a regression analysis as the authors had conducted. As a result, this experiment failed to replicate the results as highlighted by the authors in Mani et al., (2013).

Commentary

Examining the exploratory analysis suggests that the participants may not accurately come from “poor” backgrounds as originally defined by the authors of the original study. For example, examining the spread of the results from Figure X, it appears that many of the individuals that reported income below $10,000 are between the ages of 18-22 (the approximate age of college students). However, while they may not make income from full-time employment, this does not suggest that they inhabit economically scarce conditions or that their current life is dependent on anything from school. As a result, I believe that the failure to replicate came from a lack of more stringent recruitment policies that more narrowly targeted working class adults who were financial dependent on their income. One way to examine this in the future is to restrict participants to individuals who are employed rather than students.

However, while it is possible that the participant profile contributed to the overall results highlighted in this replication report, it is worth mentioning the ceiling effect that is observed in Figure 1. Virtually all participants answered every trial correctly, with only a handful of participants that inaccurately responded. In fact, the lowest accuracy reported in my data is 90%, which means that of the 40 trials, a few missed only 4 trials. This is substantially different from the approximately 66% average accuracy in participants from low-income backgrounds in the hard condition as we were anticipating. But such an average is more plausible given their design of only analyzing the results of 3 trials. Indeed, analyzing only 3 responses means the range of results from a single participant can include one of four values: 0%, 33%, 66%, or 100%. In my study, if participants missed two trials, they would have an accuracy of 95%; in the original study, missing two trials would result in a 33% accuracy. So even if I were to have participants that may come from low-income backgrounds, such results may only materialize if I faithfully replicated the original analysis. Nevertheless, it would be worthwhile to examine, if this did replicate under these conditions, how more trials affect the performance, and if more trials mitigate any differences in accuracy over time.