We aim to analyze the effect size of experiments done with the Linda Problem Paradigm, coined by Tversky in the 1983 seminal paper.

PRISMA diagram

The following PRISMA diagram summarizes the papers

# my_prisma_plot <- prisma2(found = 432, # Count of unique papers found through database searches, namely Google Scholar and JSTOR
#         found_other = 0,  # Papers found through other sources
#         screened = 432, # Papers screened by scanning the title and abstract
#         screen_exclusions = 295, # Of those screened, number of papers excluded
#         full_text = 127, # Of those not previously excluded, papers screened by reading full text
#         full_text_exclusions = 75, # Of those screened by full text, number of papers excluded
#         quantitative = 52, # Final Count of unique papers in this meta-analysis
#         width = 800, height = 800)
# 
# my_prisma_plot

# prisma_pdf(my_prisma_plot, "plots/updated_prisma_plot.pdf")
# knitr::include_graphics(path =  "plots/updated_prisma_plot.pdf")

Forest Plot

This plot displays the effect size of each experiment from the 52 unique papers we included.

MA_DATA_PATH <- "Updated MA data tidy with ES.csv"

Read in data

ma_data <- read_csv(MA_DATA_PATH) 
# kable(ma_data)

ma_model <- rma(data = ma_data, measure="PLO", xi=n_at_least_one_error, ni=n_1)
#measure = "PLO" transforms into Log Odds
ma_model

## 
## Random-Effects Model (k = 200; tau^2 estimator: REML)
## 
## tau^2 (estimated amount of total heterogeneity): 1.1895 (SE = 0.1354)
## tau (square root of estimated tau^2 value):      1.0906
## I^2 (total heterogeneity / total variability):   95.05%
## H^2 (total variability / sampling variability):  20.22
## 
## Test for Heterogeneity:
## Q(df = 199) = 2220.8108, p-val < .0001
## 
## Model Results:
## 
## estimate      se    zval    pval   ci.lb   ci.ub 
##   0.4270  0.0826  5.1702  <.0001  0.2652  0.5889  *** 
## 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

forest(ma_model)

# pdf("plots/forest_plot.pdf", height = 35, width = 13.5)
forest(ma_model,
       header = T,
       slab = ma_data$short_cite,
       col = "red",
       cex = .6,
       xlab = "Effect Size",
       top = 0)

# dev.off()

First, the proportion of participants that committed at least one conjunction fallacy were converted to an effect size. These effect sizes were scaled using logit transformed proportion method, then plotted.

Each row represents one effect size and corresponding 95% confidence intervals. The size of each plotted effect size (each square) corresponds directly with the sample size - a larger sample size results in a larger effect size square plotted.

We see that there is a small, positive meta-analytic effect size of 0.43. There are no obvious outliers in this dataset, with the smallest plotted effect size of -2.44 and the largest plotted effect size of 4.80. There is a curious alignment of three effect sizes of -2.40 from Wells (1985). Through revisiting the papers, it was confirmed that these effect sizes are indeed calculated correctly and the homogeneity is likely due to a small sample size.

This positive meta-analytic effect size suggests that the conjunction fallacy does indeed occur. Below, possible moderators of this effect are included and analyzed.

Funnel Plot

This plot displays the relationship between the effect size and standard error. The white triangle region corresponds to a 95% confidence interval around the meta-analytic effect size (0.43), which is shown with a vertical line.

# pdf("plots/funnel_plot.pdf", height = 6, width = 6)
funnel(ma_model)

# dev.off()

Overall, we observe that there is an asymmetrical distribution around the meta-analytic effect size. Below a standard error of 0.721, we see that the distribution appears symmetrical and seems to be evenly within and without the 95% confidence interval.

Above a standard error of 0.721, we see that there are only positive effect sizes reported, and they tend to lie outside the 95% confidence interval of the meta-analytic effect size.

Because a smaller standard error signals a larger sample size, it appears that experiments with small sample sizes and positive effect sizes tend to publish their results while those with small sample sizes and negative effect sizes tend to forgo publishing their findings. Experiments with large sample sizes tend to publish their findings without regard to the sign of the effect size.

Moderator Plots

Linda Problem

cis_by_linda <- ma_data %>%
    group_by(linda_problem) %>%
    summarize(mean = mean(d_calc),
            sd = sd(d_calc),
            n = n()) %>%
    mutate(ci_range_95 =  1.96 * (sd/sqrt(n)),
         ci_lower = mean - ci_range_95,
         ci_upper = mean + ci_range_95)

# pdf("plots/linda_problem_yes_no.pdf", height = 6, width = 6)

ggplot(ma_data, aes(x = linda_problem, 
                    y = d_calc, 
                    color = linda_problem)) +
  geom_violin() +
  geom_point(alpha = .4)  +
  ylab("Effect Size") +
  xlab("Linda Problem") +
  ggtitle("Effect Size by Linda Problem") +
  geom_pointrange(data = cis_by_linda, 
                  aes(x = linda_problem, 
                      y = mean, ymin = ci_lower, 
                      ymax = ci_upper), 
                  color = "black") +
  geom_hline(aes(yintercept = 0), linetype = 2) +
  theme_classic(base_size = 16) +
  theme(legend.position = "none")

# dev.off()

When plotting the effect sizes of experiments that use the exact wording of the original Linda Problem next to the effect sizes of those that used other conjunction fallacy problems, we see that there is significant overlap in the effect size and confidence intervals.

This suggests that the original effect found by Tversky and Kaneman in the 1983 seminal paper was not solely due the wording of the original Linda Problem, but rather stemming from an underlying suspectibility to the much broader conjunction fallacy.

Text Format

cis_by_text <- ma_data %>%
    group_by(text_format) %>%
    summarize(mean = mean(d_calc),
            sd = sd(d_calc),
            n = n()) %>%
    mutate(ci_range_95 =  1.96 * (sd/sqrt(n)),
         ci_lower = mean - ci_range_95,
         ci_upper = mean + ci_range_95)

# pdf("plots/text_format.pdf", height = 6, width = 6)

ggplot(ma_data, aes(x = text_format, 
                    y = d_calc, 
                    color = text_format)) +
  geom_violin() +
  geom_point(alpha = .4)  +
  ylab("Effect Size") +
  xlab("Text Format") +
  ggtitle("Effect Size by Text Format") +
  geom_pointrange(data = cis_by_text, 
                  aes(x = text_format, 
                      y = mean, ymin = ci_lower, 
                      ymax = ci_upper), 
                  color = "black") +
  geom_hline(aes(yintercept = 0), linetype = 2) +
  theme_classic(base_size = 16) +
  theme(legend.position = "none")

# dev.off()

Text format corresponds to the means through which participants accessed the material of the experiment. There were three possibilities: booklet, computer, and missing. Booklet was recorded if the experiment material was presented through physical paper, computer if accessed digitally, and missing if the text format was not mentioned.

We observe that the mean effect sizes and confidence intervals for each text format largely overlap, suggesting that the text format has a limited effect on the committing of a conjunction fallacy by the participant.

Computer Medium

cis_by_computer <- ma_data %>%
    group_by(computer_medium) %>%
    summarize(mean = mean(d_calc),
            sd = sd(d_calc),
            n = n()) %>%
    mutate(ci_range_95 =  1.96 * (sd/sqrt(n)),
         ci_lower = mean - ci_range_95,
         ci_upper = mean + ci_range_95)

# pdf("plots/computer_online_vs_in-lab.pdf", height = 6, width = 6)

ggplot(ma_data, aes(x = computer_medium, 
                    y = d_calc, 
                    color = computer_medium)) +
  geom_violin() +
  geom_point(alpha = .4)  +
  ylab("Effect Size") +
  xlab("Computer Medium") +
  ggtitle("Effect Size by Computer Medium") +
  geom_pointrange(data = cis_by_computer, 
                  aes(x = computer_medium, 
                      y = mean, ymin = ci_lower, 
                      ymax = ci_upper), 
                  color = "black") +
  geom_hline(aes(yintercept = 0), linetype = 2) +
  theme_classic(base_size = 16) +
  theme(legend.position = "none")

# dev.off()

Computer medium is a moderator coded only for experiments that presented materials to participants through a digital format. In-lab corresponds to experiments conducted on a lab device, and online corresponds to experiments conducted on a personal electronic device.

As in-lab experiments typically involve more interaction with a human experimenter, we were curious as to whether interaction may influence the effect size.

We see that there is a larger mean effect size for experiments performed in the lab when compared to the effect size for experiments performed on an individual electronic device with no direct human interaction.

When interacting with an experimenter, it is likely that some verbal exchange has occurred between the experimenter and participant as necessitated by social cues. This verbal exchange may contain instructions pertaining to the task, which may have influenced the participant’s results.

Naive or Informed

cis_by_participant_status <- ma_data %>%
    group_by(naive_or_informed) %>%
    summarize(mean = mean(d_calc),
            sd = sd(d_calc),
            n = n()) %>%
    mutate(ci_range_95 =  1.96 * (sd/sqrt(n)),
         ci_lower = mean - ci_range_95,
         ci_upper = mean + ci_range_95)

# pdf("plots/naive_or_informed.pdf", height = 6, width = 6)

ggplot(ma_data, aes(x = naive_or_informed, 
                    y = d_calc, 
                    color = naive_or_informed)) +
  geom_violin() +
  geom_point(alpha = .4)  +
  ylab("Effect Size") +
  xlab("Participant Status") +
  ggtitle("Effect Size by Level of Participant Knowledge") +
  geom_pointrange(data = cis_by_participant_status, 
                  aes(x = naive_or_informed, 
                      y = mean, ymin = ci_lower, 
                      ymax = ci_upper), 
                  color = "black") +
  geom_hline(aes(yintercept = 0), linetype = 2) +
  theme_classic(base_size = 15) +
  theme(legend.position = "none")

# dev.off()

This moderator corresponds to the amount of previous statistical knowledge of the participants. The key to avoiding a conjunction fallacy is knowledge of probability. Having two conditions apply to a situation at the same time is less likely than having one condition apply to a situation. Thus, we reasoned that prior statistical knowledge might prevent a participant from committing a conjunction fallacy.

We observe that the mean effect sizes and confidence intervals for both naive and informed participants largely overlap, suggesting that previous statistical knowledge has a limited effect on the committing of a conjunction fallacy by the participant.

Context Specific?

cis_by_stimuli_relevance <- ma_data %>%
    group_by(stimuli_context_specific) %>%
    summarize(mean = mean(d_calc),
            sd = sd(d_calc),
            n = n()) %>%
    mutate(ci_range_95 =  1.96 * (sd/sqrt(n)),
         ci_lower = mean - ci_range_95,
         ci_upper = mean + ci_range_95)

# pdf("plots/stimuli_context_specific.pdf", height = 6, width = 6)

ggplot(ma_data, aes(x = stimuli_context_specific, 
                    y = d_calc, 
                    color = stimuli_context_specific)) +
  geom_violin() +
  geom_point(alpha = .4)  +
  ylab("Effect Size") +
  xlab("Stimuli Context Specific") +
  ggtitle("Effect Size by Stimuli Relevance") +
  geom_pointrange(data = cis_by_stimuli_relevance, 
                  aes(x = stimuli_context_specific, 
                      y = mean, ymin = ci_lower, 
                      ymax = ci_upper), 
                  color = "black") +
  geom_hline(aes(yintercept = 0), linetype = 2) +
  theme_classic(base_size = 16) +
  theme(legend.position = "none")

# dev.off()

Group vs Individial

cis_by_testing_type <- ma_data %>%
    group_by(group_or_individual_testing) %>%
    summarize(mean = mean(d_calc),
            sd = sd(d_calc),
            n = n()) %>%
    mutate(ci_range_95 =  1.96 * (sd/sqrt(n)),
         ci_lower = mean - ci_range_95,
         ci_upper = mean + ci_range_95)

# pdf("plots/group_or_individual.pdf", height = 6, width = 6)
ggplot(ma_data, aes(x = group_or_individual_testing, 
                    y = d_calc, 
                    color = group_or_individual_testing)) +
  geom_violin() +
  geom_point(alpha = .4)  +
  ylab("Effect Size") +
  xlab("Group or Individual") +
  ggtitle("Effect Size by Testing Type") +
  geom_pointrange(data = cis_by_testing_type, 
                  aes(x = group_or_individual_testing, 
                      y = mean, ymin = ci_lower, 
                      ymax = ci_upper), 
                  color = "black") +
  geom_hline(aes(yintercept = 0), linetype = 2) +
  theme_classic(base_size = 16) +
  theme(legend.position = "none")

# dev.off()

In this moderator, ‘individual’ corresponds to a participant that was tested individually, and ‘group’ corresponds to participants that were given experimental material or instructions collectively. Experiments done with a group have either had participants individually fill out the final answers or collaboratively

experiments typically involve more interaction with a human experimenter, we were curious as to whether interaction may influence the effect size.

Group Size

cis_by_group_size <- ma_data %>%
    group_by(group_size) %>%
    summarize(mean = mean(d_calc),
            sd = sd(d_calc),
            n = n()) %>%
    mutate(ci_range_95 =  1.96 * (sd/sqrt(n)),
         ci_lower = mean - ci_range_95,
         ci_upper = mean + ci_range_95)

# pdf("plots/group_size.pdf", height = 6, width = 6)

ggplot(ma_data, aes(x = group_size, 
                    y = d_calc, 
                    color = group_size)) +
  geom_violin() +
  geom_point(alpha = .4)  +
  ylab("Effect Size") +
  xlab("Group Size") +
  ggtitle("Effect Size by Group Size") +
  geom_pointrange(data = cis_by_group_size, 
                  aes(x = group_size, 
                      y = mean, ymin = ci_lower, 
                      ymax = ci_upper), 
                  color = "black") +
  geom_hline(aes(yintercept = 0), linetype = 2) +
  theme_classic(base_size = 16) +
  theme(legend.position = "none")

# dev.off()

Language

cis_by_language <- ma_data %>%
    group_by(language) %>%
    summarize(mean = mean(d_calc),
            sd = sd(d_calc),
            n = n()) %>%
    mutate(ci_range_95 =  1.96 * (sd/sqrt(n)),
         ci_lower = mean - ci_range_95,
         ci_upper = mean + ci_range_95)

# pdf("plots/language.pdf", height = 6, width = 6)

ggplot(ma_data, aes(x = language, 
                    y = d_calc, 
                    color = language)) +
  geom_violin() +
  geom_point(alpha = .4)  +
  ylab("Effect Size") +
  xlab("Group or Individual") +
  ggtitle("Effect Size by Testing Type") +
  geom_pointrange(data = cis_by_language, 
                  aes(x = language, 
                      y = mean, ymin = ci_lower, 
                      ymax = ci_upper), 
                  color = "black") +
  geom_hline(aes(yintercept = 0), linetype = 2) +
  theme_classic(base_size = 16) +
  theme(legend.position = "none")

## Warning: Removed 2 rows containing missing values (geom_segment).

#dev.off()

Meta-Analysis for Linda Problem Paradigm

SURA 2020 - Victoria Shiau

24 July 2020