Replication of Hasan et al. (2025, PNAS Nexus)

Author

Shashanka Subrahmanya (ssbrahma@stanford.edu)

Published

December 15, 2025

Introduction

Depression is a highly prevalent internalizing mental health disorder 1, often under-detected and under-treated. A core symptom of depression is the presence of cognitive distortions that determine how individuals think about themselves and the world. In the current age of social media where users consume algorithmically curated content, these distortions may be reinforced since users are selectively drawn towards cognitively distorted posts (negativity bias), raising the possibility that online environments can induce or amplify distorted thinking. However, there exists a gold-standard psychological treatment for depression and related internalizing disorders based on the same premise, Cognitive Behavioral Therapy (CBT). CBT therapists typically use psychoeducation, modeling, and structured skills practice to help individuals identify, evaluate, and reframe distorted thoughts into more balanced alternatives. Recent work has extended these principles to “microinterventions” delivered via digital platforms, suggesting that brief, scalable CBT-inspired exercises such as teaching users to recognize distorted language in social media content could leverage the same algorithmic infrastructure that spreads harmful content to instead disseminate corrective, resilience-building material at scale.​

With this context, Hasan et al. (2025) test whether a one-shot intervention could train individuals to detect cognitive distortions in social media posts and whether this training would change how they interacted with such content via likes and retweets. I intend to replicate the study’s primary objective of assessing how training influenced engagement with distorted versus nondistorted tweets, and the secondary objective of examining how recognition and interaction patterns varied across levels of depressive symptoms.

More details about this replication can be found at the associated Github repository, pre-registeration, and the hosted experiment.

Methods

Power Analysis

Hasan et al. (2025) show from a generalized mixed effects regression that here was a significant decrease in the liking and retweeting rates of distorted content when the interactions followed training (like: \(e^{\beta}\) = 0.553, 95% CI = [0.475–0.645], retweet: \(e^{\beta}\) = 0.424, 95% CI = [0.343–0.524]). Based on a power analysis (by estimating Cohen’s d from the odds ratio scores), we observe that we need a sample size of 149 and 72 for \(\alpha = .05\) for likes and retweets respectively.

Planned Sample

In line with the original paper we restrict to participants within the US and above 18 years of age, dividing approximately equally between the conditions.

Materials

Hasan et al. (2025) prompt a large language model (LLM) using OpenAI’s ChatGPT interface to generate two sets of sentiment-matched tweets, one set of 60 tweets with cognitively distorted content and the other set of the same number but without distorted content. A sample prompt was “Now generate 10 more that contain cognitive distortions with similar sentence construction and sentiment.” In the next step, a licensed clinical psychologist evaluated these tweets, and some of these tweets were modified to be in the correct category. Finally, a random subset of 30 tweets from each category was selected for the experiment. In the next step, the authors annotated the sentiment of each tweet on a -1 to 1 scale using a sentiment analysis tool called VADER (Hutto & Gilbert, 2014). The tweets were further modified so that the composite sentiment scores for both sets of tweets had similar means (Mdistorted = −.30 and Mnon-distorted = −.15, t(58) = 1.45, p = 0.15) and similar distributions (Kolmogorov-Smirnov Test with D(0)= .23, p = .39). This modification was again validated by a licensed clinical psychologist and later used for the experiment.

In the interest of time, we reuse the materials provided in Hasan et al. (2025).

Procedure

We follow precisely the same experimental procedure as Hasan et al. (2025).

Participants provided their informed consent before participating in the experiment. Their Twitter handle and demographic information were collected before the main task. Participants were assigned to one of the two counter-balanced conditions—Interaction Before Training and Interaction After Training to study the impact of training on interaction. The assigned condition determined the order in which they saw the interaction and training-identification blocks

Fig. 1. Experimental procedure:. Participants who were assigned to the “Interaction After Training Condition” did the Training-Identification block before the Interaction Block and the individuals in the “Interaction Before the Training Condition” did it in the opposite order

During the training-identification block, participants first learned about cognitive distortions using the previously described training method. Following this, in the identification block, they were presented with the 30 randomly chosen tweets counter balanced across the two categories and asked to evaluate the probability (on a scale of 0–100) that each tweet contained a distortion. Participants provided their probability judgment by typing an integer between 0 and 100 into a text box. No feedback was given on their judgments. After providing their judgment, participants could interact with the tweet using the “like” or “retweet” buttons as described for the interaction block, using an interface similar to Twitter. In the interaction block, participants were presented with tweets generated from the remaining 30 randomly sampled stimuli also counter balanced across the two categories.

After the main task, we collected participant data on their mental health and social media use. The full details about the questions including question text are in the Supplementary methods. At the end of the experiment, participants were provided with resources for mental health support.

Analysis Plan

We also follow the same analysis plan as Hasan et al. (2025).

We used a generalized mixed effects regression for our analysis. We tested for differences in (i) accuracy, (ii) liking, and (iii) retweeting. For the accuracy regression, we used behavioral data from the training-identification block. For the liking and retweeting regression, we used the behavioral data from the interaction block. We treated subject ID as a random effect. Since every participant saw a different randomly sampled set of stimuli, we controlled for this by treating every stimulus as a random effect. The Depressive Symptoms and TUS were centered and standardized before being used in the model. The models were fit using the Bound Optimization by Quadratic Approximation in the glmer method in the lme4 R package.

To determine the best model of each dependent variable, we conducted a nested model comparison using five models

  1. base y ∼ block order * is distorted
  2. depression severity (alone) y ∼ block order * is distorted + Depressive Symptoms
  3. twitter use score (alone) y ∼ block order * is distorted + Twitter Use Score
  4. independent effects y ∼ block order * is distorted * Depressive Symptoms + block order * is distorted * Twitter Use Score
  5. full model: y ∼ block order * is distorted * Depressive Symptoms * Twitter Use Score

However we focus only on the model responsible for the main result.

  1. independent effects y ∼ block order * is distorted * Depressive Symptoms + block order * is distorted * Twitter Use Score

Differences from Original Study

To respect participants’ autonomy and right to choose, the post-experiment survey was made optional in this replication. As a result, Twitter Use Score variable wasn’t transformed as in the original paper since a majority of the participants didn’t respond to the question.

Methods Addendum (Post Data Collection)

Actual Sample

However, we collected data from only 105 participants in the interest of reducing the cost for this replication, with 53 participants interacting before the training and 52 after training. A detailed visualization of the participant demographics is available in Fig. 2.

# below code displays the distribution of the main study variables
# other variables aren't displayed for brevity

# Age
age <- agg_data %>%
  ggplot(aes(x = Age)) +
  geom_histogram(color = "black") +
  labs(
    y = "Frequency", 
    title = "Age")

# Gender
gender <- agg_data %>%
  ggplot(aes(x = Gender)) +
  geom_bar(color = "black") +
  labs(
    y = "Frequency",
    title = "Gender")


# Accuracy of identifying cognitive distortion
acc <- agg_data %>%
  ggplot(mapping = aes(x = acc)) +
  geom_histogram(
    color = "black", 
    breaks = seq(0, 1, 0.1)) +
  labs(
    y = "Frequency",
    title = "Accuracy")

# Depression symptoms
phq9 <- agg_data %>%
  ggplot(aes(x = PHQ_9_aggregate)) +
  geom_histogram(color = "black") +
  labs(
    y = "Frequency", 
    title = "Depression symptoms")

# Like rate
like <- agg_data %>%
  ggplot(aes(x = like_rate)) +
  geom_histogram(color = "black") +
  labs(
    y = "Frequency", 
    title = "Like rate")

# Retweet rate
retweet <- agg_data %>%
  ggplot(aes(x = retweet_rate)) +
  geom_histogram(color = "black") +
  labs(
    y = "Frequency", 
    title = "Retweet rate")

# Finally stitch everything into a nice plot
(age + gender + acc + phq9 + like + retweet) &
  plot_annotation(
    tag_levels = 'A',
    caption = "Fig. 2. Distributions of variables used in the replication") &
  theme_minimal() &
  theme(
    plot.title = element_text(hjust = 0.5, size = 16),
    plot.caption = element_text(hjust = 0.5, size = 14),
    axis.line = element_line(),
    axis.text = element_text(size = 12),
    axis.title = element_blank()
  )

Differences from pre-data collection methods plan

None.

Results

Data preparation

Data from Prolific transformed into an analysis-ready version using the code shared by Hasan et al. (2025) and available at analysis/clean_demographics_data.ipynb and analysis/clean_stimulus_data.ipynb (in the same order)

Confirmatory analysis

Liking and retweeting

Table 1a. Results of generalized mixed effects regression predicting accuracy, liking, and retweeting from the original paper
# for deterministic results
set.seed(1)

model.MA <- glmer(
  is_correct ~ 1 + Age + Gender + 
    block_order_dummy * is_distorted * PHQ_9_aggregate_scaled + 
    block_order_dummy * is_distorted * Twitter_usage_composite_scaled + 
    (1 | stimulus) + (1 | subject_ID),
  data = identification_data, family = binomial,
  control = glmerControl(optimizer = "bobyqa", optCtrl=list(maxfun=1000000)))

# Like Model
model.ML <- glmer(
  liked ~ 1 + Age + Gender + 
    block_order_dummy * is_distorted * PHQ_9_aggregate_scaled + 
    block_order_dummy * is_distorted * Twitter_usage_composite_scaled + 
    (1 | stimulus) + (1 | subject_ID),
  data = interaction_data, family = binomial,
  control = glmerControl(optimizer = "bobyqa", optCtrl=list(maxfun=1000000)))

# Retweet Model
model.MR <- glmer(
  retweet ~ 1 + Age + Gender + 
    block_order_dummy * is_distorted * PHQ_9_aggregate_scaled + 
    block_order_dummy * is_distorted * Twitter_usage_composite_scaled + 
    (1 | stimulus) + (1 | subject_ID),
  data = interaction_data, family = binomial,
  control = glmerControl(optimizer = "bobyqa", optCtrl=list(maxfun=1000000)))

# Generate Table 2
modelsummary(
  list(
    Accuracy = model.MA,
    Like = model.ML,
    Retweet  = model.MR),
  fmt = 2,
  estimate  = "{estimate}{stars} [{conf.low}, {conf.high}]",
  statistic = NULL,
  stars = TRUE,
  coef_map = name_mappings,
  title = "Table 1b. Results of generalized mixed effects regression predicting accuracy, liking, and retweeting, from Tweet distortion, user
depression, training block, and their interactions (n= 838).",
  output = "kableExtra"
)
Table 1b. Results of generalized mixed effects regression predicting accuracy, liking, and retweeting, from Tweet distortion, user depression, training block, and their interactions (n= 838).
Accuracy Like Retweet
Intercept 2.74*** [1.81, 3.66] −1.47* [−2.76, −0.18] −2.11* [−3.96, −0.27]
Woman 0.45* [0.01, 0.88] 0.59+ [−0.05, 1.24] 0.05 [−0.89, 0.99]
Nonbinary 0.58 [−1.66, 2.83] 0.95 [−2.14, 4.05] 1.58 [−2.61, 5.78]
Interaction after training −0.32 [−0.85, 0.21] −0.08 [−0.73, 0.57] −0.43 [−1.39, 0.52]
Is distorted −1.45*** [−2.01, −0.88] −1.72*** [−2.37, −1.08] −1.09** [−1.76, −0.41]
Depression severity −1.29+ [−2.70, 0.12] 1.19 [−0.59, 2.97] 3.26* [0.71, 5.80]
Twitter Use Score −2.54*** [−3.98, −1.10] 0.61 [−1.35, 2.58] 2.59+ [−0.08, 5.27]
Interaction after training:Is distorted 0.23 [−0.23, 0.70] −0.17 [−0.65, 0.31] −0.19 [−0.87, 0.49]
Interaction after training:Depression severity 2.70** [0.82, 4.57] −1.43 [−3.77, 0.92] −2.18 [−5.52, 1.16]
Is distorted:Depression severity 1.91** [0.69, 3.12] 2.61*** [1.37, 3.86] 1.57+ [−0.01, 3.14]
Interaction after training:Twitter Use Score 1.90 [−0.40, 4.19] 3.86* [0.91, 6.82] 1.81 [−2.26, 5.87]
Is distorted:Twitter Use Score 1.32* [0.16, 2.48] 1.69** [0.47, 2.90] 0.68 [−0.67, 2.03]
Interaction after training:Is distorted:Depression severity −3.37*** [−4.99, −1.75] −1.10 [−2.70, 0.50] −1.07 [−3.14, 1.00]
Interaction after training:Is distorted:Twitter Use Score −0.07 [−2.08, 1.94] −2.42* [−4.43, −0.40] −1.83 [−4.24, 0.58]
Num.Obs. 2850 2850 2850
R2 Marg. 0.124 0.197 0.188
R2 Cond. 0.406 0.596 0.671
AIC 2383.8 2441.4 1625.2
BIC 2485.0 2542.7 1726.5
ICC 0.3 0.5 0.6
RMSE 0.33 0.33 0.26

Training individuals to detect cognitive distortions

Similar to Hasan et al. (2025) the participants were trained to detect cognitive distortions in tweets using the materials released with the original paper which included a document with less than 250 words and a collection of 30 tweets.

I could replicate

Fig. 2a. Spearman correlations between the different measures of interest from the original paper
columns <- c(
  "Depressive Symptoms" = "PHQ_9_aggregate",
  "Twitter Use Score" = "Twitter_usage_composite",
  "Like Rate" = "like_rate",
  "Retweet Rate" = "retweet_rate",
  "Liked Ratio \nDistorted/Total-Liked" = "likedr_dist_nondist",
  "Retweet Ratio \nDistorted/Total-Retweeted" = "retweetr_dist_nondist",
  "Accuracy" = "acc",
  "Slope" = "slope"
)

# get the cross correlation betwen the above columns
correlations <- agg_data %>%
  select(columns) %>%
  cor(method = "spearman", use = "pairwise.complete.obs")

significance <- agg_data %>%
  select(columns) %>%
  cor_pmat(method = "spearman")
  
correlations %>%
  ggcorrplot(
    type = "upper", 
    lab = TRUE,
    p.mat = significance,
    sig.level = c(0.05, 0.01, 0.001)) +
  labs(
    caption = "Fig. 2b. Spearmans correlation between the main variables of interest in the replication sample") +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 16),
    plot.caption = element_text(hjust = 0.5, size = 14),
    axis.text = element_text(size = 12),
    axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1),
    axis.title = element_blank()
  )

Impact of intervention

Fig. 3a. The impact of cognitive distortion psychoeducation on liking and retweeting distorted and nondistorted content from the original paper
df <- interaction_data %>%
  mutate(is_distorted = ifelse(is_distorted == "True", "Distorted", "Undistorted")) %>%
  select(
    is_distorted, 
    block_order, 
    stimulus,
    Liked = liked,
    Retweeted = retweet) %>%
  pivot_longer(
    cols = c(Liked, Retweeted),
    names_to = "interaction",
    values_to = "value") %>%
  group_by(is_distorted, block_order, stimulus, interaction) %>%
  summarise(value = mean(value, na.rm = TRUE))

# plot
df %>%
  ggplot(
    aes(
      x = block_order, 
      y = value, 
      group = stimulus)) +
  facet_grid(interaction ~ is_distorted, scales = "free") +
  geom_line(
    alpha = 0.2, 
    aes(color = is_distorted)) +
  stat_summary(
    fun = mean, 
    geom = "line", 
    aes(color = is_distorted, group = 1),
    size = 1.2,
    alpha = 0.9) +  # mean line
  labs(
    x = "Order of training and interaction",
    y = "Rate",
    color = "Is distorted",
    caption = "Fig. 3b. Impact of intervention of Like and Retweet rate in the replication sample"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 16),
    plot.caption = element_text(hjust = 0.5, size = 14),
    axis.line = element_line(),
    axis.text = element_text(size = 12),
    axis.title = element_text(size = 14),
    axis.title.x = element_blank(),
    strip.text = element_text(size = 14),
    legend.position = "none"
  )

Exploratory analyses

Any follow-up analyses desired (not required).

Discussion

Summary of Replication Attempt

The confirmatory analyses replicated the core qualitative pattern where the distorted tweets were liked and retweeted less often than nondistorted tweets as shown in Fig. 3.

In the generalized mixed-effects regression, the interaction between distortion and depression severity remained positive for both liking and retweeting, indicating that participants with higher depression severity showed a stronger tendency to engage with distorted (relative to nondistorted content). The interaction between training block order and distortion, although small in effect sizes and not statistically significant, was directionally consistent with a reduction in engagement with distorted tweets after training.

Overall, the results support classifying the replication as successful, at least in terms of the primary claims even though the precision of the estimates is lower and some interactions are less stable than in the original paper.

Commentary

Compared to Hasan et al. (2025), this replication used a substantially smaller Prolific sample which reduced statistical power and likely contributed to wider confidence intervals and fewer significant training effects. Also remaining differences in the outcome, even though the reuse of the original stimuli, procedure, and analysis code, can be attributed to the platform differences, which helps explain why the main qualitative patterns replicated while some higher-order interactions appeared weaker or less stable than in the original paper.

References

Hasan, E., Epping, G., Lorenzo-Luaces, L., Bollen, J., & Trueblood, J. S. (2025). One-shot intervention reduces online engagement with distorted content. PNAS Nexus, 4(3), pgaf068. https://doi.org/10.1093/pnasnexus/pgaf068
Hutto, C., & Gilbert, E. (2014). VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. Proceedings of the International AAAI Conference on Web and Social Media, 8(1), 216–225. https://doi.org/10.1609/icwsm.v8i1.14550

Footnotes

  1. https://www.who.int/news-room/fact-sheets/detail/depression↩︎