Reproducibility Report for Uncovering the structure of self-regulation through data-driven ontology discover by Eisenberg et al. (2019, Nature Communication)

Author

Mia Jimenez-Fuentes (miafuen@stanford.edu)

Published

December 14, 2025

Introduction

In this report I attempt to reproduce 2 key findings presented in Uncovering the structure of self-regulation through data-driven ontology discover, where Eisenberg et al. (2019) explore the idea of defining and measuring the concept of self-regulation. The overarching motivation for this paper was to explore the inconsistency in the way that concepts like self-regulation are defined in the literature. Given the various ways that researchers define and measure self-regulation has led to theoretical fragmentation in the field. The key analyses of interest here are factor anlalyses which identify latent dimensions underlying nearly 200 task and survey variables, and the clustering analyses, which map how those measures group together.

Justification for choice of study

In Uncovering the structure of self-regulation through data-driven ontology discover, Eisenberg et al. (2019) provides a compelling example of how large-scale, complicated, data-driven methods can be used to clarify the meaning and measurement of psychological constructs that are otherwise incredibly difficult to pin down. In this paper, the authors explore the idea of self-regulation. This concept has been tied to various outcomes ranging from health to education, but the field has consistently debated how to measure it and, more broadly, if it is a coherent construct at all. This idea comes up frequently in discussions around constructs like these, especially so in conversations around skills of executive functions. This paper tackles this issue by integrating a broad battery of tasks, surveys, and real-world outcomes, offering a unique opportunity to examine how different measures related to one another.

Reproducing the results of this study will be valuable for several reasons. To begin, the authors’ methods are broad and complex - involving large-scale data collection, exploratory factor analysis, clustering, and predictive modeling. Reproducing their approach offers an opportunity to practice key skills in dimensional reduction, psychometrics, and prediction. Furthermore, this paper raises important conceptual issues about the reliability and validity of common measures in psychology - issues central to my own developing research interests. Overall, this work is methodologically complex and incredibly theoretically relevant. It will allow me to deepen my understanding of advanced data techniques while also informing my ongoing exploration about the measurement and meaning of related constructs in research.

Anticipated challenges

One challenge I anticipate surrounds the complexity of the analytic pipeline - the authors rely on several advanced methods and small deviations in parameter choices or implementation could lead to different results. Secondly, the dataset includes nearly 200 independent variables derived from the survey and behavioral tasks and reproducing the results will require careful preprocessing to match the authors’ original approach.

Links

The following links are for the project repository on Github, the original paper (which lives in the Github folder), as well as the link to the OSF preregistration of the project. * Project repository: https://github.com/psych251/eisenberg2019 * Original paper: https://github.com/psych251/eisenberg2019/blob/main/original_paper/eisenberg2019pdf.pdf * Preregistration link: https://osf.io/ecbvf/overview

Methods

Description of the steps required to reproduce the results

In order to reproduce the findings from Eisenberg 2019, there are a few key steps. First, I will need to organize all data acess and setup steps. This means, identifying the data used in the original study. The processed data file is called meaningful_varialbes_imputed.csv where the authors have done some preprocessing already to the data. From here, I will confirm descriptives about the data - that there are the right number of variables, participants, and that the data matches the file used in the original paper. Furthermore, in order to have a good foundation for the methods used and decisions made, I will identify and explore the original scripts- those within the dimensional_structure and graph_analysis folders will be most relevenat. Second, I will reproriduce Figure 2 which is their dimensional structure plot. To do this, I will compute an inter-variable correlation matrix and run exploratory factor analysis using extraction and rotation methods. Using parallel analysis, I’ll then determine the dimensional structure and plot. Finally, I will reproduce figures 3 and 4, the ontological graphs. For this, I will construct network representaiotn using the correlation matrix, apply graph-based cluserting and hierarchical grouping, and visualize the structures.

Differences from original study

The main differences from the original study surround the computing environment. The original authors use a mix of Python and R, whereas here I will stick to R for my computing. Furthermore, original imputation was not reproduced as part of this report. Instead, I am starting from the authors’ imputed dataset.

Project Progress Check 1

Measure of success

The primary success criterion is whether I can produce the two figures – specifically the sparse partial-correlation network with a clearly separated task vs survey DV structure and the subsequent factor/cluster based ontology which involves extracting latent dimensions of hierarchical communities from the space constructed in figure 2. Both outcomes will be evaluated by structure/ visual alignment with the original paper and quantitatively (by checking counts of things like dependent variables (DVs) and subjects).

Pipeline progress

All findings have been roughly reproduced! In terms of data access and setup, I have identified and loaded the author’s finalized imputed data, confirmed participant count and dependent variable structure are consistent with the original report. In terms of figure 2, I have successfully computed the partial correlation network, confirmed dependent variable count and split, and have plotted. Originally, the dependent variable split was incorrect and I found that the authors used grepl (the method I use below) to identify the surveys, but then in a later, more processed file of the data have mislabled 5 surveys as tasks. I have gone ahead and followed their labeling. Finally, I have also roughly reproduced figures 3 and 4.

Results

Data preparation

Data preparation following the analysis plan.

Key analysis

Figure 2 – connectivity matrix

set.seed(2025)

# connectivity matrix 
cor_mat <- cor(df, use = "pairwise.complete.obs", method = "spearman") # base correlation
n_subj <- nrow(df)
adj_mat <- EBICglasso(S = cor_mat, n = nrow(df), gamma = 0.5) # partial correlation adjacency matrix

# threshold edge list 
Adj_thr <- adj_mat
Adj_thr[abs(Adj_thr) < threshold_abs] <- 0
# cat("→ Retained edges ≥", threshold_abs, ":", 
#     sum(Adj_thr[upper.tri(Adj_thr)] != 0), "\n")

qg_unthr <- qgraph(
  adj_mat,
  layout = "spring",
  DoNotPlot = TRUE,
  vsize = 3
)

layout_matrix <- qg_unthr$layout

layout_matrix[is_survey_paper_match, 2] <- layout_matrix[is_survey_paper_match, 2] + 0.8
layout_matrix[is_task_paper_match, 2] <- layout_matrix[is_task_paper_match, 2] - 0.8

node_colors <- ifelse(is_survey_paper_match, "indianred", "steelblue")
edge_col <- adjustcolor("gray", alpha.f = 0.15)

png(
  file.path(out_dir_fig2, "figure2unlabeled.png"),
  width = 1400, height = 1800, res = 220
)

par(mar = c(1, 1, 1, 6))
qgraph(
  Adj_thr,
  layout = layout_matrix,
  color = node_colors,
  labels = FALSE,
  legend = FALSE,
  vsize = 3,
  border.width = 0,
  edge.color = edge_col,
  posCol = edge_col,
  negCol = edge_col,
  fade = TRUE,
  title = "EBICglasso Network (|partial r| ≥ 0.05)"
)

# finding coordiantes/ anchors for the text annotations 
xr <- range(layout_matrix[, 1])
yr <- range(layout_matrix[, 2])
right_edge <- max(layout_matrix[, 1])
x_anno <- right_edge - 1.5
y_mid <- mean(yr)
y_task <- y_mid + 0.03 * diff(yr)
y_survey <- y_mid - 0.03 * diff(yr)

text(
  x_anno, y_task,
  labels = paste0(n_task_paper_match, " Task DVs"),
  col = "steelblue", cex = 1.5, adj = 0
)
text(
  x_anno, y_survey,
  labels = paste0(n_survey_paper_match, " Survey DVs"),
  col = "indianred", cex = 1.5, adj = 0
)

dev.off()

quartz_off_screen 
                2

original_fig2 <- here("results", "figure2", "figure2original.png")
reproduced_fig2 <- here("results", "figure2", "figure2unlabeled.png")

original_fig2 <- image_read(original_fig2)
reproduced_fig2 <- image_read(reproduced_fig2)

# viewing side by side 
image_append(c(original_fig2, reproduced_fig2))

# spearman correlation matrix
cor_file <- file.path(out_dir_fig2, "correlation_matrix_spearman.csv")
# write.csv(cor_mat, cor_file, row.names = TRUE)

# thresholded edge list - long-format (source, target, weight)
edges_df <- which(Adj_thr != 0, arr.ind = TRUE)
edges_export <- data.frame(
  Var1 = rownames(Adj_thr)[edges_df[,1]],
  Var2 = colnames(Adj_thr)[edges_df[,2]],
  Weight = Adj_thr[edges_df]
)
edges_export <- edges_export[edges_df[,1] < edges_df[,2], ]
edges_file <- file.path(out_dir_fig2, "edges_glasso_abs_ge_0.05.csv")
# write.csv(edges_export, edges_file, row.names = FALSE)

# full unthresholded adjacency matrix
adj_full_file <- file.path(out_dir_fig2, "adjacency_glasso_unthresholded.csv")
# write.csv(adj_mat, adj_full_file, row.names = TRUE)

Figure 3 setup

# justifying 12 factors 

# parallel analysis 
# fa.parallel(df_survey, fa = "fa", fm = "minres", main = "Survey Parallel Analysis")
# says 12! 

# BIC across candidate factor counts
k_seq <- 1:20
bic_vals_survey <- sapply(k_seq, function(k) {
  fa(df_survey_paper_match, nfactors = k, fm = "minres", rotate = "oblimin")$BIC
})

plot(k_seq, bic_vals_survey, type = "b",
     xlab = "Number of factors", ylab = "BIC",
     main = "Survey BIC across factor solutions")
abline(v = 12, col = "red", lty = 2)

Rough reproduction of Fig 3!

df_survey <- df_survey_paper_match

# 12 factor EFA
efa_survey <- fa(
  df_survey,
  nfactors = 12,
  fm = "minres",
  rotate = "oblimin"
)

L_raw <- as.matrix(efa_survey$loadings)
dv_names <- rownames(L_raw)

# cluster dvvs in factor-loading space using dynamictreecut like authors did 
dist_survey <- dist(L_raw)
hc_survey <- hclust(dist_survey, method = "ward.D2")

clusters <- cutreeDynamic(
  dendro = hc_survey,
  distM = as.matrix(dist_survey),
  deepSplit = 2,
  pamRespectsDendro = FALSE,
  minClusterSize = 3
)

 ..cutHeight not given, setting it to 2.89  ===>  99% of the (truncated) height range in dendro.
 ..done.

clusters <- as.numeric(clusters)

valid <- clusters > 0
clusters <- clusters[valid]
L_raw <- L_raw[valid, ]
dv_names <- dv_names[valid]

# reorder by dendogram 
ord <- hc_survey$order
L_ord <- L_raw[ord, ]
clusters_ord <- clusters[ord]
dv_names_ord <- dv_names[ord]

# relabel clusters sequentially
cluster_levels <- unique(clusters_ord)
cluster_map <- setNames(seq_along(cluster_levels), cluster_levels)
clusters_ord <- cluster_map[as.character(clusters_ord)]

# dominant factor loading and cluster mapping 
max_loading_factor <- apply(L_ord, 1, function(x) which.max(abs(x)))
names(max_loading_factor) <- dv_names_ord

# paper factor names (fixed order based on plot)
paper_factor_names <- c(
  "Financial risk-taking",
  "Eating control",
  "Reward sensitivity",
  "Emotional control",
  "Mindfulness",
  "Impulsivity",
  "Goal-directedness",
  "Ethical/health risk-taking",
  "Risk perception",
  "Sensation seeking",
  "Sociability",
  "Agreeableness"
)

names(paper_factor_names) <- paper_factor_names

factor_to_cluster_label <- c(
  "Financial risk-taking" = "Financial Risk-Taking",
  "Eating control" = "Eating",
  "Reward sensitivity" = "Reward Sensitivity",
  "Emotional control" = "Behavioral Inhibition",
  "Mindfulness" = "Mindfulness",
  "Impulsivity" = "Impulsivity",
  "Goal-directedness" = "Goal-Directedness",
  "Ethical/health risk-taking" = "Ethical/Health Risk-Taking",
  "Risk perception" = "Risk Perception",
  "Sensation seeking" = "Sensation Seeking",
  "Sociability" = "Sociability",
  "Agreeableness" = NA_character_
)

cluster_ids <- sort(unique(clusters_ord))

cluster_label_map <- setNames(
  sapply(cluster_ids, function(cl) {
    dvs <- dv_names_ord[clusters_ord == cl]
    dom_factor_idx <- names(
      sort(table(max_loading_factor[dvs]), decreasing = TRUE)
    )[1]
    dom_factor_name <- names(paper_factor_names)[as.numeric(dom_factor_idx)]
    factor_to_cluster_label[[dom_factor_name]]
  }),
  cluster_ids
)

# heatmap matrix
heat_input <- t(L_ord)
rownames(heat_input) <- paper_factor_names
colnames(heat_input) <- dv_names_ord

# dendogram colored by clusters
dend <- as.dendrogram(hc_survey)
pal <- brewer.pal(length(unique(clusters_ord)), "Set3")

dend_colored <- color_branches(
  dend,
  clusters = clusters_ord,
  col = pal
)

# boundatries for clusters 
cluster_breaks <- which(diff(clusters_ord) != 0)

# heatmap
col_fun <- colorRamp2(c(-1, 0, 1), c("steelblue", "white", "indianred"))

png(
  file.path(out_dir_fig3, "figure3reproduction.png"),
  width = 4800, height = 2000, res = 220
)

bottom_space <- HeatmapAnnotation(
  spacer = anno_empty(height = unit(12, "mm")),
  show_annotation_name = FALSE,
  border = FALSE
)

ht <- Heatmap(
  heat_input,
  name = "Factor loading",
  col = col_fun,
  cluster_rows = FALSE,
  cluster_columns = dend_colored,
  show_row_names = TRUE,
  show_column_names = FALSE,
  bottom_annotation = bottom_space,
  row_names_gp = gpar(fontsize = 10),
  heatmap_legend_param = list(
    at = c(-1, 0, 1),
    labels = c("-1", "0", "1")
  )
)

ht_drawn <- draw(
  ht,
  heatmap_legend_side = "right",
  padding = unit(c(5, 5, 25, 5), "mm")
)

# cluster labels
# dashed cluster separators
decorate_annotation("spacer", {
  r <- rle(clusters_ord)
  ends <- cumsum(r$lengths)
  starts <- ends - r$lengths + 1
  mids <- (starts + ends) / 2
  n <- length(clusters_ord)

  for (i in seq_along(r$values)) {
    cl <- r$values[i]
    lab <- cluster_label_map[as.character(cl)]
    col <- pal[cl]
    if (is.na(lab)) next
    
    x0 <- (starts[i] - 1) / n
    x1 <- ends[i] / n
    xm <- (mids[i] - 0.5) / n
    # bracket
    grid.lines(
      x = unit(c(x0, x1), "npc"),
      y = unit(0.65, "npc"),
      gp = gpar(col = col, lwd = 2)
    )
    # rotated label
    grid.text(
      lab,
      x = unit(xm, "npc"),
      y = unit(0.25, "npc"),
      rot = 15,
      just = "right",
      gp = gpar(col = col, fontsize = 9)
    )
  }
})

dev.off()

quartz_off_screen 
                2

original_fig3 <- here("results", "figure3", "fig3_original.png")
reproduced_fig3 <- here("results", "figure3", "figure3reproduction.png")

original_fig3 <- image_read(original_fig3)
reproduced_fig3 <- image_read(reproduced_fig3)

# viewing side by side 
image_append(c(original_fig3, reproduced_fig3))

Figure 4 setup

Rough reproduction of Fig 4!

df_task <- df_task_paper_match 

# 5 factor EFA
efa_task <- fa(
  df_task,
  nfactors = 5,
  fm = "minres",
  rotate = "oblimin"
)

L_raw <- as.matrix(efa_task$loadings)
dv_names <- rownames(L_raw)

# cluster dvvs in factor-loading space using dynamictreecut like authors did 
dist_task <- dist(L_raw)
hc_task <- hclust(dist_task, method = "ward.D2")

clusters <- cutreeDynamic(
  dendro = hc_task,
  distM = as.matrix(dist_task),
  deepSplit = 2,
  pamRespectsDendro = FALSE,
  minClusterSize = 3
)

 ..cutHeight not given, setting it to 3.25  ===>  99% of the (truncated) height range in dendro.
 ..done.

clusters <- as.numeric(clusters)
# drop unassigned
valid <- clusters > 0
clusters <- clusters[valid]
L_raw <- L_raw[valid, ]
dv_names <- dv_names[valid]

# reorder by dendogram 
ord <- hc_task$order
L_ord <- L_raw[ord, ]
clusters_ord <- clusters[ord]
dv_names_ord <- dv_names[ord]

# merging
perc_resp_merge <- c(13, 14)
clusters_ord[clusters_ord %in% perc_resp_merge] <- min(perc_resp_merge)
cluster_levels <- sort(unique(clusters_ord))
cluster_map <- setNames(seq_along(cluster_levels), cluster_levels)
clusters_ord <- cluster_map[as.character(clusters_ord)]

# paper factor names (fixed order based on plot)
paper_task_factor_names <- c(
  "Speeded IP",
  "Caution",
  "Perc/Resp",
  "Discounting",
  "Strategic IP"
)
names(paper_task_factor_names) <- paper_task_factor_names

# mapping tasks and afctors
cluster_label_map <- c(
  "1" = "Discounting",
  "2" = "Speeded IP",
  "3" = "Inhib. threshold",
  "4" = "Cold / model-based",
  "5" = "Information proc.",
  "6" = "Speeded IP",
  "7" = "Inhib. perc/resp",
  "8" = "Conflict proc.",
  "9" = "Speeded IP",
  "10" = "Shifting",
  "11" = "Caution",
  "12" = "Conflict proc.",
  "13" = "Perc/Resp",
  "14" = "Hot / model-free",
  "15" = "Strategic IP"
)

# heatmap matrix
heat_input <- t(L_ord)
rownames(heat_input) <- paper_task_factor_names
colnames(heat_input) <- dv_names_ord

# dendogram colored by clusters
dend <- as.dendrogram(hc_task)
pal <- brewer.pal(15, "Set3")
dend_colored <- dendextend::color_branches(
  dend,
  clusters = clusters_ord,
  col = pal
)

# plotting
png(
  file.path(out_dir_fig4, "figure4reproduction.png"),
  width = 4800, height = 2000, res = 220
)

bottom_space <- HeatmapAnnotation(
  spacer = anno_empty(height = unit(14, "mm")),
  show_annotation_name = FALSE,
  border = FALSE
)

ht <- Heatmap(
  heat_input,
  name = "Factor loading",
  col = colorRamp2(c(-1, 0, 1), c("steelblue", "white", "indianred")),
  cluster_rows = FALSE,
  cluster_columns = dend_colored,
  show_row_names = TRUE,
  show_column_names = FALSE,
  bottom_annotation = bottom_space,
  row_names_gp = gpar(fontsize = 14),
  heatmap_legend_param = list(at = c(-1, 0, 1))
)

draw(
  ht,
  heatmap_legend_side = "right",
  padding = unit(c(5, 5, 30, 5), "mm")
)

# bottom labels
decorate_annotation("spacer", {
  r <- rle(clusters_ord)
  ends <- cumsum(r$lengths)
  starts <- ends - r$lengths + 1
  mids <- (starts + ends) / 2
  n <- length(clusters_ord)

  for (i in seq_along(r$values)) {
    cl <- r$values[i]
    lab <- cluster_label_map[as.character(cl)]
    col <- pal[cl]

    x0 <- (starts[i] - 1) / n
    x1 <- ends[i] / n
    xm <- (mids[i] - 0.5) / n

    grid.lines(
      x = unit(c(x0, x1), "npc"),
      y = unit(0.7, "npc"),
      gp = gpar(col = col, lwd = 2)
    )

    grid.text(
      lab,
      x = unit(xm, "npc"),
      y = unit(0.20, "npc"),
      rot = 15,
      just = "right",
      gp = gpar(col = col, fontsize = 9)
    )
  }
})
dev.off()

quartz_off_screen 
                2

original_fig4 <- here("results", "figure4", "fig4_original.png")
reproduced_fig4 <- here("results", "figure4", "figure4reproduction.png")

original_fig4 <- image_read(original_fig4)
reproduced_fig4 <- image_read(reproduced_fig4)

# viewing side by side 
image_append(c(original_fig4, reproduced_fig4))

Exploratory analyses

set.seed(2025)

# connectivity matrix 
cor_mat <- cor(df, use = "pairwise.complete.obs", method = "spearman") # base correlation
n_subj <- nrow(df)
adj_mat <- EBICglasso(S = cor_mat, n = nrow(df), gamma = 0.5) # partial correlation adjacency matrix

# threshold edge list 
Adj_thr <- adj_mat
Adj_thr[abs(Adj_thr) < threshold_abs] <- 0
# cat("→ Retained edges ≥", threshold_abs, ":", 
#     sum(Adj_thr[upper.tri(Adj_thr)] != 0), "\n")

qg_unthr <- qgraph(
  adj_mat,
  layout = "spring",
  DoNotPlot = TRUE,
  vsize = 3
)

layout_matrix <- qg_unthr$layout

layout_matrix[is_survey_original, 2] <- layout_matrix[is_survey_original, 2] + 0.8
layout_matrix[is_task_original, 2] <- layout_matrix[is_task_original, 2] - 0.8

node_colors <- ifelse(is_survey_original, "indianred", "steelblue")
edge_col <- adjustcolor("gray", alpha.f = 0.15)

par(mar = c(1, 1, 1, 6))
qgraph(
  Adj_thr,
  layout = layout_matrix,
  color = node_colors,
  labels = FALSE,
  legend = FALSE,
  vsize = 3,
  border.width = 0,
  edge.color = edge_col,
  posCol = edge_col,
  negCol = edge_col,
  fade = TRUE,
  title = "EBICglasso Network (|partial r| ≥ 0.05)"
)

# finding coordiantes/ anchors for the text annotations 
xr <- range(layout_matrix[, 1])
yr <- range(layout_matrix[, 2])
right_edge <- max(layout_matrix[, 1])
x_anno <- right_edge - 1.5
y_mid <- mean(yr)
y_task <- y_mid + 0.03 * diff(yr)
y_survey <- y_mid - 0.03 * diff(yr)

text(
  x_anno, y_task,
  labels = paste0(n_task, " Task DVs"),
  col = "steelblue", cex = 1.5, adj = 0
)
text(
  x_anno, y_survey,
  labels = paste0(n_survey, " Survey DVs"),
  col = "indianred", cex = 1.5, adj = 0
)

Discussion

Summary of Reproduction Attempt

Eisenberg et al. (2019) demonstrated that self-regulation is not a unitary, unified construct but instead is comprised of a multi-dimensional ontology that can be recovered in a data-driven manner. Using exploratory factory analysis across a large battery of both survey and behavioral data, the authors identified district latent factors and organized the task and survey level measures into interpretable clusters. The goal of this reproduction exercise was to evlaute if the survey and task ontologies that were built by Eisenberg et al. (2019) could be recovered using their general analytic framework - exploratory factor analysis and hierarchical clustering. using the matched survey and task datasets provided by the authors in their repo, we successfully reproduced the core qualitative structure of both ontologies. For survey data, a 12-factor solution was supported by parallel analysis and BIC and hierarchical clustering yielded interpretable clusters that aligned closely with the themes reported in the original paper. For task data, a 5-factor solution similarly produced coherent clusters like those reported in the original piece.

While there are some visual and aesthetic features of the figures that were not reproduced here - ordering of the clusters and the apparent staircase nature of the factor loading, the overall concepts were consistent. Furthermore, the task analysis yielded one additional cluster prior to post-hoc inspection, but even then the ontology structure (defined like the original authors by factor projections, cluster coherence, and conceptual interpretation), was consistent with the original report. Thus, the primary results of the Eisenberg et al. (2019) report were success reproduced with minor deviations.

Commentary

Several insights emerged from follow-up exploration of the reproduced ontologies. First, both survey and task related analyses revealed closely related constructs that can split into multiple empirically distinct clusters. In the task ontology, two inhibition-related clusters emerged that differed in whether they were dominated by threshold parameters versus perceptual or response related drift components. This highlights that self-regulation as a concept may occupy graded space rather than discrete categories. Importantly, there were minor differences in this reproduction that do not change the final findings or interpretations (that self regulation is best characterized as a multi-dimensional ontology spanning separable. ut related processes). One broader challenge that emerged for me from this reproduction was that presentation-level decisions (ie cluster ordering, label consolidation) play a significant role in how ontologies are interpreted, and are often under-specified or deeply burried in methodological desciprtions.