Follow-up Study
Critical Reflection on Comparison of GGM and DCG as Causal Discovery Tools
1 Summary of Main Analysis
In the main analysis, we aimed to investigate the utility of statistical network models – Gaussian graphical model (GGM) – as tools for causal discovery in cyclic settings compared to the directed cyclic graph models (DCG) by means of a simulation study.
We used the cyclic causal discovery (CCD) algorithm to estimate DCGs (Richardson, 1996b) and compare them to the Gaussian graphical models (GGM). The comparison is made based on the overall density and degree centrality. As shown in Figure 1, the output of the CCD algorithm is a PAG (partial ancestral graph), which represents the Markov-equivalent class of DCGs – DCGs that are statistically equivalent under the estimated conditional independencies from observational data. Therefore, we computed the average density and average degree centrality given all the equivalent DCGs per simulated condition and compared those to the density and degree centrality of GGMs and the true model, respectively.
The results showed that the DCGs approximated the true cyclic models better in terms of both density and degree centrality compared to GGMs. GGMs often overestimated the density and correspondingly resulted in high degrees for almost all nodes in the considered models. The conclusion based on these results could be that statistical network models perform poorly as causal discovery tools in cyclic settings and hence, it shall be preferred to use the purpose-built causal discovery methods when one is interested in the underlying causal mechanisms.
2 Reflection on Limitations
However, we cannot just naively draw such conclusion merely based on these results, as there are several limitations in this simulation study. Below, I list two crucial limitations.
We did not explicitly account for the uncertainty in estimation with the CCD algorithm. As explained above, the algorithm in general cannot uniquely identify one true graph, but instead provides a set of equivalent DCGs (as a form of PAG). The size of the equivalent set reflects the level of uncertainty in its estimation, such that the larger the set is, the higher the uncertainty in estimation. Besides the size, the variation in the equivalent set can also imply the instability in estimation; the higher the variation, the more unstable the estimates are. It is a critical aspect to further investigate, as it could be the case that even when the average density and degree centrality are closely aligned to those of the true model, the size of the set or the variation in the set is large, indicating the lack of precision in its estimates.
There is little evidence on practical applicability of the CCD algorithm, as we only tested it out on rather small simulated models (\(p = 4, 5, 6\)) with a very large sample size (\(N = 10^6\)). The typical psychological data, however, usually consist of much fewer observations while including more variables (Constantin et al., 2021). Thus, whether the CCD algorithm can be applied to such empirical data and utilized in psychological research in practice is yet questionable.
3 Follow-up Analyses
In this section, we perform the follow-up analyses to investigate the aforementioned limitations in the original study. To examine the overall uncertainty/instability in estimation as to the first limitation, in Section 3.1, we check the size of equivalent class from each of the simulated models and investigate the variation in the density. Additionally, in Section 3.2, we look into the variation in degree centrality per each set of DCGs. Regarding the practical applicability of the CCD algorithm, in Section 3.3, we test the algorithm on empirical data (McNally et al., 2017) and check if it produces a reasonable output.
3.1 Variation in Density of DCGs
Figure 2 shows the size of the equivalence class and the distribution of density per set of DCGs from each of the simulated models. First thing that stands out is that the dense models (right column of Figure 2) seem to have a slightly higher variation in density (i.e., the spread of distributions is wider), and accordingly the discrepancy between the true density and average density of DCGs is larger in the dense models than in the sparse models. This indicates that the CCD algorithm tends to be less stable when the true causal structure is dense and correspondingly, the resulting estimates are more likely to deviate from the true values.
Secondly, it could be seen that the size of equivalence class is overall quite large across all conditions, except for the sparse model with 4 nodes case (Figure 2 (a)). There is no specific pattern observed between the size of equivalence class and the types of simulated models, but these fairly large equivalence classes suggest the lack of certainty in its estimation.
3.2 Variation in Degree Centrality of DCGs
Figure 3 shows the average degree per node with 95% confidence interval in each of the simulated models. Here, we see that the 95% confidence intervals in the dense models are relatively wider (right column of Figure 3) than the ones in the sparse models (left column of Figure 3), which indicate that there is less stability in degree estimation with the dense models. This is in accordance with the variation in density as shown in Figure 2), where the dense models are shown to have more variations in the estimated densities.
3.3 Practical Applicability of CCD
We use the data provided by McNally et al. (2017), which contains information on depression and OCD symptoms for 408 patients. Here, we only use the depression symptom scores, which consist of 408 observations for 16 depression symptoms. Figure 4 shows the partial ancestral graph (PAG) for the depression symptoms estimated by the CCD algorithm.
A couple of features are apparent. First, there are two clusters (islands), one comprising symptoms related to sleeping problems, and the other one comprising symptoms related to appetite issue. Secondly, there exists a cycle (feedback loop): anhedonia \(\rightarrow\) fatigue \(\rightarrow\) retard \(\rightarrow\) sad \(\rightarrow\) anhedonia, which seems reasonable in a substantive sense. Here, we do not know the true underlying causal structure, but the overall findings from this causal graph is deemed rather informative and sensible, mostly aligning with our expectations.
Just for a comparison, the statistical network model (GGM) is also estimated using graphical LASSO (Epskamp & Fried, 2018) as shown in Figure 5. In accordance with the causal model, similar clustering structures are observed with symptoms related to sleep and appetite. Even though with this statistical network model we could additionally examine the sign of relations (edge color: blue/red = positive/negative) and the strength of relations (thickness of edges) between symptoms, we can hardly infer any causal relations or detect the presence of feedback loops, as we did with the causal model.
4 Conclusion
All in all, with these additional analyses, we learn that:
- There typically exists a considerable size of equivalence class and the variation in estimates is relatively large when the causal structure is dense.
- The CCD algorithm is indeed applicable in practice to typical psychological data and can be used to help discovering some interesting causal dynamics in psychological processes.
Although the results from the first part of analyses imply quite some uncertainty and instability in estimation with the CCD algorithm, the core conclusion from the original simulation study (i.e., causal models are preferred to statistical network models when causal hypothesis is of interest) remains the same for the following reasons. First, even though the size of equivalence class can be larger than desired, the causal models still tend to be more informative than the undirected statistical network models when it comes to inferring directions of causal relations. Second, the difficulty in estimation with dense causal structures is equally problematic in statistical network models. In fact, the original simulation study showed that causal models outperformed the statistical network models under the dense conditions, in terms of approximating the true density and degree centrality.
Outside of their use for causal discovery, statistical network models can be very useful as descriptive tools; to explore multivariate statistical relationships (Epskamp et al., 2018); to visualize clustering structures (Golino & Epskamp, 2017). However, if one is interested in the network approach to search for causal mechanisms, then the focus should be on estimating causal models rather than statistical network models.
5 References
6 Session info
sessionInfo()R version 4.2.2 (2022-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.0
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] Rgraphviz_2.40.0 graph_1.74.0 BiocGenerics_0.42.0
[4] DOT_0.1 rcausal_1.2.1 devtools_2.4.5
[7] usethis_2.1.6 rJava_1.0-6 kableExtra_1.3.4
[10] purrr_0.3.4 ggpubr_0.4.0 dplyr_1.0.9
[13] ggplot2_3.3.6 pcalg_2.7-7 qgraph_1.9.2
loaded via a namespace (and not attached):
[1] colorspace_2.0-3 ggsignif_0.6.3 deldir_1.0-6
[4] ellipsis_0.3.2 htmlTable_2.4.1 corpcor_1.6.10
[7] base64enc_0.1-3 fs_1.5.2 clue_0.3-61
[10] rstudioapi_0.13 lavaan_0.6-12 remotes_2.4.2
[13] fansi_1.0.3 xml2_1.3.3 splines_4.2.2
[16] mnormt_2.1.0 cachem_1.0.6 robustbase_0.95-0
[19] knitr_1.40 glasso_1.11 pkgload_1.3.0
[22] Formula_1.2-4 jsonlite_1.8.0 broom_1.0.0
[25] cluster_2.1.4 png_0.1-7 sfsmisc_1.1-13
[28] shiny_1.7.2 compiler_4.2.2 httr_1.4.3
[31] backports_1.4.1 assertthat_0.2.1 Matrix_1.5-1
[34] fastmap_1.1.0 cli_3.4.1 later_1.3.0
[37] prettyunits_1.1.1 htmltools_0.5.2 tools_4.2.2
[40] igraph_1.3.5 gtable_0.3.0 glue_1.6.2
[43] reshape2_1.4.4 V8_4.2.1 Rcpp_1.0.9
[46] carData_3.0-5 vctrs_0.4.1 svglite_2.1.0
[49] nlme_3.1-160 psych_2.2.5 xfun_0.31
[52] stringr_1.4.1 ps_1.7.1 rvest_1.0.2
[55] mime_0.12 ggm_2.5 miniUI_0.1.1.1
[58] lifecycle_1.0.1 gtools_3.9.3 rstatix_0.7.0
[61] DEoptimR_1.0-11 scales_1.2.0 promises_1.2.0.1
[64] parallel_4.2.2 RBGL_1.72.0 RColorBrewer_1.1-3
[67] curl_4.3.2 yaml_2.3.5 memoise_2.0.1
[70] pbapply_1.5-0 gridExtra_2.3 bdsmatrix_1.3-6
[73] rpart_4.1.19 fastICA_1.2-3 latticeExtra_0.6-30
[76] stringi_1.7.8 checkmate_2.1.0 pkgbuild_1.3.1
[79] rlang_1.0.5 pkgconfig_2.0.3 systemfonts_1.0.4
[82] evaluate_0.15 lattice_0.20-45 htmlwidgets_1.5.4
[85] processx_3.7.0 tidyselect_1.1.2 plyr_1.8.7
[88] magrittr_2.0.3 R6_2.5.1 profvis_0.3.7
[91] generics_0.1.3 Hmisc_4.7-0 DBI_1.1.3
[94] pillar_1.7.0 foreign_0.8-83 withr_2.5.0
[97] survival_3.4-0 abind_1.4-5 nnet_7.3-18
[100] tibble_3.1.7 crayon_1.5.1 car_3.1-0
[103] interp_1.1-2 fdrtool_1.2.17 utf8_1.2.2
[106] urlchecker_1.0.1 rmarkdown_2.16 jpeg_0.1-9
[109] data.table_1.14.2 pbivnorm_0.6.0 callr_3.7.1
[112] digest_0.6.29 webshot_0.5.3 xtable_1.8-4
[115] httpuv_1.6.5 tidyr_1.2.0 stats4_4.2.2
[118] munsell_0.5.0 viridisLite_0.4.0 sessioninfo_1.2.2