Follow-up Study

Critical Reflection on Comparison of GGM and DCG as Causal Discovery Tools

Author

Kyuri Park

Published

January 14, 2023


1 Summary of Main Analysis

In the main analysis, we aimed to investigate the utility of statistical network models – Gaussian graphical model (GGM) – as tools for causal discovery in cyclic settings compared to the directed cyclic graph models (DCG) by means of a simulation study.

We used the cyclic causal discovery (CCD) algorithm to estimate DCGs (Richardson, 1996b) and compare them to the Gaussian graphical models (GGM). The comparison is made based on the overall density and degree centrality. As shown in Figure 1, the output of the CCD algorithm is a PAG (partial ancestral graph), which represents the Markov-equivalent class of DCGs – DCGs that are statistically equivalent under the estimated conditional independencies from observational data. Therefore, we computed the average density and average degree centrality given all the equivalent DCGs per simulated condition and compared those to the density and degree centrality of GGMs and the true model, respectively.

The results showed that the DCGs approximated the true cyclic models better in terms of both density and degree centrality compared to GGMs. GGMs often overestimated the density and correspondingly resulted in high degrees for almost all nodes in the considered models. The conclusion based on these results could be that statistical network models perform poorly as causal discovery tools in cyclic settings and hence, it shall be preferred to use the purpose-built causal discovery methods when one is interested in the underlying causal mechanisms.

Figure 1. Summary of the CCD algorithm operation. Given the estimated statistical independencies from observational data, CCD outputs a partial ancestral graph (PAG). It can be seen that in this example, the PAG represents two different directed cyclic graphs (DCG), of which we compute the density and degree centrality. Then, the average value of density and degree centrality are subsequently used for the comparison with the GGM.


2 Reflection on Limitations

However, we cannot just naively draw such conclusion merely based on these results, as there are several limitations in this simulation study. Below, I list two crucial limitations.

  1. We did not explicitly account for the uncertainty in estimation with the CCD algorithm. As explained above, the algorithm in general cannot uniquely identify one true graph, but instead provides a set of equivalent DCGs (as a form of PAG). The size of the equivalent set reflects the level of uncertainty in its estimation, such that the larger the set is, the higher the uncertainty in estimation. Besides the size, the variation in the equivalent set can also imply the instability in estimation; the higher the variation, the more unstable the estimates are. It is a critical aspect to further investigate, as it could be the case that even when the average density and degree centrality are closely aligned to those of the true model, the size of the set or the variation in the set is large, indicating the lack of precision in its estimates.

  2. There is little evidence on practical applicability of the CCD algorithm, as we only tested it out on rather small simulated models (\(p = 4, 5, 6\)) with a very large sample size (\(N = 10^6\)). The typical psychological data, however, usually consist of much fewer observations while including more variables (Constantin et al., 2021). Thus, whether the CCD algorithm can be applied to such empirical data and utilized in psychological research in practice is yet questionable.


3 Follow-up Analyses

In this section, we perform the follow-up analyses to investigate the aforementioned limitations in the original study. To examine the overall uncertainty/instability in estimation as to the first limitation, in Section 3.1, we check the size of equivalent class from each of the simulated models and investigate the variation in the density. Additionally, in Section 3.2, we look into the variation in degree centrality per each set of DCGs. Regarding the practical applicability of the CCD algorithm, in Section 3.3, we test the algorithm on empirical data (McNally et al., 2017) and check if it produces a reasonable output.

3.1 Variation in Density of DCGs

Figure 2 shows the size of the equivalence class and the distribution of density per set of DCGs from each of the simulated models. First thing that stands out is that the dense models (right column of Figure 2) seem to have a slightly higher variation in density (i.e., the spread of distributions is wider), and accordingly the discrepancy between the true density and average density of DCGs is larger in the dense models than in the sparse models. This indicates that the CCD algorithm tends to be less stable when the true causal structure is dense and correspondingly, the resulting estimates are more likely to deviate from the true values.

Secondly, it could be seen that the size of equivalence class is overall quite large across all conditions, except for the sparse model with 4 nodes case (Figure 2 (a)). There is no specific pattern observed between the size of equivalence class and the types of simulated models, but these fairly large equivalence classes suggest the lack of certainty in its estimation.

Figure 2. Distribution of density per set of DCGs. In (a), two DCGs in the equivalence class have the same density that is identical to the true density (the two lines overlap). The dense model with 4 nodes in (b) seems to be the most difficult case for CCD, given that it has the largest equivalence class (1,659 DCGs) with the highest discrepancy in density among all considered cases. Typically, the CCD algorithm seems to struggle more to estimate the dense models, seeing that there are more variations and higher deviations from the true values in the dense condition.

3.2 Variation in Degree Centrality of DCGs

Figure 3 shows the average degree per node with 95% confidence interval in each of the simulated models. Here, we see that the 95% confidence intervals in the dense models are relatively wider (right column of Figure 3) than the ones in the sparse models (left column of Figure 3), which indicate that there is less stability in degree estimation with the dense models. This is in accordance with the variation in density as shown in Figure 2), where the dense models are shown to have more variations in the estimated densities.

Figure 3. Average degree for every node with 95% confidence interval per set of DCGs. In (a), the standard error (SE) is zero and accordingly the confidence interval (CI) doesn’t exist. For the rest, the values of SE are also very small (\(.001 - .01\)), which makes it hard to visualize the CIs. For the ease of comparison, the SEs are scaled up. Hence, note that the width of CIs here can only be compared across different cases in a relative sense.

3.3 Practical Applicability of CCD

We use the data provided by McNally et al. (2017), which contains information on depression and OCD symptoms for 408 patients. Here, we only use the depression symptom scores, which consist of 408 observations for 16 depression symptoms. Figure 4 shows the partial ancestral graph (PAG) for the depression symptoms estimated by the CCD algorithm.

A couple of features are apparent. First, there are two clusters (islands), one comprising symptoms related to sleeping problems, and the other one comprising symptoms related to appetite issue. Secondly, there exists a cycle (feedback loop): anhedonia \(\rightarrow\) fatigue \(\rightarrow\) retard \(\rightarrow\) sad \(\rightarrow\) anhedonia, which seems reasonable in a substantive sense. Here, we do not know the true underlying causal structure, but the overall findings from this causal graph is deemed rather informative and sensible, mostly aligning with our expectations.

Just for a comparison, the statistical network model (GGM) is also estimated using graphical LASSO (Epskamp & Fried, 2018) as shown in Figure 5. In accordance with the causal model, similar clustering structures are observed with symptoms related to sleep and appetite. Even though with this statistical network model we could additionally examine the sign of relations (edge color: blue/red = positive/negative) and the strength of relations (thickness of edges) between symptoms, we can hardly infer any causal relations or detect the presence of feedback loops, as we did with the causal model.

Figure 4. PAG estimated by CCD algorithm for depression symptoms. In the PAG representation, there exists two types of underlining that are used in a triple of nodes: solid underlining (A – B – C) and dotted underlining. The colored nodes (in blue) refer to the presence of solid underlinings and the dashed nodes refer to the presence of dotted underlinings on the corresponding nodes. These underlinings are used to help determining directions of causal relations in a PAG. For more information, see Richardson (1996a).

Figure 5. Statistical network model constructed via graphical LASSO for depression symptoms.


4 Conclusion

All in all, with these additional analyses, we learn that:

  1. There typically exists a considerable size of equivalence class and the variation in estimates is relatively large when the causal structure is dense.
  2. The CCD algorithm is indeed applicable in practice to typical psychological data and can be used to help discovering some interesting causal dynamics in psychological processes.

Although the results from the first part of analyses imply quite some uncertainty and instability in estimation with the CCD algorithm, the core conclusion from the original simulation study (i.e., causal models are preferred to statistical network models when causal hypothesis is of interest) remains the same for the following reasons. First, even though the size of equivalence class can be larger than desired, the causal models still tend to be more informative than the undirected statistical network models when it comes to inferring directions of causal relations. Second, the difficulty in estimation with dense causal structures is equally problematic in statistical network models. In fact, the original simulation study showed that causal models outperformed the statistical network models under the dense conditions, in terms of approximating the true density and degree centrality.

Outside of their use for causal discovery, statistical network models can be very useful as descriptive tools; to explore multivariate statistical relationships (Epskamp et al., 2018); to visualize clustering structures (Golino & Epskamp, 2017). However, if one is interested in the network approach to search for causal mechanisms, then the focus should be on estimating causal models rather than statistical network models.


5 References

Constantin, M., Schuurman, N. K., & Vermunt, J. (2021). A General Monte Carlo Method for Sample Size Analysis in the Context of Network Models. PsyArXiv. https://doi.org/10.31234/osf.io/j5v7u
Epskamp, S., & Fried, E. I. (2018). A Tutorial on Regularized Partial Correlation Networks. Psychological Methods, 23(4), 617–634. https://doi.org/10.1037/met0000167
Epskamp, S., Waldorp, L. J., Mõttus, R., & Borsboom, D. (2018). The Gaussian Graphical Model in Cross-Sectional and Time-Series Data. Multivariate Behavioral Research, 53(4), 453–480. https://doi.org/10.1080/00273171.2018.1454823
Golino, H. F., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PLOS ONE, 12(6), e0174035. https://doi.org/10.1371/journal.pone.0174035
McNally, R., Mair, P., Mugno, B., & Riemann, B. (2017). Co-morbid obsessive–compulsive disorder and depression: A bayesian network approach. Psychological Medicine, 47(7), 1204–1214. https://doi.org/10.1017/S0033291716003287
Richardson, T. (1996a). A discovery algorithm for directed cyclic graphs. Proceedings of the Twelfth International Conference on Uncertainty in Artificial Intelligence, 454–461.
Richardson, T. (1996b). Discovering cyclic causal structure. Carnegie Mellon [Department of Philosophy].

6 Session info

sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.0

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] Rgraphviz_2.40.0    graph_1.74.0        BiocGenerics_0.42.0
 [4] DOT_0.1             rcausal_1.2.1       devtools_2.4.5     
 [7] usethis_2.1.6       rJava_1.0-6         kableExtra_1.3.4   
[10] purrr_0.3.4         ggpubr_0.4.0        dplyr_1.0.9        
[13] ggplot2_3.3.6       pcalg_2.7-7         qgraph_1.9.2       

loaded via a namespace (and not attached):
  [1] colorspace_2.0-3    ggsignif_0.6.3      deldir_1.0-6       
  [4] ellipsis_0.3.2      htmlTable_2.4.1     corpcor_1.6.10     
  [7] base64enc_0.1-3     fs_1.5.2            clue_0.3-61        
 [10] rstudioapi_0.13     lavaan_0.6-12       remotes_2.4.2      
 [13] fansi_1.0.3         xml2_1.3.3          splines_4.2.2      
 [16] mnormt_2.1.0        cachem_1.0.6        robustbase_0.95-0  
 [19] knitr_1.40          glasso_1.11         pkgload_1.3.0      
 [22] Formula_1.2-4       jsonlite_1.8.0      broom_1.0.0        
 [25] cluster_2.1.4       png_0.1-7           sfsmisc_1.1-13     
 [28] shiny_1.7.2         compiler_4.2.2      httr_1.4.3         
 [31] backports_1.4.1     assertthat_0.2.1    Matrix_1.5-1       
 [34] fastmap_1.1.0       cli_3.4.1           later_1.3.0        
 [37] prettyunits_1.1.1   htmltools_0.5.2     tools_4.2.2        
 [40] igraph_1.3.5        gtable_0.3.0        glue_1.6.2         
 [43] reshape2_1.4.4      V8_4.2.1            Rcpp_1.0.9         
 [46] carData_3.0-5       vctrs_0.4.1         svglite_2.1.0      
 [49] nlme_3.1-160        psych_2.2.5         xfun_0.31          
 [52] stringr_1.4.1       ps_1.7.1            rvest_1.0.2        
 [55] mime_0.12           ggm_2.5             miniUI_0.1.1.1     
 [58] lifecycle_1.0.1     gtools_3.9.3        rstatix_0.7.0      
 [61] DEoptimR_1.0-11     scales_1.2.0        promises_1.2.0.1   
 [64] parallel_4.2.2      RBGL_1.72.0         RColorBrewer_1.1-3 
 [67] curl_4.3.2          yaml_2.3.5          memoise_2.0.1      
 [70] pbapply_1.5-0       gridExtra_2.3       bdsmatrix_1.3-6    
 [73] rpart_4.1.19        fastICA_1.2-3       latticeExtra_0.6-30
 [76] stringi_1.7.8       checkmate_2.1.0     pkgbuild_1.3.1     
 [79] rlang_1.0.5         pkgconfig_2.0.3     systemfonts_1.0.4  
 [82] evaluate_0.15       lattice_0.20-45     htmlwidgets_1.5.4  
 [85] processx_3.7.0      tidyselect_1.1.2    plyr_1.8.7         
 [88] magrittr_2.0.3      R6_2.5.1            profvis_0.3.7      
 [91] generics_0.1.3      Hmisc_4.7-0         DBI_1.1.3          
 [94] pillar_1.7.0        foreign_0.8-83      withr_2.5.0        
 [97] survival_3.4-0      abind_1.4-5         nnet_7.3-18        
[100] tibble_3.1.7        crayon_1.5.1        car_3.1-0          
[103] interp_1.1-2        fdrtool_1.2.17      utf8_1.2.2         
[106] urlchecker_1.0.1    rmarkdown_2.16      jpeg_0.1-9         
[109] data.table_1.14.2   pbivnorm_0.6.0      callr_3.7.1        
[112] digest_0.6.29       webshot_0.5.3       xtable_1.8-4       
[115] httpuv_1.6.5        tidyr_1.2.0         stats4_4.2.2       
[118] munsell_0.5.0       viridisLite_0.4.0   sessioninfo_1.2.2