A simulation study was performed to identify the best estimator performance for a longitunal TMLE analysis of the effect of second-line diabetes drugs on dementia risk in the Danish National Registry data with four key features: many timepoints (10), a rare outcome (dementia prevalence: 1.9%), competing risks from death, and a high degree of administrative censoring. Three simulations were completed. 1) A simple simulation without positivity violiations, rare outcomes, long-term followup, or competing risks as a sanity check that estimators were implemented correctly, especially as we modified the LTMLE package code, 2) a realistic simulation in terms of dementia prevalence and diabetes drug patterns, but with scrambled outcomes and competing risks to check estimator performance with a known null association, and 3) a realistic simulation with a protective effect of GLP1 usage on dementia and death, with the truth calculated as the counterfactual 5 year risk of dementia prior to death when continiously on GLP1 versus not, with the effect of GLP1 on death removed to remove the competing risk.
True RR: 0.57
True RD: -0.3
Notes:
estimator | bias | variance | mse | bias_se_ratio | oracle.coverage | coverage_ic | coverage_tmle | coverage_cv_boot | coverage_cv_boot_1000iter | coverage_iptw | coverage_iptw_boot |
---|---|---|---|---|---|---|---|---|---|---|---|
GLM | -0.001 | 0.007 | 0.007 | -0.015 | 94.7 | 95.0 | 95.0 | 93.7 | NA | 95.7 | 93.5 |
LASSO | -0.001 | 0.007 | 0.007 | -0.014 | 94.5 | 94.9 | 94.9 | 93.7 | 94.5 | 95.7 | 93.5 |
estimator | bias | variance | mse | bias_se_ratio | oracle.coverage | coverage_ic | coverage_tmle | coverage_cv_boot | coverage_cv_boot_1000iter | coverage_iptw | coverage_iptw_boot_1000iter |
---|---|---|---|---|---|---|---|---|---|---|---|
GLM | 0.00063 | 0.00159 | 0.00159 | 0.01567 | 95.5 | 95.3 | 95.3 | 94.8 | NA | 95.3 | NA |
LASSO | 0.00066 | 0.00159 | 0.00159 | 0.01661 | 95.5 | 95.3 | 95.3 | 94.8 | 95.2 | 95.3 | 95.2 |
True RR: 1
True RD: 0
Notes:
estimator | Qint | DetQ | bias | variance | mse | bias_se_ratio | oracle.coverage |
---|---|---|---|---|---|---|---|
LASSO | No | No | -0.00006 | 2e-05 | 2e-05 | -0.01181 | 96.0 |
LASSO | Yes | No | -0.00005 | 2e-05 | 2e-05 | -0.01101 | 96.5 |
GLM | Yes | No | -0.00004 | 2e-05 | 2e-05 | -0.00754 | 96.5 |
GLM | No | Yes | 0.00061 | 3e-05 | 3e-05 | 0.10875 | 97.0 |
GLM | No | No | 0.00103 | 9e-05 | 9e-05 | 0.10823 | 98.0 |
estimator | Qint | DetQ | bias | variance | mse | bias_se_ratio | oracle.coverage |
---|---|---|---|---|---|---|---|
GLM | Yes | No | -0.053 | 0.116 | 0.119 | -0.155 | 95.5 |
LASSO | Yes | No | -0.053 | 0.114 | 0.117 | -0.156 | 96.0 |
LASSO | No | No | -0.053 | 0.114 | 0.117 | -0.156 | 96.5 |
GLM | No | Yes | -0.013 | 0.126 | 0.126 | -0.038 | 97.0 |
GLM | No | No | -0.023 | 0.168 | 0.169 | -0.057 | 98.0 |
Notes:
variance_estimator | coverage | mean_ci_width |
---|---|---|
ic | 51.00000 | 0.00722 |
tmle | 100.00000 | 0.11535 |
bootstrap | 90.85366 | 0.01300 |
variance_estimator | coverage | mean_ci_width |
---|---|---|
ic | 51.50000 | 0.50639 |
tmle | 100.00000 | 8.38962 |
bootstrap | 90.85366 | 1.14126 |
Note CI width on the log scale for relative risks.
True RD: -0.009683665
True RR: 0.5148661
Notes:
estimator | bias | variance | mse | oracle.coverage |
---|---|---|---|---|
LASSO, Det-Q, AUC fit | -0.002080 | 6.0e-06 | 1.0e-05 | 84.50000 |
LASSO, Det-Q, AUC fit | -0.002080 | 6.0e-06 | 1.0e-05 | 84.50000 |
LASSO, Lambda: 1se | -0.001631 | 1.0e-05 | 1.3e-05 | 91.50000 |
Elastic Net, Lambda: 1se | -0.001450 | 9.0e-06 | 1.2e-05 | 92.00000 |
GLM, LASSO prescreen | 0.002793 | 4.9e-05 | 5.7e-05 | 92.78351 |
LASSO, Q-intercept | -0.001583 | 1.1e-05 | 1.3e-05 | 93.00000 |
LASSO, Det-Q, Lambda: 1se | -0.001109 | 8.0e-06 | 9.0e-06 | 93.50000 |
GLM | 0.002819 | 5.6e-05 | 6.4e-05 | 93.50000 |
GLM, LASSO prescreen, Det-Q | 0.002795 | 5.1e-05 | 5.9e-05 | 93.87755 |
Ridge, Det-Q | 0.000446 | 1.1e-05 | 1.1e-05 | 94.00000 |
Elastic Net, Det-Q, Lambda: 1se | -0.000899 | 8.0e-06 | 8.0e-06 | 94.50000 |
LASSO, Det-Q | 0.000267 | 1.4e-05 | 1.4e-05 | 94.50000 |
Ridge, Lambda: 1se | -0.000978 | 8.0e-06 | 9.0e-06 | 94.50000 |
Ridge | -0.000118 | 1.3e-05 | 1.3e-05 | 94.50000 |
LASSO, AUC fit | -0.001365 | 1.2e-05 | 1.4e-05 | 95.00000 |
LASSO | -0.000265 | 1.7e-05 | 1.7e-05 | 95.00000 |
Ridge, Det-Q, Lambda: 1se | -0.000536 | 6.0e-06 | 7.0e-06 | 95.50000 |
estimator | bias | variance | mse | oracle.coverage |
---|---|---|---|---|
LASSO, Det-Q, AUC fit | -0.762 | 0.209 | 0.790 | 34.000 |
LASSO, Det-Q, AUC fit | -0.762 | 0.209 | 0.790 | 34.000 |
Ridge, Det-Q, Lambda: 1se | -0.574 | 0.228 | 0.558 | 65.500 |
LASSO, Det-Q, Lambda: 1se | -0.594 | 0.250 | 0.603 | 69.000 |
Ridge, Lambda: 1se | -0.558 | 0.239 | 0.550 | 69.500 |
Elastic Net, Det-Q, Lambda: 1se | -0.580 | 0.250 | 0.585 | 71.000 |
LASSO, Lambda: 1se | -0.569 | 0.268 | 0.592 | 74.000 |
Elastic Net, Lambda: 1se | -0.561 | 0.265 | 0.579 | 75.000 |
LASSO, Q-intercept | -0.577 | 0.287 | 0.619 | 78.500 |
LASSO, AUC fit | -0.465 | 0.282 | 0.498 | 84.500 |
Ridge | -0.337 | 0.278 | 0.392 | 93.000 |
GLM, LASSO prescreen | -0.022 | 0.459 | 0.459 | 93.299 |
Ridge, Det-Q | -0.315 | 0.263 | 0.362 | 93.500 |
GLM | -0.005 | 0.469 | 0.469 | 93.500 |
LASSO, Det-Q | -0.326 | 0.308 | 0.414 | 95.000 |
LASSO | -0.341 | 0.328 | 0.445 | 95.000 |
GLM, LASSO prescreen, Det-Q | -0.025 | 0.443 | 0.443 | 95.408 |
Notes:
variance_estimator | coverage | mean_ci_width | power | bias_se_ratio_emp |
---|---|---|---|---|
ic, Det-Q | 67.0 | 0.00736 | 92.0 | 0.14223 |
tmle | 99.5 | 0.02129 | 49.0 | -0.05020 |
ic | 62.0 | 0.00737 | 91.0 | -0.14089 |
Bootstrap, Det Q function | 87.0 | 0.01346 | 68.5 | NA |
Bootstrap, Det Q function, 500 iterations | 89.0 | 0.01338 | 69.5 | NA |
Bootstrap | 85.5 | 0.01454 | 68.5 | NA |
Bootstrap-Ridge | 87.5 | 0.01289 | 72.0 | NA |
variance_estimator | coverage | mean_ci_width | power | bias_se_ratio_emp |
---|---|---|---|---|
ic, Det-Q | 55.0 | 0.866 | 92.0 | -1.475 |
ic | 48.5 | 0.841 | 90.5 | -1.591 |
tmle | 100.0 | 3.579 | 0.5 | -0.374 |
Bootstrap, Det Q function | 76.5 | 1.952 | 68.5 | NA |
Bootstrap, Det Q function, 500 iterations | 77.5 | 1.955 | 69.5 | NA |
Bootstrap | 75.5 | 1.988 | 68.5 | NA |
Bootstrap- IPTW | 100.0 | 17.888 | 0.0 | NA |
Bootstrap-Ridge | 76.5 | 1.870 | 72.0 | NA |
The primary analysis examined the effect of continuous GLP1 usage on dementia risk after 5 years, with longitudinal data discretized into 6 month time nodes. The imperfect performance of estimators in simulations may arise from the rare outcome (~2% prevalence after 5 years), positivity issues in the long-term followup (with increasingly small number of individuals continuously on GLP1), or high degrees of administrative censoring (~50% after 5 years). We ran simulations for all length of followup time from 6 months (time=1) to 5 years (time=10). Oracle coverage is good at all times, while IC coverage is increasingly anti-conservative and TMLE coverage is increasingly conservative over time. Interestingly, variance in RD estimates increases more over time while bias increases more in RR estimates.
time | bias | variance | mse | bias_se_ratio | bias_se_ratio_emp | oracle.coverage | IC_coverage | TMLE_coverage | IC_mean_ci_width | TMLE_mean_ci_width |
---|---|---|---|---|---|---|---|---|---|---|
1 | -0.00021 | 0e+00 | 0e+00 | -0.24209 | -0.35093 | 96.0 | 71.0 | 85.0 | 0.00232 | 0.00285 |
2 | -0.00021 | 0e+00 | 0e+00 | -0.17300 | -0.23397 | 95.5 | 77.5 | 77.5 | 0.00357 | 0.00357 |
3 | 0.00019 | 0e+00 | 0e+00 | 0.11499 | 0.17473 | 96.5 | 76.5 | 96.5 | 0.00426 | 0.00655 |
4 | 0.00070 | 0e+00 | 0e+00 | 0.37575 | 0.56369 | 95.5 | 78.5 | 99.0 | 0.00487 | 0.00838 |
5 | 0.00027 | 1e-05 | 1e-05 | 0.11295 | 0.18745 | 96.0 | 78.5 | 98.5 | 0.00569 | 0.01069 |
6 | 0.00064 | 1e-05 | 1e-05 | 0.25730 | 0.41259 | 95.0 | 78.0 | 99.5 | 0.00607 | 0.01228 |
7 | -0.00019 | 1e-05 | 1e-05 | -0.06014 | -0.11447 | 95.5 | 72.5 | 97.0 | 0.00656 | 0.01434 |
8 | -0.00114 | 1e-05 | 1e-05 | -0.33070 | -0.65342 | 93.5 | 64.5 | 97.5 | 0.00685 | 0.01649 |
9 | -0.00072 | 1e-05 | 1e-05 | -0.19364 | -0.39991 | 94.0 | 64.0 | 98.0 | 0.00710 | 0.01844 |
10 | -0.00045 | 2e-05 | 2e-05 | -0.11002 | -0.24138 | 94.5 | 61.5 | 99.5 | 0.00737 | 0.02129 |
time | bias | variance | mse | bias_se_ratio | bias_se_ratio_emp | oracle.coverage | IC_coverage | TMLE_coverage | IC_mean_ci_width | TMLE_mean_ci_width |
---|---|---|---|---|---|---|---|---|---|---|
1 | -0.289 | 0.383 | 0.467 | -0.467 | -0.717 | 96.0 | 75.5 | 96.0 | 1.581 | 2.199 |
2 | -0.207 | 0.244 | 0.287 | -0.419 | -0.616 | 96.5 | 78.5 | 78.5 | 1.319 | 1.319 |
3 | -0.151 | 0.284 | 0.306 | -0.283 | -0.475 | 98.0 | 73.0 | 97.5 | 1.243 | 2.351 |
4 | -0.076 | 0.274 | 0.280 | -0.144 | -0.246 | 97.0 | 74.0 | 98.0 | 1.205 | 2.720 |
5 | -0.194 | 0.269 | 0.307 | -0.374 | -0.740 | 96.0 | 71.0 | 98.5 | 1.029 | 2.517 |
6 | -0.161 | 0.259 | 0.285 | -0.317 | -0.621 | 95.0 | 72.0 | 99.5 | 1.019 | 2.852 |
7 | -0.312 | 0.304 | 0.401 | -0.566 | -1.287 | 95.0 | 62.5 | 98.5 | 0.950 | 2.892 |
8 | -0.407 | 0.317 | 0.482 | -0.724 | -1.791 | 93.5 | 52.5 | 99.0 | 0.891 | 3.030 |
9 | -0.369 | 0.321 | 0.458 | -0.651 | -1.659 | 94.0 | 53.0 | 99.0 | 0.873 | 3.300 |
10 | -0.352 | 0.328 | 0.452 | -0.614 | -1.639 | 94.5 | 48.0 | 100.0 | 0.841 | 3.579 |