Summary

A simulation study was performed to identify the best estimator performance for a longitunal TMLE analysis of the effect of second-line diabetes drugs on dementia risk in the Danish National Registry data with four key features: many timepoints (10), a rare outcome (dementia prevalence: 1.9%), competing risks from death, and a high degree of administrative censoring. Three simulations were completed. 1) A simple simulation without positivity violiations, rare outcomes, long-term followup, or competing risks as a sanity check that estimators were implemented correctly, especially as we modified the LTMLE package code, 2) a realistic simulation in terms of dementia prevalence and diabetes drug patterns, but with scrambled outcomes and competing risks to check estimator performance with a known null association, and 3) a realistic simulation with a protective effect of GLP1 usage on dementia and death, with the truth calculated as the counterfactual 5 year risk of dementia prior to death when continiously on GLP1 versus not, with the effect of GLP1 on death removed to remove the competing risk.

Scenario 1: Simple simulation

True RR: 0.57

True RD: -0.3

Notes:

Both the GLM and LASSO estimators perform well, a sanity check on the ltmle LASSO implementation.
All variance estimators perform well, with IC, TMLE, IPTW and 1000-iteration bootstrap perform similarly
Increasing bootstrap iterations from 200 to 1000 increased coverage closer to 95% in the RR estimation. 200 iterations was sufficient for RD.

Relative risk performance

estimator	bias	variance	mse	bias_se_ratio	oracle.coverage	coverage_ic	coverage_tmle	coverage_cv_boot	coverage_cv_boot_1000iter	coverage_iptw	coverage_iptw_boot
GLM	-0.001	0.007	0.007	-0.015	94.7	95.0	95.0	93.7	NA	95.7	93.5
LASSO	-0.001	0.007	0.007	-0.014	94.5	94.9	94.9	93.7	94.5	95.7	93.5

Risk difference performance

estimator	bias	variance	mse	bias_se_ratio	oracle.coverage	coverage_ic	coverage_tmle	coverage_cv_boot	coverage_cv_boot_1000iter	coverage_iptw	coverage_iptw_boot_1000iter
GLM	0.00063	0.00159	0.00159	0.01567	95.5	95.3	95.3	94.8	NA	95.3	NA
LASSO	0.00066	0.00159	0.00159	0.01661	95.5	95.3	95.3	94.8	95.2	95.3	95.2

Scenario 2: Realistic simulation, null outcome

True RR: 1

True RD: 0

Notes:

Outcome, death and censoring all jointly scrambled
Oracle coverage is pretty good but a little too high for all estimators, but GLM estimators have more variance

RD oracle coverage of different estimators

estimator	Qint	DetQ	bias	variance	mse	bias_se_ratio	oracle.coverage
LASSO	No	No	-0.00006	2e-05	2e-05	-0.01181	96.0
LASSO	Yes	No	-0.00005	2e-05	2e-05	-0.01101	96.5
GLM	Yes	No	-0.00004	2e-05	2e-05	-0.00754	96.5
GLM	No	Yes	0.00061	3e-05	3e-05	0.10875	97.0
GLM	No	No	0.00103	9e-05	9e-05	0.10823	98.0

RR oracle coverage of different estimators

estimator	Qint	DetQ	bias	variance	mse	bias_se_ratio	oracle.coverage
GLM	Yes	No	-0.053	0.116	0.119	-0.155	95.5
LASSO	Yes	No	-0.053	0.114	0.117	-0.156	96.0
LASSO	No	No	-0.053	0.114	0.117	-0.156	96.5
GLM	No	Yes	-0.013	0.126	0.126	-0.038	97.0
GLM	No	No	-0.023	0.168	0.169	-0.057	98.0

Performance of difference variance estimators on null data

Notes:

Only showing LASSO estimator results-all estimator performances assessed in the realistic simulated data below.
Sanity-check on estimation performance on data with a known null association between GLP1 and dementia.
The IC variance estimator is anti-conservative and the TMLE variance estimator is conservative.
The bootstrap is anti-conservative but less so than the IC variance estimator.
The TMLE estimator is very conservative, with CI widths 8-10X that of the bootstrap.
The IPTW estimator is uniformly biased with overly-wide confidence intervals in all simulations (not shown).

Risk difference performance

variance_estimator	coverage	mean_ci_width
ic	51.00000	0.00722
tmle	100.00000	0.11535
bootstrap	90.85366	0.01300

Relative risk performance

variance_estimator	coverage	mean_ci_width
ic	51.50000	0.50639
tmle	100.00000	8.38962
bootstrap	90.85366	1.14126

Note CI width on the log scale for relative risks.

Scenario 3: Realistic simulation, protective effect of GLP1 on dementia

True RD: -0.009683665

True RR: 0.5148661

Comparison of different estimators’ performance

Notes:

Based on these results, we chose the LASSO estimator with Q-prediction and no deterministic Q function
Several of the estimators have comparable performance, but the chosen estimator performs best in both RR and RD estimation
Ridge regressions have lower MSE but not perfect 95% oracle coverage
Including the deterministic Q function marginally decreases bias/variance, so we should use in the bootstrap estimator

Risk difference

estimator	bias	variance	mse	oracle.coverage
LASSO, Det-Q, AUC fit	-0.002080	6.0e-06	1.0e-05	84.50000
LASSO, Det-Q, AUC fit	-0.002080	6.0e-06	1.0e-05	84.50000
LASSO, Lambda: 1se	-0.001631	1.0e-05	1.3e-05	91.50000
Elastic Net, Lambda: 1se	-0.001450	9.0e-06	1.2e-05	92.00000
GLM, LASSO prescreen	0.002793	4.9e-05	5.7e-05	92.78351
LASSO, Q-intercept	-0.001583	1.1e-05	1.3e-05	93.00000
LASSO, Det-Q, Lambda: 1se	-0.001109	8.0e-06	9.0e-06	93.50000
GLM	0.002819	5.6e-05	6.4e-05	93.50000
GLM, LASSO prescreen, Det-Q	0.002795	5.1e-05	5.9e-05	93.87755
Ridge, Det-Q	0.000446	1.1e-05	1.1e-05	94.00000
Elastic Net, Det-Q, Lambda: 1se	-0.000899	8.0e-06	8.0e-06	94.50000
LASSO, Det-Q	0.000267	1.4e-05	1.4e-05	94.50000
Ridge, Lambda: 1se	-0.000978	8.0e-06	9.0e-06	94.50000
Ridge	-0.000118	1.3e-05	1.3e-05	94.50000
LASSO, AUC fit	-0.001365	1.2e-05	1.4e-05	95.00000
LASSO	-0.000265	1.7e-05	1.7e-05	95.00000
Ridge, Det-Q, Lambda: 1se	-0.000536	6.0e-06	7.0e-06	95.50000

Relative Risk

estimator	bias	variance	mse	oracle.coverage
LASSO, Det-Q, AUC fit	-0.762	0.209	0.790	34.000
LASSO, Det-Q, AUC fit	-0.762	0.209	0.790	34.000
Ridge, Det-Q, Lambda: 1se	-0.574	0.228	0.558	65.500
LASSO, Det-Q, Lambda: 1se	-0.594	0.250	0.603	69.000
Ridge, Lambda: 1se	-0.558	0.239	0.550	69.500
Elastic Net, Det-Q, Lambda: 1se	-0.580	0.250	0.585	71.000
LASSO, Lambda: 1se	-0.569	0.268	0.592	74.000
Elastic Net, Lambda: 1se	-0.561	0.265	0.579	75.000
LASSO, Q-intercept	-0.577	0.287	0.619	78.500
LASSO, AUC fit	-0.465	0.282	0.498	84.500
Ridge	-0.337	0.278	0.392	93.000
GLM, LASSO prescreen	-0.022	0.459	0.459	93.299
Ridge, Det-Q	-0.315	0.263	0.362	93.500
GLM	-0.005	0.469	0.469	93.500
LASSO, Det-Q	-0.326	0.308	0.414	95.000
LASSO	-0.341	0.328	0.445	95.000
GLM, LASSO prescreen, Det-Q	-0.025	0.443	0.443	95.408

Comparison of different variance estimators

Notes:

Showing LASSO estimator results with modeled Q (rather than intercept-only)
The IC variance estimator is anti-conservative and the TMLE variance estimator is conservative
The bootstrap is anti-conservative but less so than the IC variance estimator
The IPTW estimator is uniformly biased with overly-wide confidence intervals in all simulations (not shown)

Risk difference coverage

variance_estimator	coverage	mean_ci_width	power	bias_se_ratio_emp
ic, Det-Q	67.0	0.00736	92.0	0.14223
tmle	99.5	0.02129	49.0	-0.05020
ic	62.0	0.00737	91.0	-0.14089
Bootstrap, Det Q function	87.0	0.01346	68.5	NA
Bootstrap, Det Q function, 500 iterations	89.0	0.01338	69.5	NA
Bootstrap	85.5	0.01454	68.5	NA
Bootstrap-Ridge	87.5	0.01289	72.0	NA

Relative risk coverage

variance_estimator	coverage	mean_ci_width	power	bias_se_ratio_emp
ic, Det-Q	55.0	0.866	92.0	-1.475
ic	48.5	0.841	90.5	-1.591
tmle	100.0	3.579	0.5	-0.374
Bootstrap, Det Q function	76.5	1.952	68.5	NA
Bootstrap, Det Q function, 500 iterations	77.5	1.955	69.5	NA
Bootstrap	75.5	1.988	68.5	NA
Bootstrap- IPTW	100.0	17.888	0.0	NA
Bootstrap-Ridge	76.5	1.870	72.0	NA

Comparison of variance estimator performance over time

The primary analysis examined the effect of continuous GLP1 usage on dementia risk after 5 years, with longitudinal data discretized into 6 month time nodes. The imperfect performance of estimators in simulations may arise from the rare outcome (~2% prevalence after 5 years), positivity issues in the long-term followup (with increasingly small number of individuals continuously on GLP1), or high degrees of administrative censoring (~50% after 5 years). We ran simulations for all length of followup time from 6 months (time=1) to 5 years (time=10). Oracle coverage is good at all times, while IC coverage is increasingly anti-conservative and TMLE coverage is increasingly conservative over time. Interestingly, variance in RD estimates increases more over time while bias increases more in RR estimates.

Risk difference

time	bias	variance	mse	bias_se_ratio	bias_se_ratio_emp	oracle.coverage	IC_coverage	TMLE_coverage	IC_mean_ci_width	TMLE_mean_ci_width
1	-0.00021	0e+00	0e+00	-0.24209	-0.35093	96.0	71.0	85.0	0.00232	0.00285
2	-0.00021	0e+00	0e+00	-0.17300	-0.23397	95.5	77.5	77.5	0.00357	0.00357
3	0.00019	0e+00	0e+00	0.11499	0.17473	96.5	76.5	96.5	0.00426	0.00655
4	0.00070	0e+00	0e+00	0.37575	0.56369	95.5	78.5	99.0	0.00487	0.00838
5	0.00027	1e-05	1e-05	0.11295	0.18745	96.0	78.5	98.5	0.00569	0.01069
6	0.00064	1e-05	1e-05	0.25730	0.41259	95.0	78.0	99.5	0.00607	0.01228
7	-0.00019	1e-05	1e-05	-0.06014	-0.11447	95.5	72.5	97.0	0.00656	0.01434
8	-0.00114	1e-05	1e-05	-0.33070	-0.65342	93.5	64.5	97.5	0.00685	0.01649
9	-0.00072	1e-05	1e-05	-0.19364	-0.39991	94.0	64.0	98.0	0.00710	0.01844
10	-0.00045	2e-05	2e-05	-0.11002	-0.24138	94.5	61.5	99.5	0.00737	0.02129

Relative risk

time	bias	variance	mse	bias_se_ratio	bias_se_ratio_emp	oracle.coverage	IC_coverage	TMLE_coverage	IC_mean_ci_width	TMLE_mean_ci_width
1	-0.289	0.383	0.467	-0.467	-0.717	96.0	75.5	96.0	1.581	2.199
2	-0.207	0.244	0.287	-0.419	-0.616	96.5	78.5	78.5	1.319	1.319
3	-0.151	0.284	0.306	-0.283	-0.475	98.0	73.0	97.5	1.243	2.351
4	-0.076	0.274	0.280	-0.144	-0.246	97.0	74.0	98.0	1.205	2.720
5	-0.194	0.269	0.307	-0.374	-0.740	96.0	71.0	98.5	1.029	2.517
6	-0.161	0.259	0.285	-0.317	-0.621	95.0	72.0	99.5	1.019	2.852
7	-0.312	0.304	0.401	-0.566	-1.287	95.0	62.5	98.5	0.950	2.892
8	-0.407	0.317	0.482	-0.724	-1.791	93.5	52.5	99.0	0.891	3.030
9	-0.369	0.321	0.458	-0.651	-1.659	94.0	53.0	99.0	0.873	3.300
10	-0.352	0.328	0.452	-0.614	-1.639	94.5	48.0	100.0	0.841	3.579

Diabetes-dementia updated simulation results

Andrew Mertens

2022-11-17

Summary

Scenario 1: Simple simulation

Relative risk performance

Risk difference performance

Scenario 2: Realistic simulation, null outcome

RD oracle coverage of different estimators

RR oracle coverage of different estimators

Performance of difference variance estimators on null data

Risk difference performance

Relative risk performance

Scenario 3: Realistic simulation, protective effect of GLP1 on dementia

Comparison of different estimators’ performance

Risk difference

Relative Risk

Comparison of different variance estimators

Risk difference coverage

Relative risk coverage

Comparison of variance estimator performance over time

Risk difference

Relative risk