Introduction

Improvement in RCT generalizability is important for real-world population.

Common methods of generalize trial results to population, such as propensity methods, require individual population data which are hard obtain due to cost, time, and data protection.

Methods are needed to be able to assess generalizability and to generalize trial results to population using aggregated data.

Methods - overview

Using real-world data to assess the accuracy of 4 methods of calculating generalizability by comparing

  • predicted risk using the generalized data (based on individual data from a trial) and,

  • predicted risk using a gold standard derived from the individual-level data based on trial and population data

Methods - data

Trial:

  1. Justification for the Use of Statins in Primary Prevention: an Intervention Trial Evaluating Rosuvastatin (JUPITER)

  2. 17,802 subjects from 26 countries

  3. Risk of cardiovascular disease between daily Rosuvastatin 20 gm vs. placebo

  4. Individual data are available for this analysis

Target population data (UK population)

  1. Clinical Practice Research Datalink

  2. Approximation population of England who are eligible for this trial

  3. Individual data from all population are assumed unavailable when evaluating the four methods, but is available for the purpose of determining gold standard.

Methods - covariates

Binary: sex, current smoking, chronic kidney disease, use of aspirin and anti-hypertensive drugs

Continuous:

  • age and BMI

  • high—sensitive C-reactive protein (marker of acute inflammation, which is a sign of a serious infection, an injury, or chronic disease)

  • high-density lipoprotein cholesterol (“good” cholesterol, absorbs cholesterol in the blood and carries it back to the liver),

  • low-density lipoprotein cholesterol (high levels of LDL cholesterol raise your risk for heart disease and stroke).

Methods to generalize population data using trial data

More details on Method 1

Given summary statistics of the covariates in the total target population, simulate hypothetical individual data:

  1. assumed no correlation between covariates (independently simulated)

  2. binary variables: based on proportion from the target population

  3. continuous variables: based on mean (SD) or as categorical variables

  4. 100 * true population size

  5. obtain sampling probability/weight for each trial subject using logistic model

  6. estimate treatment effect to compare with the gold standard

More details on Method 2

Patients in the trial with individual data were reweighted to have the average value of the covariates that match those reported in the target population.

  1. works with situation where trial data are limited

  2. use methods of moments to estimate sampling probability (weights) of the logistic models

  3. methods previously developed with R code available

  4. estimate treatment effect to compare with the gold standard

More details on Method 3

Summarize strata-specific drug effect across strata based on weight to obtain the overall effects:

  1. works with only binary/categorical covariates

  2. can work with 1 variable at a time

  3. calculate treatment effect of each subgroup (defined by covariate) in trial

  4. calculate overall treatment effect by weighting the subgroup effect according to the proportion of the strata in the population

  5. repeat steps 3 and 4 for each covariate

  6. summarize the post-stratification estimates of treatment effect among all covariate by taking average of the treatment effect

More details on Method 4

Overall relative treatment effect from the trial and absolute outcome risk estimates from the target population:

  1. obtain risk of disease in the target population (literature) as the baseline risk

  2. identify the relative risk observed in the trial

  3. multiply the baseline risk with the relative risk to obtain the expected risk reduction in the target population

  4. absolute risk only, not applicable in our case

Results summary

Methods 1 and 2 yielded estimates closest to the gold-standard estimates when continuous effect modifiers were represented as categorical variables (reducing the issue with non-normality in data).

Limitations of these two methods:

Covariate balance in joint distributions is important because the goal here is to reweight RCT participants to the target population on all effect modifiers, including those specific to a certain covariate pattern.

References

  • Hong, J. L., Webster-Clark, M., Jonsson Funk, M., Stürmer, T., Dempster, S. E., Cole, S. R., Herr, I., & LoCasale, R. (2019). Comparison of Methods to Generalize Randomized Clinical Trial Results Without Individual-Level Data for the Target Population. American journal of epidemiology, 188(2), 426–437. https://doi.org/10.1093/aje/kwy233

  • Phillippo, D. M., Ades, A. E., Dias, S., Palmer, S., Abrams, K. R., & Welton, N. J. (2018). Methods for Population-Adjusted Indirect Comparisons in Health Technology Appraisal. Medical decision making : an international journal of the Society for Medical Decision Making, 38(2), 200–211. https://doi.org/10.1177/0272989X17725740

  • Signorovitch, J. E., Wu, E. Q., Yu, A. P., Gerrits, C. M., Kantor, E., Bao, Y., Gupta, S. R., & Mulani, P. M. (2010). Comparative effectiveness without head-to-head trials: a method for matching-adjusted indirect comparisons applied to psoriasis treatment with adalimumab or etanercept. PharmacoEconomics, 28(10), 935–945. https://doi.org/10.2165/11538370-000000000-00000

  • https://www.mayoclinic.org/tests-procedures/c-reactive-protein-test/about/pac-20385228

  • https://www.cdc.gov/cholesterol/ldl_hdl.htm#:~:text=HDL%20(high%2Ddensity%20lipoprotein),for%20heart%20disease%20and%20stroke.

Appendix I - gold standard

The gold standard was calculated by combining the individual data from trial and population to calculate sampling probability using multivariable logistic regression including all effect modifiers.

Sampling weights calculated as the inverse odds of the sampling probability [(1-p)/p].

Use this weight to scale the data to make the pseudo-RCT population. Then calculate the effect of the drug.