New Modelling Extension for Multi-Locational Studies

The eight step process in Masselot and Gasparrini’s new paper in Statistical Methods in Medical Research.

Derek Weix

Drexel University

March 24, 2025

Introduction

Masselot and Gasparrini (2025)

  • Title: Modelling extensions for multi-location studies in environmental epidemiology.
  • Journal: Statistical methods in medical research (SMMR)
  • Authors: Pierre Masselot and Antonio Gasparrini

Overview of Paper:

  • This paper proposes a cohesive (8 part) framework that incorporates many ideas from previous papers, and adds a few new ones.
  • All code and data to reproduce their analysis is on github.

Novel Contributions

Old Ideas:

New Ideas:

  • Uses spatial regression to asses risk in unobserved areas.
  • Uses excess mortality rates that are group and location specific, and which can be transformed into a standardized excess mortality rate.
  • Proposes a new method of Monte Carlo simulation that accounts for additional uncertainty when deriving empirical confidence intervals.

The Eight-Stage Framework

Stages of Analysis:

  1. Location/Group-Specific First Stage Model
  2. Modelling Demographic Differences
  3. Dimension Reduction
  4. Predictive Meta-Regression Model
  5. Geospatial Model
  6. Risk Prediction
  7. Standardization of Impacts
  8. Uncertainty Assessment

First Stage Model

For location, \(i\), and group \(a\), fit the following model: \[g(E[y_{iat}]) = \alpha_{ia} + f(x_{it}, l;\boldsymbol{\theta_{ia}}) + \sum_{j=1}^Js_j(t;\boldsymbol{\varphi}_{iaj})+\sum_{q=1}^Qh_q(z_{iaqt};\boldsymbol{\gamma}_{iaq}),\] where \(y_{iat}\) is the health outcome and \(x_{it}\) is the exposure. This model allows for time series, case time series, case crossover, etc.

\[~\]

This is a standard DLNM. Our goal is to model the values of \(\boldsymbol{\theta}_{ia}\) using demographic differences and indices of vulnerability.

Modelling Demographic Differences

Process goal:

  • Get an age variable, \(A_{ia}\), attributable to each estimate \(\hat{\boldsymbol{\theta}}_{ia}\).
  • \(A_{ia}\) should represent the average age of death.
  • This variable will be used in the meta-regression.

\[A_{ia} = \left(\sum_{k=l}^ud_{ik}\right)^{-1} \sum_{k=l}^u o_{ik}d_{ik}\]

  • \(o_{ik}\) and \(d_{ik}\) are age and number of cases for age \(k\) at location \(i\).
  • \(l\) and \(u\) are the lower and upper bound of age for group \(a\) in location \(i\).

Composite Indices of Vulnerability

Suppose we have \(\mathbf{v}_i\), a vector of \(P\) local characteristics for location \(i\).

\(~\)

We reduce that to \(\mathbf{w}_i\), a vector of \(K \ll P\) of composite characteristics using a transformation \(\mathbf{R}\).

\[\mathbf{w}_i = \mathbf{R}'\mathbf{v}_i\]

\(~\)

Options for dimension reduction:

  • Principal Component Analysis (PCA)
  • Canonical Correlation Analysis (CCA)
  • Partial Least Squares (PLS)

Composite Indices of Vulnerability

Predictive Meta-Regression Model

Thus, we have the components for our meta-regression:

  • \(\boldsymbol{\theta}_{ia}\) is the response and
  • \(A_{ia}\) and \(\mathbf{w}_i\) become the design matrix, \(\mathbf{X}_{ia}\).

\(~\)

The general form of the meta-regression is:

\[\hat{\boldsymbol{\theta}}_{ia} = \mathbf{X}_{ia}\boldsymbol{\beta} + \mathbf{Z}_i\mathbf{b}_i + \epsilon_{ia}\]

where \(\boldsymbol{\beta}\) is the fixed effect, \(\mathbf{b}_i \sim N(0, \Psi_i)\) is the city-specific random effect, and \(\epsilon_{ia} \sim N(0, \mathbf{S}_{ia})\) are the residuals.

Predictive Meta-Regression Model

  • Best linear unbiased prediction: \(\hat{\boldsymbol{\theta}}_{ia}^b = \mathbf{X}_{ia} \hat{\boldsymbol{\beta}} + \mathbf{Z}_i\hat{\Psi}\mathbf{Z}_i'\Sigma_{ia}^{-1}(\hat{\boldsymbol{\theta}}_{ia}-\mathbf{X}_{ia}\hat{\boldsymbol{\beta}})\)
  • Fixed effect prediction: \(\hat{\boldsymbol{\theta}}_{ia}^f = \mathbf{X}_{ia} \hat{\boldsymbol{\beta}}\)

Spatialization of Risk

BLUP residuals, \(\hat{\boldsymbol{\xi}}_i\), can capture patterns unexplained by the mixed-effects meta-regression.

\[\hat{\boldsymbol{\xi}}_i = \hat{\boldsymbol{\theta}}_{ia}^b - \hat{\boldsymbol{\theta}}_{ia}^f\]

To estimate \(\hat{\boldsymbol{\xi}}_i\) we need observations.

\(~\)

We can estimate \(\hat{\boldsymbol{\xi}}_i^*\) using geostatistical method such as:

  • Kriging
  • Integrated Nested Laplace Approximations (INLA)

Spatialization of Risk

Prediction to Unobserved Locations

The spatial estimates of the BLUP residuals allow us to estimate the BLUP, even for locations where mortality was not directly observed.

\[~\]

For locations that are unobserved, we use:

  • \(\hat{\boldsymbol{\theta}}_{ia}^{f*}\), estimated from \(A_{ia}\) and \(\mathbf{w}_i\), and
  • \(\hat{\boldsymbol{\xi}}_i^*\), estimated with Kriging,

to get the BLUP:

\[\hat{\boldsymbol{\theta}}_{ia}^{b*} = \hat{\boldsymbol{\theta}}_{ia}^{f*} + \hat{\boldsymbol{\xi}}_i^*\].

Prediction to Unobserved Locations

RMSE of First-Stage and Predicted vs BLUP

Age Standarization of Impacts

Previously we use attributable fraction/number:

  • \(AF^x_{ia} = 1 - \exp{(-f(x, l; \hat{\boldsymbol{\theta}}_{ia}^{b*}))}\),
  • \(AN^x_{ia} = AF^x_{ia}d_{ia}\) where \(d_{ia}\) is the total number of deaths.

Excess mortality rate:

  • \(E^*_{ia} = AN_{ia}/p_{ia}\), where \(p_{ia}\) is population of age group and and location \(i\) over the study period.

Standardized excess mortality rate:

  • \(E^*_i = \left(\sum w_a\right)^{-1}\sum_aE^*_{ia}w_a\), where the weight \(w_a\) is group specific proportion of a reference population.

Age Standarization of Impacts

Uncertainty Assessment

Empirical confidence intervals (eCIs) are necessary for estimating the uncertainty of our estimates of \(E^*_i\).

\(~\)

The established method resamples from \(\hat{\boldsymbol{\theta}}_{ia}^{b*}\) with its corresponding covariance matrix \(V(\hat{\boldsymbol{\theta}}_{ia}^{b*})\), but this ignores the dependency in the fixed aspect of this estimate.

\(~\)

Therefore we must sample directly from \(\boldsymbol{\beta} \sim N(\hat{\boldsymbol{\beta}}, \mathbf{V}_\boldsymbol{\beta})\) and \(\hat{\boldsymbol{\xi}}_i^*\) with \(V(\hat{\boldsymbol{\xi}}_i^*)\).

\(~\)

This more fully accounts for the uncertainty in our estimates.

Uncertainty Assessment

Review of 8 Step Process: 1-3

Review of 8 Step Process: 4-8

Bibliography

Gasparrini, A., B. Armstrong, and M. G. Kenward. 2012. “Multivariate Meta‐analysis for Non‐linear and Other Multi‐parameter Associations.” Statistics in Medicine 31 (29): 3821–39. https://doi.org/10.1002/sim.5471.
Masselot, Pierre, and Antonio Gasparrini. 2025. “Modelling Extensions for Multi-Location Studies in Environmental Epidemiology.” Statistical Methods in Medical Research, February, 09622802241313284. https://doi.org/10.1177/09622802241313284.
Sera, Francesco, Benedict Armstrong, Marta Blangiardo, and Antonio Gasparrini. 2019. “An Extended Mixed‐effects Framework for Meta‐analysis.” Statistics in Medicine 38 (29): 5429–44. https://doi.org/10.1002/sim.8362.