Epidemiology 202: Homework 1

Paper: Birth Characteristics and Subsequent Risks of Maternal Cardiovascular Disease: Effects of Gestational Age and Fetal Growth

1. Would you describe the study population as open or closed? Explain your answer. How was membership in the study population defined?

This is a closed cohort study, as the membership to the study population was defined by a membership defining “event” (having their first singleton birth during the enrollment period) rather than a “state”, and they were always members of the cohort after inclusion.

2. From the viewpoint of the investigators, is the time frame that is observed between exposure and outcome retrospective, cross-sectional, or prospective? (1-2 sentences)

It is prospective by the exposure-outcome temporality definition. The exposure was measured at the time of childbirth, i.e., before occurence of outcome events.

3. Describe the study population (2-3 sentences).

All women having a first singleton delivery between 1/1/1983 to 12/31/2005 in Sweden were potentially eligible. Immigrating women, women who had had CVD-event before first child, women with missing data (gestational age, birth weight, or record in the Medical Birth Register) were excluded, leaving 923,686 women as the study sample.

4. What are the 2 main exposures of interest and how are they ascertained and defined in their analysis? (1-2 sentences)

Delivery of a preterm (exposure 1) infant and small for gestational age (exposure 2) infant as the first child were considered the exposures of interest in this study. Information was collected from the Medical Birth Register, i.e., gestational age was estimated from ultrasonography or last menstrual period (preterm birth defined as birth at < 37 weeks of gestational age), and birth weight (SGA defined as < 90% of expected birth weight) was also collected from the Register.

5. The major outcome of interest is maternal cardiovascular disease. How did the authors ascertain and define the outcome? (1-2 sentences)

Incident cardiovascular diseases were ascertained from the ICD codes in the Hospital Discharge Register or the Cause of Death Register, and were defined as the first hospitalization or death caused by coronary heart disease (unstable angina or acute myocardial infarction), cerebrovascular events (cerebral infarction, cerebral hemorrhage, subarachnoid hemorrhage, transient ischemic attack, or other acute stroke), or heart failure.

6a. Describe the study hypothesis, including the direction of association.

It was hypothesized that maternal risks of cardiovascular diseases increases with more preterm labor and more severe small for gestational age infants in the first childbirth.

6b. In general, how can a prospective study design help address the limitations of retrospective and cross-sectional studies?

In a prospective study, it is possible to confirm that the patients are free of the outcome of interest at the enrollment, thus, temporality (outcome comes after exposure) can be shown concretely. Appropriate temporality is an essential part of causal inference. Also, as the exposure is measured before the outcome, no differential misclassification of the exposure status can happen.

7a. The primary measure of association is the Hazard Ratio. For the purposes of this homework, you may assume that this is synonymous with the Incidence Rate Ratio. Why do you think this measure was chosen? If the authors chose to calculate a cumulative incidence ratio for this study would it be valid? Why or why not? (2-3 sentences)

A rate measure rather than risk measure was chosen as the follow-up durations were different for different study participants. This is because the study enrollment was over 22 years (1983-2005), but the follow-up period ended 12/31/2005 at maximum, and also follow-up ended at CVD events, emigration, or death. If 22-year cumulative incidence (risk) were calculated, it would assume everybody was followed for full 22 years (overestimation of person-time), thus it would underestimate the risk.

7b. What was the final multivariable + smoking adjusted hazard ratio that compares the hazard of cardiovascular disease in women who had very preterm babies, with women having babies at term not taking into account birth weight for gestational age? Give an interpretation for this quantity.

2.57 (95% confidence interval 1.97, 3.34). The hazard rate for cardiovascular disease in women who had very preterm babies is 2.57 times higher than that in women who had term babies.

8a. What are the essential properties of confounding variables?

A confounding variable must be associated with both the outcome of interest and predictor of interest. It cannot be a causal intermediate or common effect of the exposure and the outcome.

8b. Draw a Directed Acyclic Graph (DAG) representing the relationship between preterm birth and cardiovascular disease in mothers, with maternal age represented as a confounder. Label all variables and explain the meaning of each arrow on the graph.

library(dagR)
dag.dat <- dag.init(outcome = NULL, exposure = NULL, covs = c(1), arcs = c(1,0, 1,-1),
                    assocs = c(0,0), xgap = 0.04, ygap = 0.05, len = 0.1, y.name = "Later Life Maternal CVD",
                    x.name = "Preterm birth", cov.names = c("Maternal Age"))

junk <- dag.draw(dag.dat, n = T)

plot of chunk unnamed-chunk-2

Arrows:

8c. In one of their models, the authors adjusted for smoking in their analysis. Do you think smoking is a confounder of the relationship between preterm birth and CVD in mothers? Why or why not? (2-3 sentences)

Yes. Smoking can potentially result in preterm birth, and can result in CVD, thus it is a common cause of both of the predictor of the interest and the outcome of the interest. These characteristics meet the definition of a confounder.

8d. What potential confounders did the authors include in their multivariable adjusted model for the association between preterm birth and maternal cardiovascular morbidity?

It was adjusted for maternal age, birth year, highest income, and education levels before first delivery, country of birth, pregestational diabetes mellitus, pregestational hypertension, gestational diabetes mellitus, gestational hypertension, and preeclampsia/eclampsia.

8e. Did adjustment for these confounders (including smoking) change the interpretation of the results, for the association between gestational age and CVD risk, when comparing the group that was moderately preterm to the group that delivered at term? If so, how?

Adjustment for these potential confounders diminished the ratio seen in the crude analysis, indicating that part of the crude association was explained by confounding.

9a. The authors suggest that a possible mechanism giving rise to the association, is related to an inflammatory pathway that causes both preterm birth and CVD. Is this hypothesis consistent with a causal effect of preterm birth on CVD?

No. They also wrote “it has been suggested that preterm birth and CVD share common antecedents.” This explanation suggests a common cause. If it is true, the association is from confounding, not from causal relationship.

9b. If the correlation between preterm birth and CVD is not due to a causal relationship, could the study still have clinical relevance? If so, briefly, suggest a clinically relevant interpretation of the paper which does not depend on maternal cardiovascular morbidity being a causal effect of preterm birth. If not, explain why.

It is still clinically relevant. The postulated common cause, an inflammatory pathway is not directly observable. Preterm birth can act as a surrogate marker of the underlying inflammatory state, i.e., observing preterm birth make it possible to predict the risk of later life CVD events.

10. Please consider the dichotomized study data below. You are concerned about the association of preterm births on CVD. Do you find evidence of effect measure modification by gestational age on the multiplicative scale? Do you find evidence of effect measure modification by gestational age on the additive scale? [No need for formal statistical testing, or confidence interval estimation]

Multiplicative scale

There is an 18% change in the rate ratio, thus, effect measure modification is present on the multiplicative scale.

Additive scale

The rate difference is approximately 3 times higher for the preterm stratum, thus, effect measure modification is present on the additive scale.

sga.preterm.dat <- read.table(head = T, text = "
event   py      sga     preterm
2131    7348276 No      No
215 427677  No      Yes
1001    2600683 Yes     No
195 247245  Yes     Yes
")

sga.preterm.dat
  event      py sga preterm
1  2131 7348276  No      No
2   215  427677  No     Yes
3  1001 2600683 Yes      No
4   195  247245 Yes     Yes

## SGA Yes vs No rate ratio
pois.sga <- glm(event ~ sga, offset = log(py), family = poisson, data = sga.preterm.dat)
## summary(pois.sga)
..glm.OR(pois.sga)
              RR 2.5 % 97.5 %
(Intercept) 0.00   0.0   0.00
sgaYes      1.39   1.3   1.49

## Preterm Yes vs No rate ratio
pois.preterm <- glm(event ~ preterm, offset = log(py), family = poisson, data = sga.preterm.dat)
## summary(pois.preterm)
..glm.OR(pois.preterm)
              RR 2.5 % 97.5 %
(Intercept) 0.00  0.00   0.00
pretermYes  1.93  1.74   2.14

## No effect measure (multiplicative) modification model
pois.sga.preterm <- glm(event ~ sga + preterm, offset = log(py), family = poisson, data = sga.preterm.dat)
## summary(pois.sga.preterm)
..glm.OR(pois.sga.preterm)
              RR 2.5 % 97.5 %
(Intercept) 0.00  0.00   0.00
sgaYes      1.36  1.26   1.45
pretermYes  1.87  1.68   2.07

## Effect measure (multiplicative) modification model: SGA = Yes RR 1.33 (1.57 if Preterm = Yes)
pois.sga.preterm.interaction <-
    glm(event ~ sga * preterm, offset = log(py), family = poisson, data = sga.preterm.dat)
## summary(pois.sga.preterm.interaction)
..glm.OR(pois.sga.preterm.interaction)
                    RR 2.5 % 97.5 %
(Intercept)       0.00  0.00   0.00
sgaYes            1.33  1.23   1.43
pretermYes        1.73  1.50   1.99
sgaYes:pretermYes 1.18  0.96   1.46
library(effects)
plot(effect("sga:preterm", pois.sga.preterm.interaction), multi = T)

plot of chunk unnamed-chunk-3

11. In cohort study selection bias can result from differential loss to follow-up. You are planning a similar cohort study of birth characteristics and CVD risk in a population which has a high rate of emigration (moving out of the country). Describe a scenario in your new study which could threaten the validity of your study due to selection bias.

If emmigration (loss to follow-up) is a common effect of birth characteristics and a third variable that also causes maternal later life CVD event, conditioning on emmigration by observing those who are still in the country will open up the path, resulting in selection bias. For example, if mothers who had preterm infants are more likely to emmigrate for better perinatal care and rich people who have lower later life CVD event risks are also more likely to emigrate for better life, observing events in those who are still in the country will result in a selection bias as shown in the simulation below.

Conditions for simulation.

Preparation of matrices for simulation.

Labels                           <- list(SES = c("Poor","Rich"), Birth = c("Preterm","Term"))

cohort.sample.size               <- matrix(rep(500,4),                    ncol = 2, dimnames = Labels)
ten.year.risks                   <- matrix(c(0.2,0.2,0.1,0.1), byrow = T, ncol = 2, dimnames = Labels)
emigration.proportions           <- matrix(c(0.4,0.2,0.8,0.4), byrow = T, ncol = 2, dimnames = Labels)

list(cohort.sample.size     = cohort.sample.size,
     ten.year.risks         = ten.year.risks,
     emigration.proportions = emigration.proportions)
$cohort.sample.size
      Birth
SES    Preterm Term
  Poor     500  500
  Rich     500  500

$ten.year.risks
      Birth
SES    Preterm Term
  Poor     0.2  0.2
  Rich     0.1  0.1

$emigration.proportions
      Birth
SES    Preterm Term
  Poor     0.4  0.2
  Rich     0.8  0.4

No emigration senario

No effect of preterm vs term birth is observed if no loss to follow-up exists

events                           <- cohort.sample.size * ten.year.risks
person.years                     <- cohort.sample.size * 10
events.per.1000PY                <- margin.table(events, 2) / margin.table(person.years, 2) * 1000

list("CVD events per 1000 PY" = events.per.1000PY)
$`CVD events per 1000 PY`
Birth
Preterm    Term 
     15      15 

Emigration senario

A small spurious association appears between preterm birth and CVD event rates over the 10-year follow-up duration.

cohort.remaining                 <- cohort.sample.size * (1 - emigration.proportions)
events.in.cohort.remaining       <- cohort.remaining * ten.year.risks
person.years.in.cohort.remaining <- cohort.remaining * 10

cohort.emigrated                 <- cohort.sample.size * emigration.proportions
events.in.cohort.emigrated       <- cohort.emigrated * ten.year.risks * 0.5 # Observed events (50%)
person.years.in.cohort.emigrated <- cohort.emigrated * 10 * 0.5             # Observed person-years (50%)

events.total                     <- events.in.cohort.remaining + events.in.cohort.emigrated
person.years.total               <- person.years.in.cohort.remaining + person.years.in.cohort.emigrated
events.per.1000PY                <- margin.table(events.total, 2) / margin.table(person.years.total, 2) * 1000

list("CVD events per 1000 PY" = events.per.1000PY)
$`CVD events per 1000 PY`
Birth
Preterm    Term 
 15.714  15.294 

12. On a scale between 0 and 10 inclusive, please indicate your degree of confidence in the following statement: “Delivery of a preterm or SGA first born infant causes increased risk of CVD hospitalization or death.” Justify your answer, briefly, in terms of the principles of epidemiologic research.

2/10. The current study showed an association between a preterm or SGA first born infant and incresed maternal risk of CVD hospitalization or death However, the authors explain the association by a common cause, an unobserved inflammatory pathway, thereby, effectively stating that the association resulted from confounding by an unobserved variable. Nonetheless, this study does not deny the possibility of other pathways that do connect the exposure to the outcome causally, thus, I put 2 for that possibility.