Designing Population Health Studies

March 31 and April 2 2025
Eric Delmelle

Chapter Overview

  • Research design fundamentals in epidemiology and public health
  • Measurement validity and reliability
  • Error and bias in population health research
  • Study design approaches:
    • Cross-sectional
    • Case-control
    • Cohort
    • Experimental
  • Qualitative methods and their integration with quantitative approaches
  • Ethical considerations in population health research

1 A Matter of Measurement

Primary vs. Secondary Data

Primary data:
- Data collected specifically for the purpose of the study.

Secondary data:
- Data collected for other purposes but reorganized and reanalyzed.

Examples:
- Health insurance claims data
- Employment records
- National health surveys

2 Primary vs. Secondary Data

  • Primary data
    • Collected specifically for a new study
    • Controlled by the researcher
    • Tailored to answer specific research questions
  • Secondary data
    • Pre-existing data collected for another purpose
    • Often large-scale and readily available
    • May require cleaning or transformation

While primary data offers control and precision, secondary data can save time and resources.

3 Levels of Measurement

Level Description Example Operations
Nominal Categories with no ranking Blood type, Sex Equality/inequality
Ordinal Ordered categories Health self-rating: excellent, good, fair, poor Greater than/less than
Interval Equal units, no true zero Temperature in °C Addition/subtraction
Ratio Equal units with true zero Weight, Blood pressure Multiplication/division

Understanding measurement levels is crucial for selecting appropriate statistical analyses. A variable can always be reduced to a lower level of measurement (continuous to categorical), but not elevated (categorical to continuous).

4 Ecological Studies and Fallacy

  • Unit of analysis: Group (e.g., city-level data)
  • Examples:
    • Community fluoride levels and dental caries
    • Countries’ smoking rates and lung cancer rates
  • Ecological fallacy: Attributing group-level associations to individuals
  • Example:
    • Classrooms with more women had higher average grades
    • But individual-level analysis showed men had higher grades in each classroom
Classroom A Classroom B Classroom C
F 70 F 65 F 65
F 70 F 70 F 70
F 70 F 70 F 80
F 75 F 75 F 80
F 70 F 80 M 70
F 80 F 85 M 75
F 80 M 80 M 75
F 80 M 80 M 80
M 95 M 85 M 85
M 100 M 90 M 90
Class Mean
F 74, M 98, FM: 79 F 74, M 84, FM: 78 F 74, M 79, FM: 77

5 Variables and Levels of Measurement

  • Categorical variables:
    • Dichotomous (e.g., male/female)
    • Polytomous (e.g., blood type)
    • Nominal (no implied order)
    • Ordinal (ranked, e.g., “good” > “fair”)
  • Continuous variables:
    • Interval scale (e.g., temperature in Celsius)
    • Ratio scale (e.g., body weight, height)

Note: Continuous variables can be converted to categorical, but not vice versa.

6 Types of Research Design

Key concepts to distinguish studies:

  • Purpose: Descriptive vs. analytical
  • Investigator control: Observational vs. interventional
  • Directionality: Forward vs. backward
  • Sample selection: Based on exposure, disease, or neither
  • Timing: Prospective vs. retrospective

Study Types: - Cross-sectional - Case-control - Retrospective cohort - Prospective cohort - Randomized controlled trial (RCT)

7 Basic Terminology: Exposure and Disease

  • E = “Exposure”
    • Risk factor (e.g., smoking, occupational hazard)
    • Intervention (e.g., drug, prevention program)
  • D = “Disease” or outcome
    • Disease, injury, death
    • Any health-related outcome
  • E₀ and D₀ = Absence of exposure/disease
  • E₁ and D₁ = Presence of exposure/disease

8 Study Designs Summary

Study Type Purpose Control Directionality Sample Selection Timing
Cross-sectional Descriptive/Analytical Observational Concurrent Representative sample Retrospective
Case-control Analytical Observational Backward Based on disease Retrospective
Retrospective cohort Analytical Observational Forward Based on exposure Retrospective
Prospective cohort Analytical Observational Forward Based on exposure Prospective
Randomized control trial Analytical Interventional Forward Based on exposure Prospective

9 Cross-Sectional Studies

  • Also called prevalence studies
  • Exposure and outcome assessed simultaneously
  • Can be descriptive or analytical
  • Provides a “snapshot” of a population
  • Relatively quick and inexpensive

Limitations:

  • Cannot establish temporal sequence

  • Only includes survivors of disease

  • Not suitable for rare diseases

A snapshot in time: both exposure and outcome measured simultaneously

10 In-class exercise

11 Case-Control Analysis

  • Cannot directly compute relative risk
  • Use odds ratio (OR) as an estimate:

\[OR = \frac{ad}{bc}\]

  • In a 2×2 table:

\[ \begin{array}{|c|c|c|} \hline & D_1 & D_0 \\ \hline E_1 & a & b \\ \hline E_0 & c & d \\ \hline \end{array} \]

  • When disease is rare, OR ≈ RR

What is happening here?

12 Cohort Studies

  • Start with exposure status (E₁ and E₀)
  • Follow forward to observe outcome
  • Two types:
    • Prospective: Start now, follow into future
    • Retrospective: Look back at historical exposure
  • Can study multiple outcomes
  • Directly computes incidence and relative risk

Following groups forward from exposure to outcome

13 Cohort Analysis

  • Relative Risk (RR) quantifies association between exposure and outcome:

\[RR = \frac{a/(a+b)}{c/(c+d)}\]

  • In a 2×2 table:

\[ \begin{array}{|c|c|c|} \hline & D_1 & D_0 \\ \hline E_1 & a & b \\ \hline E_0 & c & d \\ \hline \end{array} \]

  • Allows for:
    • Direct incidence calculation
    • Assessment of multiple outcomes
    • Use with rare exposures

Challenges:

  • Time-consuming and costly

  • Potential for loss to follow-up

  • Not ideal for rare diseases

  • Diagnostic changes over time may affect results

14 Randomized Controlled Trials (RCTs)

  • Gold standard for assessing causal relationships
  • Participants randomly allocated to intervention (E₁) or control (E₀)
  • Minimizes confounding and bias through:
    • Randomization
    • Blinding (single, double)
  • Can be conducted at individual or group level

Phases of clinical trials:

  • Phase I: Safety and dosage (small sample)

  • Phase II: Effectiveness and side effects

  • Phase III: Confirm effectiveness, monitor adverse reactions (RCT)

  • Phase IV: Post-marketing surveillance

Randomization helps balance known and unknown confounders.

15 Validity and Reliability

Measurement Validity

  • Face validity: Appears reasonable on the surface

  • Content validity: Covers full scope of concept

  • Construct validity: Reflects theoretical concept

  • Criterion validity: Correlates with external standard

  • Concurrent validity: Correlates with present condition

  • Predictive validity: Forecasts future outcome

Study Validity

  • Internal validity: Results valid for study sample

  • External validity: Results generalize to other populations

Reliability

  • Test-retest: Same test, different times

  • Inter-observer: Different observers agree

  • Intra-observer: Same observer consistent over time

16 Reliability vs. Validity

  • Reliability = consistency of measurement
  • Validity = accuracy of what is intended to be measured
  • A tool can be reliable but not valid
  • A tool cannot be valid if it is not reliable

Target analogy for validity and reliability

17 Types of Error

Random Error

  • Caused by chance or sampling variation

  • Affects precision

  • Can be reduced by increasing sample size

  • Produces unpredictable fluctuations

Systematic Error (Bias)

  • Consistent, repeatable error due to flaws in design or measurement

  • Affects validity

  • Not reduced by increasing sample size

  • Must be addressed in study design

18 Major Types of Bias

  • Systematic differences between those selected and not selected
  • Examples:
    • Low response rate
  • Healthy worker effect
  • Volunteer bias
  • Berkson’s bias (hospital sampling)
    • Loss to follow-up
    • Survivor bias
  • Measurement error or misclassification
  • Examples:
    • Recall bias
    • Observer/interviewer bias
    • Social desirability bias
    • Instrument bias
    • Diagnostic suspicion bias
  • Third variable distorts exposure-outcome relationship
  • Must be:
    1. Associated with exposure
    2. Risk factor for the outcome
    3. Not an intermediate step in the causal path

19 Controlling for Confounding

At Design Stage

  1. Randomization
    • Evenly distributes confounders across groups
  2. Restriction
    • Limit study to specific subgroup
  3. Matching
    • Pair participants with similar characteristics

At Analysis Stage

  1. Stratification
    • Analyze within homogeneous strata
    • Mantel-Haenszel summary estimate
  2. Multivariable Modeling
    • Include confounders as covariates
    • Logistic regression, Cox models

20 Effect Modification (Interaction)

  • Occurs when the effect of exposure differs across levels of a third variable
  • Not the same as confounding
  • Can be additive or multiplicative
  • Additive Model:
    \[RREM = RRE + RRM - 1\]
  • Multiplicative Model:
    \[RREM = RRE \times RRM\]
  • Interpretation:
    • If RREM > expected → synergism
    • If RREM < expected → antagonism

Example: Asbestos and smoking on lung cancer risk Interaction models

21 Qualitative Methods

  • Originates from the social sciences
  • Explores perceptions, beliefs, experiences
  • Often used to:
    • Understand lived experiences
    • Explore context and meaning
    • Inform survey and tool development
  • Common techniques:
    • In-depth interviews
    • Focus groups
    • Participant observation
    • Document analysis

Mixed methods combine qualitative insight with quantitative data.

22 Types of Qualitative Methods

  • Participant observation: Researcher immersed in environment
  • Captures real behavior and interactions
  • Varies in degree of participation
  • Structured: Same questions for all
  • Semi-structured: Flexible, guided
  • In-depth: Deep exploration
  • Focus groups: Group discussion dynamics
  • Systematic review of written or visual materials
  • Examples:
    • Public records, policy reports
    • Personal journals, media, videos

23 Integrating Qualitative & Quantitative Methods

Ways to integrate qualitative methods:

  1. Pre-study: To develop hypotheses or instruments

  2. During study: To explain unexpected results

  3. Post-study: To interpret and validate findings

  4. Standalone: As an alternative or complement to quantitative

Example: Regional Health Needs Assessment

  • Quantitative: Epidemiologic data, service access

  • Qualitative: Focus groups with residents, interviews with key informants

Combining methods gives depth and context.

24 Ethical Considerations

Four Ethical Principles:

  1. Autonomy – Respect for individuals

  2. Beneficence – Maximize benefits

  3. Non-maleficence – Do no harm

  4. Justice – Fair treatment and burden distribution

Research Protections:

  • Informed consent

  • Confidentiality

  • Data security

  • Vulnerable populations

  • Institutional Review Boards (IRBs)

  • Research integrity and transparency

25 Key Takeaways

  • Choose the study design that best answers your research question
  • Recognize and control for bias and confounding
  • Understand the role of validity and reliability
  • Combine methods when appropriate
  • Always prioritize ethics and participant rights

“No method is inherently superior—it all depends on the research question.” – T. Kue Young