1. Introduction

The fundamental task of epidemiology is to quantify the occurrence of disease. We do this to describe the health of a population, identify groups at high risk, and evaluate the effectiveness of health interventions.

To measure disease, we don’t just count individuals; we relate those counts to the size of the population and the passage of time.


2. Fractions in Epidemiology

Before we measure specific diseases, we must understand the three mathematical building blocks:

A. Ratio

A fraction where the numerator is not part of the denominator. \[Ratio = \frac{x}{y}\] * Example: The ratio of male to female births in a hospital (e.g., 1.05:1).

B. Proportion

A fraction where the numerator is part of the denominator. Usually expressed as a percentage. \[Proportion = \frac{x}{x + y}\] * Example: 15 students in a class of 100 have the flu (15%).

C. Rate

A proportion that incorporates the concept of time. * Formula: \(\frac{\text{Event}}{\text{Population at risk} \times \text{Time Period}}\)


3. Measures of Morbidity

Morbidity refers to the state of being diseased or unhealthy within a population. The two primary measures are Prevalence and Incidence.

3.1 Prevalence

Prevalence measures the burden of disease. It answers: What proportion of the population has the disease at a specific point in time?

Point Prevalence

\[P = \frac{\text{Number of existing cases at a point in time}}{\text{Total population at that point in time}}\]

  • Real-Life Example: On January 1st, 2023, 500 people in a town of 10,000 were living with Type 2 Diabetes.
    • \(Prevalence = 500 / 10,000 = 5\%\)

3.2 Incidence

Incidence measures the risk or the flow of new cases. It answers: How many people developed the disease during a specific period?

A. Cumulative Incidence (Risk)

Used when the entire population is followed for the same amount of time. \[CI = \frac{\text{Number of new cases during a period}}{\text{Total population at risk at the start of the period}}\]

B. Incidence Rate (Incidence Density)

Used when people are followed for different lengths of time (using Person-Time). \[IR = \frac{\text{Number of new cases}}{\text{Sum of person-time at risk}}\]


4. The Relationship between Prevalence and Incidence

Think of a bathtub: * Incidence is the water flowing in from the faucet (new cases). * Prevalence is the level of water in the tub (total existing cases). * Mortality/Recovery is the water leaving through the drain.

\[Prevalence \approx Incidence \times Duration\]

Real-Life Example: HIV/AIDS In the 1990s, when highly active antiretroviral therapy (HAART) was introduced, the Prevalence of HIV increased. This wasn’t because more people were getting infected (Incidence was stable), but because fewer people were dying (Duration of life increased).


5. Visualizing the Data in R

Let’s visualize the difference between Incidence and Prevalence using a simulated dataset of a flu outbreak in a small dorm.

# Creating dummy data
days <- 1:15
new_cases <- c(0, 2, 5, 8, 12, 10, 5, 3, 1, 0, 0, 0, 0, 0, 0)
total_cases <- cumsum(new_cases) - c(0, 0, 0, 1, 2, 4, 6, 8, 10, 12, 14, 15, 15, 15, 15) # subtracting recoveries

df <- data.frame(Day = days, New_Cases = new_cases, Active_Cases = total_cases)

ggplot(df, aes(x = Day)) +
  geom_line(aes(y = New_Cases, color = "Incidence (New)"), size = 1.2) +
  geom_line(aes(y = Active_Cases, color = "Prevalence (Current)"), size = 1.2, linetype = "dashed") +
  labs(title = "Incidence vs. Prevalence during a Flu Outbreak",
       y = "Number of People",
       color = "Metric") +
  theme_minimal()


6. Measures of Mortality

Mortality measures the occurrence of death in a population.

Measure Formula
Crude Death Rate \(\frac{\text{Total deaths in a year}}{\text{Mid-year population}} \times 1,000\)
Case Fatality Rate (CFR) \(\frac{\text{Deaths from specific disease}}{\text{Number of people with that disease}} \times 100\)
Infant Mortality Rate \(\frac{\text{Deaths < 1 year old}}{\text{Number of live births}} \times 1,000\)

Case Study: COVID-19

In the early stages of a pandemic, the Case Fatality Rate (CFR) often appears higher because we only test the sickest individuals (the denominator is small). As testing expands to asymptomatic people, the CFR usually drops.


7. Practice Example: Calculating Person-Years

Scenario: 5 healthy men were followed for a study on heart disease for 5 years. * Subject A: Followed 5 years, no disease. * Subject B: Developed disease at year 2. * Subject C: Lost to follow-up at year 3. * Subject D: Followed 5 years, no disease. * Subject E: Developed disease at year 4.

Calculation: 1. Total Person-Years = \(5 (A) + 2 (B) + 3 (C) + 5 (D) + 4 (E) = 19 \text{ person-years}\). 2. New Cases = 2 (B and E). 3. Incidence Rate = \(2 / 19 = 0.105\) cases per person-year (or 10.5 cases per 100 person-years).


8. Summary Checklist

  • Use Prevalence for resource planning, hospital bed allocation, and chronic disease burden.
  • Use Incidence for researching the causes (etiology) of disease and the effectiveness of prevention programs.
  • Always define your population at risk (e.g., you cannot include people who have already had their gallbladder removed in a study of gallbladder disease incidence).

```

Key Components included in this Rmd:

  1. Mathematical Clarity: Used LaTeX format for formulas (e.g., $$P = \frac{x}{y}$$) which renders beautifully in R Markdown.
  2. Visualizations: Included a ggplot2 code chunk to demonstrate the relationship between Incidence and Prevalence visually.
  3. Real-Life Context: Included examples like HIV treatment impacts and COVID-19 testing to make the abstract concepts concrete.
  4. Structure: Used Table of Contents (toc: true) and themed styling for a professional lecture note feel.

How to use this:

  1. Install R and RStudio.
  2. Install the necessary packages: install.packages(c("ggplot2", "dplyr", "tidyr")).
  3. Create a new “R Markdown” file, delete the default text, and paste the code above.
  4. Click the Knit button at the top of the RStudio editor.