Epidemiology: Beyond the Basics: Analysis of Age, Birth Cohort, and Period Effects

References

Prepare data

## Enter Table 1-2
tab1.2 <- read.table(header = TRUE, text = "
group midpoint  s1975   s1985   s1995   s2005
10-19   15      17      28      NA      NA
20-29   25      14      23      35      NA
30-39   35      12      19      30      45
40-49   45      10      18      26      40
50-59   55      NA      15      22      36
60-69   65      NA      NA      20      31
70-79   75      NA      NA      NA      27
")

## Melt for graphing and casting
library(reshape2)
tab1.2.melt <- melt(data          = tab1.2,
                    id.vars       = c("group","midpoint"),
                    variable.name = "year",
                    value.name    = "prevalence"
                    )
## Change year to numeric
tab1.2.melt$year <- as.numeric(gsub("s","", tab1.2.melt$year))

## Create a birth cohort indicator
tab1.2.melt$cohort <- with(tab1.2.melt, year - midpoint)
tab1.2.melt$cohort <- factor(tab1.2.melt$cohort)

## Change year to categorical
tab1.2.melt$year <- factor(tab1.2.melt$year)

## Show
tab1.2.melt
   group midpoint year prevalence cohort
1  10-19       15 1975         17   1960
2  20-29       25 1975         14   1950
3  30-39       35 1975         12   1940
4  40-49       45 1975         10   1930
5  50-59       55 1975         NA   1920
6  60-69       65 1975         NA   1910
7  70-79       75 1975         NA   1900
8  10-19       15 1985         28   1970
9  20-29       25 1985         23   1960
10 30-39       35 1985         19   1950
11 40-49       45 1985         18   1940
12 50-59       55 1985         15   1930
13 60-69       65 1985         NA   1920
14 70-79       75 1985         NA   1910
15 10-19       15 1995         NA   1980
16 20-29       25 1995         35   1970
17 30-39       35 1995         30   1960
18 40-49       45 1995         26   1950
19 50-59       55 1995         22   1940
20 60-69       65 1995         20   1930
21 70-79       75 1995         NA   1920
22 10-19       15 2005         NA   1990
23 20-29       25 2005         NA   1980
24 30-39       35 2005         45   1970
25 40-49       45 2005         40   1960
26 50-59       55 2005         36   1950
27 60-69       65 2005         31   1940
28 70-79       75 2005         27   1930

Table 1-2

tab1.2
  group midpoint s1975 s1985 s1995 s2005
1 10-19       15    17    28    NA    NA
2 20-29       25    14    23    35    NA
3 30-39       35    12    19    30    45
4 40-49       45    10    18    26    40
5 50-59       55    NA    15    22    36
6 60-69       65    NA    NA    20    31
7 70-79       75    NA    NA    NA    27

Figure 1-2: Cros-sectional effect of age at each survey year

This graph is useful in describing the age distribution of a disease at a given year.

In each cross-sectional survey grouped together by lines, the prevalence of the hypothetical disease is lower in the older people. But does it mean the prevalence decreases as a birth cohort ages?

## Graph
library(ggplot2)

## Configure prevalence vs age plot
plot.prev.age <-
    ggplot(data = tab1.2.melt,
                 mapping = aes_string(x = "midpoint", y = "prevalence")) +
    layer(geom = "point") +
    theme_bw() +
    theme(legend.key = element_blank())

## Plot grouping by the survey years
fig1.2 <- plot.prev.age + layer(geom = "line", mapping = aes_string(group = "year", color = "year"))
fig1.2

plot of chunk unnamed-chunk-4

Figure 1-3: Longitudinal effect of age for each birth cohort with cross-sectional effect for survey yer 2005 only

This graph is useful in describing the effect of aging in a group of people who were born at the same time.

In each birth cohort grouped together by broken lines, the prevalence of the hypothetical disease increases as they age. This is the longitudinal effect of aging in indidividuals as they age.

## Plot grouping by the birth cohorts
fig1.3 <- plot.prev.age +
    ## Add cohort grouping
    layer(geom = "line", mapping = aes_string(group = "cohort", color = "cohort"), lty = 2) +
    ## Add survey year grouping for year 2005 only
    layer(data = subset(tab1.2.melt, year == 2005), geom = "line",
          mapping = aes_string(group = "year"))
fig1.3

plot of chunk unnamed-chunk-5

Table 1-3

This presentation shows the experience of each birth cohort (rows) over time (columns).

## Array casting
tab1.3 <- acast(data = tab1.2.melt, cohort ~ midpoint, value.var = "prevalence")
## Change to xtabs table not to show NA
class(tab1.3) <- c("xtabs", "table")
## Show
tab1.3
     15 25 35 45 55 65 75
1900                     
1910                     
1920                     
1930          10 15 20 27
1940       12 18 22 31   
1950    14 19 26 36      
1960 17 23 30 40         
1970 28 35 45            
1980                     
1990                     

## Create a table using xtabs
tab1.3 <- xtabs(prevalence ~ cohort + midpoint, data = tab1.2.melt)
## Remove 0
tab1.3[tab1.3 == 0] <- NA
## Show
tab1.3
      midpoint
cohort 15 25 35 45 55 65 75
  1900                     
  1910                     
  1920                     
  1930          10 15 20 27
  1940       12 18 22 31   
  1950    14 19 26 36      
  1960 17 23 30 40         
  1970 28 35 45            
  1980                     
  1990                     

Figure 1-4:

This graph is useful in detecting an unusual birth cohort. If there is a spike, that specific birth cohort more severely affected than other cohorts.

This is grouped by the age at the surveys. In each age group, those who were born more recently (more right) have higher prevalence, i.e., at the same age, those who were born more recently had higher prevalences of this hypothetical disease.

## Change age midpoint to categorical
tab1.2.melt$midpoint <- factor(tab1.2.melt$midpoint)

## Configure prevalence vs birth cohort plot
plot.prev.cohort <-
    ggplot(data = subset(tab1.2.melt, !is.na(prevalence)),
                 mapping = aes_string(x = "cohort", y = "prevalence")) +
    layer(geom = "point") +
    theme_bw() +
    theme(legend.key = element_blank())

## Plot grouping by age midpoints
fig1.4 <- plot.prev.cohort +
    layer(geom = "line", mapping = aes_string(group = "midpoint", color = "midpoint"), lty = 4)
fig1.4

plot of chunk unnamed-chunk-7

Additional figure

This graph is useful in detecting an unusual survey year. If there is a spike, that specific survey year was more severely affected than other survey years.

This one has the same grouping as Figure 1-4, but the X-axis is by the calendar year, thus, the ordering of the lines are reverse of Figure 1-4.

## Configure prevalence vs survey year plot
plot.prev.year <-
    ggplot(data = subset(tab1.2.melt, !is.na(prevalence)),
                 mapping = aes_string(x = "year", y = "prevalence")) +
    layer(geom = "point") +
    theme_bw() +
    theme(legend.key = element_blank())

## Plot grouping by age midpoints
fig.add <- plot.prev.year +
    layer(geom = "line", mapping = aes_string(group = "midpoint", color = "midpoint"), lty = 4)
fig.add

plot of chunk unnamed-chunk-8