pacman::p_load(dslabs, tidyverse, highcharter, RColorBrewer)

Vaccine introduction for contagious diseases in the U.S.

I was intrigued by the Measles heatmap, with the line showing the introduction of the vaccine. And I wanted to show that same effect of discrete events for each of the diseases in the us_contagious_diseases data.

data("us_contagious_diseases")
summary(us_contagious_diseases)              # look at the data set
##         disease            state            year      weeks_reporting
##  Hepatitis A:2346   Alabama   :  315   Min.   :1928   Min.   : 0.00  
##  Measles    :3825   Alaska    :  315   1st Qu.:1950   1st Qu.:31.00  
##  Mumps      :1785   Arizona   :  315   Median :1975   Median :46.00  
##  Pertussis  :2856   Arkansas  :  315   Mean   :1971   Mean   :37.38  
##  Polio      :2091   California:  315   3rd Qu.:1990   3rd Qu.:50.00  
##  Rubella    :1887   Colorado  :  315   Max.   :2011   Max.   :52.00  
##  Smallpox   :1275   (Other)   :14175                                 
##      count          population      
##  Min.   :     0   Min.   :   86853  
##  1st Qu.:     7   1st Qu.: 1018755  
##  Median :    69   Median : 2749249  
##  Mean   :  1492   Mean   : 4107584  
##  3rd Qu.:   525   3rd Qu.: 4996229  
##  Max.   :132342   Max.   :37607525  
##                   NA's   :214

I used the same calculation for rate, but summarized by year and disease in order to create a line graph grouped by disease across the time line.

diseases <- us_contagious_diseases %>% 
  mutate(rate = count / population * 10000 * 52 / weeks_reporting) %>%   # calc the cases per 10K for reported weeks 
  group_by(year, disease) %>%         
  summarise(cases = sum(rate, na.rm = TRUE)) %>%                         # sum the cases by year, leave out N/A values
  arrange(year, disease)
## `summarise()` has grouped output by 'year'. You can override using the `.groups` argument.

I added vertical plot-lines for the year the vaccines were introduced. Smallpox vaccination has actually been going on for a very long time. Children were required to be vaccinated by most U.S. schools by 1922, and smallpox was eradicated in the U.S. by the 1950s. D.A. Henderson ran the World Health Organization’s smallpox world-wide eradication program from 1967 to 1977. (Later, Henderson was the dean of Johns Hopkins School of Public Health, and my mother-in-law worked with him as associate dean.)

cols <- brewer.pal(7, "Dark2")                                          # choose a broad, bold color palette

# Create a line graph of disease cases per year

highchart() %>%                                                         
  hc_add_theme(hc_theme_smpl()) %>% 
  hc_add_series(data = diseases,
                type = "line",
                hcaes(x = year, y = cases, group = disease)) %>% 
  hc_colors(cols) %>% 
  hc_yAxis(title = list(text = "Cases (per 10K population)", style = list(fontWeight = "bold"))) %>%
  
  # add vertical lines for the date of each vaccine
  
  hc_xAxis(title = list(text = "Year", style = list(fontWeight = "bold")),
      plotLines = list(
        list(label = list(text = "Pertussis"), dashStyle = "dot", width = 1.5, value = 1939, color = "blue"),
        list(label = list(text = "Polio"), dashStyle = "dot", width = 1.5, value = 1955, color = "blue"),
        list(label = list(text = "Measles"), dashStyle = "dot", width = 1.5, value = 1963, color = "blue"),
        list(label = list(text = "Mumps"), dashStyle = "dot", width = 1.5, value = 1967, color = "blue"),
        list(label = list(text = "Rubella"), dashStyle = "dot", width = 1.5, value = 1969, color = "blue"),
        list(label = list(text = "Hepatitis A"), dashStyle = "dot", width = 1.5, value = 1995, color = "blue")      
      )) %>% 
  
  # position the legend, use small circles and wide lines on the line plot
  
  hc_legend(align = "right", verticalAlign = "top", layout = "vertical") %>% 
  hc_plotOptions(series = list(marker = list(symbol = "circle", radius = 2), lineWidth = 3)) %>% 
  
  # add title, sub-title and tooltip
  
  hc_title(text = "Vaccine introduction for contagious diseases in the U.S.",
           margin = 20, align = "center", style = list(color = "darkblue")) %>% 
  hc_subtitle(text = "by 1922 smallpox vaccine was required by many U.S. schools", align = "left") %>% 
  hc_tooltip(shared = TRUE,
             borderColor = "black",
             pointFormat = "{point.disease}: {point.cases:.2f}<br>")

This plot does show the reduction in each disease after the vaccines were introduced. Measles cases outnumber the other diseases by quite a bit before 1970, so the scale pushes the lines for other diseases down to the bottom of the plot. And it’s difficult to see details and differences as the lines start converging to zero. Putting the y-axis on a logarithm scale gives a better view of the other diseases, and shows details as the cases trend downward.

# measles cases greatly outnumber the other diseases, use logarithmic scale on the y-axis
# plot vertical lines for the introduction dates of vaccines

highchart() %>%                                                         
  hc_add_theme(hc_theme_smpl()) %>% 
  hc_add_series(data = diseases,
                type = "line",
                hcaes(x = year, y = cases, group = disease)) %>% 
  hc_colors(cols) %>% 
  
  # use log scale on y-axis, and use max = 1200 to make it tall enough for the plotLine lables to be legible
  
  hc_yAxis(type = "logarithmic", max = 12000, title = list(text = "Cases (per 10K population, log scale)", style = list(fontWeight = "bold"))) %>%
    
  # add vertical lines for the date of each vaccine
  
  hc_xAxis(title = list(text = "Year", style = list(fontWeight = "bold")),
      plotLines = list(
        list(label = list(text = "Pertussis"), dashStyle = "dot", width = 1.5, value = 1939, color = "blue"),
        list(label = list(text = "Polio"), dashStyle = "dot", width = 1.5, value = 1955, color = "blue"),
        list(label = list(text = "Measles"), dashStyle = "dot", width = 1.5, value = 1963, color = "blue"),
        list(label = list(text = "Mumps"), dashStyle = "dot", width = 1.5, value = 1967, color = "blue"),
        list(label = list(text = "Rubella"), dashStyle = "dot", width = 1.5, value = 1969, color = "blue"),
        list(label = list(text = "Hepatitis A"), dashStyle = "dot", width = 1.5, value = 1995, color = "blue")      
      )) %>% 
  
  # position the legend, use small circles and wide lines on the line plot
  
  hc_legend(align = "right", verticalAlign = "top", layout = "vertical") %>% 
  hc_plotOptions(series = list(marker = list(symbol = "circle", radius = 2), lineWidth = 3)) %>% 
  
  # tile, sub-title and tooltip
  
  hc_title(text = "Vaccine introduction for contagious diseases in the U.S.",
           margin = 20, align = "center", style = list(color = "darkblue")) %>% 
  hc_subtitle(text = "by 1922 smallpox vaccine was required by many U.S. schools", align = "left") %>% 
  hc_tooltip(shared = TRUE,
             borderColor = "black",
             pointFormat = "{point.disease}: {point.cases:.2f}<br>") 

On this graph, it looks as though pertussis (whooping cough), has not receded greatly, but that’s the log scale working. Using the tool-tip shows a 97% reduction from 887 cases per 10K population in 1938 to 23 cases in 2011. Whooping cough vaccines had a rocky history up until an effective one was introduced in 1939. Pertussis remains a major problem in the developing world, the World Health Organization estimates that pertussis caused 89,000 deaths worldwide in 2012.

The plot would be more effective if earlier data were available for all of the diseases.