pacman::p_load(dslabs, tidyverse, highcharter, RColorBrewer)
I was intrigued by the Measles heatmap, with the line showing the introduction of the vaccine. And I wanted to show that same effect of discrete events for each of the diseases in the us_contagious_diseases data.
data("us_contagious_diseases")
summary(us_contagious_diseases) # look at the data set
## disease state year weeks_reporting
## Hepatitis A:2346 Alabama : 315 Min. :1928 Min. : 0.00
## Measles :3825 Alaska : 315 1st Qu.:1950 1st Qu.:31.00
## Mumps :1785 Arizona : 315 Median :1975 Median :46.00
## Pertussis :2856 Arkansas : 315 Mean :1971 Mean :37.38
## Polio :2091 California: 315 3rd Qu.:1990 3rd Qu.:50.00
## Rubella :1887 Colorado : 315 Max. :2011 Max. :52.00
## Smallpox :1275 (Other) :14175
## count population
## Min. : 0 Min. : 86853
## 1st Qu.: 7 1st Qu.: 1018755
## Median : 69 Median : 2749249
## Mean : 1492 Mean : 4107584
## 3rd Qu.: 525 3rd Qu.: 4996229
## Max. :132342 Max. :37607525
## NA's :214
I used the same calculation for rate, but summarized by year and disease in order to create a line graph grouped by disease across the time line.
diseases <- us_contagious_diseases %>%
mutate(rate = count / population * 10000 * 52 / weeks_reporting) %>% # calc the cases per 10K for reported weeks
group_by(year, disease) %>%
summarise(cases = sum(rate, na.rm = TRUE)) %>% # sum the cases by year, leave out N/A values
arrange(year, disease)
## `summarise()` has grouped output by 'year'. You can override using the `.groups` argument.
I added vertical plot-lines for the year the vaccines were introduced. Smallpox vaccination has actually been going on for a very long time. Children were required to be vaccinated by most U.S. schools by 1922, and smallpox was eradicated in the U.S. by the 1950s. D.A. Henderson ran the World Health Organization’s smallpox world-wide eradication program from 1967 to 1977. (Later, Henderson was the dean of Johns Hopkins School of Public Health, and my mother-in-law worked with him as associate dean.)
cols <- brewer.pal(7, "Dark2") # choose a broad, bold color palette
# Create a line graph of disease cases per year
highchart() %>%
hc_add_theme(hc_theme_smpl()) %>%
hc_add_series(data = diseases,
type = "line",
hcaes(x = year, y = cases, group = disease)) %>%
hc_colors(cols) %>%
hc_yAxis(title = list(text = "Cases (per 10K population)", style = list(fontWeight = "bold"))) %>%
# add vertical lines for the date of each vaccine
hc_xAxis(title = list(text = "Year", style = list(fontWeight = "bold")),
plotLines = list(
list(label = list(text = "Pertussis"), dashStyle = "dot", width = 1.5, value = 1939, color = "blue"),
list(label = list(text = "Polio"), dashStyle = "dot", width = 1.5, value = 1955, color = "blue"),
list(label = list(text = "Measles"), dashStyle = "dot", width = 1.5, value = 1963, color = "blue"),
list(label = list(text = "Mumps"), dashStyle = "dot", width = 1.5, value = 1967, color = "blue"),
list(label = list(text = "Rubella"), dashStyle = "dot", width = 1.5, value = 1969, color = "blue"),
list(label = list(text = "Hepatitis A"), dashStyle = "dot", width = 1.5, value = 1995, color = "blue")
)) %>%
# position the legend, use small circles and wide lines on the line plot
hc_legend(align = "right", verticalAlign = "top", layout = "vertical") %>%
hc_plotOptions(series = list(marker = list(symbol = "circle", radius = 2), lineWidth = 3)) %>%
# add title, sub-title and tooltip
hc_title(text = "Vaccine introduction for contagious diseases in the U.S.",
margin = 20, align = "center", style = list(color = "darkblue")) %>%
hc_subtitle(text = "by 1922 smallpox vaccine was required by many U.S. schools", align = "left") %>%
hc_tooltip(shared = TRUE,
borderColor = "black",
pointFormat = "{point.disease}: {point.cases:.2f}<br>")
This plot does show the reduction in each disease after the vaccines were introduced. Measles cases outnumber the other diseases by quite a bit before 1970, so the scale pushes the lines for other diseases down to the bottom of the plot. And it’s difficult to see details and differences as the lines start converging to zero. Putting the y-axis on a logarithm scale gives a better view of the other diseases, and shows details as the cases trend downward.
# measles cases greatly outnumber the other diseases, use logarithmic scale on the y-axis
# plot vertical lines for the introduction dates of vaccines
highchart() %>%
hc_add_theme(hc_theme_smpl()) %>%
hc_add_series(data = diseases,
type = "line",
hcaes(x = year, y = cases, group = disease)) %>%
hc_colors(cols) %>%
# use log scale on y-axis, and use max = 1200 to make it tall enough for the plotLine lables to be legible
hc_yAxis(type = "logarithmic", max = 12000, title = list(text = "Cases (per 10K population, log scale)", style = list(fontWeight = "bold"))) %>%
# add vertical lines for the date of each vaccine
hc_xAxis(title = list(text = "Year", style = list(fontWeight = "bold")),
plotLines = list(
list(label = list(text = "Pertussis"), dashStyle = "dot", width = 1.5, value = 1939, color = "blue"),
list(label = list(text = "Polio"), dashStyle = "dot", width = 1.5, value = 1955, color = "blue"),
list(label = list(text = "Measles"), dashStyle = "dot", width = 1.5, value = 1963, color = "blue"),
list(label = list(text = "Mumps"), dashStyle = "dot", width = 1.5, value = 1967, color = "blue"),
list(label = list(text = "Rubella"), dashStyle = "dot", width = 1.5, value = 1969, color = "blue"),
list(label = list(text = "Hepatitis A"), dashStyle = "dot", width = 1.5, value = 1995, color = "blue")
)) %>%
# position the legend, use small circles and wide lines on the line plot
hc_legend(align = "right", verticalAlign = "top", layout = "vertical") %>%
hc_plotOptions(series = list(marker = list(symbol = "circle", radius = 2), lineWidth = 3)) %>%
# tile, sub-title and tooltip
hc_title(text = "Vaccine introduction for contagious diseases in the U.S.",
margin = 20, align = "center", style = list(color = "darkblue")) %>%
hc_subtitle(text = "by 1922 smallpox vaccine was required by many U.S. schools", align = "left") %>%
hc_tooltip(shared = TRUE,
borderColor = "black",
pointFormat = "{point.disease}: {point.cases:.2f}<br>")
On this graph, it looks as though pertussis (whooping cough), has not receded greatly, but that’s the log scale working. Using the tool-tip shows a 97% reduction from 887 cases per 10K population in 1938 to 23 cases in 2011. Whooping cough vaccines had a rocky history up until an effective one was introduced in 1939. Pertussis remains a major problem in the developing world, the World Health Organization estimates that pertussis caused 89,000 deaths worldwide in 2012.
The plot would be more effective if earlier data were available for all of the diseases.