Hepatitis A is a highly contagious liver infection caused by the “hepatitis A” virus. The disease can spread from contaminated food or water, or through contact with the infected person. This is a preventable disease which vaccine began its errardication in 1995. The data chart below depics the disease progression throughout the US since 1960s.
# Here we load the dataset package and load tidyverse
library("dslabs")
data(package="dslabs")
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.0 ✓ dplyr 1.0.2
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
# We scope info regarding the diseases within our data set (our interest is in the details about Hepatitis A)
?us_contagious_diseases
# We load RColorBrewer package that we'll use in our heatmap
library(RColorBrewer)
data("us_contagious_diseases")
the_disease <- "Hepatitis A"
# Here we filter out states without data prior to their inception into the US, and define variables - adjusting the scale of the rate of infections per 10k inhabitants, and add the "intercept" point at which a vaccine was developed + add appropriate labels.
us_contagious_diseases %>%
filter(!state%in%c("Hawaii","Alaska") & disease == the_disease) %>%
mutate(infections_per_10k = count / population * 10000 * 52 / weeks_reporting) %>%
mutate(state = reorder(state, infections_per_10k)) %>%
ggplot(aes(year, state, fill = infections_per_10k)) +
geom_tile(color = "grey50") +
scale_x_continuous(expand=c(0,0)) +
scale_fill_gradientn(colors = brewer.pal(9, "Blues"), trans = "sqrt") +
geom_vline(xintercept=1995, col = "blue") +
theme_minimal() + theme(panel.grid = element_blank()) +
ggtitle(the_disease) +
ylab("US States") +
xlab("Years disease reported")