The goal of this project is to find notable trends for opioid-related
deaths in Washington, D.C., over 9 years (2015-2023). To do this, I will
use a dataset published by the National Center for Health Statistics,
which contains overdose death statistics for all fifty states and
Washington, D.C., over this time span. The dataset includes relevant
information such as the type of substance present in the overdose, the
month, year, and state in which it was reported, and the percent
completeness of aggregated reports for a given month in a given state.
The dataset includes a comprehensive breakdown of the type of opioid
that was present in the reported overdose. Most notably, they are broken
down between synthetic opioids, non-synthetic & semi-synthetic
opioids, heroin, methadone, and an overall category of all
opioid-related overdoses regardless of type. The dataset also includes
information on cocaine overdoses, which will be used to compare trends
between overall opioid and cocaine use in order to gather a greater
understanding of drug-use in Washington D.C.
My approach to analyzing the data will begin with cleaning up the
dataset and making sure all the variables I need are in a format that I
can use effectively. I will then be assigning the relevant variables to
their own dataframes, making them easier to work with when performing
calculations and creating the visualizations.
library (dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library (ggplot2)
library (lubridate)
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
drug_overdose_info <- read.csv("DrugOverdoseRates.csv")
#examining the layout of the dataset
str(drug_overdose_info)
## 'data.frame': 78120 obs. of 12 variables:
## $ State : chr "AK" "AK" "AK" "AK" ...
## $ Year : int 2015 2015 2015 2015 2015 2015 2015 2015 2015 2015 ...
## $ Month : chr "January" "February" "March" "April" ...
## $ Period : chr "12 month-ending" "12 month-ending" "12 month-ending" "12 month-ending" ...
## $ Indicator : chr "Cocaine (T40.5)" "Cocaine (T40.5)" "Cocaine (T40.5)" "Cocaine (T40.5)" ...
## $ Data.Value : chr "" "" "" "" ...
## $ Percent.Complete : int 100 100 100 100 100 100 100 100 100 100 ...
## $ Percent.Pending.Investigation: num 0 0 0 0 0 0 0 0 0 0 ...
## $ State.Name : chr "Alaska" "Alaska" "Alaska" "Alaska" ...
## $ Footnote : chr "Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality." "Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality." "Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality." "Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality." ...
## $ Footnote.Symbol : chr "**" "**" "**" "**" ...
## $ Predicted.Value : chr "" "" "" "" ...
head(drug_overdose_info)
## State Year Month Period Indicator Data.Value
## 1 AK 2015 January 12 month-ending Cocaine (T40.5)
## 2 AK 2015 February 12 month-ending Cocaine (T40.5)
## 3 AK 2015 March 12 month-ending Cocaine (T40.5)
## 4 AK 2015 April 12 month-ending Cocaine (T40.5)
## 5 AK 2015 May 12 month-ending Cocaine (T40.5)
## 6 AK 2015 June 12 month-ending Cocaine (T40.5)
## Percent.Complete Percent.Pending.Investigation State.Name
## 1 100 0 Alaska
## 2 100 0 Alaska
## 3 100 0 Alaska
## 4 100 0 Alaska
## 5 100 0 Alaska
## 6 100 0 Alaska
## Footnote
## 1 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## 2 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## 3 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## 4 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## 5 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## 6 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## Footnote.Symbol Predicted.Value
## 1 **
## 2 **
## 3 **
## 4 **
## 5 **
## 6 **
#cleaning up the names of the columns, making the data_value column (number of deaths) as numeric rather than character, and using lubridate to create a date column in date format
names(drug_overdose_info) <- gsub("[(). \\-]", "_", names(drug_overdose_info))
names(drug_overdose_info) <- tolower(names(drug_overdose_info))
drug_overdose_info <- drug_overdose_info %>%
mutate(data_value = as.numeric(data_value))
## Warning: There was 1 warning in `mutate()`.
## ℹ In argument: `data_value = as.numeric(data_value)`.
## Caused by warning:
## ! NAs introduced by coercion
drug_overdose_info <- drug_overdose_info %>%
mutate(date = make_date(year, match(month, month.name), 1))
#assigning the six different drug types relevant to my report into their own dataframes, filtering to make them all for DC specifically and making sure to exclude any incomplete reports
dc_nonsynth_df <- drug_overdose_info %>%
filter(state == "DC" & indicator == "Natural & semi-synthetic opioids (T40.2)" & footnote != "Underreported due to incomplete data.")
dc_synth_df <- drug_overdose_info %>%
filter(state == "DC" & indicator == "Synthetic opioids, excl. methadone (T40.4)" & footnote != "Underreported due to incomplete data.")
dc_opioid_df <- drug_overdose_info %>%
filter(state == "DC" & indicator == "Opioids (T40.0-T40.4,T40.6)" & footnote != "Underreported due to incomplete data.")
dc_cocaine_df <- drug_overdose_info %>%
filter(state == "DC" & indicator == "Cocaine (T40.5)" & footnote != "Underreported due to incomplete data.")
dc_methadone_df <- drug_overdose_info %>%
filter(state == "DC" & indicator == "Methadone (T40.3)" & footnote != "Underreported due to incomplete data.")
dc_heroin_df <- drug_overdose_info %>%
filter(state == "DC" & indicator == "Heroin (T40.1)" & footnote != "Underreported due to incomplete data.")
head(drug_overdose_info)
## state year month period indicator data_value
## 1 AK 2015 January 12 month-ending Cocaine (T40.5) NA
## 2 AK 2015 February 12 month-ending Cocaine (T40.5) NA
## 3 AK 2015 March 12 month-ending Cocaine (T40.5) NA
## 4 AK 2015 April 12 month-ending Cocaine (T40.5) NA
## 5 AK 2015 May 12 month-ending Cocaine (T40.5) NA
## 6 AK 2015 June 12 month-ending Cocaine (T40.5) NA
## percent_complete percent_pending_investigation state_name
## 1 100 0 Alaska
## 2 100 0 Alaska
## 3 100 0 Alaska
## 4 100 0 Alaska
## 5 100 0 Alaska
## 6 100 0 Alaska
## footnote
## 1 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## 2 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## 3 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## 4 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## 5 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## 6 Numbers may differ from published reports using final data. See Technical Notes. Data not shown due to low data quality.
## footnote_symbol predicted_value date
## 1 ** 2015-01-01
## 2 ** 2015-02-01
## 3 ** 2015-03-01
## 4 ** 2015-04-01
## 5 ** 2015-05-01
## 6 ** 2015-06-01
sum(is.na(drug_overdose_info$data_value))
## [1] 32743
Now that I have all of my data cleaned and organized, I will need to
focus on creating new dataframes that highlight essential trends in the
use. For example, I will be creating a new dataframe called dc_diff_df
to find the difference between the amounts of synthetic opioid overdoses
and non-synthetic and semi-synthetic opioid overdoses. I will also make
new dataframes for both overall opioid overdoses and cocaine overdoses
to calculate the yearly trends rather than the monthly trends.
Additionally, I will create new dataframes that calculate the
proportions of each drug in the total amount of overdose deaths for
three different time spans over the nine years. I am doing all of this
to garner a greater understanding of the breakdown and trends of
drug-related overdoses in DC, as the simple monthly trends are not
enough to get a clear picture of all the trends.
#calculating the difference between synthetic and non-synthetic & semi-synthetic opioid overdoses.
dc_diff_df <- data.frame(
state = dc_synth_df$state,
month = dc_synth_df$month,
year = dc_synth_df$year,
date = dc_synth_df$date,
diff_value = dc_synth_df$data_value - dc_nonsynth_df$data_value
)
#calculating the yearly totals of opioid overdoses and the raw & percent changes of them over the previous year
yearly_opioid_deaths <- dc_opioid_df %>%
mutate(year = year(date)) %>%
group_by(year) %>%
summarise(total_deaths = sum(data_value))
yearly_opioid_deaths <- yearly_opioid_deaths %>%
mutate(
change = total_deaths - lag(total_deaths),
percent_change = (change / lag(total_deaths)) * 100
)
#calculating the yearly totals of cocaine overdoses and the raw & percent changes of them over the previous year
yearly_cocaine_deaths <- dc_cocaine_df %>%
mutate(year = year(date)) %>%
group_by(year) %>%
summarise(total_deaths = sum(data_value))
yearly_cocaine_deaths <- yearly_cocaine_deaths %>%
mutate(
change = total_deaths - lag(total_deaths),
percent_change = (change / lag(total_deaths)) * 100
)
#calculating the proportions of each drug in the total amounts of overdoses over the entire 9-year span
drug_totals <- data.frame(
drug_type = c("Heroin", "Cocaine", "Methadone", "Non-synthetic & Semi-synthetic Opioids", "Synthetic Opioids"),
total_deaths = c(
sum(dc_heroin_df$data_value, na.rm = TRUE),
sum(dc_cocaine_df$data_value, na.rm = TRUE),
sum(dc_methadone_df$data_value, na.rm = TRUE),
sum(dc_nonsynth_df$data_value, na.rm = TRUE),
sum(dc_synth_df$data_value, na.rm = TRUE)
)
)
drug_totals <- drug_totals %>%
mutate(percent = round(100 * total_deaths / sum(total_deaths), 1),
legend = paste0(drug_type, " (", percent, "%)")) %>%
arrange(desc(percent)) %>%
mutate(legend = factor(legend, levels = legend))
#calculating the proportions of each drug in the total amounts of overdoses between 2015 and 2019
drug_totals_pre2020 <- data.frame(
drug_type = c("Heroin", "Cocaine", "Methadone", "Non-synthetic & Semi-synthetic Opioids", "Synthetic Opioids"),
total_deaths = c(
sum(filter(dc_heroin_df, year < 2020)$data_value, na.rm = TRUE),
sum(filter(dc_cocaine_df, year < 2020)$data_value, na.rm = TRUE),
sum(filter(dc_methadone_df, year < 2020)$data_value, na.rm = TRUE),
sum(filter(dc_nonsynth_df, year < 2020)$data_value, na.rm = TRUE),
sum(filter(dc_synth_df, year < 2020)$data_value, na.rm = TRUE)
)
)
drug_totals_pre2020 <- drug_totals_pre2020 %>%
mutate(percent = round(100 * total_deaths / sum(total_deaths), 1),
legend = paste0(drug_type, " (", percent, "%)")) %>%
arrange(desc(percent)) %>%
mutate(legend = factor(legend, levels = legend))
#calculating the proportions of each drug in the total amounts of overdoses between 2020 and 2023
drug_totals_post2020 <- data.frame(
drug_type = c("Heroin", "Cocaine", "Methadone", "Non-synthetic & Semi-synthetic Opioids", "Synthetic Opioids"),
total_deaths = c(
sum(filter(dc_heroin_df, year >= 2020)$data_value, na.rm = TRUE),
sum(filter(dc_cocaine_df, year >= 2020)$data_value, na.rm = TRUE),
sum(filter(dc_methadone_df, year >= 2020)$data_value, na.rm = TRUE),
sum(filter(dc_nonsynth_df, year >= 2020)$data_value, na.rm = TRUE),
sum(filter(dc_synth_df, year >= 2020)$data_value, na.rm = TRUE)
)
)
drug_totals_post2020 <- drug_totals_post2020 %>%
mutate(percent = round(100 * total_deaths / sum(total_deaths), 1),
legend = paste0(drug_type, " (", percent, "%)")) %>%
arrange(desc(percent)) %>%
mutate(legend = factor(legend, levels = legend))
This section of the report will show us the trends of drug overdoses in DC through the use of visuals.
ggplot(drug_totals, aes(x = "", y = total_deaths, fill = legend)) +
geom_bar(stat = "identity", width = 1, color = "white") +
coord_polar(theta = "y") +
labs(
title = "Overdose Deaths in DC by Substance \n(2015–2023)",
fill = "Drug Type"
) +
theme_void() +
theme(
plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
legend.title = element_text(face = "bold"),
legend.text = element_text(size = 10)
) +
scale_fill_brewer(palette = "Set1")
As we can see in the pie chart above, roughly 70% of reported
overdoses in Washington, D.C., between 2015 and 2023 are due to opioids.
The main subset of opioids causing this are synthetic opioids, which are
responsible for 45% of the overdose deaths in these nine years. However,
nine years is quite a long time and this pie chart fails to show any
trends. To get a better picture, we must break up this time period into
shorter intervals.
ggplot(drug_totals_pre2020, aes(x = "", y = total_deaths, fill = legend)) +
geom_bar(stat = "identity", width = 1, color = "white") +
coord_polar(theta = "y") +
labs(
title = "Overdose Deaths in DC by Substance \n(2015–2019)",
fill = "Drug Type"
) +
theme_void() +
theme(
plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
legend.title = element_text(face = "bold"),
legend.text = element_text(size = 10)
) +
scale_fill_brewer(palette = "Set2")
We see a significantly different breakdown between 2015 and 2019.
Synthetic opioids were considerably less prevalent (though still the
leading cause of death) pre-2020. Heroin is substantially more
prevalent, now responsible for nearly 28% of overdose deaths in this
time period.
ggplot(drug_totals_post2020, aes(x = "", y = total_deaths, fill = legend)) +
geom_bar(stat = "identity", width = 1, color = "white") +
coord_polar(theta = "y") +
labs(
title = "Overdose Deaths in DC by Substance \n(2020–2023)",
fill = "Drug Type"
) +
theme_void() +
theme(
plot.title = element_text(hjust = 0.5, size = 14, face = "bold"),
legend.title = element_text(face = "bold"),
legend.text = element_text(size = 10)
) +
scale_fill_brewer(palette = "Set3")
The second half of this time period, 2020 and later, shows a
concerning trend with synthetic opioids. Heroin, which was responsible
for nearly 28% of overdose deaths between 2015 and 2019, is now
responsible for roughly 10% of overdose deaths during the 2020s. That is
approximately an 18% shift. Synthetic opioids seem to have replaced
heroin as we see around a 15% increase in the proportion of synthetic
opioid-related overdoses. The proportion of cocaine use has also
increased a bit, albeit less dramatically, with around a 7% increase.
Now that we have a general understanding of the proportional
breakdown of the most prevalent substances, we need to take a closer
look at the trends of specific substances over the nine-year timespan.
ggplot(dc_opioid_df, aes(x = date, y = data_value)) +
geom_line(color = "darkred", linewidth = 1.3) +
scale_x_date(date_labels = "%Y", date_breaks = "1 year") +
labs(
title = "Opioid Overdose Trends for DC",
x = "Date",
y = "Deaths"
) +
theme_minimal()
As seen in the figure above, opioid overdoses have skyrocketed over
the nine years this data covers. Overdoses have gone from under a
hundred overdose deaths a month in the first half of 2015 to over 500
overdose deaths a month in the latter part of 2023. To claim that this
trend is concerning would be an understatement. We see a sharp increase
from 2016-2018, a decrease from 2018-2019, and then another dramatic
increase from 2019-2021. Between 2021 and 2023, the trend stabilized;
however, from 2023 to 2024, we see another significant increase in
opioid overdoses.
ggplot(dc_synth_df, aes(x = date, y = data_value)) +
geom_line(color = "steelblue", linewidth = 1.3) +
scale_x_date(date_labels = "%Y", date_breaks = "1 year") +
labs(
title = "Synthetic Opioid Overdose Trends for DC",
x = "Date",
y = "Deaths"
) +
theme_minimal()
Given that synthetic opioids represent the majority of opioid use
in DC, it is not surprising to see that the trend for synthetic opioid
overdoses closely matches the trend for all opioid overdoses. There are
significant increases during the same periods: 2016-2018, 2019-2021, and
2023-2024. It also mirrors the decline between 2018 and 2019. We can
reasonably conclude that it is the synthetic opioid trends that are the
driving force heavily influencing the overall opioid trends seen in the
previous figure.
ggplot(dc_heroin_df, aes(x = date, y = data_value)) +
geom_line(color = "lightsalmon", linewidth = 1.3) +
scale_x_date(date_labels = "%Y", date_breaks = "1 year") +
labs(
title = "Heroin Overdose Trends for DC",
x = "Date",
y = "Deaths"
) +
theme_minimal()
We see a much different trend with heroin than we do with synthetic
opioids. While it does have a similar, significant spike between 2016
and 2018, it has a much more substantial decline between 2018 and 2019.
It stabilizes around 100 deaths a month between 2019 and 2021, followed
by a sharp decline to its lowest number of reported overdoses over the
entire nine-year span in 2022 and 2023. While we do see a bit of an
increase between 2023 and 2024, it is within striking distance of the
first reports in 2015. This differs from synthetic opioid use, which was
roughly 500% more prevalent in the final reports in 2023 than in the
initial reports in 2015.
ggplot() +
geom_line(data = dc_heroin_df, aes(x = date, y = data_value, color = "Heroin"), linewidth = 1.2) +
geom_line(data = dc_synth_df, aes(x = date, y = data_value, color = "Synthetic Opioids"), linewidth = 1.2) +
labs(
title = "DC Drug Overdose Trends: Heroin vs Synthetic Opioids",
x = "Date",
y = "Number of Deaths",
color = "Drug Type"
) +
scale_color_manual(values = c("Heroin" = "red", "Synthetic Opioids" = "blue")) +
theme_minimal()
This graph further illustrates the difference in trends between
heroin and synthetic opioids. While both have a similar spike between
2016 and 2018, they begin to diverge after that, with heroin going on a
steady decline and synthetic opioids going on a rapid incline. While
this is merely speculation and more research would be required to
confirm it, it would seem that it was around this time in 2018-2020 that
heroin became more scarce and synthetic opioids became drastically more
widespread.
ggplot() +
geom_line(data = dc_nonsynth_df, aes(x = date, y = data_value, color = "Non-Synthetic & Semi-Synthetic Opioids"), linewidth = 1.2) +
geom_line(data = dc_synth_df, aes(x = date, y = data_value, color = "Synthetic Opioids"), linewidth = 1.2) +
labs(
title = "DC Drug Overdose Death Trends: \nNon-Synthetic & Semi-Synthetic vs Synthetic Opioids",
x = "Date",
y = "Number of Deaths",
color = "Drug Type"
) +
scale_color_manual(values = c("Non-Synthetic & Semi-Synthetic Opioids" = "orange", "Synthetic Opioids" = "green2")) +
theme_minimal()
This figure highlights the grave disparity between synthetic
opioid-related deaths and their non-synthetic and semi-synthetic
counterparts. As we can see, while synthetic opioid overdoses have risen
dramatically, non-synthetic and semi-synthetic opioid overdoses have
remained mostly stable throughout the nine years, not reaching above 50
deaths a month once. This further illustrates that synthetic opioids
should be the primary focus of concern for the D.C. community.
ggplot(yearly_opioid_deaths, aes(x = year, y = total_deaths)) +
geom_line(color = "lightgreen", linewidth = 1.3) +
geom_point(size = 1.5) +
geom_text(aes(label = paste0(round(percent_change), "%")), vjust = -1, size = 3) +
scale_x_continuous(breaks = seq(min(yearly_opioid_deaths$year), max(yearly_opioid_deaths$year), by = 1)) +
labs(title = "Year-Over-Year Opioid Overdose Trends in DC", x = "Year", y = "Total Deaths") +
theme_minimal()
This figure shows us which years had the most significant
proportional increase in deaths relative to the previous year. We see
that from 2016 to 2017, there was an 89% increase in opioid overdoses
from 2015 to 2016. Curiously, except for 2017-2018, every year showed an
increasing amount of overdoses. Further research would be required to
deduce what happened from 2017 to 2018 to cause the decrease.
But aside from opioids, how have cocaine-related overdoses trended
during this timespan? Has it only been opioids on the rise or has
cocaine also surged?
ggplot() +
geom_line(data = dc_cocaine_df, aes(x = date, y = data_value, color = "Cocaine"), linewidth = 1.2) +
geom_line(data = dc_synth_df, aes(x = date, y = data_value, color = "Synthetic Opioids"), linewidth = 1.2) +
labs(
title = "DC Drug Overdose Death Trends: Cocaine vs Synthetic Opioids",
x = "Date",
y = "Number of Deaths",
color = "Drug Type"
) +
scale_color_manual(values = c("Cocaine" = "darkorchid", "Synthetic Opioids" = "salmon")) +
theme_minimal()
We can see that cocaine has a strikingly similar trend to synthetic
opioids, with the main difference being that the spikes and decreases
seem to lag behind synthetic opioids a bit. This begs the question of
whether or not overall drug use in D.C. has increased over the years, or
perhaps, given the similar trend to synthetic opioids, that cocaine is
being contaminated with the influx of synthetic opioids to hit the D.C.
area. Further research on this question is required.
ggplot(yearly_cocaine_deaths, aes(x = year, y = total_deaths)) +
geom_line(color = "turquoise1", linewidth = 1.3) +
geom_point(size = 1.5) +
geom_text(aes(label = paste0(round(percent_change), "%")), vjust = -1, size = 3) +
scale_x_continuous(breaks = seq(min(yearly_cocaine_deaths$year), max(yearly_cocaine_deaths$year), by = 1)) +
labs(title = "Year-Over-Year Cocaine Overdose Trends in DC", x = "Year", y = "Total Deaths") +
theme_minimal()
This figure shows us which years had the most significant
proportional increase in cocaine overdoses relative to the previous
year. The largest increase was between 2016-2017, in which there was
over double the amount of deaths compared to 2015-2016. Similar to the
opioids, every year showed growth besides 2017-2018. Yet again, further
research would be required to deduce what happened from 2017 to 2018 to
cause the decrease.
The breakdown and trends of drug overdoses in D.C. between 2015 and
2023 show us a few key findings. The most concerning finding from this
data would be the sharp rise in opioid related deaths, seemingly caused
by the meteoric rise of synthetic opioids in the region. In January of
2015, only 15 individuals died from a synthetic opioid-related overdose.
In December of 2023, 493 individuals died from a synthetic
opioid-related overdose. That is a 3287% increase. Opioid deaths, except
for 2017 to 2018, have only been on the rise, and the trend shows no
signs of stopping.
Contrary to that, heroin, non-synthetic,
and semi-synthetic opioids have not seen the same increase that
synthetic opioids have seen. Heroin overdoses spiked alongside synthetic
opioid overdoses during the first significant increase in opioid
overdoses between 2016 and 2018. However, instead of continuing to rise
like synthetic opioids, heroin overdoses trended down to the previous
baseline and have remained mostly stable there since. Non-synthetic and
semi-synthetic opioids have remained stable over the entire 9-year span.
It is clear that the driving force behind the increase in opioid
overdoses is the synthetic opioids, and that is where D.C. should direct
its attention.
Future research should be focused on why we are
seeing such a stark increase in synthetic opioid use in D.C., whether
these trends are reflected in other states, and how we can address the
rise of synthetic opioid overdoses. Finding out what caused the decrease
between 2017 and 2018 could reveal other vital variables that were not
present in this dataset.
Another notable trend that warrants
further research was the similar trend in cocaine overdoses in D.C.
There should be follow-up research on what the cause of this is; whether
it’s simply an overall increase in drug use in D.C., or if there is
another factor at play, such as cocaine supply being contaminated with
these highly lethal synthetic opioids.
Compute lagged or leading values - lead-lag. dplyr. (n.d.). https://dplyr.tidyverse.org/reference/lead-lag.html
National Center for Health Statistics. (2018, June 3). VSRR
Provisional Drug Overdose Death Counts https://data.cdc.gov/National-Center-for-Health-Statistics/VSRR-Provisional-Drug-Overdose-Death-Counts/xkb8-kh2a/about_data
Rinker, T. (2013, September 19). Paste, paste0, and sprintf.
TRinker’s R Blog. https://trinkerrstuff.wordpress.com/2013/09/15/paste-paste0-and-sprintf-2/