ONDCP Preliminary Strategy Analysis

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Loading required package: viridisLite

Part 1: Frequency-based analysis of ONDCP strategy documents

This part of the document is based on a simple analysis of target word frequencies within each of the Office of National Drug Control Policy (ONDCP) strategy documents contained within the “ONDP Spending by Function” spreadsheet. These documents span the years 1989-1991, 1993-2016, and 2019-2025.

Target word frequencies were extracted from each of the strategy documents using python in Google Colab. Code facilitating the extraction can be found here. The raw results from the extraction are located in the “strategy_counts” worksheet of the aforementioned spreadsheet, located here.

Because of extreme variance in the word count of each strategy document (ranging from 2,000 to 100,000 words), each target word frequency is expressed as the percentage of each document’s total word count represented by the target word.

It should be noted that this document is intended to serve as an tool/tech workflow demonstrator and a jumping-off point for future analyses, rather than an exhaustive analysis in its own right.

! Using an auto-discovered, cached token.

  To suppress this message, modify your code or options to clearly consent to
  the use of a cached token.

  See gargle's "Non-interactive auth" vignette for more details:

  <https://gargle.r-lib.org/articles/non-interactive-auth.html>

ℹ The googlesheets4 package is using a cached token for
  'kenneth_kalu@brown.edu'.

✔ Reading from "ONDP Spending by Function".

✔ Range ''strategy_counts''.

a) Substances in ONDCP strategy documents

This graph reflects conventional wisdom surrounding American drug control strategy. Documents from the ’90s emphasize heroin, cocaine, and marijuana, with a small but escalating focus on meth. By the mid-2000s, mentions of “prescription” (which include both prescription drugs and drug control initiatives like PDMPs) begin to supplant cocaine, peaking between 2009 and 2015.

Interestingly, mentions of fentanyl in strategy documents only began in earnest (57 mentions, 0.12%) in 2016 — long after fentanyl entered the drug supply in the United States. As the crisis created by synthetic opioids grew, however, mentions of “cocaine”, “fentanyl” and “opioid” began to crowd out all other substance types, including heroin and meth.

Of additional interest are the (faintly visible) mentions of fentanyl in the 2007 strategy document. Closer inspection reveals that these mentions reflect the discovery of fentanyl in the drug supply of Chicago, Detroit and Philadelphia in the summer of 2006. Rather than recommending an increase in treatment services or harm reduction practices, the strategy document focused on primary prevention through community “fentanyl awareness forums” and interdiction through DEA operations against fentanyl producers.

Pictured: The fentanyl notice in the 2007 strategy document.

b) Prevention strategy within ONDCP strategy documents

Interdiction was the ONDCP’s first instinct when presented with the first hint of the fentanyl threat. This raises the question: what drug control strategies has ONDCP prioritized over time?

The following graph compares mentions of primary prevention (screenings, drug awareness efforts, etc.), secondary prevention (treatment, recovery, etc.) and tertiary prevention (harm reduction, syringe exchange programs, etc.) within ONDCP documents.

These trends paint a curious picture: from the earliest strategy documents to the present, secondary prevention efforts have received the most emphasis from ONDCP. Tertiary efforts only received significant mentions beginning in 2012, with mentions remaining limited through both the Obama and Trump administrations.

Commensurate with the Biden administration’s expanded focus on harm reduction services, mentions of tertiary prevention efforts exploded during the Biden-era strategy documents - a trend reversed within the first strategy document of the Trump administration.

The following graph will further explore the ONDCP focus on tertiary prevention by tracking mentions of the term “harm reduction”.

Here, the trend becomes even more stark: other than a sporadic presence during the early Bush administration, mentions of harm reduction in national strategy document were virtually nonexistent until the Biden administration.

Further investigation reveals that the only reason that Bush-era strategy documents even mentioned harm reduction was to condemn it. A passage from the 2002 ONDCP strategy document is quite telling:

A probe of MOUD mentions reveals similar drug war logic. Until 2020, methadone and naltrexone were the sole forms of MOUD mentioned within strategy documents. The term MOUD itself was not mentioned until the 2021 strategy document, and curiously, buprenorphine (either as buprenorphine or suboxone) has never been mentioned within a strategy document.

The detour examining specific forms of prevention is best concluded by determining the extent to which demand-focused prevention (the primary/secondary/tertiary methods aforementioned) competed with supply-focused prevention (that is, interdiction).

The following graph also chronicles the ONDCP focus on prevention strategies, with a twist: cataloguing words associated with drug-war era interdiction strategies (words like “border”, “surveillance”, and “smuggling”).

Including words associated with interdiction makes the longevity of the drug war clear. Throughout the 90s and 2000s, up to the mid-2010s, interdiction was much more heavily represented in strategy documents than any single type of prevention - and sometimes more than all forms of prevention combined.

Part 2: Using topic modeling to more systematically analyze strategy documents with BERTopic

The word-frequency-based method used up to this point is good for simple analyses. As displayed above, frequency-based methods have yielded useful conclusions when supplied with appropriately specific target words.

This being said, frequency-based analyses are of questionable validity. Categories for analysis (e.g. primary/secondary prevention) are dependent on the completeness of the words associated with each category. Even analyses of specific words require proper regular expressions that match these words (e.g. use(r|rs) vs. user). All told, it is difficult to develop a complete portrait of each strategy document by nipping at parts of it with a frequency-based analysis.

Solving this problem means developing a mechanism of analyzing the whole set of strategy documents at once. With a smaller corpus of documents, we would likely employ some form of content analysis by hand. However, the total corpus of strategy documents is in excess of 3,000 pages, which makes hand-coding challenging.

Enter topic modeling. Topic modeling uses machine learning to determine the representation of topics in a body of documents. Different topic models operate differently, but most topic models seek to uncover topics by approximating the relationship between words in a document. “Topics” are collections of related words - it is up to the user to label each topic. For example, a topic might contain the words “border”, “surveillance”, and “control” - we might brand this the “interdiction” topic.

Different documents contain different topics in different proportions. Different topic modeling strategies represent the topics contained within documents in different ways. Some models can function without significant training (unsupervised learning), and other models need pre-formed topics to supplied by the user (supervised learning). Some models can be initially supplied with possible topics, but have the ability to uncover further hidden topics from the data.

Choosing a topic model to analyze strategy documents means balancing training requirements, the capacity of the model to identify complex and heavily inter-related topics, and the ability to create accessible visualizations from model data.

Satisfying this set of parameters led me to choose BERTopic to analyze the strategy documents.

a) What is BERTopic?

BERTopic (documentation here) is a python-implemented topic modeling technique based on a language representation model known as BERT (Bidirectional Encoded Representations from Transformers). An granular explanation of how both BERT and BERTopic work is beyond the scope of this document (the creators explain BERT here and BERTopic here), but for purposes of clarity, I will give a cursory overview of how BERTopic functions.

Essentially, the BERTopic model receives information on relationships between words across the English language (this process is known as pre-training) and starts with default “ideas” on word relationships within a document. These default ideas generated from pre-training are called embeddings.

With these embeddings as a base, BERTopic then clusters the documents with similar embeddings using an algorithm called HDBSCAN (whose inner workings are beyond the scope of this document). For example, documents that are heavily comprised of words like “methadone” and “buprenorphine” will have similar embeddings and will be clustered together. Necessarily, this requires that each document is assigned to only a single cluster at a time.

A neat measure called TF-IDF (Term Frequency-Inverse Document Frequency) is used to turn clustered documents into topics. TF-IDF multiplies the frequency of a term in a document (a measure of how important the term is to the document) by the inverse of the likelihood of that term cropping up in other documents. The highest-scoring words can best differentiate that specific cluster from other clusters - forming a topic. Basically, it identifies the most important words by finding words that are really likely to show up in a topic but less likely to show up in other topics.

With this basic understanding of BERTopic, we can now transition to figuring out how it was implemented with reference to the strategy document.

b) How was BERTopic used?

For this analysis, BERTopic was used to create a topic model that spanned the entire corpus of ONDCP strategy documents pertaining to the following years: 1989-1992, 1993-2003, 2005-2016, and 2019-2025.

Prior to the analysis, each strategy document was separated into individual pages. This was conducted because the full strategy documents were individually too big for BERTopic to create descriptive topics. This can be traced to BERTopic’s reliance on clustering each document into one (and only one) cluster - the strategy documents contain far too many topics for BERTopic to effectively cluster them.

The initial result of a BERTopic model attempting to cluster full strategy documents. “Topic -1” refers to documents that could not be otherwise classified.

After being separated into pages, blank pages were removed, and the text was pre-processed by removing digits and converting all letters to lowercase (though the latter processing step didn’t occur completely). The topic model was then created with the BERTopic() function. All of the code behind this analysis can be found here. Results have been visualized with BERTopic’s native visualization solutions, which use PlotLy.

It is important to stress that because the nature of topic modeling with BERTopic is random, re-running this analysis may yield similar conclusions, but not the exact same data.

c) Conclusions from the BERTopic analysis

1) Global topic modeling

The first task in understanding the results of the BERTopic analysis is to understand how the topics are distributed across documents (pages). Because BERtopic classifies each document (page) to one topic, it is relatively simple to determine the most common topics.

Breakdown of topics in the topic model, organized by how many documents correspond to that topic. This graph was made in PlotLy with BERTopic’s native visualizer.

From this table, we can see that “topic -1” is the most common result. Remember, however, that “topic -1” contains documents (pages) that could not otherwise be classified - this means that 843 out of 3,004 strategy document pages (or about 28.1%) contained no recognizable topics. The largest comprehensible topic is topic 0, with 137 pages belonging to it, followed by topic 1 with 87 pages, and so on. The higher the topic number, the pages belong to it.

Now that we’re aware of topic distributions, we need an idea of what the topics represent.

BERTopic classified each document (page) by topic, and obtained the words most relevant to each topic by TF-IDF score. Here, we see both the strengths and limitations of BERtopic.

4 out of the 5 most common topics correspond to recognizable elements of ONDCP strategy over time. The most common topic contains words like “prevention”, “youth”, and “school” - which we can readily interpret as pertaining to primary prevention efforts among children. The next topic, with words like “justice”, “courts”, and “offenders” can be readily interpreted as pertaining to a criminal-legal approach to drug use and control. This trend can be repeated to summarize most of the 8 most common topics: topic 3 pertains to border interdiction, topic 4 pertains to cocaine-focused interdiction, and topic 5 pertains to recovery.

Now, there are some challenges associated with BERTopic. Topic 2 appears to be filled with corrupted values (as is topic 27), for reasons that I don’t currently understand. This is why the barchart for topic 2 shows no words. Additionally, because the entirety of the ONDCP strategy documents were used to train the topic model, some topics are more artifacts of the way that strategy documents are presented - for example, the mention of “NCJRS” in topic 6 likely pertains to the initial/final strategy document pages that mention the National Criminal Justice Reference Service as the publisher of the strategy documents.

In addition to snappy visualizations, we can understand topics in a more systematic fashion, as well. We can set the number of words in each topic (right now, the default is 10), and obtain the most relevant words (by TF-IDF score) for every topic like so:

Topic 19, which consists of words related to crime and local/state law enforcement.

2) Dynamic topic modeling

Where BERTopic really shows its superiority as a topic modeling strategy is in its ability to show the evolution of topics over time. BERTopic supports a dynamic topic model based on the original (or global) topic model. All you have to do is supply a list of dates (can be years, months, etc.) that correspond to the list of documents, and you can see how the topic model changes over time.

Through this method, we can glean all sorts of useful information. First, and most obviously, we can see which global topics (occurring across all periods) are most common at which times.

8 of the most common comprehensible (not corrupt values or random artifacts) topics were selected to create this visualization. Readily apparent are changes in drug policy throughout presidential administrations. The H.W. Bush and Clinton Administrations contained a significant emphasis on interdiction efforts against drug trafficking - as evidenced by the most frequently mentioned global topics being topic 3 (literally containing the word “interdiction”) and topic 4 (with the words “Colombia”, “cocaine”, and “Peru”). By contrast, Biden administration documents focused very heavily on recovery, and very little on interdiction.

We can also determine how general topics changed over time. For this exercise, we’ll focus on topic 0 (referring to youth prevention outcomes) in 2008 and 2020. By comparing these two terms, we can compare what youth prevention looked like during these two eras.

Here, differences in prevention priorities make themselves clear. In 2008, prevention was administered through programs that focused on being 100% drug free - the drug-free communities (DFC) and CC-ASAP (a local Kentucky prevention program emphasizing drug-free environments for youth. In contrast, “prevention” in 2020 was more generally rooted in reducing drug use, especially among youth.

This leads to the most interesting application of dynamic topic modeling: out of the individual pages, we can reverse-engineer a topic model representing each individual strategy document, and compare them - yielding rich data on how ONDCP strategy has changed over time.

All told, approaches to strategy document analysis based in topic modeling have the potential to uncover a significant amount of information with a comparatively limited expenditure of time and resources.

Part 3: Future Directions

a) Validation

Part of the original reasoning for using topic modeling was the prospect of approaching document analysis with a method that could stand up to scrutiny. Because the results that I have put forward are so preliminary, I have not applied validation methods of any kind.

One useful form of validation includes sampling a set of pages (10-15% of the document set), having knowledgeable humans sort them into the topic categories created by the algorithmic model, and comparing the accuracy of human topic modeling with algorithmic modeling. Accuracy, in this case would be the percentage of cases in which humans and the algorithm agree. This method was used by Ortega et al. to validate topic modeling of CDC abstracts - a paper that coincidentally used BERTopic.

Additionally, there are mathematical methods of validating BERTopic models (measures of topic coherence), but these are presently beyond my understanding.

b) Supervised topic modeling

This topic model was essentially unsupervised (on my part). Even though BERTopic requires pre-training to determine default word embeddings (see 2(a)), I did not pass any data to guide the actual topic modeling itself. This doesn’t have to be the case. BERTopic supports supervised topic modeling, where you basically manually input topics and the model tries to find documents that correspond with those topics. A manual input would look something like this (from BERTopic documentation).

Because we’re working with a lot of topics that we’re familiar with (e.g. we know which words correspond with harm reduction), supervised topic modeling may provide data that is more acutely tailored to our needs - allowing us to develop a picture of federal drug control strategy that assess when certain trends (e.g. the disparate focus on crack) were at their peak.