Introduction

How often do you stop to consider what will happen if you get caught?

Maybe the question prompted you to think about the buddy you copied your answers from back in college. Maybe you “accidentally” stole a candy bar from your store a few days ago. Maybe you even bought substances under the table. Everyone has committed a fault, consciously or not, in their lives, but not everyone has equitable consequences for their actions.

The U.S. criminal justice system is deeply intertwined with our daily lives, yet often feels distant to the average person. News articles and other media outlets often report on the most dramatic cases, although arguably biased on the perpetrator themselves too. Society likes to believe the system is just, but there often persists a demographic argument for the validity and equity of any ruling. There’s subjectivity in punishment, just as there is in rewards.

This project takes a closer look at variations in 2018 federal criminal case sentencing using data pulled from the United States Sentencing Commission. 2018 is a relatively “simpler” time, with a considerably less polarized political landscape. That era arguably was clearer in terms of where society stood on matters, for example, abortion, drug trafficking, and sexual misconduct.

This analysis will examine potential factors influencing four major drug types: meth, crack, cocaine, and heroin. As a college student, my close exposure to drugs and drug users shaped my motivation to explore this topic.

To ground the data and understand the social landscape in 2018, the app below subsets U.S. state populations by racial demographics — a factor often cited in criminal appeals involving claims of racial bias in jury selection or trial outcomes.

https://rmarkdown.rstudio.com/lesson-15.HTML

## Warning: package 'maps' was built under R version 4.4.3
Shiny applications not supported in static R Markdown documents

The federal criminal justice system in the U.S. is divided into regional circuits, specifically through the U.S. Courts of Appeals, which are organized into 13 circuits (11 numbered circuits, plus D.C. and the Federal Circuit). Each circuit covers a group of states and plays a major role in how federal law is interpreted regionally. This map contextualizes demographics for each of the 50 US states, excluding D.C. and territories, in terms of the entire US and within each circuit. The data set had minimal drug case observations specific to D.C., justifying the exclusion. Federal circuits were excluded to highlight original jurisdiction rulings; federal courts often handle appeals.

Things to note: certain states generally have higher populations than others (ex. Texas v. Rhode Island), so pay close attention to the counts listed on the bottom of the output. Also, there appears to generally be more white individuals in the observation set given the more balanced gradient for that racial demographic.

The table below assigns the US states to each circuit for easier viewing and comparison.

## Warning: package 'kableExtra' was built under R version 4.4.3
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows
States by Federal Circuit
Circuit States
First Maine, Massachusetts, New Hampshire, Rhode Island
Second Connecticut, New York, Vermont
Third Delaware, New Jersey, Pennsylvania
Fourth Maryland, North Carolina, South Carolina, Virginia, West Virginia
Fifth Louisiana, Mississippi, Texas
Sixth Kentucky, Michigan, Ohio, Tennessee
Seventh Illinois, Indiana, Wisconsin
Eighth Arkansas, Iowa, Minnesota, Missouri, Nebraska, North Dakota, South Dakota
Ninth Alaska, Arizona, California, Hawaii, Idaho, Montana, Nevada, Oregon, Washington
Tenth Colorado, Kansas, New Mexico, Oklahoma, Utah, Wyoming
Eleventh Alabama, Florida, Georgia
## Warning: package 'haven' was built under R version 4.4.3
## Warning: package 'plotly' was built under R version 4.4.3
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
Shiny applications not supported in static R Markdown documents

These tree maps help us identify the circuits with the most cases across all drug types. The visualization is dynamic to open a pop-up tree map to further break down the circuits into their state make-up. The circuits with the most cases in the data set are Circuits 8, 5, and 9 respectively. This makes sense given Circuits 8 and 9 both have 8-9 moderately sized states, and Circuit 5 includes Texas. As they say, everything is big in Texas!

To introduce some social commentary, crack cocaine and powder cocaine are essentially the same drug but in different forms. Historically, black and other minority groups are more likely tried under crack cocaine guidelines (which are harsher), whereas white counterparts tend to be tried under powder cocaine guidelines (https://pmc.ncbi.nlm.nih.gov/articles/PMC4533860/#:~*:text=African%20Americans%20are%20also%20more**,Vagins%20and%20McCurdy%2C%202006).*). The tree map below explores only crack and powder cocaine cases.

## Warning: package 'treemap' was built under R version 4.4.3

This tree map excludes heroin and meth and looks very different from the previous ma, which included heroin and meth. The leading circuits here are Circuit 4, 11, and 5 – Circuit 4 the largest by 30 cases. As a reminder, the states in the 4th circuit are Maryland, North Carolina, South Carolina, Virginia and West Virginia. These are all states with close proximity to the Nation’s Capitol. Laws near D.C. are generally strict, so let’s frame a follow-up question to further investigate whether proximity to DC impacts the severity of crack sentencing.

“Is the median sentence for crack and powder cocaine offenses harsher in the 4th Circuit (states close to D.C.) compared to other federal circuits?”

In order to visualize this, an overlaid density plot with median markers help to compare the 4th circuit sentences to the aggregate of all other circuits.

Circuit 4 v. Others Analysis

The density plot demonstrates a higher median value for crack sentencing based on proximity to the fourth circuit by 20 months. This is almost 2 full years difference; if you’re going to go to jail, 2 years is a BIG difference. Both distributions also appear to be right-skewed, meaning more outliers fall on the right than on the left of the distribution, essentially “skewing” shape. Other circuits tend to peak earlier providing additional evidence of the tendency to sentence lower than the fourth circuit. I will caveat and say that the 4th circuit has a flatter peak meaning the points are a bit more evenly spread than compared to other circuits.

Let’s further crack down on the 4th circuit. How do these densities look per state? North Carolina, Virginia, and South Carolina have enough data points for me to be comfortable analyzing them (40+), but Maryland doesn’t have enough (<30), so I will exclude it from this comparison.

Well, I was hoping to see some very stark differences between the states, so I could make some super cool insight, but these plots are informative nonetheless. All states appear unimodal between 100 and 150, with North Carolina having a thicker right skew than the others. Given these, I want to shift gears a bit back to the crack versus powder debate, but from an overarching general view.

Crack v Powder Cocaine … does it matter?

Again, while there is no real substantial difference between crack cocaine and cocaine apart from whether it’s in a smokable form (crack) or not, minority individuals are often charged VERY differently. “African Americans are also more likely to be convicted for crack offenses, while powder cocaine convictions are more common in affluent white communities (USSC, 1995; Vagins and McCurdy, 2006)” – https://pmc.ncbi.nlm.nih.gov/articles/PMC4533860/#:~:text=African%20Americans%20are%20also%20more,Vagins%20and%20McCurdy%2C%202006.

Knowing this, let’s dive deeper into the dataset. Is there actually a clearer difference between Crack Cocaine and Powder Cocaine in terms of sentencing across the US?

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ plotly::filter()         masks dplyr::filter(), stats::filter()
## ✖ kableExtra::group_rows() masks dplyr::group_rows()
## ✖ dplyr::lag()             masks stats::lag()
## ✖ purrr::map()             masks maps::map()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
## Warning: package 'DT' was built under R version 4.4.3
## 
## Attaching package: 'DT'
## 
## The following objects are masked from 'package:shiny':
## 
##     dataTableOutput, renderDataTable
Shiny applications not supported in static R Markdown documents

Based on this explorer, users and myself are able to take a deeper look into different factors that could coincide with the crack v. powder debate across all circuits. Some interesting things I’ve found is that, surprisingly, not many of the cases in this data set are committed by non-citizen individuals. Additionally, …..

I want to take a step back now and take a look at the general trends between sentence length and drug type. We’ve established that there’s a difference between powder and crack cocaine, but what about other drug types? I will take a closer look at heroin and meth as well.

Comparative Analysis of all Drug Types

I want to take a deeper dive into other demographic features, so let’s make some more plots. Citizenship status is especially interesting to me. With rising concerns about immigrants, at least by the Trump administration (who was in office at the time of the data), I wanted to see if the data unfairly reflected a bias towards noncitizen individuals.

width argument in the geom bar (make it fixed) – same size bars? maybe use different shades of blue for the citizenship status’ –> maybe flip it so color meant drug

Plot ideas: if judge went off-script based on race of the person tried (sentence length by race GIVEN a certain off-guide excuse)?

plea/trial difference within circuits –> choose only one drug, maybe do tabs for each drug and the drop down for the circuits (shinyapps)? - ex. dropdown 1 was tab 1, dropdown 2 was tab 2, dropdown was tab 3

#maybe a scatterplot – between weight and sentence length for a specific drug of choice? –> you could facet wrap for each drug This plot aims to see the impact of weapon charges on the sentence length of individuals convicted for crack and powder. While I expected to see a bigger difference between the variables, I’m surprised that overall the sentencing is quite similar.

maybe look at education status –> show proportion (out of number of people arrested for heroin, divide up the education) –> is there a rship between education status and how extreme of a drug crime u do (ex. highschool cocaine, heroin in college or something)