Virginia’s new expungement law contains a large set of conditions that marks each record in the court data as eligible for automatic record clearing, expungement by petition, automatic or petition clearing pending an ongoing waiting period, or not eligible for expungement. In this document, we alter these conditions to reflect one of several hypothetical revisions to the expungment law:

  1. How many records qualify for each type of expungement as the law is currently written, and how do these numbers break down by race?

  2. How would these counts change if the lifetime limit of two expungments for convictions is dropped?

  3. How would these counts change if the seven-year wait period for expungement of misdemeanor convictions is reduced to three years?

  4. How would these counts change if the seven-year wait period for expungement of misdemeanor convictions is reduced to three years, and the ten-year wait period for expungement of felony convictions is reduced to five years?

  5. How would these counts change if the following code sections are added to the list of convictions eligible for automatic expungment?

  • B.46.2-301: Driving while license, permit, or privilege to drive suspended or revoked (which accounts for 9.5% of all cases)
  • A.46.2-862: Exceeding speed limit, reckless driving (which accounts for 9% of all cases), also C.46.2-862 (which accounts for 4.4% of all cases)
  • 46.2-300: Driving without license prohibited; penalties (which accounts for 6.3% of all cases)
  • 18.2-95: Grand larceny defined; how punished (which accounts for 2.5% of all cases)

The most difficult technical challenge is the computation time needed to categorize large numbers of cases for their expungability outcomes. Generating these results for all cases in the data (roughly 9 million cases from 3 million individuals) takes abiut 5 days per hypothetical condition. For efficiency, and to enable a more valuable feedback process, we use a randomly selected sample of 1000 individuals with 2811 total cases. A random sample of 1000 from the whole population generates a +/- 3 percent margin of error for all results reported in this document.

We begin by loading the results and the needed packages:

library(tidyverse)
library(DT)
results <- read_csv("results.csv")

Counts of Expungement Outcomes for the Law as Currently Written

The counts under the status quo are as follows:

r <- results %>%
  group_by(expungable) %>%
  summarize(count = n(),
            percent = round(100*n()/nrow(results),2)) %>%
  mutate(percentstr = paste(as.character(percent), "%", sep=""))
g <- ggplot(r, aes(x=expungable, y=count, fill=expungable)) +
  geom_col() +
  theme(legend.position="none") +
  geom_text(aes(label = percentstr), vjust = 0) +
  xlab("Expungability Outcome") +
  ylab("Count")
g

The plurality of cases under the status quo still are not eligible for expungement. That said, nearly 20% of cases already qualify for automatic expungement and an additional 12% will become eligible at some point once a seven-year wait period has ellapsed.

We can break down these results by race as well:

r <- results %>%
  group_by(expungable, Race) %>%
  summarize(count = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(count)) %>%
  ungroup() %>%
  mutate(percent = round(100*count/totalrace,2),
         percent = paste(as.character(percent), "%", sep=""))
## `summarise()` has grouped output by 'expungable'. You can override using the `.groups` argument.
datatable(r)

Removing the Lifetime Limit

r <- results %>%
  group_by(expungable) %>%
  summarize(currentpercent = round(100*n()/nrow(results),2))
r2 <- results %>%
  group_by(expungable_no_lifetimelimit) %>%
  summarize(newpercent = round(100*n()/nrow(results),2)) %>%
  rename(expungable = expungable_no_lifetimelimit)
r <- inner_join(r, r2, by="expungable") %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
datatable(r)
r <- r %>%
  select(expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=expungable, y=diff_percent, fill=expungable)) +
  geom_col() +
  theme(legend.position="none") +
  geom_text(aes(label = diff_percent), vjust = 0) +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Removing the Lifetime Limit")
g

r <- results %>%
  group_by(expungable, Race) %>%
  summarize(currentcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(currentcount)) %>%
  ungroup() %>%
  mutate(currentpercent = round(100*currentcount/totalrace,2))
## `summarise()` has grouped output by 'expungable'. You can override using the `.groups` argument.
r2 <- results %>%
  group_by(expungable_no_lifetimelimit, Race) %>%
  summarize(newcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(newcount)) %>%
  ungroup() %>%
  mutate(newpercent = round(100*newcount/totalrace,2)) %>%
  rename(expungable = expungable_no_lifetimelimit)
## `summarise()` has grouped output by 'expungable_no_lifetimelimit'. You can override using the `.groups` argument.
r <- inner_join(r, r2, by=c("expungable", "Race")) %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
r <- select(r, expungable, Race, currentpercent, newpercent, diff_percent)
datatable(r)
r <- r %>%
  select(Race, expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=Race, y=diff_percent, fill=expungable)) +
  geom_col(position="dodge") +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Removing the Lifetime Limit") +
  coord_flip()
g

Shortening the 7 and 10 year wait periods to 3 and 5 years respectively

r <- results %>%
  group_by(expungable) %>%
  summarize(currentpercent = round(100*n()/nrow(results),2))
r2 <- results %>%
  group_by(expungable_7_to_5_and_10_to_5) %>%
  summarize(newpercent = round(100*n()/nrow(results),2)) %>%
  rename(expungable = expungable_7_to_5_and_10_to_5)
r <- inner_join(r, r2, by="expungable") %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
datatable(r)
r <- r %>%
  select(expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=expungable, y=diff_percent, fill=expungable)) +
  geom_col() +
  theme(legend.position="none") +
  geom_text(aes(label = diff_percent), vjust = 0) +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Reducing the 7 and 10 year waiting periods to 3 and 5 years")
g

r <- results %>%
  group_by(expungable, Race) %>%
  summarize(currentcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(currentcount)) %>%
  ungroup() %>%
  mutate(currentpercent = round(100*currentcount/totalrace,2))
## `summarise()` has grouped output by 'expungable'. You can override using the `.groups` argument.
r2 <- results %>%
  group_by(expungable_7_to_5_and_10_to_5, Race) %>%
  summarize(newcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(newcount)) %>%
  ungroup() %>%
  mutate(newpercent = round(100*newcount/totalrace,2)) %>%
  rename(expungable = expungable_7_to_5_and_10_to_5)
## `summarise()` has grouped output by 'expungable_7_to_5_and_10_to_5'. You can override using the `.groups` argument.
r <- inner_join(r, r2, by=c("expungable", "Race")) %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
r <- select(r, expungable, Race, currentpercent, newpercent, diff_percent)
datatable(r)
r <- r %>%
  select(Race, expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=Race, y=diff_percent, fill=expungable)) +
  geom_col(position="dodge") +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Reducing the 7 and 10 year waiting periods to 3 and 5 years") +
  coord_flip()
g

## Adding Additional Code Sections to the Ones Eligible for Automatic Expungement

B.46.2-301: Driving while license, permit, or privilege to drive suspended or revoked (which accounts for 9.5% of all cases)

r <- results %>%
  group_by(expungable) %>%
  summarize(currentpercent = round(100*n()/nrow(results),2))
r2 <- results %>%
  group_by(expungableB462301) %>%
  summarize(newpercent = round(100*n()/nrow(results),2)) %>%
  rename(expungable = expungableB462301)
r <- inner_join(r, r2, by="expungable") %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
datatable(r)
r <- r %>%
  select(expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=expungable, y=diff_percent, fill=expungable)) +
  geom_col() +
  theme(legend.position="none") +
  geom_text(aes(label = diff_percent), vjust = 0) +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Adding B.46.2-301 to Automatic Expungements")
g

r <- results %>%
  group_by(expungable, Race) %>%
  summarize(currentcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(currentcount)) %>%
  ungroup() %>%
  mutate(currentpercent = round(100*currentcount/totalrace,2))
## `summarise()` has grouped output by 'expungable'. You can override using the `.groups` argument.
r2 <- results %>%
  group_by(expungableB462301, Race) %>%
  summarize(newcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(newcount)) %>%
  ungroup() %>%
  mutate(newpercent = round(100*newcount/totalrace,2)) %>%
  rename(expungable = expungableB462301)
## `summarise()` has grouped output by 'expungableB462301'. You can override using the `.groups` argument.
r <- inner_join(r, r2, by=c("expungable", "Race")) %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
r <- select(r, expungable, Race, currentpercent, newpercent, diff_percent)
datatable(r)
r <- r %>%
  select(Race, expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=Race, y=diff_percent, fill=expungable)) +
  geom_col(position="dodge") +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Adding B.46.2-301 to Automatic Expungements") +
  coord_flip()
g

### A.46.2-862: Exceeding speed limit, reckless driving (which accounts for 9% of all cases)

r <- results %>%
  group_by(expungable) %>%
  summarize(currentpercent = round(100*n()/nrow(results),2))
r2 <- results %>%
  group_by(expungableA462862) %>%
  summarize(newpercent = round(100*n()/nrow(results),2)) %>%
  rename(expungable = expungableA462862)
r <- inner_join(r, r2, by="expungable") %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
datatable(r)
r <- r %>%
  select(expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=expungable, y=diff_percent, fill=expungable)) +
  geom_col() +
  theme(legend.position="none") +
  geom_text(aes(label = diff_percent), vjust = 0) +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Adding A.46.2-862 to Automatic Expungements")
g

r <- results %>%
  group_by(expungable, Race) %>%
  summarize(currentcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(currentcount)) %>%
  ungroup() %>%
  mutate(currentpercent = round(100*currentcount/totalrace,2))
## `summarise()` has grouped output by 'expungable'. You can override using the `.groups` argument.
r2 <- results %>%
  group_by(expungableA462862, Race) %>%
  summarize(newcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(newcount)) %>%
  ungroup() %>%
  mutate(newpercent = round(100*newcount/totalrace,2)) %>%
  rename(expungable = expungableA462862)
## `summarise()` has grouped output by 'expungableA462862'. You can override using the `.groups` argument.
r <- inner_join(r, r2, by=c("expungable", "Race")) %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
r <- select(r, expungable, Race, currentpercent, newpercent, diff_percent)
datatable(r)
r <- r %>%
  select(Race, expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=Race, y=diff_percent, fill=expungable)) +
  geom_col(position="dodge") +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Adding A.46.2-862 to Automatic Expungements") +
  coord_flip()
g

### 46.2-300: Driving without license prohibited; penalties (which accounts for 6.3% of all cases)

r <- results %>%
  group_by(expungable) %>%
  summarize(currentpercent = round(100*n()/nrow(results),2))
r2 <- results %>%
  group_by(expungable462300) %>%
  summarize(newpercent = round(100*n()/nrow(results),2)) %>%
  rename(expungable = expungable462300)
r <- inner_join(r, r2, by="expungable") %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
datatable(r)
r <- r %>%
  select(expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=expungable, y=diff_percent, fill=expungable)) +
  geom_col() +
  theme(legend.position="none") +
  geom_text(aes(label = diff_percent), vjust = 0) +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Adding 46.2-300 to Automatic Expungements")
g

r <- results %>%
  group_by(expungable, Race) %>%
  summarize(currentcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(currentcount)) %>%
  ungroup() %>%
  mutate(currentpercent = round(100*currentcount/totalrace,2))
## `summarise()` has grouped output by 'expungable'. You can override using the `.groups` argument.
r2 <- results %>%
  group_by(expungable462300, Race) %>%
  summarize(newcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(newcount)) %>%
  ungroup() %>%
  mutate(newpercent = round(100*newcount/totalrace,2)) %>%
  rename(expungable = expungable462300)
## `summarise()` has grouped output by 'expungable462300'. You can override using the `.groups` argument.
r <- inner_join(r, r2, by=c("expungable", "Race")) %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
r <- select(r, expungable, Race, currentpercent, newpercent, diff_percent)
datatable(r)
r <- r %>%
  select(Race, expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=Race, y=diff_percent, fill=expungable)) +
  geom_col(position="dodge") +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Adding 46.2-300 to Automatic Expungements") +
  coord_flip()
g

### 18.2-95: Grand larceny defined; how punished (which accounts for 2.5% of all cases)

r <- results %>%
  group_by(expungable) %>%
  summarize(currentpercent = round(100*n()/nrow(results),2))
r2 <- results %>%
  group_by(expungable18295) %>%
  summarize(newpercent = round(100*n()/nrow(results),2)) %>%
  rename(expungable = expungable18295)
r <- inner_join(r, r2, by="expungable") %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
datatable(r)
r <- r %>%
  select(expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=expungable, y=diff_percent, fill=expungable)) +
  geom_col() +
  theme(legend.position="none") +
  geom_text(aes(label = diff_percent), vjust = 0) +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Adding 18.2-95 to Automatic Expungements")
g

r <- results %>%
  group_by(expungable, Race) %>%
  summarize(currentcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(currentcount)) %>%
  ungroup() %>%
  mutate(currentpercent = round(100*currentcount/totalrace,2))
## `summarise()` has grouped output by 'expungable'. You can override using the `.groups` argument.
r2 <- results %>%
  group_by(expungable18295, Race) %>%
  summarize(newcount = n()) %>%
  group_by(Race) %>%
  mutate(totalrace = sum(newcount)) %>%
  ungroup() %>%
  mutate(newpercent = round(100*newcount/totalrace,2)) %>%
  rename(expungable = expungable18295)
## `summarise()` has grouped output by 'expungable18295'. You can override using the `.groups` argument.
r <- inner_join(r, r2, by=c("expungable", "Race")) %>%
  mutate(diff_percent = round(newpercent - currentpercent,2),
         currentpercent = paste(as.character(currentpercent), "%", sep=""),
         newpercent = paste(as.character(newpercent), "%", sep=""),
         diff_percent = paste(as.character(diff_percent), "%", sep=""))
r <- select(r, expungable, Race, currentpercent, newpercent, diff_percent)
datatable(r)
r <- r %>%
  select(Race, expungable, diff_percent) %>%
  mutate(diff_percent = str_replace(diff_percent, "%", ""),
         diff_percent = as.numeric(diff_percent))
g <- ggplot(r, aes(x=Race, y=diff_percent, fill=expungable)) +
  geom_col(position="dodge") +
  xlab("Expungability Outcome") +
  ylab("Change in percent relative to current law") +
  ggtitle("Adding 18.2-95 to Automatic Expungements") +
  coord_flip()
g