Daily Counts vs. Yearly Flows

BJS reports don’t tell us how many unique people flow through jail in a given year. However, as of 7/1/16, Connecticut has published a daily census of every inmate held in jail while awaiting trial.1

I download the data on 1-8-19 and save it as CT-jail_1-8-19.csv. It has 2.88 million rows.

Call the csv and clean up.

While it’s 2.88 million rows, how many unique people are in the data?

#search unique by identifier
[1] 30785

There are 30,785 unique people. But maybe some people have gone in and out of jail for different offenses? So, how many unique person-admissions are there?

#search unique rows by identifier and latest admission date
[1] 42396

42,396 unique person-admissions.

Let’s compare the 1/1/2018 count and the count for all of 2018.

[1] 3283

There were 3,283 unique people in the CT jail on 1-1-2018.

nrow(unique(CT[CT$download_date >= "2018-01-01" & CT$download_date < "2019-01-01", "identifier"]))
[1] 15972

But there were 15,972 unique people in the CT jail during all of 2018.

[1] 4.865062

So, in CT for the pretrial jail population, about 5 times as many pretrial individuals flow through the jail system over the year as are represented in a daily count.

Recall Prof Pfaff had said in the Justice in America podcast, “it’s probably on the order of six, seven million unique individuals get sent to jail for at least some period of time every single year.”

[1] 8
[1] 9.333333

Given that he was estimating about 750,000 people are in jail on a given day, he estimates that about 8-9 times (see above) as many people flow through jail over the year as are represented in a daily count.

Regardless of the specifics, the main point remains: daily counts massively underrepresent the population who experiences jail on an annual basis (due to short durations of stays in jail). In the case of CT pretrial (using numbers from 2018), about 5 times as many pretrial people flow through the jail system over the year as are represented in a daily count.

Spotting (Holiday) Seasonality

Before making graphs, I call my custom theme, as per usual.

#Load more libraries
library(ggplot2);library(ggrepel); library(extrafont); library(ggthemes);library(reshape);library(grid);
#Define theme for my visuals
my_theme <- function() {
  # Define colors for the chart
  palette <- brewer.pal("Greys", n=9)
  color.background = palette[2]
  color.grid.major = palette[4]
  color.panel = palette[3]
  color.axis.text = palette[9]
  color.axis.title = palette[9]
  color.title = palette[9]
  # Create basic construction of chart
  theme_bw(base_size=9, base_family="Palatino") + 
  # Set the entire chart region to a light gray color
  theme(panel.background=element_rect(fill=color.panel, color=color.background)) +
  theme(plot.background=element_rect(fill=color.background, color=color.background)) +
  theme(panel.border=element_rect(color=color.background)) +
  # Format grid
  theme(panel.grid.major=element_line(color=color.grid.major,size=.25)) +
  theme(panel.grid.minor=element_blank()) +
  theme(axis.ticks=element_blank()) +
  # Format legend
  theme(legend.position="right") +
  theme(legend.background = element_rect(fill=color.background)) +
  theme(legend.text = element_text(size=7,color=color.axis.title)) + 
  theme(legend.title = element_text(size=0,face="bold", color=color.axis.title)) + 
  #Format facet labels
  theme(strip.text.x = element_text(size = 8, face="bold"))+
  # Format title and axes labels these and tick marks
  theme(plot.title=element_text(color=color.title, size=18, face="bold", hjust=0)) +
  theme(axis.text.x=element_text(size=8,color=color.axis.text)) +
  theme(axis.text.y=element_text(size=8,color=color.axis.text)) +
  theme(axis.title.x=element_text(size=10,color=color.axis.title, vjust=-1)) +
  theme(axis.title.y=element_text(size=10,color=color.axis.title, vjust=1.8)) +
  #Format title and facet_wrap title
  theme(strip.text = element_text(size=8), plot.title = element_text(size = 14, face = "bold", colour = "black", vjust = 1, hjust=0.5))+
  # Plot margins
  theme(plot.margin = unit(c(.2, .2, .2, .2), "cm"))

Show line chart of inmates over time by race.

CTdays<- CT %>% 
  group_by(download_date) %>% 
  summarise(n = n())
ggplot(data=CTdays, aes(x=download_date, y=n)) + 
  my_theme()+ theme(plot.title = element_text(hjust = 0))+
  labs(x="", y="Number of Pretrial Inmates")+
  ggtitle("Gender/Race of Connecticut Pretrial Inmates (7/1/16-1/8/19)", subtitle = "Data Available via Connecticut Open Data | Visualization via Alex Albright (thelittledataset.com)")

2017-8-24 is a crazy outlier, remove it.

CTdays<- subset(CTdays, n<6000)

Graph by day now with 2017-8-24 removed (as I assume it must be an error).

ggplot(data=CTdays, aes(x=download_date, y=n)) + 
  my_theme()+ theme(plot.title = element_text(hjust = 0))+
  labs(x="Date", y="Number of Pretrial Inmates", caption="\nData plotted by day. Red lines mark the beginning/end of years.")+
  geom_vline(xintercept = as.numeric(as.Date(
    c("2017-01-01","2018-01-01", "2019-01-01")
    )), linetype=4, color="red")+
  scale_x_date(date_breaks = "1 month", date_labels = "%m-%y")+
  theme(axis.text.x = element_text(angle = 90, hjust = 1))+
  ggtitle("Connecticut Pretrial Inmate Population 7/1/16-1/8/19", subtitle = "Data Available via Connecticut Open Data | Visualization via Alex Albright (thelittledataset.com)")
ggsave("time-pop0.png", width=7, height=4.5, dpi=900)