load("Data/cjtc_internal.rda")
cjtc <- cjtc_internal

# indices for plotting

## all yr-mo in data
yr_mo_all <- paste0(sort(rep(2002:2025, 12)), "-", rep(01:12, 24))
seqnum_all <- 1:(length(yr_mo_all))

seqnum_5051 <- seqnum_all[yr_mo_all == "2021-6"]
seqnum_current <- seqnum_all[yr_mo_all == "2022-6"]

index_all <- data.frame(seqnum = seqnum_all,  yr_mo = yr_mo_all) %>%
  filter(seqnum < (max(seqnum)-2)) %>% # only thru 9/2025 
  mutate(period = case_when(
    seqnum <= seqnum_5051 ~ "pre.5051",
    seqnum > seqnum_current ~ "current",
    TRUE ~ "interim"
  ))

## post 5051
index_post_5051 <- index_all %>% filter(period != "pre.5051")

Introduction

Overview

The WA State process for investigating law enforcement officers for misconduct and possible decertification was revised in 2021 by Senate Bill 5051 (see the “SB 5051” tab above for details). The process is tracked internally by CJTC with a “certification database”, and this report presents basic descriptive statistics from the internal version of that database (see the “On the data” tab above for details).

For some of the tables and plots we split the data into 3 periods by the date the case was received:

  • pre-5051 (before July 2021),
  • “interim” (July 2021 - June 2022, a period when the new system was being set up), and
  • “current” (after June 2022, when the system was fully functional)

In other places, we combine the second two periods into “post-5051”.

A list of officers that have received multiple complaints or agency notices of misconduct can be found in the “Officers” section of this report. The list can be searched, sorted, filtered and downloaded in several different formats.


Navigating this report:

  • You can use the TOC on the left to view specific sections based on content.

  • Within sections, look for the tabs in the document to view all of the different tables and graphs. Tabs are typically just below the section headers, but some sections have additional nested tabs below.


This is a first draft, so there may be some errors in the precise numbers presented. Comments/suggestions/questions are welcomed and appreciated.


SB 5051


On the data

The CJTC maintains a database to track publicly filed complaints, agency notifications of officer misconduct (“Form 1915”), and Division initiated investigations that require review and may lead to officer decertification. Note that decertification, and especially involuntary decertification, is a very rare outcome (you can verify this in the Outcomes section of this report).

Case processing involves 3 general steps: Intake (entered into the database, initial review), Assessment (either closed without investigation or forwarded to investigation) and Determination (if forwarded for investigation, either decline to charge or charge with a hearing and outcome). At any step in the process, an officer may voluntarily surrender their certification.

There are two versions of the database:

  • An internal version that includes all cases

  • A public version that excludes cases in the “Intake” process, and is available online at the CJTC website.

The format and case coverage of the two datasets is different, but they otherwise include almost the same information.

This report is based on the Internal Dataset

The internal dataset was obtained via PDR by Dom Campese of the WCPA, and contains all cases up through 9/24/2025.

One of the key differences between the internal and public dataset formats is how the individual records in the dataset are defined. In the public dataset, each record is a “case-person”, defined by a case number and a person’s name. In the internal dataset, one record is used per offense (a complaint or form 1915 can lead to multiple offenses being charged for the same case-person), with additional records used to track attorney- or union-represented officers, some complainants and witnesses, and a few other categories of persons. As a result there are many more records in the internal dataset than the public (5829 vs. 2200 as of 9/24/2025) .

For this report, we “de-duplicated” the internal dataset with the goal of obtaining a set of “case-person” records equivalent to the public dataset. This reduced the number of records in the internal dataset down from 5829 to 3931. The de-duplicated internal dataset still has more records than the public dataset because it includes cases that are in the Intake process.


Case Status

Summary

tab <- cjtc %>%
  group_by(Status = as.character(status3)) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Status), -1, rank(Number)),
         Status = replace_na(Status, "Missing")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Status ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))

tab %>%  
  kbl(caption = "Case Status:  all cases",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Status == "Missing"), italic=T) %>%
  row_spec(which(tab$Status == "Total"), bold=T) 
Case Status: all cases
Status Number Percent
Closed 2003 51.0%
Intake and review 1739 44.2%
Active investigation 183 4.7%
Missing 6 0.2%
Total 3931 100.0%

By period

p <- ggplot(cjtc %>% filter(!is.na(status3))) +
  
  aes(x=period_r, 
      group=status3, fill=status3) + 
  
  geom_bar(alpha = 0.7, color="grey") +
  
  scale_fill_brewer(palette = "Spectral") +
  
  labs(title = "Case status by period received",
       x = "Period Received",
       y = "Number of cases",
       fill = "Status")

ggplotly(p, tooltip = c("count", "group")) #%>% reverse_legend_labels()

Detailed CJTC coding

tab <- cjtc %>%
  group_by(Status = as.character(status)) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Status), -1, rank(Number)),
         Status = replace_na(Status, "Missing")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Status ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1)) 
  
tab %>% 
  kbl(caption = "Case Status (original detail):  all cases",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Status == "Missing"), italic=T) %>%
  row_spec(which(tab$Status == "Total"), bold=T) 
Case Status (original detail): all cases
Status Number Percent
Closed 2003 51.0%
Intake Review 1333 33.9%
Intake 406 10.3%
Investigation 90 2.3%
Admin Review 54 1.4%
Hearing Coordinator 13 0.3%
AAG (Legal) 9 0.2%
Settlement 8 0.2%
Assistant Director Review 5 0.1%
Appeal - Post Discipline 2 0.1%
Discipline - Post Hearing 1 0.0%
Open - Pending Expiration 1 0.0%
Missing 6 0.2%
Total 3931 100.0%

Outcomes

In this section we restrict the focus to closed cases, and their results.

Overall

Summary

tab <- cjtc %>%
  filter(status3 == "Closed") %>%
  group_by(Outcome = as.character(outcat)) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Outcome), -1, rank(Number)),
         Outcome = replace_na(Outcome, "Missing")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Outcome ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1)) 
  
tab %>%
  kbl(caption = "Case Outcome:  all closed cases",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Outcome == "Missing"), italic=T) %>%
  row_spec(which(tab$Outcome == "Total"), bold=T) 
Case Outcome: all closed cases
Outcome Number Percent
Admin Closure 740 36.9%
Duplicate 492 24.6%
Revoked/Denied Cert/Disciplined 449 22.4%
Declined/Dismissed/Settlement 314 15.7%
Other 8 0.4%
Total 2003 100.0%

Original CJTC detail

tab <- cjtc %>%
  filter(status3 == "Closed") %>%
  mutate(case_determination = if_else(status3 != "Closed", "Still Open", case_determination)) %>%
  group_by(Outcome = as.character(case_determination)) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Outcome), -1, rank(Number)),
         Outcome = replace_na(Outcome, "Missing")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Outcome ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1)) #%>%

tab %>%    
  kbl(caption = "Case Outcome (original detail):  all cases",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Outcome == "Missing"), italic=T) %>%
  row_spec(which(tab$Outcome == "Total"), bold=T) 
Case Outcome (original detail): all cases
Outcome Number Percent
Administrative Closure - Lack of Jurisdiction 623 31.1%
Administrative Closure - Duplicate Case 492 24.6%
Revocation - Default 295 14.7%
Decline 293 14.6%
Revocation - Surrendered Certification 82 4.1%
Revocation - Hearing 52 2.6%
Administrative Closure - Not Certified or Deceased 48 2.4%
Administrative Closure - Insufficient Information 37 1.8%
Charge(s) Dismissed 21 1.0%
Denied Certification 17 0.8%
Administrative Closure - Implausible 16 0.8%
Administrative Closure - Revoked 15 0.7%
Eligibility Reinstated 8 0.4%
Denied Reinstatement 2 0.1%
Administrative Closure - Lack of Jurisdiction; Administrative Closure - Not Certified or Deceased 1 0.0%
Revocation - Summary Judgment 1 0.0%
Total 2003 100.0%

By period received

p <- ggplot(cjtc %>% filter(status3 == "Closed" & outcat != "Other"), 
          aes(x=period_r, group=desc(outcat), 
              fill=outcat)) + 
  
  geom_bar(alpha = 0.7, color="grey") +
  
  scale_fill_brewer(palette = "Spectral") +
  
  labs(title = "Closed case outcome by period received",
       x = "Period Received",
       y = "Number of closed cases",
       fill = "Outcome")

ggplotly(p, tooltip= c("count", "fill")) %>% reverse_legend_labels()

By year - recd post 5051

xval = max(cjtc$yr_r)
yval = nrow(cjtc %>% filter(yr_r == max(cjtc$yr_r) & 
                              period_r != "pre.5051" & 
                              status3 == "Closed")) * 1.2

p <- ggplot(cjtc %>% filter(period_r != "pre.5051" & status3 == "Closed"), 
          aes(x=yr_r, group=desc(outcat), 
              fill=outcat)) + 
  
  geom_bar(alpha = 0.7, color="grey") +
  
  scale_fill_brewer(palette = "Spectral") +
  
  annotate("text", x=xval, y=yval, label= "Yr to date", size=3) +
  
  labs(title = "Closed case outcomes by year received (post-5051)",
       x = "Year received",
       y = "Number of closed cases",
       fill = "Outcome")

ggplotly(p, tooltip= c("count", "fill")) %>% reverse_legend_labels()

Processing time

By status and period

The “boxplots” below show the range of case processing times (in years) by period, separately for closed and still open cases. For closed cases, this is the difference between case date received and date closed. For open cases it is the difference between case date received and the current date.

  • The box shows the middle 50% of the processing times, with a line at the median.
  • The “whiskers” extend to roughly the lower 5% and upper 95% of processing times.
  • Any dots beyond the whiskers are cases with unusually low or high processing times.
  • The width of the box represents the number of cases in that group.
ggplot(cjtc %>% filter(!is.na(status3) & case_length_yr >= 0) %>%
         mutate(status2 = if_else(status3=="Closed", "Closed", "Open")), 
       aes(x=period_r, y=case_length_yr, fill=period_r)) +
  geom_boxplot(alpha = 0.5, varwidth = T) +
  
  labs(title = "Case processing times by case status and period received",
       x = "Period received",
       y = "Time in years",
       fill = "Period received") +
  
  scale_y_continuous(breaks = seq(0,round(max(cjtc$case_length_yr)), 1)) +

  facet_wrap("status2")

Post 5051

Closed cases: By outcome

ggplot(cjtc %>% filter(period_r != "pre.5051" & status3 == "Closed" & case_length_yr >= 0), 
       aes(x=period_r, y=case_length_yr, fill=period_r)) +
  geom_boxplot(alpha = 0.5, varwidth = T) +
  
  labs(title = "Closed case processing times by outcome and period received",
       x = "Period received",
       y = "Time to close in years",
       fill = "Period received") +

  facet_wrap("outcat")

Open cases: By status

ggplot(cjtc %>% 
         filter(period_r != "pre.5051" & status3 != "Closed" & case_length_yr >= 0) %>%
         mutate(Status = if_else(grepl("Intake", status), status, "Investigation")), 
       aes(x=period_r, y=case_length_yr, fill=period_r)) +
  geom_boxplot(alpha = 0.5, varwidth = T) +
  
  labs(title = "Open case processing times by status and period received",
       x = "Period received",
       y = "Time since received in years",
       fill = "Period received") +

  facet_wrap("Status")

____

Caseload dynamics over time

rcd <- cjtc %>% 
  group_by(yr_mo = yr_mo_r) %>%
  summarize(received = n())

clo <- cjtc %>% 
  group_by(yr_mo = yr_mo_c) %>%
  summarize(closed = n())

# index_all defined in top code block
monthly_df <- left_join(index_all, rcd) %>%
  left_join(clo) %>%
  arrange(seqnum) %>%
  mutate(across(c(received,closed), ~tidyr::replace_na(.x, 0)),
         net_number = closed - received,
         net_pct = 100*(closed/received -1),
         in_out_ratio = round(received/closed, 2),
         cumulative_backlog = -cumsum(net_number)) 

In the post-5051 period, there has been a steady growth in the size of the open caseload, and as the Case Status section showed, most of these case are in the “Intake” part of the process, not in active investigation. The graphs in the tabs below show the trend in the size of the open caseload in different ways.

Received vs. Closed

This plot shows the number of cases received and the number of cases closed each month. It does a good job identifying the explosion of cases received post 5051, but not as good a job identifying the relative number of cases received vs. closed. For that, the other tabs in this section are better.

p <- monthly_df %>% 
  select(seqnum, yr_mo, received, closed) %>%
  pivot_longer(cols = c(received:closed),
               names_to = "Measure",
               values_to = "num") %>%
  
  ggplot(aes(x=seqnum, y=num, group=Measure, color=Measure)) +
  
  geom_point(alpha = 0.7, size = 1) +
  geom_line(alpha = 0.5) +
  
  labs(title = "Number of cases received and closed by month",
       x = "Year",
       y = "Number of cases") +
  
  scale_x_continuous(
    breaks = seq(1, nrow(monthly_df), 12),
    label = as.character(2002:2025) # monthly_df$yr_mo[seq(1, nrow(monthly_df), 12)]
  )  +
  
  theme(axis.text.x = element_text(size = 5))
  
ggplotly(p, tooltip= c("color", "num"))

Ratio of cases received:closed

This plot shows the ratio of cases received to closed each month (the points), with a smoothed line showing the moving average over time (the gray band around the line measures the uncertainty in the average).

A value of 1 (on the y-axis) means the two numbers are equal, and this is highlighted in the plot by the gray horizontal line. Values below that line mean more cases are closed than received; values above that line mean more cases are received than closed. A value of 0.5, for example, would mean that the number of received cases was half as large as the number of closed cases. While a value of 5 would mean the number of received cases was 5 times as large.

The 3 periods are represented by the colors of the lines/points.

Overall, the smoothed ratios indicate more cases received than closed for much of the time, but the average ratios rise post-5051:

  • pre-5051 average: 1.4

  • interim average: 4.4

  • current average: 2.8

p <- monthly_df %>% 
  select(seqnum, yr_mo, period, in_out_ratio) %>%
  
  ggplot(aes(x=seqnum, y=in_out_ratio, color=period, label=yr_mo)) +
  
  geom_smooth() +
  geom_point(alpha = 0.5) +
  geom_hline(yintercept = 1, color = "darkgrey") +
  
  labs(title = "Ratio of cases received to cases closed by month",
       x = "Year",
       y = "Ratio of cases received:closed") +
  
  scale_x_continuous(
    breaks = seq(1, nrow(monthly_df), 12),
    label = as.character(2002:2025) # monthly_df$yr_mo[seq(1, nrow(monthly_df), 12)]
  )  +

  coord_cartesian(ylim = c(0, NA)) +
  
  theme(axis.text.x = element_text(size = 5))

ggplotly(p, tooltip= c("yr_mo", "color", "in_out_ratio"))

Monthly net throughput

This graph shows monthly throughput as the difference between the number of cases received vs. closed – a negative or positive value. Negative values represent months where more cases came in than were closed. Positive represents months with more closures than incoming.

p <- monthly_df %>% 
  mutate(posneg = if_else(net_number >= 0, "positive", "negative")) %>%
 
  ggplot(aes(x=seqnum, y=net_number, fill=posneg)) +
  
  geom_bar(stat = "identity",
           #fill = "blue",
           alpha = 0.7) +
  
  labs(title = "Net throughput: cases closed - cases received by month",
       x = "Year",
       y = "Net number of cases",
       fill = "Net throughput") +
  
  scale_x_continuous(
    breaks = seq(1, nrow(monthly_df), 12),
    label = as.character(2002:2025)
  )  +
  
  theme(axis.text.x = element_text(size = 5))
  
ggplotly(p, tooltip="net_number")

Cumulative open caseload

This shows the growth in the monthly number of open cases (intake + active investigation) over time.

All cases

p <- monthly_df %>% 

  ggplot(aes(x=seqnum, y=cumulative_backlog)) +
  
  geom_line(stat = "identity",
            color = "grey") +
  geom_point(stat = "identity",
             color = "blue",
             alpha = 0.7,
             size = 1) +
  
  labs(title = "Cumulative open caseload by month",
       x = "Year",
       y = "Number of open cases") +
  
  scale_x_continuous(
    breaks = seq(1, nrow(monthly_df), 12),
    label = as.character(2002:2025)
  )  +
  
  theme(axis.text.x = element_text(size = 5))
  
ggplotly(p, tooltip="cumulative_backlog")

Post 5051

    # date_received < as.Date("2021-06-20") ~ "pre.5051",
    # date_received > as.Date("2022-06-30") ~ "current",

p <- monthly_df %>% filter(yr_mo > "2021-6") %>%

  ggplot(aes(x=seqnum, y=cumulative_backlog)) +
  
  geom_line(stat = "identity",
            color = "grey") +
  geom_point(stat = "identity",
             color = "blue",
             alpha = 0.7,
             size = 1) +
  
  geom_vline(xintercept = monthly_df$seqnum[monthly_df$yr_mo == "2022-7"],
             color = "darkgrey") +
  
  annotate("text",
           x = monthly_df$seqnum[monthly_df$yr_mo == "2022-1"],
           y = max(monthly_df$cumulative_backlog),
           label = "Interim period", size = 3) +

    annotate("text",
           x = monthly_df$seqnum[monthly_df$yr_mo == "2024-1"],
           y = max(monthly_df$cumulative_backlog),
           label = "Current period", size = 3) +
  
  labs(title = "Cumulative open caseload by month - post 5051",
       x = "Year",
       y = "Number of open cases") +
  
  scale_x_continuous(
    breaks = seq(1, nrow(monthly_df), 12),
    label = as.character(2002:2025)
  )  +
  
  theme(axis.text.x = element_text(size = 5))
  
ggplotly(p, tooltip = "cumulative_backlog")

Source of complaint

Overall

  tab <- cjtc %>%
  group_by(Source = opened_reason) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Source), -1, rank(Number)),
         Source = replace_na(Source, "Missing")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Source ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))

tab %>%  
  kbl(caption = "Case Source:  all cases",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Source == "Missing"), italic=T) %>%
  row_spec(which(tab$Source == "Total"), bold=T) 
Case Source: all cases
Source Number Percent
Complaint 1488 37.9%
CJ Form 1915 Reporting 1427 36.3%
NOS Misconduct 957 24.3%
Division Initiated 49 1.2%
Request for Reinstatement 10 0.3%
Total 3931 100.0%

Post 5051

tab <- cjtc %>%
  filter(period_r != "pre.5051") %>%
  group_by(Source = opened_reason) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Source), -1, rank(Number)),
         Source = replace_na(Source, "Missing")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Source ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))

tab %>%  
  kbl(caption = "Case Source:  post 5051",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Source == "Missing"), italic=T) %>%
  row_spec(which(tab$Source == "Total"), bold=T) 
Case Source: post 5051
Source Number Percent
Complaint 1488 44.0%
CJ Form 1915 Reporting 1427 42.2%
NOS Misconduct 421 12.4%
Division Initiated 49 1.4%
Total 3385 100.0%

Conduct investigated

Offenses are designated by the RCW that applies, and are coded by CJTC staff during the intake review process. Pre-5051 offenses are coded with a wide range of RCWs. Post-5051, cases are coded by the RCW section added by 5051 (43.101.105) and classified into 2 categories: mandatory vs. discretionary decertification, if sustained. In addition, CJTC has proposed a priority ranking for each offense to help triage the existing caseload.

Conduct investigated can be tabulated by the offenses charged, or by the case. Both are shown in this section.

By offense

The tables in this subsection count the number of times each offense appears in the certification database. The total number of offenses does not track the total number of cases, because a single subject may have committed several alleged offenses, and because most cases in intake status have not yet been assigned an offense. So the tables below are best used to understand what types of offenses are commonly found in the data. The full list of offenses is found on the “Original CJTC” tab.

By period received

tab <- offense_list_internal %>%
  group_by(Period = period_r) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number)) %>%
  #arrange(desc(Number)) %>%

  bind_rows(data.frame(Period ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))
  
tab %>%
  kbl(caption = "Number of offenses by period",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Period == "Total"), bold=T) 
Number of offenses by period
Period Number Percent
pre.5051 1237 66.2%
interim 179 9.6%
current 452 24.2%
Total 1868 100.0%

Classification

Offenses are only classified into mandatory vs. discretionary for post-5051 cases, so we exclude the offenses found in the pre-5051 period. There are still a few missing cases, when the RCW listed is not from RCW section 43.101.105.

tab <- offense_list_internal %>%
  filter(period_r != "pre.5051") %>%
  group_by(Classification = offense_classification) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/nrow(offense_list_internal)) %>%
  #arrange(desc(Number)) %>%

  bind_rows(data.frame(Classification ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))
  
tab %>%
  kbl(caption = "Classification of Offense",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Classification == "Not classified"), italic=T) %>%
  row_spec(which(tab$Classification == "Total"), bold=T) 
Classification of Offense
Classification Number Percent
Mandatory 57 3.1%
Discretionary 557 29.8%
Not classified 17 0.9%
Total 631 33.8%

Priority ranking

The proposed priority ranking of an offense depends on the employment status of the offender. CJTC staff have proposed priority ranks for all of the RCW offense categories established by 5051, except for two: voluntary surrender of certification and separated for misconduct. These cases are defined by RCW and appear in the dataset, so we include them in the table:

  • 43.101.105(k) “Has been suspended or discharged, has resigned or retired in lieu of discharge, or has separated from the agency after the alleged misconduct occurred, for any conduct listed in this section”

  • 43.101.105(l) “Has voluntarily surrendered the person’s certification as a peace officer or corrections officer.”.

Here again, offenses are only ranked for post-5051 cases, so we exclude the offenses found in the pre-5051 period.

Employed

tab <- offense_list_internal %>%
  filter(period_r != "pre.5051") %>%
  mutate(
    Offense_priority = factor(
      case_when(
        is.na(offense_priority_emp) ~ "Not ranked",
        offense_priority_emp < 4 ~ as.character(offense_priority_emp),
        offense_priority_emp == 4 ~ "Separated for misconduct",
        offense_priority_emp == 5 ~ "Surrendered certification")),
    Offense_priority = fct_relevel(Offense_priority, "Not ranked", after = Inf )
  ) %>%
  group_by(Priority = Offense_priority) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number)) %>%
  #arrange(desc(Number)) %>%

  bind_rows(data.frame(Priority ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))
  
tab %>%
  kbl(caption = "Offense Priority: Employed",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Priority == "Not ranked"), italic=T) %>%
  row_spec(which(tab$Priority == "Total"), bold=T)
Offense Priority: Employed
Priority Number Percent
1 122 19.3%
2 327 51.8%
Separated for misconduct 91 14.4%
Surrendered certification 74 11.7%
Not ranked 17 2.7%
Total 631 100.0%

Separated

tab <- offense_list_internal %>%
  filter(period_r != "pre.5051") %>%
  mutate(
    Offense_priority = factor(
      case_when(
        is.na(offense_priority_sep) ~ "Not ranked",
        offense_priority_sep < 4 ~ as.character(offense_priority_sep),
        offense_priority_sep == 4 ~ "Separated for misconduct",
        offense_priority_sep == 5 ~ "Surrendered certification")),
    Offense_priority = fct_relevel(Offense_priority, "Not ranked", after = Inf )
  ) %>%
  group_by(Priority = Offense_priority) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number)) %>%
  #arrange(desc(Number)) %>%

  bind_rows(data.frame(Priority ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))
  
tab %>%
  kbl(caption = "Offense Priority: Separated",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
    row_spec(which(tab$Priority == "Not ranked"), italic=T) %>%
  row_spec(which(tab$Priority == "Total"), bold=T)
Offense Priority: Separated
Priority Number Percent
2 52 8.2%
3 397 62.9%
Separated for misconduct 91 14.4%
Surrendered certification 74 11.7%
Not ranked 17 2.7%
Total 631 100.0%

Original CJTC RCW code

  • Offenses defined by 5051 will begin with RCW 43.101.105
  • When a description of the RCW has been included in the data, we include it in the table.
tab <- offense_list_internal %>%

  group_by(RCW=offense_rcw, Description=offense_desc) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/nrow(offense_list_internal)) %>%
  arrange(desc(Number)) %>%

  bind_rows(data.frame(RCW ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))

tab %>%  
  kbl(caption = "Offense investigated",
      align = "llrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$RCW == "Total"), bold=T)
Offense investigated
RCW Description Number Percent
RCW 43.101.105(3)(j)(iv) Unethical or Unprofessional Conduct 271 14.5%
RCW 9A.76.175 155 8.3%
RCW 43.101.105(3)(k) Separated for Misconduct 109 5.8%
RCW 43.101.010(8)(b) 101 5.4%
RCW 43.101.105(3)(l) Voluntary Surrender of Certification 80 4.3%
RCW 43.101.010(8) 74 4.0%
RCW 43.101.105(1)(d) 71 3.8%
RCW 43.101.010(9) 63 3.4%
RCW 43.101.010(7)(b) 58 3.1%
RCW 43.101.105(2)(d) False or Misleading Statements 56 3.0%
RCW 9A.80.010 Official Misconduct 50 2.7%
RCW 43.101.105(d) 46 2.5%
RCW 43.101.105(3)(e) Excessive Force 43 2.3%
RCW 43.101.105(4) 38 2.0%
RCW 42.20.100 Failure of Duty 27 1.4%
RCW 43.101.105(3)(h) Discrimination 27 1.4%
FormerRCW 43.101.105(1)(d) (2011) 24 1.3%
RCW 43.101.105(3)(j)(i) Committed a Felony 24 1.3%
RCW 43.101.010(7) 22 1.2%
RCW 9A.72.040 21 1.1%
RCW 43.101.010(7)(a) 18 1.0%
RCW 9A.36.041 18 1.0%
RCW 43.101.105(1)(c) 17 0.9%
RCW 43.101.105(3)(j)(iii) Unsafe Practices Involving Firearms, Weapons, or Vehicles 16 0.9%
RCW 9.41.040(2)(a)(i) 16 0.9%
RCW 42.20.040 False Report 15 0.8%
RCW 43.101.105(3)(f) Sexual Harassment 15 0.8%
RCW 43.101.105(c) 15 0.8%
RCW 43.101.105(3)(j)(ii) Violation of Rights 14 0.7%
RCW 43.101.105(1)(a) 13 0.7%
RCW 43.101.010(8)(a)(i)(B) 11 0.6%
RCW 43.101.010(8)(a)(iv) 11 0.6%
RCW 43.101.105(2)(a)(i)(B) Convicted of a Gross Misdemeanor or Domestic Violence Offense 11 0.6%
RCW 43.101.010(8)(b) (2008) 9 0.5%
RCW 43.101.105(3)(g) Use of Position for Personal Gain 9 0.5%
RCW 43.101.010(7)(c) 8 0.4%
RCW 43.101.010(8)(a)(i) 8 0.4%
RCW 43.101.010(b) 8 0.4%
RCW 43.101.105(3)(b) Administrative Error 7 0.4%
RCW 43.101.010(7)(a)(i) 6 0.3%
RCW 9A.56.030(1)(a) 6 0.3%
RCW 9A.56.050 6 0.3%
RCW 10.99.020(5) 5 0.3%
RCW 43.101.010(7)(a)(ii) 5 0.3%
RCW 43.101.010(8)(a)(ii) 5 0.3%
RCW 43.101.105(2)(e) Prohibited from Possessing Weapons 5 0.3%
RCW 43.101.105(3) 5 0.3%
RCW 43.101.105(3)(c) Falsified Information on Application 5 0.3%
RCW 43.101.105(3)(i) Extremism 5 0.3%
RCW 43.101.115(4) 5 0.3%
RCW 69.50.4013 5 0.3%
RCW 43.101.010(8)(a) and (b) (2008) 4 0.2%
RCW 43.101.105(1)(b) 4 0.2%
RCW 43.101.105(2)(a)(i)(A) Convicted of a Felony Offense 4 0.2%
RCW 10.99.020 3 0.2%
RCW 43.101.010 (8)(a)(i)(B) 3 0.2%
RCW 43.101.010(8)(a)(i)(A)(II) 3 0.2%
RCW 43.101.010(8)(a)(ii) and (b) (2008) 3 0.2%
RCW 43.101.010(8)(b) (2020) 3 0.2%
RCW 43.101.010(8)(iii) 3 0.2%
RCW 43.101.105(3)(d)(i) Interfered with a Certification Investigation 3 0.2%
RCW 43.101.105(d) (2011) 3 0.2%
RCW 69.50.403(1)(c) 3 0.2%
RCW 9A.44.160 3 0.2%
RCW 9A.44.160(1)(b) 3 0.2%
RCW 9A.46.110(1) 3 0.2%
RCW 9A.52.110 3 0.2%
RCW 9A.60.020 3 0.2%
RCW 9A.72.020 3 0.2%
RCW 9A.76.020 3 0.2%
RCW 10.99.020(3) 2 0.1%
RCW 43.101.010(4) 2 0.1%
RCW 43.101.010(7)(iv) 2 0.1%
RCW 43.101.010(8)(a) 2 0.1%
RCW 43.101.010(8)(a) and (b) 2 0.1%
RCW 43.101.095 2 0.1%
RCW 43.101.105(3)(a) Failed to MeetRequirements for Certification 2 0.1%
RCW 43.101.155(1) 2 0.1%
RCW 43.101.200 2 0.1%
RCW 69.41.030 2 0.1%
RCW 69.50.4013(1) 2 0.1%
RCW 69.50.4014 2 0.1%
RCW 9.41.040 2 0.1%
RCW 9.73.030 2 0.1%
RCW 9A.28.020 2 0.1%
RCW 9A.36.031(1)(g) 2 0.1%
RCW 9A.40.040 2 0.1%
RCW 9A.44.050 2 0.1%
RCW 9A.46.020(1)(a)(i) 2 0.1%
RCW 9A.46.020(2)(b) 2 0.1%
RCW 9A.52.030 2 0.1%
RCW 9A.52.100 2 0.1%
RCW 9A.52.120 2 0.1%
RCW 9A.56.020(1)(a) 2 0.1%
RCW 9A.56.040 2 0.1%
RCW 9A.76.175 False or Misleading Statements 2 0.1%
RCW 10.99.040 1 0.1%
RCW 26.50.010(7) 1 0.1%
RCW 26.50.110(1) 1 0.1%
RCW 42.20.050 1 0.1%
RCW 43.101.010 (8)(b) 1 0.1%
RCW 43.101.010(6) 1 0.1%
RCW 43.101.010(7)(a)(iii) 1 0.1%
RCW 43.101.010(7)(a)(iv) 1 0.1%
RCW 43.101.010(7)(b) and (8) 1 0.1%
RCW 43.101.010(8)(a)(i)(A) 1 0.1%
RCW 43.101.010(8)(a)(i)(A)(IV) 1 0.1%
RCW 43.101.010(8)(a)(i)(B) (2008) 1 0.1%
RCW 43.101.010(8)(a)(i)(B)(b) 1 0.1%
RCW 43.101.010(8)(a)(i)(ii)(iv) 1 0.1%
RCW 43.101.010(8)(a)(iii)(iv) 1 0.1%
RCW 43.101.010(8)(b) and (9) 1 0.1%
RCW 43.101.010(9) (2008) 1 0.1%
RCW 43.101.105(1) 1 0.1%
RCW 43.101.105(2)(a)(ii)(B) 1 0.1%
RCW 43.101.105(2)(c)(i) Terminated for Failure to Intervene in Use of Force 1 0.1%
RCW 43.101.105(3)(d)(ii) Interfered with a Certification Investigation 1 0.1%
RCW 43.101.105(3)(d)(iii) Interfered with a Certification Investigation 1 0.1%
RCW 43.101.157 1 0.1%
RCW 43.89.010 1 0.1%
RCW 46.52.010(1 1 0.1%
RCW 46.61.465 1 0.1%
RCW 46.61.500 1 0.1%
RCW 46.61.502 1 0.1%
RCW 46.61.520 1 0.1%
RCW 49.50.4013(1) 1 0.1%
RCW 50.36.010 1 0.1%
RCW 69.50.101(g) 1 0.1%
RCW 69.50.204(c)(22) 1 0.1%
RCW 69.50.206(b)(4) 1 0.1%
RCW 69.50.401 1 0.1%
RCW 69.50.401(1)(2)(c) 1 0.1%
RCW 69.50.401(1)(a) 1 0.1%
RCW 69.50.403(a) 1 0.1%
RCW 9.35.020 1 0.1%
RCW 9.35.020(1) 1 0.1%
RCW 9.35.020(3) 1 0.1%
RCW 9.41.020(2)(a)(i) 1 0.1%
RCW 9.68A.070(1) 1 0.1%
RCW 9.68A.070(2)(a) 1 0.1%
RCW 9.68A.090(1) 1 0.1%
RCW 9.94A.535(2)(f) 1 0.1%
RCW 9A.020(1)(b) 1 0.1%
RCW 9A.04.040 1 0.1%
RCW 9A.04.110(23) 1 0.1%
RCW 9A.08.020 1 0.1%
RCW 9A.28.040 1 0.1%
RCW 9A.28.040(3)(c) 1 0.1%
RCW 9A.36.021 1 0.1%
RCW 9A.36.021(1) 1 0.1%
RCW 9A.36.021(1)(a) 1 0.1%
RCW 9A.36.021(1)(c) 1 0.1%
RCW 9A.36.021(2)(a) 1 0.1%
RCW 9A.36.021(c) 1 0.1%
RCW 9A.36.031(f) 1 0.1%
RCW 9A.36.041(1) 1 0.1%
RCW 9A.36.050 1 0.1%
RCW 9A.40.010(1) 1 0.1%
RCW 9A.44.010(1)(c) 1 0.1%
RCW 9A.44.050(1)(B) 1 0.1%
RCW 9A.44.050(1)(F) 1 0.1%
RCW 9A.44.060 1 0.1%
RCW 9A.44.076 1 0.1%
RCW 9A.44.079 1 0.1%
RCW 9A.44.083 1 0.1%
RCW 9A.44.115 1 0.1%
RCW 9A.44.170(1)(b) 1 0.1%
RCW 9A.46.110 1 0.1%
RCW 9A.46.110(5)(b)(ii) 1 0.1%
RCW 9A.52.025(1) 1 0.1%
RCW 9A.52.080 1 0.1%
RCW 9A.56.020 1 0.1%
RCW 9A.56.020(1) 1 0.1%
RCW 9A.56.020(1)(b) 1 0.1%
RCW 9A.56.130 1 0.1%
RCW 9A.56.160 1 0.1%
RCW 9A.56.170 1 0.1%
RCW 9A.56.300 1 0.1%
RCW 9A.64.020(1) 1 0.1%
RCW 9A.72.030 1 0.1%
RCW 9A.72.110 1 0.1%
RCW 9A.72.150 1 0.1%
RCW 9A.76.070 1 0.1%
RCW 9A.82.050 1 0.1%
RCW 9A.83.020 1 0.1%
RCW 9A.83.020(1) 1 0.1%
RCW 9A.83.020(4) 1 0.1%
RCW 9A.88.080(1)(b) 1 0.1%
RCW 9A.88.110 1 0.1%
RCW 9A.90.040 1 0.1%
Total 1868 100.0%

By case

In this subsection we examine offenses by case. Multiple offenses may be coded for each case (officers may also have multiple cases in the dataset, that is a different issue and is covered in the last section of the report). So classification and ranking work a bit differently.

Perhaps the most important feature, however, is the number of cases that have no data on offense: over 80%. We show how this varies by period and status first.

Missing by period received

This table is useful for understanding how many cases are missing offense data.

tab <- cjtc %>%
  group_by(Period = period_r) %>%
  summarize(Number = n(),
            Missing = sum(is.na(num_offenses))) %>%
  mutate(Pct.Missing = Missing/Number) %>%
  #arrange(desc(Number)) %>%

  bind_rows(data.frame(Period ="Total",
                       Number = sum(.$Number),
                       Missing = sum(.$Missing),
                       Pct.Missing = sum(.$Missing)/sum(.$Number))) %>%
  mutate(Pct.Missing = scales::percent(Pct.Missing, acc=.1))
  
tab %>%
  kbl(caption = "Missing offense data by period",
      align = "lrrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Period == "Total"), bold=T) 
Missing offense data by period
Period Number Missing Pct.Missing
pre.5051 546 124 22.7%
interim 331 244 73.7%
current 3054 2827 92.6%
Total 3931 3195 81.3%

Missing by case status

tab <- cjtc %>%
  filter(!is.na(status3)) %>%
  group_by(Status = status3) %>%
  summarize(Number = n(),
            Missing = sum(is.na(num_offenses))) %>%
  mutate(Pct.Missing = Missing/Number) %>%
  #arrange(desc(Number)) %>%

  bind_rows(data.frame(Status ="Total",
                       Number = sum(.$Number),
                       Missing = sum(.$Missing),
                       Pct.Missing = sum(.$Missing)/sum(.$Number))) %>%
  mutate(Pct.Missing = scales::percent(Pct.Missing, acc=.1))
  
tab %>%
  kbl(caption = "Missing offense data by case status",
      align = "lrrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Status == "Total"), bold=T) 
Missing offense data by case status
Status Number Missing Pct.Missing
Intake and review 1739 1727 99.3%
Active investigation 183 141 77.0%
Closed 2003 1321 66.0%
Total 3925 3189 81.2%

Number of offenses

We exclude the missing cases from this table.

tab <- cjtc %>%
  filter(!is.na(num_offenses)) %>%
  group_by(Number = num_offenses) %>%
  summarize(Persons = n()) %>%
  mutate(Number = if_else(is.na(Number), "Not coded", as.character(Number)),
         Percent = Persons/sum(Persons)) %>%
  #arrange(desc(Persons)) %>%

  bind_rows(data.frame(Number ="Total",
                       Persons = sum(.$Persons),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))
  
tab %>%
  kbl(caption = "Number of Offenses",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Number == "Not coded"), italic=T) %>%
  row_spec(which(tab$Number == "Total"), bold=T) 
Number of Offenses
Number Persons Percent
1 245 33.3%
2 137 18.6%
3 186 25.3%
4 102 13.9%
5 34 4.6%
6 18 2.4%
7 10 1.4%
8 2 0.3%
9 1 0.1%
10 1 0.1%
Total 736 100.0%

Classification

Since subjects can commit multiple offenses, with multiple classifications, here we identify those with at least one offense that would require mandatory decertification if sustained.

We exclude both the pre-5051 cases (since these are not classified) and the missing cases from the table.

tab <- cjtc %>%
  filter(!is.na(num_offenses) & period_r != "pre.5051") %>%
  mutate(Classification = case_when(
    num_mandatory_offenses > 0 ~ "At least one mandatory",
    num_discretionary_offenses > 1 ~ "Multiple discretionary",
    num_discretionary_offenses == 1 ~ "Single discretionary")
    ) %>%

  group_by(Classification) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Classification), -1, rank(Number)),
         Classification = replace_na(Classification, "Not classified")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Classification ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))

tab %>%  
  kbl(caption = "Case classification at the officer level:  all cases",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Classification == "Not classified"), italic=T) %>%
  row_spec(which(tab$Classification == "Total"), bold=T)
Case classification at the officer level: all cases
Classification Number Percent
Single discretionary 126 40.1%
Multiple discretionary 125 39.8%
At least one mandatory 56 17.8%
Not classified 7 2.2%
Total 314 100.0%

Priority ranking

There does not seem to be a way to identify the current employment status of officer in the certification database.


Documents

The documents for each case provide critical information. They are, however, often missing, even for closed cases.

All cases

tab <- cjtc %>%

  mutate(Available = if_else(is.na(url), "No", "Yes")
         ) %>%
  
  group_by(Available) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number)) %>%
  arrange(desc(Number)) %>%

  bind_rows(data.frame(Available ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1)) 
  
tab %>% kbl(caption = "Documents available?",
            align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Available == "Missing"), italic=T) %>%
  row_spec(which(tab$Available == "Total"), bold=T)
Documents available?
Available Number Percent
No 2507 63.8%
Yes 1424 36.2%
Total 3931 100.0%

By period

p <- ggplot(cjtc %>% mutate(Documents = if_else(is.na(url), "No", "Yes")), 
            aes(x=period_r, group=Documents, 
                fill=Documents)) + 
  
  geom_bar(alpha = 0.7, color="grey") +
  
  scale_fill_brewer(palette = "Spectral") +
  
  labs(title = "Document availability by period received",
       x = "Period Received",
       y = "Number of cases",
       fill = "Documents")

ggplotly(p, tooltip= c("count", "group")) #%>% reverse_legend_labels()

By status (recd post 5051)

p <- ggplot(cjtc %>% 
              filter(period_r != "pre.5051" & !is.na(status3)) %>%
              mutate(Documents = if_else(is.na(url), "No", "Yes")), 
            aes(x=status3, group=Documents, 
                fill=Documents)) + 
  
  geom_bar(alpha = 0.7, color="grey") +
  
  scale_fill_brewer(palette = "Spectral") +
  
  labs(title = "Document availability by case status",
       x = "Case status",
       y = "Number of cases",
       fill = "Documents")

ggplotly(p, tooltip= c("count", "group")) #%>% reverse_legend_labels()

Availability over time

closed_docs <- cjtc %>% 
  filter(status3=="Closed" & period_r != "pre.5051") %>%
  group_by(yr_mo = yr_mo_c) %>%
  summarize(num = n(),
            docs = sum(!is.na(url))) %>%
  mutate(status = "Closed",
         pct.docs = docs/num)

investigation_docs <- cjtc %>% 
  filter(grepl("investigation", status3) & period_r != "pre.5051") %>%
  group_by(yr_mo = yr_mo_r) %>%
  summarize(num = n(),
            docs = sum(!is.na(url))) %>%
  mutate(status = "Active investigation",
         pct.docs = docs/num)


monthly_df_closed <- left_join(index_post_5051, closed_docs) %>%
  arrange(seqnum) %>%
  mutate(across(num:docs, ~tidyr::replace_na(.x, 0)),
         cumulative_closed = cumsum(num),
         cumulative_docs = cumsum(docs),
         cumulative_pct_docs = cumulative_docs/cumulative_closed)

monthly_df_investigation <- left_join(index_post_5051, investigation_docs) %>%
  arrange(seqnum) %>%
  mutate(across(num:docs, ~tidyr::replace_na(.x, 0)),
         cumulative_investigation = cumsum(num),
         cumulative_docs = cumsum(docs),
         cumulative_pct_docs = cumulative_docs/cumulative_investigation)

Closed cases (closed post 5051)

    # date_received < as.Date("2021-06-20") ~ "pre.5051",
    # date_received > as.Date("2022-06-30") ~ "current",

p1 <- monthly_df_closed %>%
  pivot_longer(cols = cumulative_closed:cumulative_pct_docs,
               names_to = "Measure",
               values_to = "cumsum") %>%
  filter(Measure != "cumulative_pct_docs") %>%
  mutate(Measure = sub("cumulative_", "", Measure)) %>%

  ggplot(aes(x=seqnum, y=cumsum, group=Measure, color=Measure)) +
  
  geom_line(stat = "identity") +
  geom_point(stat = "identity",
             alpha = 0.7,
             size = 1) +
  
  geom_vline(xintercept = monthly_df_closed$seqnum[monthly_df_closed$yr_mo == "2022-7"],
             color = "darkgrey") +
  
  annotate("text",
           x = monthly_df_closed$seqnum[monthly_df_closed$yr_mo == "2022-1"],
           y = max(monthly_df_closed$cumulative_closed),
           label = "Interim", size = 3) +

    annotate("text",
           x = monthly_df_closed$seqnum[monthly_df_closed$yr_mo == "2024-1"],
           y = max(monthly_df_closed$cumulative_closed),
           label = "Current period", size = 3) +
  
  labs(title = "Cumulative closed cases and docs",
       x = "Year closed",
       y = "Number of cases") +
  
  scale_x_continuous(
    breaks = seq(1, nrow(index_all), 12),
    label = as.character(2002:2025)
  ) +
  
  theme(text = element_text(size = 8))
  
p2 <- monthly_df_closed %>%

  ggplot(aes(x=seqnum, y=100*cumulative_pct_docs)) +
  
  geom_bar(stat = "identity",
           fill = "blue",
           alpha = 0.7) +
  
  geom_vline(xintercept = monthly_df_closed$seqnum[monthly_df_closed$yr_mo == "2022-7"],
             color = "darkgrey") +
  
  annotate("text",
           x = monthly_df_closed$seqnum[monthly_df_closed$yr_mo == "2022-1"],
           y = 100+5,
           label = "Interim", size = 3) +

    annotate("text",
           x = monthly_df_closed$seqnum[monthly_df_closed$yr_mo == "2024-1"],
           y = 100+5,
           label = "Current period", size = 3) +
  
  labs(title = "Percent with documents",
       x = "Year closed",
       y = "Percent of cases") +
  
  scale_x_continuous(
    breaks = seq(1, nrow(index_all), 12),
    label = as.character(2002:2025)
  ) +
  
  theme(text = element_text(size = 8))

# p1plotly <- ggplotly(p1)
# p2plotly <- ggplotly(p2)

patchwork <- p1+p2+
  plot_layout(widths = c(3, 4))
patchwork + plot_annotation(
  title = "Cases closed post-5051 and document availability",
  subtitle = "By month closed") 

Active investigation cases (recd post 5051)

    # date_received < as.Date("2021-06-20") ~ "pre.5051",
    # date_received > as.Date("2022-06-30") ~ "current",

p1 <- monthly_df_investigation %>%
  pivot_longer(cols = cumulative_investigation:cumulative_pct_docs,
               names_to = "Measure",
               values_to = "cumsum") %>%
  filter(Measure != "cumulative_pct_docs") %>%
  mutate(Measure = sub("cumulative_", "", Measure)) %>%

  ggplot(aes(x=seqnum, y=cumsum, group=Measure, color=Measure)) +
  
  geom_line(stat = "identity") +
  geom_point(stat = "identity",
             alpha = 0.7,
             size = 1) +
  
  geom_vline(xintercept = 
               monthly_df_investigation$seqnum[monthly_df_investigation$yr_mo == "2022-7"],
             color = "darkgrey") +
  
  annotate("text",
           x = monthly_df_investigation$seqnum[monthly_df_investigation$yr_mo == "2022-1"],
           y = max(monthly_df_investigation$cumulative_investigation),
           label = "Interim", size = 3) +

    annotate("text",
           x = monthly_df_investigation$seqnum[monthly_df_investigation$yr_mo == "2024-1"],
           y = max(monthly_df_investigation$cumulative_investigation),
           label = "Current period", size = 3) +
  
  labs(title = "Cumulative cases and docs",
       x = "Year received",
       y = "Number of cases under investigation") +
  
  scale_x_continuous(
    breaks = seq(1, nrow(index_all), 12),
    label = as.character(2002:2025)
  )+
  
  theme(text = element_text(size = 8))
  
p2 <- monthly_df_investigation %>%

  ggplot(aes(x=seqnum, y=100*cumulative_pct_docs)) +
  
  geom_bar(stat = "identity",
           fill = "blue",
           alpha = 0.7) +
  
  geom_vline(xintercept = 
               monthly_df_investigation$seqnum[monthly_df_investigation$yr_mo == "2022-7"],
             color = "darkgrey") +
  
  annotate("text",
           x = monthly_df_investigation$seqnum[monthly_df_investigation$yr_mo == "2022-1"],
           y = 100+5,
           label = "Interim", size = 3) +
  
  annotate("text",
           x = monthly_df_investigation$seqnum[monthly_df_investigation$yr_mo == "2024-1"],
           y = 100+5,
           label = "Current period", size = 3) +
  
  labs(title = "Pct of cases with docs",
       x = "Year Received",
       y = "Percent of cases") +
  
  scale_x_continuous(
    breaks = seq(1, nrow(index_all), 12),
    label = as.character(2002:2025)
  ) +
  
  theme(text = element_text(size = 8))

patchwork <- p1+p2+
  plot_layout(widths = c(3, 4))
patchwork + plot_annotation(
  title = "Active cases under investigation post-5051 and document availability",
  subtitle = "By month received") 


Employers

Agency type

Overall

tab <- cjtc %>%

  group_by(Type = organization_type) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Type), -1, rank(Number)),
         Type = replace_na(Type, "Missing")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Type ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))
  
tab %>%
  kbl(caption = "Employer agency type",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Type == "Missing"), italic=T) %>%
  row_spec(which(tab$Type == "Total"), bold=T) 
Employer agency type
Type Number Percent
Law Enforcement Agency 3462 88.1%
WA State Agency 234 6.0%
Correctional Facility 177 4.5%
WA State Agency; Law Enforcement Agency 9 0.2%
Law Enforcement Agency; Correctional Facility 5 0.1%
Law Enforcement Agency; NA 4 0.1%
Correctional Facility; Law Enforcement Agency 1 0.0%
Law Enforcement Agency; WA State Agency 1 0.0%
Police Department 1 0.0%
Missing 37 0.9%
Total 3931 100.0%

Post 5051

tab <- cjtc %>%
  filter(period_r != "pre.5051") %>%
  
  group_by(Type = organization_type) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Type), -1, rank(Number)),
         Type = replace_na(Type, "Missing")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Type ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))
  
tab %>%
  kbl(caption = "Employer agency type",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Type == "Missing"), italic=T) %>%
  row_spec(which(tab$Type == "Total"), bold=T) 
Employer agency type
Type Number Percent
Law Enforcement Agency 2945 87.0%
WA State Agency 209 6.2%
Correctional Facility 174 5.1%
WA State Agency; Law Enforcement Agency 9 0.3%
Law Enforcement Agency; Correctional Facility 5 0.1%
Law Enforcement Agency; NA 4 0.1%
Correctional Facility; Law Enforcement Agency 1 0.0%
Law Enforcement Agency; WA State Agency 1 0.0%
Police Department 1 0.0%
Missing 36 1.1%
Total 3385 100.0%

Agency Name

Overall

tbl <- cjtc  %>%
  filter(!is.na(agency)) %>%
  
  mutate(employer = gsub("County Sheriff's Office", "Cnty SO", agency),
         employer = gsub("Police Department", "PD", employer)
         ) %>%
  
  group_by(employer) %>%
  summarise(Number = n()) %>%
  mutate(Percent = Number/sum(Number)) %>%
  arrange(Number) %>%
  filter(Number > 19)

ggplot(tbl,
       aes(x = reorder(employer, Percent), y = Percent, 
           label = Number)) +
  geom_text(aes(y = Percent), size = 3, nudge_y = .002) +
  geom_bar(stat="identity", fill="blue", alpha=.5) +
  labs(title = "Cases by Employer: Agencies with 20+ cases",
       caption = "CJTC certification database",
       x = "",
       y = "Percent of Total") +
  
  theme(axis.text.y = element_text(size=7)) +
  
  scale_y_continuous(labels = scales::percent_format(acc=1)) +
  coord_flip()

Post-5051

tbl2 <- cjtc  %>%
  filter(period_r != "pre.5051") %>%
  filter(!is.na(agency)) %>%
  
  mutate(employer = gsub("County Sheriff's Office", "Cnty SO", agency),
         employer = gsub("Police Department", "PD", employer)
         ) %>%
  
  group_by(employer) %>%
  summarise(Number = n()) %>%
  mutate(Percent = Number/sum(Number)) %>%
  arrange(Number) %>%
  filter(Number > 19)

ggplot(tbl2,
       aes(x = reorder(employer, Percent), y = Percent, 
           label = Number)) +
  geom_text(aes(y = Percent), size = 3, nudge_y = .002) +
  geom_bar(stat="identity", fill="blue", alpha=.5) +
  labs(title = "Cases by Employer post 5051: Agencies with 20+ cases",
       caption = "CJTC certification database",
       x = "",
       y = "Percent of Total") +
  
  theme(axis.text.y = element_text(size=7)) +
  
  scale_y_continuous(labels = scales::percent_format(acc=1)) +
  coord_flip()


Officers

Certification status

tab <- cjtc %>%

  group_by(Certification.status = certification_status) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Certification.status), -1, rank(Number)),
         Certification.status = replace_na(Certification.status, "Missing")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Certification.status ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))

tab %>%  
  kbl(caption = "Certification status of subject:  all cases",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Certification.status == "Missing"), italic=T) %>%
  row_spec(which(tab$Certification.status == "Total"), bold=T)
Certification status of subject: all cases
Certification.status Number Percent
Certified 2782 70.8%
Not Certified 182 4.6%
Lapsed 79 2.0%
Expired 29 0.7%
Certified; NA 4 0.1%
Missing 855 21.8%
Total 3931 100.0%

Occupation type

tab <- cjtc %>%

  group_by(Occupation = occupation_type) %>%
  summarize(Number = n()) %>%
  mutate(Percent = Number/sum(Number),
         index = if_else(is.na(Occupation), -1, rank(Number)),
         Occupation = replace_na(Occupation, "Missing")
         ) %>%
  arrange(desc(index)) %>%
  select(-index) %>%

  bind_rows(data.frame(Occupation ="Total",
                       Number = sum(.$Number),
                       Percent = sum(.$Percent))) %>%
  mutate(Percent = scales::percent(Percent, acc=.1))

tab %>%  
  kbl(caption = "Occupation type of subject:  all cases",
      align = "lrr") %>%
  kable_styling(bootstrap_options = "striped") %>%
  row_spec(which(tab$Occupation == "Missing"), italic=T) %>%
  row_spec(which(tab$Occupation == "Total"), bold=T)
Occupation type of subject: all cases
Occupation Number Percent
Peace Officer 3101 78.9%
Corrections Officer 221 5.6%
Recruit 91 2.3%
Non-Certified 88 2.2%
Peace Officer; NA 4 0.1%
Missing 426 10.8%
Total 3931 100.0%

Officers with multiple cases

Here we pull out and list by name officers with more than one case in the database – i.e., they have been the subject of multiple complaints and/or agency notifications of misconduct. Note that when the name is missing in the database, we are unable to identify these officers.

The information shown in the columns of the table for each officer is aggregated from all of their cases in the dataset; the list of distinct entries for each type of information is given, separated by “;”. The exception is the “outcomes” column – here if their certification was revoked for any of their cases, we list just the revocation.

The table is interactive: data can be searched, sorted, filtered and downloaded using one of the buttons or boxes shown at the top of the table.

cjtc_name <- cjtc %>%
  filter(!is.na(first_name) & first_name != "Missing" &
           !grepl("REDACT", first_name) & !grepl("Unknown", first_name) &
           !is.na(last_name) & last_name != "Missing" & 
           !grepl("REDACT", last_name) & !grepl("Complainant", first_name)) %>%
  mutate(employer = gsub("County Sheriff's Office", "CSO", agency),
         employer = gsub("Police Department", "PD", employer),
         employer = gsub("Department of Corrections", "DOC", employer),
         source = case_when(
           grepl("1915", opened_reason) ~ "1915 report",
           grepl("Request", opened_reason) ~ "Reinstate rqst",
           grepl("Division", opened_reason) ~ "Division",
           TRUE ~ opened_reason),
         name = paste(last_name, first_name, sep=", ")
           ) %>%
  arrange(date_received) %>%
  select(name, employer, source, date_received, status, case_determination, outcat) %>%
  group_by(name) %>%
  summarise(nrec = n(),
            across(employer:outcat, ~paste(unique(.x), collapse="; ")), .groups = "drop") %>%
  mutate(across(where(is.character), ~if_else(. == "NA", NA, .))) %>% # recover NAs 
  mutate(outcat = gsub("Revoked\\/", "", outcat),
         final_outcome = if_else(grepl("Revo", case_determination), "Cert Revoked", outcat))


cjtc_name %>%
  filter(nrec > 1) %>%
  arrange(desc(nrec)) %>%
  select(name, cases=nrec, sources=source, dates=date_received, employers=employer, 
         statuses=status, outcomes=final_outcome) %>%
  
  DT::datatable(rownames = F,
                caption = paste("Officers with more than one record in the database as of", 
                                max(cjtc$date_received)),
                filter = 'top',
                escape = FALSE, 
                extensions = 'Buttons', 
                options = 
                  list(dom = 'Bfrtip',
                       buttons = c('copy', 'csv', 'excel', 'pdf', 'print'))
  )