2020-10-27

For data pertaining to violent offending by race see also: https://rpubs.com/johnbradford/violent-crime

Census Data

Below I obtain population data from the US Census to calculate population rates of interracial violent crime. To be consistent with the original NCVS excel file on the BJS website, population data for White, Black, and Asian are all non-Hispanic. I use the tidycensus R package to download the population data from the 2014-2018 5-Year American Community Survey.

library(tidyverse)
library(tidycensus)
library(gt)


##CENSUS DATA 
##American Community Survey (ACS)
#acs_vars <- tidycensus::load_variables(2018, "acs5")

census_data <- tidycensus::get_acs(geography = "us", geometry = FALSE, year = 2018, survey = "acs5",
          variables = c(Total = "B01003_001",
                        White = "B03002_003",
                        Black = "B03002_004",
                        Asian = "B03002_006",
                        `Native American` = "B03002_005",
                        Hawaiian = "B03002_007",
                        Hispanic = "B03002_012")) %>% 
  dplyr::select(variable, estimate) %>% 
  tidyr::pivot_wider(names_from = variable, values_from = estimate) %>% 
  tidyr::pivot_longer(-Total, names_to = "Race", values_to = "Population") 



##display in table

census_data %>% 
  dplyr::mutate(Percent = Population/Total, 
                Race = factor(Race, levels = c("White", "Hispanic", "Black", "Asian", "Native American", "Hawaiian"))) %>% 
  dplyr::select(-Total) %>% 
  dplyr::arrange(Race) %>% 
  gt::gt(.) %>% 
    tab_header(
    title = "Population by Race",
    subtitle = "US Census 5-Year American Community Survey (2014-2018)"
  ) %>% 
  fmt_number(
    data = .,
    columns = c(2),
    rows = NULL,
    decimals = 2,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    scale_by = 1,
    suffixing = TRUE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  fmt_percent(
    data = .,
    columns = c(3),
    placement = "right",
    decimals = 2,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  
  tab_style(
    style = cell_text(size = "medium", align = "center", font = "Roboto Condensed"),
    locations = list(cells_body(columns = 1:3))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "left", font = "Arial Narrow", size = "medium"), 
    locations = list(cells_body(columns = 1), cells_column_labels(columns = 1))
  ) %>% 
  
  tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_column_labels(columns = 1:3))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "center", font = "Arial Narrow"), 
    locations = list(cells_column_labels(columns = 2:3), 
                     cells_title(groups = c("title")))
  ) %>% 
  tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_title(groups = c("title")))
  ) %>% 
  tab_source_note(source_note = md("White, Black, Asian, Native American, and Native Hawaiian include only non-Hispanic; Hispanic estimate includes all races.")) 
Population by Race
US Census 5-Year American Community Survey (2014-2018)
Race Population Percent
White 197.18M 61.07%
Hispanic 57.52M 17.81%
Black 39.72M 12.30%
Asian 17.37M 5.38%
Native American 2.14M 0.66%
Hawaiian 525.10K 0.16%
White, Black, Asian, Native American, and Native Hawaiian include only non-Hispanic; Hispanic estimate includes all races.

I. Nonfatal Interracial Violence

Data pertaining to interracial violent victimizations (excluding murder) can be obtained from the National Crime Victimization Survey (NCVS). The Bureau of Justice Statistics (BJS) provides the following description on their website:

“The Bureau of Justice Statistics’ (BJS) National Crime Victimization Survey (NCVS) is the nation’s primary source of information on criminal victimization. Each year, data are obtained from a nationally representative sample of about 240,000 interviews on criminal victimization, involving 160,000 unique persons in about 95,000 households…. The NCVS collects information on nonfatal personal crimes (i.e., rape or sexual assault, robbery, aggravated and simple assault, and personal larceny) … both reported and not reported to police.”

The BJS has made available on their website data pertaining to interracial violence for 2018 and 2019, which can be downloaded as excel files.

Unfortunately, the BJS has not published analogous files for earlier versions of the survey from 2015 and 2016, and the format of the data they do publish changes from year to year. However, the raw data can be obtained from the Inter-university Consortium for Political and Social Research (ICPSR). The drawback is that these data are not easily accessible to the general public. The files are huge, require code books to interpret, and are generally unmanageable without some prior experience in coding and/or data analysis.

My presentation of the data available on the web is completely reproducible in R. The code below will automatically download, unzip, read and format the comma separated values (csv) files contained on the BJS website. My presentation of the ICPSR data is, in contrast, mostly reproducible, but requires that you first download the .tsv formatted files from the following links:

For practitioners of R, note that because the files are so large, I load and process the files using the data.table package, which might not be as familiar as the tidyverse or base R. For more information on data.table syntax, see: https://rdatatable.gitlab.io/data.table/index.html .

A. Online Data (2018, 2019)

All of the following tables and graphs are derived from the National Crime Victimization Survey (NCVS) available on the BJS website at Criminal Victimization, 2019 and Criminal Victimization, 2018.

Functions

process_online_ncvs_data <- function(df){
 
  require(dplyr)
  require(tidyr)
  require(data.table)
  
  dt <- data.table::as.data.table(df)
  dt[, `:=` (`Total Victimizations` = sum(Victimizations))]
  dt[, `:=` (`Total Offender Race` = sum(Victimizations)), by = .(`Offender Race`)]
  dt[, index := .I]
  dt[, Interracial_flag := ifelse(`Victim Race` == `Offender Race`, "No", "Yes")]
  all_races <- unique(c(dt$`Victim Race`))
  all_pairs_list <- combn(all_races, 2, simplify = FALSE)
  
  
  for(i in seq_along(all_pairs_list)){
    race_pairs <- all_pairs_list[[i]]
    race1 <- race_pairs[1]; race2 <- race_pairs[2]
    dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1), 
       `Total Interracial Pairs` := sum(Victimizations)]
    
    dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1),  
       `Race Pair` := paste0(race1, "-", race2), by = index]
  }
  
  
  dt[, `:=` (
    `Percent of Total` = Victimizations/`Total Victimizations`,
    `Percent of Offender Race` = Victimizations/`Total Offender Race`,
    `Percent of Victim Race` = Victimizations/`Total Victim Race`,
    `Percent Interracial` = ifelse(Interracial_flag == "Yes", 
                                   Victimizations/`Total Interracial`, 
                                   NA_real_),
    `Percent Interracial Pairs` = ifelse(Interracial_flag == "Yes", 
                                         Victimizations/`Total Interracial Pairs`, 
                                         NA_real_))]
  
  race_levels <- c("White", "Black", "Hispanic", "Asian")
  race_levels <- race_levels[which(race_levels %in% 
                                     unique(dt[, c(`Offender Race`, 
                                                   `Victim Race`)]))]
  dt <- dt[`Offender Race` %in% race_levels & `Victim Race` %in% race_levels]
  dt[, `:=`(
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels))]
  
  gVars <- c("Offender Race", "Victim Race")
  gVars <- gVars[which(gVars %in% names(dt))]
  data.table::setorderv(dt, cols = gVars)
  
  ##reorganize columns
  dt[, index := NULL]
  dt[is.na(`Race Pair`) & Interracial_flag == "No", `Race Pair` := `Victim Race`]
  
  ##rearranging columns
  keep_vars <- unique(c(gVars, "Race Pair", "Victimizations", 
                        "Total Victimizations", "Percent of Total",
                        "Total Victim Race", "Percent of Victim Race",
                        "Total Offender Race", "Percent of Offender Race",
                        "Percent of Offender Race", 
                        "Total Interracial",  "Percent Interracial",  
                        "Total Interracial Pairs", "Percent Interracial Pairs"))
  dt <- dt[, ..keep_vars]
  
  ## I relabel some columns to save space
  col_names <- names(dt)
  new_names <- col_names %>% gsub("Percent ", "% ", .) %>% 
    gsub("Victimizations", "Victims", .) %>% 
    gsub("Interracial", "Inter-racial", .)
  names(dt) <- new_names

  dt

  
}

Get Data

2019 Data

## Online BJS Files
##National Crime Victimization Survey


library(tidyverse)
library(data.table)


## 2019 Data
temp <- tempfile()
download.file("https://www.bjs.gov/content/pub/sheets/cv19.zip", temp)


ncvs_t15 <- readr::read_csv(unz(temp, filename="cv19t15.csv"), 
                            skip = 11, n_max = 4) %>% 
  dplyr::select(-matches("X")) %>% 
  dplyr::rename(`Victim Race` = `Victim race/ethnicity`) %>% 
  dplyr::rename_all(., list(~gsub("\\/.*", "", .))) %>% 
  dplyr::mutate(`Victim Race` = gsub("\\/.*", "", `Victim Race`)) %>% 
  dplyr::rename( Total = `Number of violent incidents`) %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black", "Hispanic", "Other"))

dt_ncvs_19 <- ncvs_t15 %>% 
    dplyr::rename(`Total Victim Race` = Total) %>% 
    tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`), 
                        names_to = "Offender Race", 
                        values_to = "Victimizations") %>% 
    dplyr::filter(`Offender Race` != "Total") %>% 
    dplyr::mutate(
Percent = Victimizations/`Total Victim Race`,
`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])
) %>% 
  ##function defined above
  process_online_ncvs_data(.)

2018 Data

## Online BJS Files
##National Crime Victimization Survey


library(tidyverse)
library(data.table)


temp <- tempfile()
download.file("https://www.bjs.gov/content/pub/sheets/cv18.zip", temp)


ncvs_t14 <- readr::read_csv(unz(temp, filename="cv18t14.csv"), 
                            skip = 11, n_max = 4) %>% 
  dplyr::select(-matches("X")) %>% 
  dplyr::rename(`Victim Race` = `Victim race/ethnicity`) %>% 
  dplyr::mutate(`Asian/a` = as.numeric(gsub("<", "", `Asian/a`))) %>% 
  dplyr::select(-Total) %>% 
  dplyr::rename_all(., list(~gsub("\\/.*", "", .))) %>% 
  dplyr::mutate_at(3:8, ~./100) %>% 
  dplyr::mutate(`Victim Race` = gsub("\\/.*", "", `Victim Race`)) %>% 
  dplyr::rename(`Multiple` = `Multiple offenders of various races`) %>% 
  dplyr::mutate( Victimizations = `Number of violent incidents`) %>% 
  dplyr::select(-`Number of violent incidents`)


dt_ncvs_18 <- ncvs_t14 %>% 
    dplyr::rename(`Total Victim Race` = Victimizations) %>% 
    tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`), names_to = "Offender Race", values_to = "Percent") %>% 
    dplyr::filter(`Offender Race` != "Total") %>% 
    dplyr::mutate(
Victimizations = `Total Victim Race` * (Percent),
`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])
) %>% 

  ##function defined above
  process_online_ncvs_data(.)

2019

library(gt)
library(extrafont)

gt(ncvs_t15) %>% 
  tab_header(
    title = "Nonfatal Violent incidents by victim and offender race (2019)"
    #subtitle = ""
  ) %>% 
  fmt_number(
    data = .,
    columns = c(2:6),
    rows = NULL,
    decimals = 0,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    scale_by = 1,
    suffixing = FALSE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 

  tab_spanner(
    label = "Offender Race",
    columns = vars(White, Black, Hispanic, Other)) %>% 
  tab_footnote(
    footnote = "Includes Asians, Native Hawaiians and Other Pacific Islanders, American Indians and Alaska Natives, persons of two or more races, and multiple offenders of various races.",
    locations = cells_column_labels(vars(Other))) %>%
  
  tab_footnote(
    footnote = "If the victim perceived any of the offenders in a multiple offender incident to be of Hispanic origin, they are classified as Hispanic.",
    locations = cells_column_labels(vars(Hispanic))) %>%
  
  tab_footnote(
    footnote = "Excludes persons of Hispanic origin",
    locations = cells_column_labels(vars(Black, White))) %>%
  
  tab_options(
    footnotes.font.size = 11
  ) %>% 
  
  
  
  tab_style(
    style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
    locations = list(cells_body(columns = 2:6))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "left", font = "Arial Narrow", size = "medium"), 
    locations = list(cells_body(columns = 1))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "center", font = "Arial Narrow"), 
    locations = list(cells_column_labels(columns = 1:6), 
                     cells_column_spanners(spanners ="Offender Race" ), 
                     cells_title(groups = c("title")),
                     cells_stubhead())
  ) %>% 
  
    tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_title(groups = c("title")))
  ) %>% 
  tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2019.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686)   \nFilename: cv19t15.csv")) 
Nonfatal Violent incidents by victim and offender race (2019)
Victim Race Total Offender Race
White1 Black1 Hispanic2 Other3
White 2,796,710 1,722,230 472,570 463,520 138,380
Black 494,610 89,980 346,260 43,730 14,640
Hispanic 774,310 170,840 249,030 334,600 19,840
Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.
Filename: cv19t15.csv

1 Excludes persons of Hispanic origin

2 If the victim perceived any of the offenders in a multiple offender incident to be of Hispanic origin, they are classified as Hispanic.

3 Includes Asians, Native Hawaiians and Other Pacific Islanders, American Indians and Alaska Natives, persons of two or more races, and multiple offenders of various races.

library(gt)


m <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(Victims = round(Victims)) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>% 
  dplyr::group_by(`Offender Race`) %>% 
  dplyr::mutate(`Total Offending` = `White` + `Black` + `Hispanic`) %>% 
  dplyr::ungroup()



gt(m) %>% 
  tab_header(
    title = "Nonfatal Violent Victimizations by Race of Offender and Victim (2019)"
    #subtitle = "Transposed"
  ) %>% 
  fmt_number(
    data = .,
    columns = c(2:5),
    rows = NULL,
    decimals = 0,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    scale_by = 1,
    suffixing = FALSE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  tab_spanner(
    label = "Victim Race",
    columns = vars(White, Black, Hispanic)) %>% 
  
  summary_rows(
    groups = NULL,
    columns = vars(`White`,`Black`, `Hispanic`, `Total Offending`),
    fns = list(`Total` = ~sum(.x, na.rm=TRUE)),
     decimals = 0,
    use_seps = TRUE,
    sep_mark = ",") %>% 
  
  tab_options(
    footnotes.font.size = 11
  ) %>% 
  
  tab_style(
    style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
    locations = list(cells_body(columns = 1:5), 
                     cells_stub(),
                     cells_grand_summary(columns = 1:5, rows = "Total"))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "left", font = "Arial Narrow", size = "medium"), 
    locations = list(cells_body(columns = 2))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "center", font = "Arial Narrow"), 
    locations = list(cells_column_labels(columns = 1:5), 
                     cells_column_spanners(spanners ="Victim Race" ), 
                     cells_title(groups = c("title")),
                     cells_stubhead())
  ) %>% 
  
    tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_title(groups = c("title")))
  ) %>% 

  tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2019.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046)   \nFilename: cv19t15.csv")) 
Nonfatal Violent Victimizations by Race of Offender and Victim (2019)
Offender Race Victim Race Total Offending
White Black Hispanic
White 1,722,230 89,980 170,840 1,983,050
Black 472,570 346,260 249,030 1,067,860
Hispanic 463,520 43,730 334,600 841,850
Total 2,658,320 479,970 754,470 3,892,760
Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.
Filename: cv19t15.csv

Keep in mind that these figures include only victimizations for which the race of both the offender and victim are known and that the 2019 BJS table includes only White, Black, and Hispanic as categories. Asians, which were included in 2018 as a separate category, are removed and lumped with ‘Other’. The column definitions are as follows:

  • Victims: frequency of victimizations involving Offender Race and Victim Race.
  • Total Victim Race: sum of victimizations involving Victim Race across all offender races.
  • Total Offender Race: sum of victimizations perpetrated by Offender Race across all victim races.
  • Total Inter-racial: sum of victimizations in which the race of offender and the race victim are different.
  • Total Inter-racial Pairs: sum of interracial victimizations involving the two races as either victim or offender.

For example, reading across the second row involving White offenders and Black victims, we see that these involved only 2.21% of all victimizations (in which the race of offender and victim are both known); 18% of victimizations of Black victims; 4.5% of victimizations involving White offenders; 5.4% of all interracial violent victimizations; and approximately 16% of violent interracial victimizations between Blacks and Whites, the other 84% involving Black offenders and White victims.

library(reactable)


reactable::reactable(dt_ncvs_19, 
                     striped = TRUE, highlight = TRUE, 
                     bordered = TRUE, pagination = FALSE, 
                     compact = TRUE, resizable = TRUE, 
                     filterable = TRUE,
                    groupBy = NULL,
  ##Default formatting
   defaultColDef = colDef(
     #minWidth = 85, 
     maxWidth = 65,
     align = "center",
     style = list(fontSize = 11, 
                  fontFamily = "'Roboto Condensed', 
                  'Arial Narrow', Merriweather,  Arial, 
                  Helvetica, sans-serif"),
     format = list(cell = colFormat(separators = TRUE, 
                                    percent = FALSE, 
                                    digits = 0)),
     headerStyle =  list(fontWeight = "bold", fontSize = 12),
     footerStyle =  list(fontWeight = "bold"))
                     
  ,columns = list(
     `Offender Race` = colDef(maxWidth = 75, 
                              minWidth = NULL, 
                              align = "left", 
                              aggregate = "unique"),
     `Race Pair` = colDef(maxWidth = 90, 
                          minWidth = NULL, 
                          align = "left", 
                          aggregate = "unique"),
     `Victim Race` = colDef(maxWidth = 65, 
                            minWidth = NULL, 
                            align = "left", 
                            aggregate = "unique"),
     
     `% of Total` = colDef(maxWidth = 50, 
                           align = "center", 
                           format = colFormat(separators = FALSE,
                                              percent = TRUE, 
                                              digits = 2)),
     `Total Victim Race` = colDef(maxWidth = 60),
      `Total Inter-racial Pairs` = colDef(maxWidth = 55),
      `Total Inter-racial` = colDef(maxWidth = 60),
     `% of Victim Race` = colDef(maxWidth = 60, 
                                 align = "center", 
                                 format = colFormat(separators = FALSE, 
                                                    percent = TRUE, 
                                                    digits = 2)),
     `% of Offender Race` = colDef(maxWidth = 75, 
                                   align = "center", 
                                   format = colFormat(separators = FALSE, 
                                                      percent = TRUE, 
                                                      digits = 2)),
     `% Inter-racial` = colDef(maxWidth = 50, 
                               align = "center", 
                               format = colFormat(separators = FALSE, 
                                                  percent = TRUE, 
                                                  digits = 2)),
    
     `% Inter-racial Pairs` = colDef(maxWidth = 55, 
                                     align = "center", 
                                     format = colFormat(separators = FALSE, 
                                                        percent = TRUE, 
                                                        digits = 2))
       )
     )

Converting from long to wide format…

m_19 <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(
    Victims = round(Victims),
    `Victim Race` = paste0(`Victim Race`, " Victims")) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>% 
  dplyr::group_by(`Offender Race`) %>% 
  dplyr::mutate(
    `Total Offending` = `White Victims` + `Black Victims` + `Hispanic Victims`, 
                
    `% of Victims` = list(c(`White Victims`,`Black Victims`, `Hispanic Victims`))
                
    # ,`Victim Counts` = list(c(`White Victims`,`Black Victims`,
    #                                 `Hispanic Victims`,`Asian Victims`))
 ) 


bar_colors <- c("#56B4E9",  "#000000", "#D55E00")
bar_color_list <- list(bar_colors, bar_colors, bar_colors, bar_colors)

reactable::reactable(m_19, striped = TRUE, highlight = TRUE, 
                     bordered = TRUE, compact = FALSE, resizable = FALSE,
                     
 # rowStyle = function(index) {if (index == 5)  list(fontWeight = "bold")},
 # rowClass = function(index) {if (index == 5) {"bold"}},
                     
 defaultColDef = colDef(
   format = colFormat(separators = TRUE, digits = 0), 
   footerStyle =  list(fontWeight = "bold"),
   footer = function(values) {
      if (!is.numeric(values)) return()
      sparkline(values, type = "pie", width = 80, height = 30, 
                sliceColors = c("#56B4E9",  "#000000", "#D55E00"))
    }),
 
   columns = list(
     `Offender Race` = colDef(footer = "% of Offenders"),
     `% of Victims` = colDef(cell = function(values) {
        if (!is.numeric(values)) return()
        sparkline(values, type = "pie", width = 80, height = 30, 
                  sliceColors = c("#56B4E9",  "#000000", "#D55E00"))
      })
   # ,`Victim Counts` = colDef(cell = function(values, index) {
   #      sparkline(values, type = "bar", 
   #                barWidth = 10,
   #                #barColor = c("#56B4E9",  "#000000", "#D55E00", "#F0E442"),
   #                barColor = bar_colors[[index]],
   #                stackedBarColor = bar_colors[[index]],
   #                chartRangeMin = 0, 
   #                chartRangeMax = unlist(m$`Victim Counts`[[index]])
   #                )
   #    })
   )
)

Violent Victimizations by Race of Offender and Victim (2019)

#devtools::install_github("mattflor/chorddiag")
library(chorddiag)

m <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(Victims = round(Victims)) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims)

matrix_names <- list(Offender = m$`Offender Race`, Victim = names(m)[-1])

m <- as.matrix(m[,-1])
dimnames(m) <- matrix_names


groupColors <- c("#56B4E9",  "#000000", "#D55E00")
chorddiag(m, groupColors = groupColors, groupThickness = 0.1, groupPadding = 2, groupnamePadding = 30)
library(plotly)

d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2019 victimizations"))
 

g <- ggplot(d_gg, 
            aes(x = `Victim Race`, 
                y = Victims, 
                fill = `Offender Race`)) +
  geom_col(alpha = 0.75, aes(text = Label)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06), 
                     #labels = scales::unit_format(scale = .001, suffix = "k")
                     labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 12), 
        axis.title.y = element_text(size = 12)) +
  ggtitle("Violent incidents by victim and offender race (2019)") +
  labs(
    x = "Victim Race", 
    y = "Number of violent incidents") +
  geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1), 
            aes(y = `Total Victim Race`, label = victimLabel), 
            vjust = 0, family = "Roboto Condensed") 


sources_label <- "Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>"


p <- plotly::ggplotly(g, width = NULL, height = NULL, #plotly_graph_heights, 
                      tooltip = c("text"), dynamicTicks = FALSE) %>% 
    plotly::config(scrollZoom = FALSE) %>% plotly::layout(dragmode='pan') %>% 
    #plotly::hide_legend() %>% 
    layout(annotations = 
      list(x = 0.5, y = -0.36, text = sources_label, 
        showarrow = F, xref='paper', yref='paper', 
        xanchor='auto', yanchor='auto', xshift=0, yshift=0,
        font=list(size=12, color="black"))
 )

p
# install.packages("remotes")
# remotes::install_github("coolbutuseless/ggpattern")
library(ggpattern)

d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, 
                               format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2019 victimizations"))
 

ggplot(d_gg, aes(x = `Victim Race`, y = Victims, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",Hispanic="#D55E00",Black="#000000",Asian="#F0E442")) +
  ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,90)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06), 
                     #labels = scales::unit_format(scale = .001, suffix = "k")
                     labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 12), 
        axis.title.y = element_text(size = 12)) +
  ggtitle("Nonfatal Violent incidents by victim and offender race (2019)", 
          subtitle = "National Crime Victimization Survey") +
  labs(
    x = "Victim Race", y = "Number of violent incidents",
    caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) +
  geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1), 
            aes(y = `Total Victim Race`, label = victimLabel), 
            vjust = 0, family = "Roboto Condensed") 

##rpubs not publishing due to file size
#install.packages("treemapify")
library(treemapify)

d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2019 victimizations"),
    
    Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    Label = paste0(Category, "\n", 
                   signs::signs(Victims, accuracy = 100, format = scales::comma),
                   "\n",
                   signs::signs(`% of Total`, accuracy = 1, 
                                format = scales::percent)),
    colorText = "white" #ifelse(`Offender Race` == "Asian", "Black", "White")
    )


ggplot(d_gg, aes(area = Victims, label = Label, fill = `Offender Race`, 
                 subgroup = Type,
                 subgroup2 = `Offender Race`, 
                 subgroup3 = `Victim Race`)) +
  geom_treemap(alpha = 0.95) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00")) + #"#F0E442")) +
  scale_color_manual(values = c("white")) +
  #ggthemes::scale_fill_colorblind() +
  #ggthemes::scale_fill_ptol() +
  geom_treemap_text(aes(color = "white"), place = "topleft", 
                    reflow = T, family = "Roboto Condensed") +
  geom_treemap_subgroup_border(size = 6, alpha = 1) +
  geom_treemap_subgroup_text(place = "bottom", grow = T, alpha = 0.4, 
                             colour = "white", fontface = "italic", 
                             min.size = 0, 
                             family = "Roboto Condensed", reflow = TRUE) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  ggtitle("Nonfatal Violent incidents within and between races (2019)", 
          subtitle = NULL) +
  labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) 

Focusing specifically on crime involving Blacks and Whites:

library(treemapify)


d_gg <- ncvs_t15 %>% 
    dplyr::rename(`Total Victim Race` = Total) %>% 
    tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`), 
                        names_to = "Offender Race", 
                        values_to = "Victimizations") %>% 
    #dplyr::filter(`Offender Race` != "Total") %>% 
    dplyr::filter(`Victim Race` %in% c("White", "Black"), 
                  `Offender Race` %in% c("White", "Black")) %>% 
    dplyr::mutate(
  Percent = Victimizations/`Total Victim Race`,
  `Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])) %>% 
  
  process_online_ncvs_data(.) %>% 

  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, 
                               format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2019 victimizations"),
    
    Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    Label = paste0(Category, "\n", signs::signs(Victims, accuracy = 100, 
                                                format = scales::comma),
                   "\n",
                   signs::signs(`% of Total`, accuracy = 1, 
                                format = scales::percent)),
    #colorText = ifelse(`Offender Race` == "Asian", "Black", "White"),
     `Victim Race` = factor(`Victim Race`, levels = c("White",  "Black")), 
    `Offender Race` = factor(`Offender Race`, levels = c("White",  "Black"))
    )



ggplot(d_gg, aes(area = Victims, label = Label, fill = `Offender Race`, 
                 subgroup = Type,
                 subgroup2 = `Offender Race`, 
                 subgroup3 = `Victim Race`)) +
  geom_treemap(alpha = 0.95) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_color_manual(values = c("black", "white")) +
  #ggthemes::scale_fill_colorblind() +
  #ggthemes::scale_fill_ptol() +
  geom_treemap_text(color = "white", place = "topleft", 
                    reflow = T, family = "Roboto Condensed") +
  #geom_treemap_text(aes(color = colorText), place = "topleft", reflow = T, family = "Roboto Condensed") +
  geom_treemap_subgroup_border(size = 6, alpha = 1) +
  geom_treemap_subgroup_text(place = "bottom", grow = T, 
                             alpha = 0.4, colour = "white", 
                             fontface = "italic", min.size = 0, 
                             family = "Roboto Condensed", reflow = TRUE) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  ggtitle("Nonfatal Violent incidents among Blacks and Whites (2019)", 
          subtitle = NULL) +
  labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) 

race_levels <- c("White", "Black", "Hispanic")

library(ggpattern)

d_gg <- dt_ncvs_19 %>% 
  tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Offender Race` %in% race_levels,
                `Victim Race` %in% race_levels) %>% 
  dplyr::mutate(
   
    Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    
    
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    ##Label = paste0(rateLabel, "\n(", countLabel, ")"),
    Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels),
    
    `Race Pair` = factor(`Race Pair`, levels = c("White-Black", 
                                                 "White-Hispanic",  
                                                 "Black-Hispanic"
                                                 # "White-Asian",
                                                 # "Black-Asian", 
                                                 # "Hispanic-Asian"
                                                 ))
    
    
    )
 

ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
                                                  Black="#000000", 
                                                  Hispanic = "#D55E00", 
                                                  Asian = "#F0E442")) +
  #ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
   ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  #scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  scale_fill_manual(values = c(   "#56B4E9","#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, 
                     expand = expansion(mult = c(0, 0.25)), 
                     labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
        axis.title.y = element_text(size = 14), 
        legend.position = "none", 
        axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Nonfatal Interracial Violent Crime (2019)", 
          subtitle = NULL) +  
           #subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
          #subtitle = "") +
  labs(
    x = "", y = "Number of violent incidents",
    caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) +
  geom_text(aes(y = Victims, label = Label), 
            vjust = 0, nudge_y = 50, family = "Roboto Condensed", 
            fill = NA, size = 4, fontface = "bold") +
  facet_wrap(~`Race Pair`, scales = "free", ncol = 3)

library(ggpattern)

d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black"), 
                `Offender Race` %in% c("White", "Black")) %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, " (", percent_interracial_pair, ")"))
 

ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill =  `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none","stripe" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
                                                  Black="#000000")) +
  ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  #scale_fill_manual(values = c(  "#000000", "#56B4E9", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.2)),
                     labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
        axis.title.y = element_text(size = 14), 
        legend.position = "none", 
        axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Nonfatal Black-White Interracial Violence (2019)", 
          subtitle = NULL) +
          #subtitle = "") +
  labs(
    x = "", y = "Number of violent incidents",
    caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) +
  geom_text(aes(y = `Victims`, label = Label), 
            vjust = 0, nudge_y = 5000, family = "Roboto Condensed", 
            fill = NA, size = 5, fontface = "bold") 

race_levels <- c("White", "Black", "Hispanic")

library(ggpattern)

d_gg <- dt_ncvs_19 %>% 
  tibble::as_tibble() %>% 
  dplyr::left_join(census_data %>% 
                     dplyr::select(`Offender Race` = Race, Population), 
                   by = c("Offender Race")) %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Offender Race` %in% race_levels,
                `Victim Race` %in% race_levels) %>% 
  dplyr::mutate(
    `Victimization Rate` = 1e06*Victims / Population, 
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    rateLabel = paste0(signs::signs(`Victimization Rate`, 
                                    format = scales::comma, 
                                    accuracy = 1), 
                       " per 1M ", 
                       `Offender Race`, "s"),
    
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(rateLabel, "\n", countLabel, " (", 
                   percent_interracial_pair, ")"),
    #Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels),
    
    `Race Pair` = factor(`Race Pair`, levels = c("White-Black", 
                                                 "White-Hispanic", 
                                                 "Black-Hispanic" 
                                                 # "White-Asian", 
                                                 # "Black-Asian", 
                                                 # "Hispanic-Asian"
                                                 ))
    
    
    )
 

ggplot(d_gg, aes(x = Category, 
                 y = `Victimization Rate`, 
                 fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9", 
                                                  Black="#000000", 
                                                  Hispanic = "#D55E00", 
                                                  Asian = "#F0E442")) +
  #ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
   ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  #scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  scale_fill_manual(values = c(   "#56B4E9","#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.25))) +  #, labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
        axis.title.y = element_text(size = 14), 
        legend.position = "none", 
        axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Rate of Interracial Nonfatal Violent Crime by Offender Population (2019)", 
          subtitle = NULL) +  
           #subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
          #subtitle = "") +
  labs(
    x = "", y = "Interracial Nonfatal Violent Crime Rate",
    caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>\nUS Census 5-Year American Community Survey (2014-2018)")) +
  geom_text(aes(y = `Victimization Rate`, label = Label), 
            vjust = 0, nudge_y = 50, 
            family = "Roboto Condensed", 
            fill = NA, size = 3.5, 
            fontface = "bold") +
  facet_wrap(~`Race Pair`, scales = "free", ncol = 3)

2018

The original table provided by BJS looks like the following:

## Online BJS Files
##National Crime Victimization Survey
##Download NCVS 2018 (released 2019)


library(tidyverse)
library(data.table)
library(gt)
library(extrafont)


gt(ncvs_t14) %>% 
  tab_header(
    title = "Violent incidents by victim and offender race (2018)"
    #subtitle = ""
  ) %>% 
  fmt_number(
    data = .,
    columns = c(8),
    rows = NULL,
    decimals = 0,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    scale_by = 1,
    suffixing = FALSE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  fmt_percent(
    data = .,
    columns = c(2:7),
    placement = "right",
    decimals = 2,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  
  tab_spanner(
    label = "Offender Race",
    columns = vars(White, Black, Hispanic, Asian, Other, Multiple)) %>% 
  tab_footnote(
    footnote = "Includes Native Americans, Hawaiians, and multiracial individuals.",
    locations = cells_column_labels(vars(Other))) %>%
  tab_footnote(
    footnote = "Multiple offenders of various races.",
    locations = cells_column_labels(vars(Multiple))) %>%

  tab_options(
    footnotes.font.size = 11
  ) %>% 
  
  tab_style(
    style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
    locations = list(cells_body(columns = 2:8))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "left", font = "Arial Narrow", size = "medium"), 
    locations = list(cells_body(columns = 1))
  ) %>% 

  
  tab_style(
    style = cell_text(align = "center", font = "Arial Narrow"), 
    locations = list(cells_column_labels(columns = 1:8), 
                     cells_column_spanners(spanners ="Offender Race" ), 
                     cells_title(groups = c("title")),
                     cells_stubhead())
  ) %>% 
  
    tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_title(groups = c("title")))
  ) %>% 
  tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2018.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686)   \nFilename: cv18t14.csv")) 
Violent incidents by victim and offender race (2018)
Victim Race Offender Race Victimizations
White Black Hispanic Asian Other1 Multiple2
White 62.10% 15.30% 10.20% 2.20% 8.10% 2.10% 3,581,360
Black 10.60% 70.30% 7.90% 0.10% 9.30% 1.90% 563,940
Hispanic 28.20% 15.30% 45.40% 0.60% 7.40% 3.00% 734,410
Asian 24.10% 27.50% 7.00% 24.10% 14.40% 2.90% 182,230
Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.
Filename: cv18t14.csv

1 Includes Native Americans, Hawaiians, and multiracial individuals.

2 Multiple offenders of various races.

From these percentages, we can derive the frequencies (counts). Below I show the frequencies of violent victimizations and transpose the table so that the rows show Offender Race and the columns show Vicim Race.

library(gt)


m <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(Victims = round(Victims)) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>% 
  dplyr::group_by(`Offender Race`) %>% 
  dplyr::mutate(`Total Offending` = `White` + `Black` + `Hispanic` + `Asian`) %>% 
  dplyr::ungroup()



gt(m) %>% 
  tab_header(
    title = "Violent Victimizations by Race of Offender and Victim (2018)"
    #subtitle = "Transposed"
  ) %>% 
  fmt_number(
    data = .,
    columns = c(2:6),
    rows = NULL,
    decimals = 0,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    scale_by = 1,
    suffixing = FALSE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  tab_spanner(
    label = "Victim Race",
    columns = vars(White, Black, Hispanic, Asian)) %>% 
  summary_rows(
    groups = NULL,
    columns = vars(`White`,`Black`, `Hispanic`, `Asian`, `Total Offending`),
    fns = list(`Total` = ~sum(.x, na.rm=TRUE)),
     decimals = 0,
    use_seps = TRUE,
    sep_mark = ",") %>% 
  
  tab_style(
    style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
    locations = list(cells_body(columns = 1:7), 
                     cells_stub(),
                     cells_grand_summary(columns = 1:7, rows = "Total"))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "left", font = "Arial Narrow", size = "medium"), 
    locations = list(cells_body(columns = 2))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "center", font = "Arial Narrow"), 
    locations = list(cells_column_labels(columns = 1:6), 
                     cells_column_spanners(spanners ="Victim Race" ), 
                     cells_title(groups = c("title")),
                     cells_stubhead())
  ) %>% 
  
    tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_title(groups = c("title")))
  ) %>% 

  tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2018.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686)   \nFilename: cv18t14.csv")) 
Violent Victimizations by Race of Offender and Victim (2018)
Offender Race Victim Race Total Offending
White Black Hispanic Asian
White 2,224,025 59,778 207,104 43,917 2,534,824
Black 547,948 396,450 112,365 50,113 1,106,876
Hispanic 365,299 44,551 333,422 12,756 756,028
Asian 78,790 564 4,406 43,917 127,677
Total 3,216,062 501,343 657,297 150,703 4,525,405
Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.
Filename: cv18t14.csv

From the table above provided by the BJS, we can derive the frequencies of both offenders and victims by race. Below I have removed Offenders classified as ‘Other’ or ‘Multiple’ race. but not before calculating the total number of victimizations including them. Keep in mind is that these figures include only victimizations for which the race of both the offender and victim are known.

For example, reading across the second row involving White offenders and Black victims, we see that these involved only 1.18% of all victimizations (in which the race of offender and victim are both known); 10.6% of victimizations of Black victims; 2.36% of victimizations involving White offenders; 2.9% of all interracial violent victimizations; and approximately 10% of violent interracial victimizations between Blacks and Whites, the other 90% involving Black offenders and White victims.

library(reactable)


reactable::reactable(dt_ncvs_18, striped = TRUE, highlight = TRUE, 
                     bordered = TRUE, pagination = FALSE, 
                     compact = TRUE, resizable = TRUE, 
                     filterable = TRUE,
                       groupBy = NULL,
   defaultColDef = colDef(
     #minWidth = 85, 
     maxWidth = 65,
     align = "center",
     style = list(fontSize = 11, 
                  fontFamily = "'Roboto Condensed', 
                  'Arial Narrow', Merriweather,  Arial, 
                  Helvetica, sans-serif"),
     format = list(cell = colFormat(separators = TRUE, 
                                    percent = FALSE, digits = 0)),
     headerStyle =  list(fontWeight = "bold", fontSize = 12),
     footerStyle =  list(fontWeight = "bold"))
   
   ,columns = list(
     `Offender Race` = colDef(maxWidth = 75, 
                              minWidth = NULL, 
                              align = "left", 
                              aggregate = "unique"),
     `Race Pair` = colDef(maxWidth = 90, 
                          minWidth = NULL, 
                          align = "left", 
                          aggregate = "unique"),
     `Victim Race` = colDef(maxWidth = 65, 
                            minWidth = NULL, 
                            align = "left", 
                            aggregate = "unique"),
     
     `% of Total` = colDef(maxWidth = 50, 
                           align = "center", 
                           format = colFormat(separators = FALSE, 
                                              percent = TRUE, 
                                              digits = 2)),
     `Total Victim Race` = colDef(maxWidth = 60),
      `Total Inter-racial Pairs` = colDef(maxWidth = 55),
      `Total Inter-racial` = colDef(maxWidth = 60),
     `% of Victim Race` = colDef(maxWidth = 60, 
                                 align = "center", 
                                 format = colFormat(separators = FALSE, 
                                                    percent = TRUE, 
                                                    digits = 2)),
     `% of Offender Race` = colDef(maxWidth = 75, 
                                   align = "center", 
                                   format = colFormat(separators = FALSE, 
                                                      percent = TRUE, 
                                                      digits = 2)),
     `% Inter-racial` = colDef(maxWidth = 50, 
                               align = "center", 
                               format = colFormat(separators = FALSE, 
                                                  percent = TRUE, 
                                                  digits = 2)),
    
     `% Inter-racial Pairs` = colDef(maxWidth = 55, 
                                     align = "center", 
                                     format = colFormat(separators = FALSE, 
                                                        percent = TRUE, 
                                                        digits = 2))
     )
                     )

The table above is in ‘long-format’. We can simplify greatly by pivoting to wide format as shown below. The pie charts on the far right under the column header % of Victims show the racial make-up of victims of offenders of each race.

For example, reading from left to right across the top row, we see that roughly 88% of the victims of White offenders are themselves White. The pie graphs in the last last row % of Offenders, by contrast, show the racial make-up of the offenders who victimized each racial group. For example, in the Black Victims column we see that roughly 79% of Black victims of violence are victimized by other Blacks.

library(reactable)
library(sparkline)


m <-  dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(
    Victims = round(Victims),
    `Victim Race` = paste0(`Victim Race`, " Victims")) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>% 
  dplyr::group_by(`Offender Race`) %>% 
  dplyr::mutate(
    `Total Offending` = `White Victims` + `Black Victims` + 
      `Hispanic Victims` + `Asian Victims`, 
                
    `% of Victims` = list(c(`White Victims`,`Black Victims`,
                                   `Hispanic Victims`,`Asian Victims`))
                
    # ,`Victim Counts` = list(c(`White Victims`,`Black Victims`,
    #                                 `Hispanic Victims`,`Asian Victims`))
 ) 


bar_colors <- c("#56B4E9",  "#000000", "#D55E00", "#F0E442")
bar_color_list <- list(bar_colors, bar_colors, bar_colors, bar_colors)

reactable::reactable(m, striped = TRUE, highlight = TRUE, 
                     bordered = TRUE, compact = FALSE, resizable = FALSE,
                     
 # rowStyle = function(index) {if (index == 5)  list(fontWeight = "bold")},
 # rowClass = function(index) {if (index == 5) {"bold"}},
                     
 defaultColDef = colDef(
   format = colFormat(separators = TRUE, digits = 0), 
   footerStyle =  list(fontWeight = "bold"),
   footer = function(values) {
      if (!is.numeric(values)) return()
      sparkline(values, type = "pie", width = 80, height = 30, 
                sliceColors = c("#56B4E9",  "#000000", "#D55E00", "#F0E442"))
    }),
 
   columns = list(
     `Offender Race` = colDef(footer = "% of Offenders"),
     `% of Victims` = colDef(cell = function(values) {
        if (!is.numeric(values)) return()
        sparkline(values, type = "pie", width = 80, height = 30, 
                  sliceColors = c("#56B4E9",  "#000000", "#D55E00", "#F0E442"))
      })
   # ,`Victim Counts` = colDef(cell = function(values, index) {
   #      sparkline(values, type = "bar", 
   #                barWidth = 10,
   #                #barColor = c("#56B4E9",  "#000000", "#D55E00", "#F0E442"),
   #                barColor = bar_colors[[index]],
   #                stackedBarColor = bar_colors[[index]],
   #                chartRangeMin = 0, 
   #                chartRangeMax = unlist(m$`Victim Counts`[[index]])
   #                )
   #    })
   )
)

Violent Victimizations by Race of Offender and Victim (2018)

The information above can be encoded in a chord diagram, shown below. The ‘total’ for each race, depicted by the color and size of the outside segments correspond to the offender race and not the race of the victim.

library(chorddiag)

m <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(Victims = round(Victims)) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims)

matrix_names <- list(Offender = m$`Offender Race`, Victim = names(m)[-1])

m <- as.matrix(m[,-1])
dimnames(m) <- matrix_names


groupColors <- c("#56B4E9",  "#000000", "#D55E00", "#F0E442")
chorddiag(m, groupColors = groupColors, groupThickness = 0.1, 
          groupPadding = 2, groupnamePadding = 30)

Chord diagram showing violent victimizations by race of offender and victim (2018)

Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018. : cv18t14.csv

library(plotly)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2018 victimizations"))
 

g <- ggplot(d_gg, aes(x = `Victim Race`, y = Victims, 
                      fill = `Offender Race`)) +
  geom_col(alpha = 0.75, aes(text = Label)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06), 
                     #labels = scales::unit_format(scale = .001, suffix = "k")
                     labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 12), 
        axis.title.y = element_text(size = 12)) +
  ggtitle("Violent incidents by victim and offender race (2018)") +
  labs(
    x = "Victim Race", 
    y = "Number of violent incidents") +
  geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1), 
            aes(y = `Total Victim Race`, label = victimLabel), 
            vjust = 0, family = "Roboto Condensed") 


sources_label <- "Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>"


p <- plotly::ggplotly(g, width = NULL, height = NULL, #plotly_graph_heights, 
                      tooltip = c("text"), dynamicTicks = FALSE) %>% 
    plotly::config(scrollZoom = FALSE) %>% plotly::layout(dragmode='pan') %>% 
    #plotly::hide_legend() %>% 
    layout(annotations = 
      list(x = 0.5, y = -0.36, text = sources_label, 
        showarrow = F, xref='paper', yref='paper', 
        xanchor='auto', yanchor='auto', xshift=0, yshift=0,
        font=list(size=12, color="black"))
 )

p

Here is a more printer-friendly version using the ggpattern package.

library(ggpattern)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2018 victimizations"))
 

ggplot(d_gg, aes(x = `Victim Race`, y = Victims, 
                 fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", 
                                                "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",Hispanic="#D55E00",Black="#000000",Asian="#F0E442")) +
  ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,90)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06), 
                     #labels = scales::unit_format(scale = .001, suffix = "k")
                     labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 12), 
        axis.title.y = element_text(size = 12)) +
  ggtitle("Nonfatal Violent incidents by victim and offender race (2018)", 
          subtitle = "National Crime Victimization Survey") +
  labs(
    x = "Victim Race", y = "Number of violent incidents",
    caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) +
  geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1), 
            aes(y = `Total Victim Race`, label = victimLabel), 
            vjust = 0, family = "Roboto Condensed") 

Below I use the treemapify package.

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, 
                               format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2018 victimizations"),
    
    Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    Label = paste0(Category, "\n", signs::signs(Victims, accuracy = 100, 
                                                format = scales::comma),
                   "\n",
                   signs::signs(`% of Total`, accuracy = 1, 
                                format = scales::percent)),
    colorText = ifelse(`Offender Race` == "Asian", "Black", "White")
    )


ggplot(d_gg, aes(area = Victims, 
                 label = Label, 
                 fill = `Offender Race`, 
                 subgroup = Type,
                 subgroup2 = `Offender Race`, 
                 subgroup3 = `Victim Race`)) +
  geom_treemap(alpha = 0.95) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_color_manual(values = c("black", "white")) +
  #ggthemes::scale_fill_colorblind() +
  #ggthemes::scale_fill_ptol() +
  geom_treemap_text(aes(color = colorText), 
                    place = "topleft", reflow = T, 
                    family = "Roboto Condensed") +
  geom_treemap_subgroup_border(size = 6, alpha = 1) +
  geom_treemap_subgroup_text(place = "bottom", 
                             grow = T, alpha = 0.4, colour = "white", 
                             fontface = "italic", min.size = 0, 
                             family = "Roboto Condensed", 
                             reflow = TRUE) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  ggtitle("Nonfatal Violent incidents within and between races (2018)", 
          subtitle = NULL) +
  labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) 

Focusing specifically on crime involving Blacks and Whites.

library(treemapify)


d_gg <-  ncvs_t14 %>% 
  dplyr::rename(`Total Victim Race` = Victimizations) %>% 
  tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`), 
                      names_to = "Offender Race", 
                      values_to = "Percent") %>% 
  #dplyr::filter(`Offender Race` != "Total") %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black"), 
                `Offender Race` %in% c("White", "Black")) %>% 
  dplyr::mutate(
      Victimizations = `Total Victim Race` * (Percent),
      `Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])) %>% 

  process_online_ncvs_data(.) %>% 

  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2018 victimizations"),
    
    Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    Label = paste0(Category, "\n", signs::signs(Victims, accuracy = 100, 
                                                format = scales::comma),
                   "\n",
                   signs::signs(`% of Total`, accuracy = 1, 
                                format = scales::percent)),
    #colorText = ifelse(`Offender Race` == "Asian", "Black", "White"),
     `Victim Race` = factor(`Victim Race`, levels = c("White",  "Black")), 
    ##for ordering graphs
    `Offender Race` = factor(`Offender Race`, levels = c("White",  "Black"))
    )


ggplot(d_gg, aes(area = Victims, 
                 label = Label, 
                 fill = `Offender Race`, 
                 subgroup = Type,
                 subgroup2 = `Offender Race`, 
                 subgroup3 = `Victim Race`)) +
  geom_treemap(alpha = 0.95) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_color_manual(values = c("black", "white")) +
  #ggthemes::scale_fill_colorblind() +
  #ggthemes::scale_fill_ptol() +
  geom_treemap_text(color = "white", place = "topleft", 
                    reflow = T, family = "Roboto Condensed") +
  #geom_treemap_text(aes(color = colorText), place = "topleft", reflow = T, family = "Roboto Condensed") +
  geom_treemap_subgroup_border(size = 6, alpha = 1) +
  geom_treemap_subgroup_text(place = "bottom", grow = T, 
                             alpha = 0.4, colour = "white", 
                             fontface = "italic", min.size = 0, 
                             family = "Roboto Condensed", 
                             reflow = TRUE) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  ggtitle("Nonfatal Violent incidents among Blacks and Whites (2018)", 
          subtitle = NULL) +
  labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) 

race_levels <- c("White", "Black", "Hispanic", "Asian")

library(ggpattern)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Offender Race` %in% race_levels,
                `Victim Race` %in% race_levels) %>% 
  dplyr::mutate(
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    
    
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    ##Label = paste0(rateLabel, "\n(", countLabel, ")"),
    Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels),
    
    `Race Pair` = factor(`Race Pair`, levels = c("White-Black", "White-Hispanic", 
                                                 "White-Asian", "Black-Hispanic", 
                                                 "Black-Asian", "Hispanic-Asian"))
    
    
    )
 

ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", 
                                                "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
                                                  Black="#000000", 
                                                  Hispanic = "#D55E00", 
                                                  Asian = "#F0E442")) +
  #ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
   ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  #scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  scale_fill_manual(values = c(   "#56B4E9","#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, 
                     expand = expansion(mult = c(0, 0.25)), 
                     labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
        axis.title.y = element_text(size = 14), 
        legend.position = "none", 
        axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Nonfatal Interracial Violent Crime (2018)", 
          subtitle = NULL) +  
           #subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
          #subtitle = "") +
  labs(
    x = "", y = "Number of violent incidents",
    caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) +
  geom_text(aes(y = Victims, label = Label), 
            vjust = 0, nudge_y = 50, 
            family = "Roboto Condensed", 
            fill = NA, size = 4, 
            fontface = "bold") +
  facet_wrap(~`Race Pair`, scales = "free", ncol = 2)

library(ggpattern)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black"), 
                `Offender Race` %in% c("White", "Black")) %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, " (", percent_interracial_pair, ")"))
 

ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill =  `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none","stripe" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
                                                  Black="#000000")) +
  ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  #scale_fill_manual(values = c(  "#000000", "#56B4E9", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.2)),
                     labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(
    linetype = 2), panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
    axis.title.y = element_text(size = 14), 
        legend.position = "none", 
    axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Black-White Nonfatal Interracial Violence (2018)", 
          subtitle = NULL) +
          #subtitle = "") +
  labs(
    x = "", y = "Number of nonfatal violent incidents",
    caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) +
  geom_text(aes(y = `Victims`, label = Label), 
            vjust = 0, nudge_y = 5000, 
            family = "Roboto Condensed", 
            fill = NA, 
            size = 5, fontface = "bold") 

race_levels <- c("White", "Black", "Hispanic", "Asian")

library(ggpattern)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::left_join(census_data %>% 
                     dplyr::select(`Offender Race` = Race, 
                                   Population), 
                   by = c("Offender Race")) %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Offender Race` %in% race_levels,
                `Victim Race` %in% race_levels) %>% 
  dplyr::mutate(
    `Victimization Rate` = 1e06*Victims / Population, 
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, 
                               format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    rateLabel = paste0(signs::signs(`Victimization Rate`, 
                                    format = scales::comma, accuracy = 1), 
                       " per 1 million ", 
                       `Offender Race`, "s"),
    
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(rateLabel, "\n", countLabel, " (", 
                   percent_interracial_pair, ")"),
    #Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels),
    
    `Race Pair` = factor(`Race Pair`, levels = c("White-Black", 
                                                 "White-Hispanic", 
                                                 "White-Asian", 
                                                 "Black-Hispanic", 
                                                 "Black-Asian", 
                                                 "Hispanic-Asian"))
    
    
    )
 

ggplot(d_gg, aes(x = Category, 
                 y = `Victimization Rate`, 
                 fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", 
                                                "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9", 
                                                  Black="#000000", 
                                                  Hispanic = "#D55E00", 
                                                  Asian = "#F0E442")) +
  #ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
   ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  #scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  scale_fill_manual(values = c(   "#56B4E9","#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, 
                     expand = expansion(mult = c(0, 0.25))) +  
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
        axis.title.y = element_text(size = 14), 
        legend.position = "none", 
        axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Rate of Interracial Nonfatal Violent Crime by Offender Population (2018)", 
          subtitle = NULL) +  
           #subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
          #subtitle = "") +
  labs(
    x = "", y = "Interracial Nonfatal Violent Crime Rate",
    caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>\nUS Census 5-Year American Community Survey (2014-2018)")) +
  geom_text(aes(y = `Victimization Rate`, label = Label), 
            vjust = 0, nudge_y = 50, 
            family = "Roboto Condensed", fill = NA, 
            size = 4, fontface = "bold") +
  facet_wrap(~`Race Pair`, scales = "free", ncol = 2)

race_levels <- c("White", "Black", "Hispanic", "Asian")

library(ggpattern)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::left_join(census_data %>% dplyr::select(`Offender Race` = Race, 
                                                 Population), 
                   by = c("Offender Race")) %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black"), 
                `Offender Race` %in% c("White", "Black")) %>% 
  dplyr::filter(`Offender Race` %in% race_levels,
                `Victim Race` %in% race_levels) %>% 
  dplyr::mutate(
    `Victimization Rate` = 1e06*Victims / Population, 
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    rateLabel = paste0(signs::signs(`Victimization Rate`, 
                                    format = scales::comma, accuracy = 1), 
                       " per 1 million ", 
                       `Offender Race`, "s"),
    
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(rateLabel, "\n", countLabel, " (", 
                   percent_interracial_pair, ")"),
    #Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels),
    
    `Race Pair` = factor(`Race Pair`, levels = c("White-Black", 
                                                 "White-Hispanic", 
                                                 "White-Asian", 
                                                 "Black-Hispanic", 
                                                 "Black-Asian", 
                                                 "Hispanic-Asian"))
    
    
    )
 


ggplot(d_gg, aes(x = Category, y = `Victimization Rate`, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill =  `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none","stripe" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",Black="#000000", Hispanic = "#D55E00", Asian = "#F0E442")) +
  ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  #scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  scale_fill_manual(values = c(   "#56B4E9","#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.2))) +  #, labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), axis.title.y = element_text(size = 14), 
        legend.position = "none", axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Rate of Black-White Interracial Violent Crime by Offender Population (2018)", subtitle = "Does not include murder") +
          #subtitle = "") +
  labs(
    x = "", y = "Interracial Violent Crime Rate",
    caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>\nUS Census 5-Year American Community Survey (2014-2018)")) +
  geom_text(aes(y = `Victimization Rate`, label = Label), vjust = 0, nudge_y = 50, family = "Roboto Condensed", fill = NA, size = 5, fontface = "bold") 

B. ICPSR Data (2015, 2016, 2018)

All of the following tables and graphs are derived from the National Crime Victimization Surveys (NCVS) available at the Inter-university Consortium for Political and Social Research (ICPSR). The ICPSR data can be found at the following links:

Functions

The code I use to derive the victim and offender counts can be found below.

library(tidyverse)
library(data.table)

get_crime <- function(x){
  data.table::fcase(
    ##Violent Crimes 1-20
    x == 1, "Completed rape",
    x == 2, "Attempted rape",
    x == 3, "Sexual attack with serious assault",
    x == 4, "Sexual attack with minor assault",
    x == 5, "Completed robbery with injury from serious assault",
    x == 6, "Completed robbery with injury from minor assault",
    x == 7, "Completed robbery without injury from minor assault",
    x == 8, "Attempted robbery with injury from serious assault",
    x == 9, "Attempted robbery with injury from minor assault",
    x == 10, "Attempted robbery without injury",
    x == 11, "Completed aggravated assault with injury",
    x == 12, "Attempted aggravated assault with weapon",
    x == 13, "Threatened assault with weapon",
    x == 14, "Simple assault completed with injury",
    x == 15, "Sexual assault without injury",
    x == 16, "Unwanted sexual contact without force",
    x == 17, "Assault without weapon without injury",
    x == 18, "Verbal threat of rape",
    x == 19, "Verbal threat of sexual assault",
    x == 20, "Verbal threat of assault",
    
    ##Purse snatching/pocket picking 21-23
    x == 21, "Completed purse snatching",
    x == 22, "Attempted purse snatching",
    x == 23, "Pocket picking (completed only)",
    
    ##Property Crimes
    x == 31, "Completed burglary, forcible entry",
    x == 32, "Completed burglary, unlawful entry without force",
    x == 33, "Attempted forcible entry",
    x == 40, "Completed motor vehicle theft",
    x == 41, "Attempted motor vehicle theft",
    x == 54, "Completed theft less than $10",
    x == 55, "Completed theft $10 to $49",
    x == 56, "Completed theft $50 to $249",
    x == 57, "Completed theft $250 or greater",
    x == 58, "Completed theft value NA",
    x == 59, "Attempted theft",
    default = NA_character_ )
}

recode_serious_crimes <- function(x){
  data.table::fcase(
    x %in% 1:4, "Rape/Sexual attack",
    x %in% 5:10, "Robbery",
    x %in% 11:13, "Assault",
    x %in% 14:20, "Minor violent crime",
    x %in% 21:23, "Purse snatch/pocket picking",
    x %in% 31:59, "Property crimes",
    default = NA_character_ )
}


get_race <- function(x){
  data.table::fcase(
    x == 1, "White",
    x == 2, "Black",
    x == 3, "American Indian",
    x == 4, "Asian",
    x == 5, "Hawaiian",
    x == 6, "White-Black",
    x == 7, "White-American Indian",
    x == 8, "White-Asian",
    x == 9, "White-Hawaiian",
    x == 10, "Black-American Indian",
    x == 11, "Black-Asian",
    x == 12, "Black-Hawaiian",
    x == 13, "American Indian-Asian",
    x == 14, "Asian-Hawaiian",
    x == 15, "White-Black-American Indian",
    x == 16, "White-Black-Asian",
    x == 17, "White-American Indian-Asian",
    x == 18, "White-Asian-Hawaiian",
    x == 19, "2 or 3 races",
    x == 20, "4 or 5 races",
    default = NA_character_)
}

get_race_category <- function(race, hisp){
  data.table::fcase(
    race %chin% c("White") & hisp == "Non-Hispanic", "White",
    race %chin% c("White") & hisp == "Hispanic", "Hispanic", 
    ##Hispanic getting at mestizos, i.e. Amerindian with Iberian (Spanish) admixture - 
    ## what most people think of as "Hispanic", but classified as "White"
    ##- ignoring Black and Asian hispanic.  There is a very small percentage of white hispanics with no amerindian ancestry who according to this 
    ## criterion should be classified as 'White', but its negligible.  
    race %chin% c("Black", "White-Black", "Black-Asian", "Black-Hawaiian", "Black-American Indian"), "Black",
    race %chin% c("Asian", "Hawaiian", "Asian-Hawaiian"), "Asian", ##Asian/PAcific Islander/Hawaiian 
    race %chin% c("American Indian"), "American Indian", 
    ##all others, it not missing, "Other"
    !is.na(race), "Other",
    default = NA_character_
  )
}


get_race_category_v2 <- function(race, hisp){
  data.table::fcase(
    hisp == "Hispanic", "Hispanic", 
    race %chin% c("White") & hisp == "Non-Hispanic", "White",
    race %chin% c("Black") & hisp == "Non-Hispanic", "Black",
    race %chin% c("Asian") & hisp == "Non-Hispanic", "Asian",
    !is.na(race), "Other",
    default = NA_character_
  )
}



process_ncvs_002_files <- function(dt){
  require(data.table)
  require(magrittr)
  
    dt[, .(
      IDHH,
      HH_WEIGHT = V2116,
      WGTHHCY, ##ADJUSTED HOUSEHOLD WEIGHT - COLLECTION YEAR
      
      hh_Race0 = data.table::fcase(
        V2031 == 1, "White", ##White, but not specifying Hispanic
        V2031 == 2, "Black",
        V2031 == 6, "White-Black",
        V2031 %in% c(3,21), "Other",
        V2031 == 22, "White-Other",
        V2031 == 23, "Black-Other",
        V2031 == 24, "White-Black-Other",
        default = NA_character_),
      pp_Race0 = get_race(V2040A),
      pp_Hisp = fcase(V2041 == 1, "Hispanic", V2041 == 2, "Non-Hispanic", default = NA_character_),
      ref_Race0 = get_race(V2049A),
      ref_Hisp = fcase(V2050 == 1, "Hispanic", V2050 == 2, "Non-Hispanic", default = NA_character_))] %>% 
  
    .[, `:=`(
      hh_Race = data.table::fcase(
        hh_Race0 %chin% c("White","White-Other"), "White/Hispanic",
        hh_Race0 %chin% c("Black", "Black-Other", "White-Black"), "Black", 
        ##Note:  due to social construction, etc.  White-Black is coded as Black, but this is negligible
        hh_Race0 %chin% c("Other", "White-Black-Other"), "Other",
        default = NA_character_),
      pp_Race = get_race_category(pp_Race0, pp_Hisp),
      ref_Race = get_race_category(ref_Race0, ref_Hisp)
  )]
  
}

process_ncvs_003_files <- function(dt){
  require(data.table)
  require(magrittr)
  
  dt[, .(
        YEARQ,
        IDHH, IDPER, 
        WGTPERCY,
        PP_weight = V3080,
        Num_incidents = V3081, ##Num Incidents
        Race0 = get_race(V3023A), 
        Hisp = data.table::fcase(V3024 == 1, "Hispanic", V3024 == 2, "Non-Hispanic", default = NA_character_))] %>% 
    
  .[, `:=` (
    Race = get_race_category(Race0, Hisp),
    Race_v2 = get_race_category_v2(Race0, Hisp)
    )]
}

process_ncvs_004_files <- function(dt){
  
  require(data.table)
  require(magrittr)
  
    ##  In 2015, incident weight is called V4527
    ## "Beginning in 2016, the incident weight (V4527) was adjusted to account for series victimizations and was renamed SERIES_IWEIGHT."
  if("V4527" %in% names(dt) & !"SERIES_IWEIGHT" %in% names(dt)){
    if(max(dt$V4527, na.rm=TRUE) > 0){
      data.table::setnames(dt, old = c("V4527"), new = c("SERIES_IWEIGHT"), skip_absent = FALSE) 
    }
  }
    
  
   ##Rule V4022 != 1 excludes cases outside US

   dt[V4022 != 1] %>% 
     .[, V4016 := fifelse(V4016 %in% 11:996, 10, V4016)] %>%  ##recode series, no greater than 10]
   
     .[, .(
          YEARQ,
          Year = V4015,
          Incident_num = V4012,
          Crime = get_crime(V4529),
          Crime_type = recode_serious_crimes(V4529),
          Crime_code = V4529,
          IDHH, 
          IDPER, 
          WGTVICCY = fifelse(V4019 == 2, WGTVICCY * V4016, WGTVICCY), ##adust for series
          SERIES_WEIGHT, 
          Incident_Weight = SERIES_IWEIGHT,
          
          off_Race0 = data.table::fcase(
            V4246C == 1 | V4281  == 1 | V4285A == 2, "Black",
            V4246B == 1 | V4280 == 1 | V4285A == 1, "White",
            V4246D == 1 | V4282A == 1 | V4285A == 3, "Native American",
            V4246E == 1 | V4282B == 1 | V4285A == 4, "Asian",
            V4246F == 1 | V4282C == 1 | V4285A == 5, "Hawaiian",
            default = NA_character_),
          
          ##Excluding multiple offenders of various races
          off_Race0_v2 = data.table::fcase(
            V4285A %in% 1:5, "Multiple",
            V4246C == 1 | V4281  == 1, "Black",
            V4246B == 1 | V4280 == 1,  "White",
            V4246D == 1 | V4282A == 1, "Native American",
            V4246E == 1 | V4282B == 1, "Asian",
            V4246F == 1 | V4282C == 1, "Hawaiian",
            default = NA_character_),
          
          off_Hisp = data.table::fcase(
            V4237A == 1 | V4252A == 1 | V4252B %in% c(1), "Hispanic",
            V4237A == 2 | V4252A == 2 | V4252B %in% c(2), "Non-Hispanic",
            default = NA_character_))] %>% 
    
    .[, `:=`(
          `Offender Race` = data.table::fcase(
            off_Race0 == "White" & off_Hisp == "Non-Hispanic", "White",
            off_Race0 == "White" & is.na(off_Hisp), "White", ## Counting White with missing Hispanic designation as White
            
            off_Race0 == "White" & off_Hisp == "Hispanic", "Hispanic",
            off_Race0 == "Black", "Black",
            off_Race0 %chin% c("Asian", "Hawaiian"), "Asian",
            off_Race0 == "Native American", "Native American",
            default = NA_character_),
          
          ##Excluding all hispanic; Other = Hawaiian, Native American
          `Offender Race v2` = data.table::fcase(
            off_Hisp == "Hispanic", "Hispanic",
            off_Race0 == "White" & off_Hisp == "Non-Hispanic", "White",
            off_Race0 == "Black" & off_Hisp == "Non-Hispanic", "Black",
            off_Race0 == "Asian" & off_Hisp == "Non-Hispanic", "Asian",
            off_Race0 %chin% c("Native American", "Hawaiian"), "Other",
            default = NA_character_),
          
          ##version 2 + excluding multiple

          `Offender Race v3` = data.table::fcase(
            off_Hisp == "Hispanic", "Hispanic",
            off_Race0_v2 == "White" & off_Hisp == "Non-Hispanic", "White",
            off_Race0_v2 == "Black" & off_Hisp == "Non-Hispanic", "Black",
            off_Race0_v2 == "Asian" & off_Hisp == "Non-Hispanic", "Asian",
            off_Race0_v2 == "Multiple", "Multiple Races",
            off_Race0_v2 %chin% c("Native American", "Hawaiian"), "Other",
            default = NA_character_)
          )]
}



merge_and_summarise_interracial <- function(dt_004, dt_003, keep_codes = 1:20, grouping_vars = c("Weight_type"), 
                                            filter_interracial = TRUE, filter_races = FALSE, keep_races = c("White", "Black")){
  require(data.table)
  
  weight_cols <- c("SERIES_WEIGHT", "Incident_Weight", "WGTPERCY", "UNWEIGHTED")
  
  dt <- dt_004[Crime_code %in% keep_codes] 
  dt <- data.table::merge.data.table(x = dt, y = dt_003, by = c("IDPER", "IDHH", "YEARQ"), all.x = TRUE) 
  dt <- dt[!is.na(`Offender Race`) & !is.na(Race)]
  dt[, UNWEIGHTED := 1]
  
  keep_vars <- unique(c(weight_cols, grouping_vars, "Crime", "Crime_code", "Race", "Offender Race"))
  keep_vars <- keep_vars[which(keep_vars != "Weight_type")]
  dt <- dt[, ..keep_vars]
  
  weight_cols <- weight_cols[which(weight_cols %in% names(dt))]
  dt <- data.table::melt.data.table(data = dt, measure.vars = weight_cols, variable.name = "Weight_type", value.name = "Weight")
  
  gVars <- unique(c("Weight_type", "Offender Race", "Race", grouping_vars))
  gVars <- gVars[which(gVars %in% names(dt))]
  
  dt <- dt[, .(`Victimizations` = sum(Weight, na.rm=TRUE)), by = gVars]
  
  data.table::setnames(dt, old = "Race", new = "Victim Race", skip_absent = FALSE)
  ##new variable identifying pairs - each cell contains a vector of two
  dt[, index := .I]
  
  gVars <- unique(c("Weight_type", grouping_vars))
  gVars <- gVars[which(gVars %in% names(dt))]
  dt[, `:=` (`Total Victimizations` = sum(Victimizations)), by = gVars] 
  
  gVars <- unique(c("Weight_type", "Offender Race", grouping_vars))
  gVars <- gVars[which(gVars %in% names(dt))]
  dt[, `:=` (`Total Offender Race` = sum(Victimizations)), by = gVars]
  
  gVars <- unique(c("Weight_type", "Victim Race", grouping_vars))
  gVars <- gVars[which(gVars %in% names(dt))]
  dt[, `:=` (`Total Victim Race` = sum(Victimizations)), by = gVars]

  dt[, Interracial_flag := ifelse(`Victim Race` == `Offender Race`, "No", "Yes")]

  
  gVars <- unique(c("Weight_type", grouping_vars))
  gVars <- gVars[which(gVars %in% names(dt))]
  
  ##FILTERING TO ONLY INTERRACIAL CRIMES
  if(isTRUE(filter_interracial)){
    dt <- dt[`Victim Race` != `Offender Race`]
  }
  dt[, `:=` (`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])), by = gVars]
  
  all_races <- unique(c(dt$`Offender Race`))
  all_pairs_list <- combn(all_races, 2, simplify = FALSE)
  
  for(i in seq_along(all_pairs_list)){
    race_pairs <- all_pairs_list[[i]]
    race1 <- race_pairs[1]; race2 <- race_pairs[2]
    dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1), 
       `Total Interracial Pairs` := sum(Victimizations), by = gVars]
    
    dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1),  
       `Race Pair` := paste0(race1, "-", race2), by = index]
  }
  
  ##FILTERING TO ONLY PAIRS
  if(isTRUE(filter_interracial) & isTRUE(filter_races)){
    if(length(unique(keep_races) == 2)){
      race1 <-unique(keep_races)[1]; race2 <- unique(keep_races)[2]
      dt <- dt[ (`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1)]
    }
  }
  
  dt[, `:=` (
    `Percent of Total` = Victimizations/`Total Victimizations`,
    `Percent of Offender Race` = Victimizations/`Total Offender Race`,
    `Percent of Victim Race` = Victimizations/`Total Victim Race`,
    `Percent Interracial` = ifelse(Interracial_flag == "Yes", Victimizations/`Total Interracial`, NA_real_),
    `Percent Interracial Pairs` = ifelse(Interracial_flag == "Yes", Victimizations/`Total Interracial Pairs`, NA_real_))]
  
  ##last filter, removing 'Other' - including only Victim races that also appear as Offender Races and vice-versa
  u_races <- unique(dt$`Offender Race`)
  dt <- dt[`Victim Race` %chin% u_races]
  
  ##for ordering purposes in graphs, reordering race categories
  race_levels <- c("White", "Black", "Hispanic", "Asian", "Native American")
  race_levels <- race_levels[which(race_levels %in% u_races)]
  
  dt[, `:=`(
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels))]
  
  
  
  gVars <- unique(c("Weight_type", grouping_vars, "Offender Race", "Victim Race"))
  gVars <- gVars[which(gVars %in% names(dt))]
  data.table::setorderv(dt, cols = gVars)
  
  ##reorganize columns
  dt[, index := NULL]
  dt[is.na(`Race Pair`) & Interracial_flag == "No", `Race Pair` := `Victim Race`]
  
  ##rearranging columns
  keep_vars <- unique(c(gVars, "Race Pair", "Victimizations", 
                        "Total Victimizations", "Percent of Total",
                        "Total Victim Race", "Percent of Victim Race",
                        "Total Offender Race", "Percent of Offender Race",
                        "Percent of Offender Race", 
                        "Total Interracial",  "Percent Interracial",  
                        "Total Interracial Pairs", "Percent Interracial Pairs"))
  dt <- dt[, ..keep_vars]
  return(dt)
}


show_crime_summary_interracial <- function(df, grouping_vars = NULL, use_weight = c("series", "incident", "person", "unweighted")){
  require(data.table)
  require(reactable)
  
  use_weight = match.arg(use_weight)
  use_weight = switch(use_weight, "series" = "SERIES", 
                      "incident" = "INCIDENT", 
                      "person" = "WGTPERCY",
                      "unweighted" = "UNWEIGHTED")
  
  dt <- data.table::as.data.table(df)
  data.table::setnames(dt, skip_absent = TRUE,
                       old = c("Crime_type", "Weight_type"), 
                       new = c("Offense", "Weight"))
  
  dt <- dt[!is.na(`Offender Race`)]
  dt[, Weight := dplyr::recode(Weight, "Incident_Weight" = "INCIDENT", "SERIES_WEIGHT" = "SERIES")]
  
  if("Offense" %in% names(dt)){
    dt[, Offense := factor(Offense, levels = c("Robbery", "Assault", "Rape/Sexual attack"))]
  }
  
  dt <- dt[Weight == use_weight]
  dt[, Weight := NULL]
  
  order_vars <- c( "Offender Race","Victim Race","Race Pair", "Offense", "Year" )
  order_vars <- order_vars[which(order_vars %in% names(dt))]
  data.table::setorderv(dt, cols = order_vars)
  
  grouping_vars <- grouping_vars[which(grouping_vars %in% names(dt))]
  if(length(grouping_vars) == 0){grouping_vars <- NULL}
  
  col_names <- names(dt)
  new_names <- col_names %>% gsub("Percent ", "% ", .) %>% gsub("Victimizations", "Victims", .) %>% gsub("Interracial", "Inter-racial", .)
  names(dt) <- new_names

  reactable::reactable(dt, striped = TRUE, highlight = TRUE, bordered = TRUE, pagination = FALSE, compact = TRUE, resizable = TRUE, filterable = TRUE,
                       groupBy = NULL,
                       defaultColDef = colDef(
                         #minWidth = 85, 
                         maxWidth = 65,
                         align = "center",
                         style = list(fontSize = 11, fontFamily = "'Roboto Condensed', 'Arial Narrow', Merriweather,  Arial, Helvetica, sans-serif"),
                         format = list(cell = colFormat(separators = TRUE, percent = FALSE, digits = 0)),
                         headerStyle =  list(fontWeight = "bold", fontSize = 12),
                         footerStyle =  list(fontWeight = "bold"))
                       ,columns = list(
                         `Offender Race` = colDef(maxWidth = 75, minWidth = NULL, align = "left", aggregate = "unique"),
                         `Race Pair` = colDef(maxWidth = 90, minWidth = NULL, align = "left", aggregate = "unique"),
                         `Victim Race` = colDef(maxWidth = 65, minWidth = NULL, align = "left", aggregate = "unique"),
                         
                         `% of Total` = colDef(maxWidth = 50, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
                         `Total Victim Race` = colDef(maxWidth = 60),
                         `Total Inter-racial Pairs` = colDef(maxWidth = 55),
                         `Total Inter-racial` = colDef(maxWidth = 60),
                         `% of Victim Race` = colDef(maxWidth = 60, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
                         `% of Offender Race` = colDef(maxWidth = 75, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
                         `% Inter-racial` = colDef(maxWidth = 50, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
                         
                         `% Inter-racial Pairs` = colDef(maxWidth = 55, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2))
                       )
  )
  
  
  
}

Loading data…

##2015 Data

nc_15_003 <- data.table::fread(paste0(myDataFolder, "DATA/NCVS-ICPSR-2015-36448/DS0003/36448-0003-Data.tsv")) %>% process_ncvs_003_files(.)
nc_15_004 <- data.table::fread(paste0(myDataFolder, "DATA/NCVS-ICPSR-2015-36448/DS0004/36448-0004-Data.tsv")) %>% process_ncvs_004_files(.)

##2016 Data

nc_16_003 <- data.table::fread(paste0(myDataFolder, "DATA/NCVS-ICPSR-2016-36828/DS0003/36828-0003-Data.tsv")) %>% process_ncvs_003_files(.)
nc_16_004 <- data.table::fread(paste0(myDataFolder,"DATA/NCVS-ICPSR-2016-36828/DS0004/36828-0004-Data.tsv")) %>% process_ncvs_004_files(.)

##2018 Data

nc_18_003 <- data.table::fread(paste0(myDataFolder,"DATA/NCVS-ICPSR-2018-37297/DS0003/37297-0003-Data.tsv")) %>% process_ncvs_003_files(.)
nc_18_004 <- data.table::fread(paste0(myDataFolder,"DATA/NCVS-ICPSR-2018-37297/DS0004/37297-0004-Data.tsv")) %>% process_ncvs_004_files(.)

2015 - Major Offenses

d_gg_15 <- merge_and_summarise_interracial(nc_15_004, nc_15_003, keep_codes = 1:13, grouping_vars = c("Weight_type")) 

show_crime_summary_interracial(d_gg_15, use_weight = "series")

2016 - Major Offenses

d_gg_16 <- merge_and_summarise_interracial(nc_16_004, nc_16_003, keep_codes = 1:13, grouping_vars = c("Weight_type")) 

show_crime_summary_interracial(d_gg_16, use_weight = "series")

2018 - Major Offenses

d_gg_18 <- merge_and_summarise_interracial(nc_18_004, nc_18_003, keep_codes = 1:13, grouping_vars = c("Weight_type")) 

show_crime_summary_interracial(d_gg_18, use_weight = "series")

Appendix: Codebook Notes

A. Race Codes

Victim Race Codes (Column V3023A found in Victim files XXXXX-0003-Data.tsv:

  1. White only
  2. Black only
  3. American Indian
  4. Asian only
  5. Hawaiian
  6. White-Black
  7. White-American Indian
  8. White-Asian
  9. White-Hawaiian
  10. Black-American Indian
  11. Black-Asian
  12. Black-Hawaiian
  13. American Indian-Asian
  14. Asian-Hawaiian
  15. White-Black-American Indian
  16. White-Black-Asian
  17. White-American Indian-Asian
  18. White-Asian-Hawaiian
  19. 2 or 3 races
  20. 4 or 5 races

Offender Race/Ethnicity Codes (found in offender files XXXXX-0004-Data.tsv:

  • V4246A - SINGLE OFFENDER RACE MISSING FLAG - 1 = Identified Race, 9 = No race info

  • V4279 - MULTIPLE OFFENDER RACE MISSING FLAG -

  • V4283 - MULTIPLE OFFENDERS: DON’T KNOW

  • V4247 - SINGLE OFF ONLY CRIME AGAINST RESP OR HH

    • “Was this the only time this offender committed a crime or made threats against you or your household?”
    • 1 - Yes (only time)
    • 2 - No (there were other times)
    • 3 - Don’t know
  • V4246C - SINGLE OFFENDER RACE: BLACK OR AFRICAN AMERICAN

  • V4281 - MULT OFF RACE: BLACK

  • V4246B - SINGLE OFFENDER RACE: WHITE

  • V4280 - MULT OFF RACE: WHITE

  • V4237A - SINGLE OFFENDER HISPANIC/LATINO

  • V4252A - MULTIPLE OFFENDERS HISPANIC/LATINO

  • V4246D - SINGLE OFFENDER RACE: AMERICAN INDIAN OR ALASKA NATIVE

  • V4282A - MULT OFF RACE: AMERICAN INDIAN OR ALASKA NATIVE

  • V4246E - SINGLE OFFENDER RACE: ASIAN

  • V4282B - MULT OFF RACE: ASIAN

  • V4246F - SINGLE OFFENDER RACE: NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER

  • V4282C - MULT OFF RACE: NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER

  • V4285A - MULTIPLE OFFENDER RACE OF MOST: 1 = “mostly white”; 2 = “mostly black..”, 3 = "mostly Nat Am, 4 = mostly Asian; 5 = Mostly Native Hawaiian or Pacific Islander

  • V4252B - MULTIPLE OFFENDERS HISPANIC/NON-HISPANIC: 1 = “Mostly Hispanic or Latino”; 2 = “Mostly non-Hispanic”; 3 = “Equal number of Hispanic and non-Hispanic”; 4 = “Don’t know”

B. Weights

“INCIDENTS VS. VICTIMIZATIONS: For personal crimes, the NCVS makes a basic distinction between”incidents" and “victimizations.” The number of victimizations reflects how many criminal acts were experienced by NCVS survey respondents; the number of incidents reflects the number of criminal acts committed against NCVS respondents and others present during such incidents, as reported by survey respondents. If there was more than one victim, the incident weight is adjusted to compensate for the possibility that the incident could be reported several times by multiple victims and thus be over-counted."

“To calculate a victimization count to match recent BJS reports, users should use variable ‘SERIES_WEIGHT’”

  • SERIES_WEIGHT = WGTVICCY, where IF V4019 = 2 THEN WGTVICCY = WGTVICCY * v4016, ELSE WGTVICCY = WGTVICCY.

C. Crime Codes - V4529

  • TYPES OF CRIME, LISTED IN ORDER OF SERIOUSNESS (The Seriousness Hierarchy) p. 579 2018

  • ALL VIOLENT CRIMES - 1-20

  • MAJOR VIOLENT CRIMES - 1,2,3,5,6,7,8,9,10,11,12,13

  1. Completed rape

  2. Attempted rape

  3. Sexual attack with serious assault

  4. Sexual attack with minor assault

  5. Completed robbery with injury from serious assault

  6. Completed robbery with injury from minor assault

  7. Completed robbery without injury from minor assault

  8. Attempted robbery with injury from serious assault

  9. Attempted robbery with injury from minor assault

  10. Attempted robbery without injury

  11. Completed aggravated assault with injury

  12. Attempted aggravated assault with weapon

  13. Threatened assault with weapon

  14. Simple assault completed with injury

  15. Sexual assault without injury

  16. Unwanted sexual contact without force

  17. Assault without weapon without injury

  18. Verbal threat of rape

  19. Verbal threat of sexual assault

  20. Verbal threat of assault

D. Misc.

  • “To be consistent with BJS publications for all years, exclude crimes committed outside the U.S.” (2018 - p. 595)

    • V4022 - IN WHAT CITY, TOWN, VILLAGE - 1 = “Outside U.S.”
    • Rule: exclude if V4022 = 1, 9 or is missing.
  • “To be consistent with BJS publications from 2012 on … include series crimes…. Include up to 10 occurrences in a series crime”

    • v4016 - HOW MANY TIMES INCIDENT OCCUR LAST 6 MOS
    • Rule: Recode to 10 values of V4016 from 11 to 996
  • Person Victimization Counts consistent with BJS publications 2012+

    • V4019 - CHECK ITEM D: ENOUGH DETAIL TO DISTINGUISH INCIDENTS

      • 1 = Yes (not a “series”)
      • 2 = No (is a “series”)
    • Rule: If V4019 = 2, THEN WGTVICCY = WGTVICCY * v4016.

    • SERIES_WEIGHT and SERIES_IWEIGHT already take this into account.

  • V4015 - YEAR INCIDENT OCCURRED (for 2018 file, either 2017 or 2018)

  • YEARQ - YEAR AND QUARTER OF INTERVIEW.

  • V4012 - Incident number. (Accounts for most duplicated IDs in _004 files)

    • “When multiple incidents are reported in response to a single screen question, this variable indicates which of these incidents is being discussed”

II. Interracial Homicides (2015-2019)

Homicide data containing race/ethnicity data for both victims and offenders can be found in the FBI’s Expanded Homicide Data Tables. The code below downloads and processes the .csv files available at the following URL’s:

Functions

###HOMICIDES 2018
##Expanded Homicide Data Table 6
##Race, Sex, and Ethnicity of Victim by Race, Sex, and Ethnicity of Offender, 2018

##function to format homicide tables
format_homicide_tables <- function(url){
  require(htmltab)
  year <- as.numeric(substr(url, start = 38, stop = 41))
  d_url <- htmltab(doc = url, which = "//*[@id='tablecontainer']")
  
  d <- d_url[c(1:4,12),c(1:6,10)]
  
  new_names <- dplyr::case_when( 
    grepl("Black", names(d), ignore.case = TRUE) ~ "Black",
    grepl("White", names(d), ignore.case = TRUE) ~ "White/Hisp",
    grepl("Other", names(d), ignore.case = TRUE) ~ "Other",
    grepl("Unknown", names(d), ignore.case = TRUE) ~ "Unknown",
    grepl("Hispanic|Latino", names(d), ignore.case = TRUE) ~ "Hispanic",
    grepl("victim", names(d), ignore.case = TRUE) ~ "Victim Race",
    TRUE ~ trimws(names(d)))
  
  names(d) <- new_names
  
  d <- d %>% 
    ##remove commas and convert to numeric
    dplyr::mutate_at(2:7, list(~ as.numeric(gsub(",", "", .)))) %>% 
    dplyr::mutate(
      White = `White/Hisp` - Hispanic,
                  
      `Victim Race` = dplyr::case_when( 
        grepl("Black", `Victim Race`, ignore.case = TRUE) ~ "Black",
        grepl("White", `Victim Race`, ignore.case = TRUE) ~ "White/Hisp",
        grepl("Other", `Victim Race`, ignore.case = TRUE) ~ "Other",
        grepl("Unknown", `Victim Race`, ignore.case = TRUE) ~ "Unknown",
        grepl("Hispanic|Latino", `Victim Race`, ignore.case = TRUE) ~ "Hispanic",
        TRUE ~ trimws(`Victim Race`))) %>% 
    
    tidyr::pivot_longer(-c(`Victim Race`, Total), names_to = "Offender Race", values_to = "Homicides") 
  
    ##to isolate the White from the White/Hispanic for Victims...
    d_wh <- d %>% dplyr::filter(`Offender Race` != "White/Hisp") %>% 
      dplyr::select(-Total) %>% 
      tidyr::pivot_wider(names_from = `Victim Race`, 
                         values_from = Homicides) %>% 
      dplyr::mutate(White = `White/Hisp` - Hispanic) %>% 
      tidyr::pivot_longer(-`Offender Race`, 
                          names_to = "Victim Race", 
                          values_to = "Homicides") %>% 
      dplyr::filter(`Victim Race` == "White") %>% 
      dplyr::group_by(`Victim Race`) %>% 
      dplyr::mutate(Total = sum(Homicides))
      dplyr::bind_rows(d, d_wh) %>% 
        dplyr::mutate(Year = year)
  
}

url_2015 <- "https://ucr.fbi.gov/crime-in-the-u.s/2015/crime-in-the-u.s.-2015/tables/expanded_homicide_data_table_6_murder_race_and_sex_of_vicitm_by_race_and_sex_of_offender_2015.xls"
url_2016 <- "https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/tables/expanded-homicide-data-table-3.xls"
url_2017 <- "https://ucr.fbi.gov/crime-in-the-u.s/2017/crime-in-the-u.s.-2017/tables/expanded-homicide-data-table-6.xls"
url_2018 <- "https://ucr.fbi.gov/crime-in-the-u.s/2018/crime-in-the-u.s.-2018/tables/expanded-homicide-data-table-6.xls"
url_2019 <- "https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019/tables/expanded-homicide-data-table-6.xls"

homicides <- purrr::map_dfr(c(url_2015, url_2016, url_2017, url_2018, url_2019), 
                            format_homicide_tables)

Tables

d_h <- homicides %>% 
  ##collapse other and unknown together
  dplyr::mutate(`Victim Race` = dplyr::recode(`Victim Race`, 
                                              "Unknown" = "Other"), 
                `Offender Race` = dplyr::recode(`Offender Race`, 
                                                "Unknown" = "Other")) %>% 
  dplyr::group_by(`Victim Race`, `Offender Race`) %>% 
  dplyr::summarise(Homicides = sum(Homicides)) %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black", "Hispanic", "Other"), 
                `Offender Race` %in% c("White", "Black", "Hispanic", "Other")) %>% 
  dplyr::group_by(`Victim Race`) %>% 
  dplyr::mutate(`Total Homicides` = sum(Homicides)) %>% 
  dplyr::ungroup() %>% 
  dplyr::mutate(
    Percent = Homicides/`Total Homicides`, 
    Label = paste0(signs::signs(Homicides, 
                                format = scales::comma, 
                                accuracy = 1), 
                   " (", 
                   signs::signs(Percent, format = scales::percent), ")")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Label, `Total Homicides`) %>% 
  tidyr::pivot_wider(names_from = `Offender Race`, 
                     values_from = Label) %>%
  dplyr::ungroup() %>%
  dplyr::mutate(`Victim Race` = factor(`Victim Race`, 
                                       levels = c("White",  "Black",  
                                                  "Hispanic", "Other"))) %>% 
  dplyr::arrange(`Victim Race`) %>% 
  dplyr::select(`Victim Race`, White, Black, Hispanic, Other, `Total Homicides`) %>% 
  dplyr::mutate(`Total Homicides` = paste0(signs::signs(`Total Homicides`, 
                                                        format = scales::comma, accuracy = 1)))

gt(d_h) %>% 
    tab_header(
    title = "Homicides by Race/Ethnicity of Victims and Offenders (2015-2019)"
    #subtitle = ""
  ) %>% 
  
  tab_style(
    style = cell_text(size = "small", align = "center"),
    locations = cells_body()
  ) %>% 
  
  tab_options(
    footnotes.font.size = 11
  ) %>% 

  
  tab_spanner(
    label = "Offender Race",
    columns = vars(White, Black, Hispanic, Other)) %>% 
  tab_footnote(
    footnote = "Non-Hispanic white victim and offender counts estimated by subtracting Hispanic ethnicity counts from original White counts including both Hispanic and Non-Hispanic Whites.", locations = cells_column_labels(vars(White))) %>%
  
  tab_footnote(
    footnote = "Total homicides summed across years 2015-2019. These tables are only include incidents where some information about the offender is known by law enforcement.  The actual victim counts by race are roughly double what is shown here.", 
    locations = cells_column_labels(vars(`Total Homicides`))) %>%

  tab_source_note(source_note = md("Source:  FBI's Expanded Homicide Data Tables. [2015 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2015/crime-in-the-u.s.-2015/tables/expanded_homicide_data_table_6_murder_race_and_sex_of_vicitm_by_race_and_sex_of_offender_2015.xls), [2016 Table-3](https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/tables/expanded-homicide-data-table-3.xls), [2017 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2017/crime-in-the-u.s.-2017/tables/expanded-homicide-data-table-6.xls), [2018 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2018/crime-in-the-u.s.-2018/tables/expanded-homicide-data-table-6.xls), [2019 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019/tables/expanded-homicide-data-table-6.xls)")) 
Homicides by Race/Ethnicity of Victims and Offenders (2015-2019)
Victim Race Offender Race Total Homicides2
White1 Black Hispanic Other
White 9,277 (75.197%) 1,963 (15.911%) 692 (5.609%) 405 (3.283%) 12,337
Black 740 (5.162%) 12,751 (88.950%) 476 (3.321%) 368 (2.567%) 14,335
Hispanic 562 (12.461%) 726 (16.098%) 3,029 (67.162%) 193 (4.279%) 4,510
Other 378 (22.487%) 308 (18.322%) 150 (8.923%) 845 (50.268%) 1,681
Source: FBI's Expanded Homicide Data Tables. 2015 Table-6, 2016 Table-3, 2017 Table-6, 2018 Table-6, 2019 Table-6

1 Non-Hispanic white victim and offender counts estimated by subtracting Hispanic ethnicity counts from original White counts including both Hispanic and Non-Hispanic Whites.

2 Total homicides summed across years 2015-2019. These tables are only include incidents where some information about the offender is known by law enforcement. The actual victim counts by race are roughly double what is shown here.