2020-10-27

For data pertaining to violent offending by race see also: https://rpubs.com/johnbradford/violent-crime

Census Data

Below I obtain population data from the US Census to calculate population rates of interracial violent crime. To be consistent with the original NCVS excel file on the BJS website, population data for White, Black, and Asian are all non-Hispanic. I use the tidycensus R package to download the population data from the 2014-2018 5-Year American Community Survey.

library(tidyverse)
library(tidycensus)
library(gt)


##CENSUS DATA 
##American Community Survey (ACS)
#acs_vars <- tidycensus::load_variables(2018, "acs5")

census_data <- tidycensus::get_acs(geography = "us", geometry = FALSE, year = 2018, survey = "acs5",
          variables = c(Total = "B01003_001",
                        White = "B03002_003",
                        Black = "B03002_004",
                        Asian = "B03002_006",
                        `Native American` = "B03002_005",
                        Hawaiian = "B03002_007",
                        Hispanic = "B03002_012")) %>% 
  dplyr::select(variable, estimate) %>% 
  tidyr::pivot_wider(names_from = variable, values_from = estimate) %>% 
  tidyr::pivot_longer(-Total, names_to = "Race", values_to = "Population") 



##display in table

census_data %>% 
  dplyr::mutate(Percent = Population/Total, 
                Race = factor(Race, levels = c("White", "Hispanic", "Black", "Asian", "Native American", "Hawaiian"))) %>% 
  dplyr::select(-Total) %>% 
  dplyr::arrange(Race) %>% 
  gt::gt(.) %>% 
    tab_header(
    title = "Population by Race",
    subtitle = "US Census 5-Year American Community Survey (2014-2018)"
  ) %>% 
  fmt_number(
    data = .,
    columns = c(2),
    rows = NULL,
    decimals = 2,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    scale_by = 1,
    suffixing = TRUE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  fmt_percent(
    data = .,
    columns = c(3),
    placement = "right",
    decimals = 2,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  
  tab_style(
    style = cell_text(size = "medium", align = "center", font = "Roboto Condensed"),
    locations = list(cells_body(columns = 1:3))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "left", font = "Arial Narrow", size = "medium"), 
    locations = list(cells_body(columns = 1), cells_column_labels(columns = 1))
  ) %>% 
  
  tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_column_labels(columns = 1:3))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "center", font = "Arial Narrow"), 
    locations = list(cells_column_labels(columns = 2:3), 
                     cells_title(groups = c("title")))
  ) %>% 
  tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_title(groups = c("title")))
  ) %>% 
  tab_source_note(source_note = md("White, Black, Asian, Native American, and Native Hawaiian include only non-Hispanic; Hispanic estimate includes all races."))

Population by Race
US Census 5-Year American Community Survey (2014-2018)
Race	Population	Percent
White	197.18M	61.07%
Hispanic	57.52M	17.81%
Black	39.72M	12.30%
Asian	17.37M	5.38%
Native American	2.14M	0.66%
Hawaiian	525.10K	0.16%
White, Black, Asian, Native American, and Native Hawaiian include only non-Hispanic; Hispanic estimate includes all races.

I. Nonfatal Interracial Violence

Data pertaining to interracial violent victimizations (excluding murder) can be obtained from the National Crime Victimization Survey (NCVS). The Bureau of Justice Statistics (BJS) provides the following description on their website:

“The Bureau of Justice Statistics’ (BJS) National Crime Victimization Survey (NCVS) is the nation’s primary source of information on criminal victimization. Each year, data are obtained from a nationally representative sample of about 240,000 interviews on criminal victimization, involving 160,000 unique persons in about 95,000 households…. The NCVS collects information on nonfatal personal crimes (i.e., rape or sexual assault, robbery, aggravated and simple assault, and personal larceny) … both reported and not reported to police.”

The BJS has made available on their website data pertaining to interracial violence for 2018 and 2019, which can be downloaded as excel files.

Criminal Victimization, 2019, click the link Data tables to download the zipped excel files. Data pertaining to interracial violence can be found in Table 15 (file cv19t15.csv).
Criminal Victimization, 2018, click the link Data tables to download the zipped excel files. Data pertaining to interracial violence can be found in Table 14 (file cv18t14.csv).

Unfortunately, the BJS has not published analogous files for earlier versions of the survey from 2015 and 2016, and the format of the data they do publish changes from year to year. However, the raw data can be obtained from the Inter-university Consortium for Political and Social Research (ICPSR). The drawback is that these data are not easily accessible to the general public. The files are huge, require code books to interpret, and are generally unmanageable without some prior experience in coding and/or data analysis.

My presentation of the data available on the web is completely reproducible in R. The code below will automatically download, unzip, read and format the comma separated values (csv) files contained on the BJS website. My presentation of the ICPSR data is, in contrast, mostly reproducible, but requires that you first download the .tsv formatted files from the following links:

National Crime Victimization Survey, 2015 (ICPSR 36448) - https://doi.org/10.3886/ICPSR36448.v1
National Crime Victimization Survey, [United States], 2016: Revised Version (ICPSR 37296) - https://doi.org/10.3886/ICPSR37296.v1
National Crime Victimization Survey, [United States], 2018 (ICPSR 37297) - https://doi.org/10.3886/ICPSR37297.v1

For practitioners of R, note that because the files are so large, I load and process the files using the data.table package, which might not be as familiar as the tidyverse or base R. For more information on data.table syntax, see: https://rdatatable.gitlab.io/data.table/index.html .

A. Online Data (2018, 2019)

All of the following tables and graphs are derived from the National Crime Victimization Survey (NCVS) available on the BJS website at Criminal Victimization, 2019 and Criminal Victimization, 2018.

Functions

process_online_ncvs_data <- function(df){
 
  require(dplyr)
  require(tidyr)
  require(data.table)
  
  dt <- data.table::as.data.table(df)
  dt[, `:=` (`Total Victimizations` = sum(Victimizations))]
  dt[, `:=` (`Total Offender Race` = sum(Victimizations)), by = .(`Offender Race`)]
  dt[, index := .I]
  dt[, Interracial_flag := ifelse(`Victim Race` == `Offender Race`, "No", "Yes")]
  all_races <- unique(c(dt$`Victim Race`))
  all_pairs_list <- combn(all_races, 2, simplify = FALSE)
  
  
  for(i in seq_along(all_pairs_list)){
    race_pairs <- all_pairs_list[[i]]
    race1 <- race_pairs[1]; race2 <- race_pairs[2]
    dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1), 
       `Total Interracial Pairs` := sum(Victimizations)]
    
    dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1),  
       `Race Pair` := paste0(race1, "-", race2), by = index]
  }
  
  
  dt[, `:=` (
    `Percent of Total` = Victimizations/`Total Victimizations`,
    `Percent of Offender Race` = Victimizations/`Total Offender Race`,
    `Percent of Victim Race` = Victimizations/`Total Victim Race`,
    `Percent Interracial` = ifelse(Interracial_flag == "Yes", 
                                   Victimizations/`Total Interracial`, 
                                   NA_real_),
    `Percent Interracial Pairs` = ifelse(Interracial_flag == "Yes", 
                                         Victimizations/`Total Interracial Pairs`, 
                                         NA_real_))]
  
  race_levels <- c("White", "Black", "Hispanic", "Asian")
  race_levels <- race_levels[which(race_levels %in% 
                                     unique(dt[, c(`Offender Race`, 
                                                   `Victim Race`)]))]
  dt <- dt[`Offender Race` %in% race_levels & `Victim Race` %in% race_levels]
  dt[, `:=`(
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels))]
  
  gVars <- c("Offender Race", "Victim Race")
  gVars <- gVars[which(gVars %in% names(dt))]
  data.table::setorderv(dt, cols = gVars)
  
  ##reorganize columns
  dt[, index := NULL]
  dt[is.na(`Race Pair`) & Interracial_flag == "No", `Race Pair` := `Victim Race`]
  
  ##rearranging columns
  keep_vars <- unique(c(gVars, "Race Pair", "Victimizations", 
                        "Total Victimizations", "Percent of Total",
                        "Total Victim Race", "Percent of Victim Race",
                        "Total Offender Race", "Percent of Offender Race",
                        "Percent of Offender Race", 
                        "Total Interracial",  "Percent Interracial",  
                        "Total Interracial Pairs", "Percent Interracial Pairs"))
  dt <- dt[, ..keep_vars]
  
  ## I relabel some columns to save space
  col_names <- names(dt)
  new_names <- col_names %>% gsub("Percent ", "% ", .) %>% 
    gsub("Victimizations", "Victims", .) %>% 
    gsub("Interracial", "Inter-racial", .)
  names(dt) <- new_names

  dt

  
}

Get Data

2019 Data

## Online BJS Files
##National Crime Victimization Survey


library(tidyverse)
library(data.table)


## 2019 Data
temp <- tempfile()
download.file("https://www.bjs.gov/content/pub/sheets/cv19.zip", temp)


ncvs_t15 <- readr::read_csv(unz(temp, filename="cv19t15.csv"), 
                            skip = 11, n_max = 4) %>% 
  dplyr::select(-matches("X")) %>% 
  dplyr::rename(`Victim Race` = `Victim race/ethnicity`) %>% 
  dplyr::rename_all(., list(~gsub("\\/.*", "", .))) %>% 
  dplyr::mutate(`Victim Race` = gsub("\\/.*", "", `Victim Race`)) %>% 
  dplyr::rename( Total = `Number of violent incidents`) %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black", "Hispanic", "Other"))

dt_ncvs_19 <- ncvs_t15 %>% 
    dplyr::rename(`Total Victim Race` = Total) %>% 
    tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`), 
                        names_to = "Offender Race", 
                        values_to = "Victimizations") %>% 
    dplyr::filter(`Offender Race` != "Total") %>% 
    dplyr::mutate(
Percent = Victimizations/`Total Victim Race`,
`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])
) %>% 
  ##function defined above
  process_online_ncvs_data(.)

2018 Data

## Online BJS Files
##National Crime Victimization Survey


library(tidyverse)
library(data.table)


temp <- tempfile()
download.file("https://www.bjs.gov/content/pub/sheets/cv18.zip", temp)


ncvs_t14 <- readr::read_csv(unz(temp, filename="cv18t14.csv"), 
                            skip = 11, n_max = 4) %>% 
  dplyr::select(-matches("X")) %>% 
  dplyr::rename(`Victim Race` = `Victim race/ethnicity`) %>% 
  dplyr::mutate(`Asian/a` = as.numeric(gsub("<", "", `Asian/a`))) %>% 
  dplyr::select(-Total) %>% 
  dplyr::rename_all(., list(~gsub("\\/.*", "", .))) %>% 
  dplyr::mutate_at(3:8, ~./100) %>% 
  dplyr::mutate(`Victim Race` = gsub("\\/.*", "", `Victim Race`)) %>% 
  dplyr::rename(`Multiple` = `Multiple offenders of various races`) %>% 
  dplyr::mutate( Victimizations = `Number of violent incidents`) %>% 
  dplyr::select(-`Number of violent incidents`)


dt_ncvs_18 <- ncvs_t14 %>% 
    dplyr::rename(`Total Victim Race` = Victimizations) %>% 
    tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`), names_to = "Offender Race", values_to = "Percent") %>% 
    dplyr::filter(`Offender Race` != "Total") %>% 
    dplyr::mutate(
Victimizations = `Total Victim Race` * (Percent),
`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])
) %>% 

  ##function defined above
  process_online_ncvs_data(.)

2019

library(gt)
library(extrafont)

gt(ncvs_t15) %>% 
  tab_header(
    title = "Nonfatal Violent incidents by victim and offender race (2019)"
    #subtitle = ""
  ) %>% 
  fmt_number(
    data = .,
    columns = c(2:6),
    rows = NULL,
    decimals = 0,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    scale_by = 1,
    suffixing = FALSE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 

  tab_spanner(
    label = "Offender Race",
    columns = vars(White, Black, Hispanic, Other)) %>% 
  tab_footnote(
    footnote = "Includes Asians, Native Hawaiians and Other Pacific Islanders, American Indians and Alaska Natives, persons of two or more races, and multiple offenders of various races.",
    locations = cells_column_labels(vars(Other))) %>%
  
  tab_footnote(
    footnote = "If the victim perceived any of the offenders in a multiple offender incident to be of Hispanic origin, they are classified as Hispanic.",
    locations = cells_column_labels(vars(Hispanic))) %>%
  
  tab_footnote(
    footnote = "Excludes persons of Hispanic origin",
    locations = cells_column_labels(vars(Black, White))) %>%
  
  tab_options(
    footnotes.font.size = 11
  ) %>% 
  
  
  
  tab_style(
    style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
    locations = list(cells_body(columns = 2:6))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "left", font = "Arial Narrow", size = "medium"), 
    locations = list(cells_body(columns = 1))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "center", font = "Arial Narrow"), 
    locations = list(cells_column_labels(columns = 1:6), 
                     cells_column_spanners(spanners ="Offender Race" ), 
                     cells_title(groups = c("title")),
                     cells_stubhead())
  ) %>% 
  
    tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_title(groups = c("title")))
  ) %>% 
  tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2019.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686)   \nFilename: cv19t15.csv"))

Nonfatal Violent incidents by victim and offender race (2019)

Victim Race	Total	Offender Race
Victim Race	Total	White¹	Black¹	Hispanic²	Other³
White	2,796,710	1,722,230	472,570	463,520	138,380
Black	494,610	89,980	346,260	43,730	14,640
Hispanic	774,310	170,840	249,030	334,600	19,840
Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019. Filename: cv19t15.csv
¹ Excludes persons of Hispanic origin ² If the victim perceived any of the offenders in a multiple offender incident to be of Hispanic origin, they are classified as Hispanic. ³ Includes Asians, Native Hawaiians and Other Pacific Islanders, American Indians and Alaska Natives, persons of two or more races, and multiple offenders of various races.

library(gt)


m <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(Victims = round(Victims)) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>% 
  dplyr::group_by(`Offender Race`) %>% 
  dplyr::mutate(`Total Offending` = `White` + `Black` + `Hispanic`) %>% 
  dplyr::ungroup()



gt(m) %>% 
  tab_header(
    title = "Nonfatal Violent Victimizations by Race of Offender and Victim (2019)"
    #subtitle = "Transposed"
  ) %>% 
  fmt_number(
    data = .,
    columns = c(2:5),
    rows = NULL,
    decimals = 0,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    scale_by = 1,
    suffixing = FALSE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  tab_spanner(
    label = "Victim Race",
    columns = vars(White, Black, Hispanic)) %>% 
  
  summary_rows(
    groups = NULL,
    columns = vars(`White`,`Black`, `Hispanic`, `Total Offending`),
    fns = list(`Total` = ~sum(.x, na.rm=TRUE)),
     decimals = 0,
    use_seps = TRUE,
    sep_mark = ",") %>% 
  
  tab_options(
    footnotes.font.size = 11
  ) %>% 
  
  tab_style(
    style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
    locations = list(cells_body(columns = 1:5), 
                     cells_stub(),
                     cells_grand_summary(columns = 1:5, rows = "Total"))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "left", font = "Arial Narrow", size = "medium"), 
    locations = list(cells_body(columns = 2))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "center", font = "Arial Narrow"), 
    locations = list(cells_column_labels(columns = 1:5), 
                     cells_column_spanners(spanners ="Victim Race" ), 
                     cells_title(groups = c("title")),
                     cells_stubhead())
  ) %>% 
  
    tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_title(groups = c("title")))
  ) %>% 

  tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2019.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046)   \nFilename: cv19t15.csv"))

Nonfatal Violent Victimizations by Race of Offender and Victim (2019)

	Offender Race	Victim Race			Total Offending
	Offender Race	White	Black	Hispanic	Total Offending
	White	1,722,230	89,980	170,840	1,983,050
	Black	472,570	346,260	249,030	1,067,860
	Hispanic	463,520	43,730	334,600	841,850
Total	—	2,658,320	479,970	754,470	3,892,760
Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019. Filename: cv19t15.csv

Keep in mind that these figures include only victimizations for which the race of both the offender and victim are known and that the 2019 BJS table includes only White, Black, and Hispanic as categories. Asians, which were included in 2018 as a separate category, are removed and lumped with ‘Other’. The column definitions are as follows:

Victims: frequency of victimizations involving Offender Race and Victim Race.
Total Victim Race: sum of victimizations involving Victim Race across all offender races.
Total Offender Race: sum of victimizations perpetrated by Offender Race across all victim races.
Total Inter-racial: sum of victimizations in which the race of offender and the race victim are different.
Total Inter-racial Pairs: sum of interracial victimizations involving the two races as either victim or offender.

For example, reading across the second row involving White offenders and Black victims, we see that these involved only 2.21% of all victimizations (in which the race of offender and victim are both known); 18% of victimizations of Black victims; 4.5% of victimizations involving White offenders; 5.4% of all interracial violent victimizations; and approximately 16% of violent interracial victimizations between Blacks and Whites, the other 84% involving Black offenders and White victims.

library(reactable)


reactable::reactable(dt_ncvs_19, 
                     striped = TRUE, highlight = TRUE, 
                     bordered = TRUE, pagination = FALSE, 
                     compact = TRUE, resizable = TRUE, 
                     filterable = TRUE,
                    groupBy = NULL,
  ##Default formatting
   defaultColDef = colDef(
     #minWidth = 85, 
     maxWidth = 65,
     align = "center",
     style = list(fontSize = 11, 
                  fontFamily = "'Roboto Condensed', 
                  'Arial Narrow', Merriweather,  Arial, 
                  Helvetica, sans-serif"),
     format = list(cell = colFormat(separators = TRUE, 
                                    percent = FALSE, 
                                    digits = 0)),
     headerStyle =  list(fontWeight = "bold", fontSize = 12),
     footerStyle =  list(fontWeight = "bold"))
                     
  ,columns = list(
     `Offender Race` = colDef(maxWidth = 75, 
                              minWidth = NULL, 
                              align = "left", 
                              aggregate = "unique"),
     `Race Pair` = colDef(maxWidth = 90, 
                          minWidth = NULL, 
                          align = "left", 
                          aggregate = "unique"),
     `Victim Race` = colDef(maxWidth = 65, 
                            minWidth = NULL, 
                            align = "left", 
                            aggregate = "unique"),
     
     `% of Total` = colDef(maxWidth = 50, 
                           align = "center", 
                           format = colFormat(separators = FALSE,
                                              percent = TRUE, 
                                              digits = 2)),
     `Total Victim Race` = colDef(maxWidth = 60),
      `Total Inter-racial Pairs` = colDef(maxWidth = 55),
      `Total Inter-racial` = colDef(maxWidth = 60),
     `% of Victim Race` = colDef(maxWidth = 60, 
                                 align = "center", 
                                 format = colFormat(separators = FALSE, 
                                                    percent = TRUE, 
                                                    digits = 2)),
     `% of Offender Race` = colDef(maxWidth = 75, 
                                   align = "center", 
                                   format = colFormat(separators = FALSE, 
                                                      percent = TRUE, 
                                                      digits = 2)),
     `% Inter-racial` = colDef(maxWidth = 50, 
                               align = "center", 
                               format = colFormat(separators = FALSE, 
                                                  percent = TRUE, 
                                                  digits = 2)),
    
     `% Inter-racial Pairs` = colDef(maxWidth = 55, 
                                     align = "center", 
                                     format = colFormat(separators = FALSE, 
                                                        percent = TRUE, 
                                                        digits = 2))
       )
     )

Converting from long to wide format…

m_19 <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(
    Victims = round(Victims),
    `Victim Race` = paste0(`Victim Race`, " Victims")) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>% 
  dplyr::group_by(`Offender Race`) %>% 
  dplyr::mutate(
    `Total Offending` = `White Victims` + `Black Victims` + `Hispanic Victims`, 
                
    `% of Victims` = list(c(`White Victims`,`Black Victims`, `Hispanic Victims`))
                
    # ,`Victim Counts` = list(c(`White Victims`,`Black Victims`,
    #                                 `Hispanic Victims`,`Asian Victims`))
 ) 


bar_colors <- c("#56B4E9",  "#000000", "#D55E00")
bar_color_list <- list(bar_colors, bar_colors, bar_colors, bar_colors)

reactable::reactable(m_19, striped = TRUE, highlight = TRUE, 
                     bordered = TRUE, compact = FALSE, resizable = FALSE,
                     
 # rowStyle = function(index) {if (index == 5)  list(fontWeight = "bold")},
 # rowClass = function(index) {if (index == 5) {"bold"}},
                     
 defaultColDef = colDef(
   format = colFormat(separators = TRUE, digits = 0), 
   footerStyle =  list(fontWeight = "bold"),
   footer = function(values) {
      if (!is.numeric(values)) return()
      sparkline(values, type = "pie", width = 80, height = 30, 
                sliceColors = c("#56B4E9",  "#000000", "#D55E00"))
    }),
 
   columns = list(
     `Offender Race` = colDef(footer = "% of Offenders"),
     `% of Victims` = colDef(cell = function(values) {
        if (!is.numeric(values)) return()
        sparkline(values, type = "pie", width = 80, height = 30, 
                  sliceColors = c("#56B4E9",  "#000000", "#D55E00"))
      })
   # ,`Victim Counts` = colDef(cell = function(values, index) {
   #      sparkline(values, type = "bar", 
   #                barWidth = 10,
   #                #barColor = c("#56B4E9",  "#000000", "#D55E00", "#F0E442"),
   #                barColor = bar_colors[[index]],
   #                stackedBarColor = bar_colors[[index]],
   #                chartRangeMin = 0, 
   #                chartRangeMax = unlist(m$`Victim Counts`[[index]])
   #                )
   #    })
   )
)

Violent Victimizations by Race of Offender and Victim (2019)

#devtools::install_github("mattflor/chorddiag")
library(chorddiag)

m <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(Victims = round(Victims)) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims)

matrix_names <- list(Offender = m$`Offender Race`, Victim = names(m)[-1])

m <- as.matrix(m[,-1])
dimnames(m) <- matrix_names


groupColors <- c("#56B4E9",  "#000000", "#D55E00")
chorddiag(m, groupColors = groupColors, groupThickness = 0.1, groupPadding = 2, groupnamePadding = 30)

library(plotly)

d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2019 victimizations"))
 

g <- ggplot(d_gg, 
            aes(x = `Victim Race`, 
                y = Victims, 
                fill = `Offender Race`)) +
  geom_col(alpha = 0.75, aes(text = Label)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06), 
                     #labels = scales::unit_format(scale = .001, suffix = "k")
                     labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 12), 
        axis.title.y = element_text(size = 12)) +
  ggtitle("Violent incidents by victim and offender race (2019)") +
  labs(
    x = "Victim Race", 
    y = "Number of violent incidents") +
  geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1), 
            aes(y = `Total Victim Race`, label = victimLabel), 
            vjust = 0, family = "Roboto Condensed") 


sources_label <- "Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>"


p <- plotly::ggplotly(g, width = NULL, height = NULL, #plotly_graph_heights, 
                      tooltip = c("text"), dynamicTicks = FALSE) %>% 
    plotly::config(scrollZoom = FALSE) %>% plotly::layout(dragmode='pan') %>% 
    #plotly::hide_legend() %>% 
    layout(annotations = 
      list(x = 0.5, y = -0.36, text = sources_label, 
        showarrow = F, xref='paper', yref='paper', 
        xanchor='auto', yanchor='auto', xshift=0, yshift=0,
        font=list(size=12, color="black"))
 )

p

# install.packages("remotes")
# remotes::install_github("coolbutuseless/ggpattern")
library(ggpattern)

d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, 
                               format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2019 victimizations"))
 

ggplot(d_gg, aes(x = `Victim Race`, y = Victims, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",Hispanic="#D55E00",Black="#000000",Asian="#F0E442")) +
  ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,90)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06), 
                     #labels = scales::unit_format(scale = .001, suffix = "k")
                     labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 12), 
        axis.title.y = element_text(size = 12)) +
  ggtitle("Nonfatal Violent incidents by victim and offender race (2019)", 
          subtitle = "National Crime Victimization Survey") +
  labs(
    x = "Victim Race", y = "Number of violent incidents",
    caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) +
  geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1), 
            aes(y = `Total Victim Race`, label = victimLabel), 
            vjust = 0, family = "Roboto Condensed")

##rpubs not publishing due to file size
#install.packages("treemapify")
library(treemapify)

d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2019 victimizations"),
    
    Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    Label = paste0(Category, "\n", 
                   signs::signs(Victims, accuracy = 100, format = scales::comma),
                   "\n",
                   signs::signs(`% of Total`, accuracy = 1, 
                                format = scales::percent)),
    colorText = "white" #ifelse(`Offender Race` == "Asian", "Black", "White")
    )


ggplot(d_gg, aes(area = Victims, label = Label, fill = `Offender Race`, 
                 subgroup = Type,
                 subgroup2 = `Offender Race`, 
                 subgroup3 = `Victim Race`)) +
  geom_treemap(alpha = 0.95) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00")) + #"#F0E442")) +
  scale_color_manual(values = c("white")) +
  #ggthemes::scale_fill_colorblind() +
  #ggthemes::scale_fill_ptol() +
  geom_treemap_text(aes(color = "white"), place = "topleft", 
                    reflow = T, family = "Roboto Condensed") +
  geom_treemap_subgroup_border(size = 6, alpha = 1) +
  geom_treemap_subgroup_text(place = "bottom", grow = T, alpha = 0.4, 
                             colour = "white", fontface = "italic", 
                             min.size = 0, 
                             family = "Roboto Condensed", reflow = TRUE) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  ggtitle("Nonfatal Violent incidents within and between races (2019)", 
          subtitle = NULL) +
  labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>"))

Focusing specifically on crime involving Blacks and Whites:

library(treemapify)


d_gg <- ncvs_t15 %>% 
    dplyr::rename(`Total Victim Race` = Total) %>% 
    tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`), 
                        names_to = "Offender Race", 
                        values_to = "Victimizations") %>% 
    #dplyr::filter(`Offender Race` != "Total") %>% 
    dplyr::filter(`Victim Race` %in% c("White", "Black"), 
                  `Offender Race` %in% c("White", "Black")) %>% 
    dplyr::mutate(
  Percent = Victimizations/`Total Victim Race`,
  `Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])) %>% 
  
  process_online_ncvs_data(.) %>% 

  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, 
                               format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2019 victimizations"),
    
    Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    Label = paste0(Category, "\n", signs::signs(Victims, accuracy = 100, 
                                                format = scales::comma),
                   "\n",
                   signs::signs(`% of Total`, accuracy = 1, 
                                format = scales::percent)),
    #colorText = ifelse(`Offender Race` == "Asian", "Black", "White"),
     `Victim Race` = factor(`Victim Race`, levels = c("White",  "Black")), 
    `Offender Race` = factor(`Offender Race`, levels = c("White",  "Black"))
    )



ggplot(d_gg, aes(area = Victims, label = Label, fill = `Offender Race`, 
                 subgroup = Type,
                 subgroup2 = `Offender Race`, 
                 subgroup3 = `Victim Race`)) +
  geom_treemap(alpha = 0.95) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_color_manual(values = c("black", "white")) +
  #ggthemes::scale_fill_colorblind() +
  #ggthemes::scale_fill_ptol() +
  geom_treemap_text(color = "white", place = "topleft", 
                    reflow = T, family = "Roboto Condensed") +
  #geom_treemap_text(aes(color = colorText), place = "topleft", reflow = T, family = "Roboto Condensed") +
  geom_treemap_subgroup_border(size = 6, alpha = 1) +
  geom_treemap_subgroup_text(place = "bottom", grow = T, 
                             alpha = 0.4, colour = "white", 
                             fontface = "italic", min.size = 0, 
                             family = "Roboto Condensed", reflow = TRUE) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  ggtitle("Nonfatal Violent incidents among Blacks and Whites (2019)", 
          subtitle = NULL) +
  labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>"))

race_levels <- c("White", "Black", "Hispanic")

library(ggpattern)

d_gg <- dt_ncvs_19 %>% 
  tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Offender Race` %in% race_levels,
                `Victim Race` %in% race_levels) %>% 
  dplyr::mutate(
   
    Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    
    
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    ##Label = paste0(rateLabel, "\n(", countLabel, ")"),
    Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels),
    
    `Race Pair` = factor(`Race Pair`, levels = c("White-Black", 
                                                 "White-Hispanic",  
                                                 "Black-Hispanic"
                                                 # "White-Asian",
                                                 # "Black-Asian", 
                                                 # "Hispanic-Asian"
                                                 ))
    
    
    )
 

ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
                                                  Black="#000000", 
                                                  Hispanic = "#D55E00", 
                                                  Asian = "#F0E442")) +
  #ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
   ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  #scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  scale_fill_manual(values = c(   "#56B4E9","#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, 
                     expand = expansion(mult = c(0, 0.25)), 
                     labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
        axis.title.y = element_text(size = 14), 
        legend.position = "none", 
        axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Nonfatal Interracial Violent Crime (2019)", 
          subtitle = NULL) +  
           #subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
          #subtitle = "") +
  labs(
    x = "", y = "Number of violent incidents",
    caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) +
  geom_text(aes(y = Victims, label = Label), 
            vjust = 0, nudge_y = 50, family = "Roboto Condensed", 
            fill = NA, size = 4, fontface = "bold") +
  facet_wrap(~`Race Pair`, scales = "free", ncol = 3)

library(ggpattern)

d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black"), 
                `Offender Race` %in% c("White", "Black")) %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, " (", percent_interracial_pair, ")"))
 

ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill =  `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none","stripe" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
                                                  Black="#000000")) +
  ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  #scale_fill_manual(values = c(  "#000000", "#56B4E9", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.2)),
                     labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
        axis.title.y = element_text(size = 14), 
        legend.position = "none", 
        axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Nonfatal Black-White Interracial Violence (2019)", 
          subtitle = NULL) +
          #subtitle = "") +
  labs(
    x = "", y = "Number of violent incidents",
    caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) +
  geom_text(aes(y = `Victims`, label = Label), 
            vjust = 0, nudge_y = 5000, family = "Roboto Condensed", 
            fill = NA, size = 5, fontface = "bold")

race_levels <- c("White", "Black", "Hispanic")

library(ggpattern)

d_gg <- dt_ncvs_19 %>% 
  tibble::as_tibble() %>% 
  dplyr::left_join(census_data %>% 
                     dplyr::select(`Offender Race` = Race, Population), 
                   by = c("Offender Race")) %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Offender Race` %in% race_levels,
                `Victim Race` %in% race_levels) %>% 
  dplyr::mutate(
    `Victimization Rate` = 1e06*Victims / Population, 
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    rateLabel = paste0(signs::signs(`Victimization Rate`, 
                                    format = scales::comma, 
                                    accuracy = 1), 
                       " per 1M ", 
                       `Offender Race`, "s"),
    
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(rateLabel, "\n", countLabel, " (", 
                   percent_interracial_pair, ")"),
    #Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels),
    
    `Race Pair` = factor(`Race Pair`, levels = c("White-Black", 
                                                 "White-Hispanic", 
                                                 "Black-Hispanic" 
                                                 # "White-Asian", 
                                                 # "Black-Asian", 
                                                 # "Hispanic-Asian"
                                                 ))
    
    
    )
 

ggplot(d_gg, aes(x = Category, 
                 y = `Victimization Rate`, 
                 fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9", 
                                                  Black="#000000", 
                                                  Hispanic = "#D55E00", 
                                                  Asian = "#F0E442")) +
  #ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
   ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  #scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  scale_fill_manual(values = c(   "#56B4E9","#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.25))) +  #, labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
        axis.title.y = element_text(size = 14), 
        legend.position = "none", 
        axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Rate of Interracial Nonfatal Violent Crime by Offender Population (2019)", 
          subtitle = NULL) +  
           #subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
          #subtitle = "") +
  labs(
    x = "", y = "Interracial Nonfatal Violent Crime Rate",
    caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>\nUS Census 5-Year American Community Survey (2014-2018)")) +
  geom_text(aes(y = `Victimization Rate`, label = Label), 
            vjust = 0, nudge_y = 50, 
            family = "Roboto Condensed", 
            fill = NA, size = 3.5, 
            fontface = "bold") +
  facet_wrap(~`Race Pair`, scales = "free", ncol = 3)

2018

The original table provided by BJS looks like the following:

## Online BJS Files
##National Crime Victimization Survey
##Download NCVS 2018 (released 2019)


library(tidyverse)
library(data.table)
library(gt)
library(extrafont)


gt(ncvs_t14) %>% 
  tab_header(
    title = "Violent incidents by victim and offender race (2018)"
    #subtitle = ""
  ) %>% 
  fmt_number(
    data = .,
    columns = c(8),
    rows = NULL,
    decimals = 0,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    scale_by = 1,
    suffixing = FALSE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  fmt_percent(
    data = .,
    columns = c(2:7),
    placement = "right",
    decimals = 2,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  
  tab_spanner(
    label = "Offender Race",
    columns = vars(White, Black, Hispanic, Asian, Other, Multiple)) %>% 
  tab_footnote(
    footnote = "Includes Native Americans, Hawaiians, and multiracial individuals.",
    locations = cells_column_labels(vars(Other))) %>%
  tab_footnote(
    footnote = "Multiple offenders of various races.",
    locations = cells_column_labels(vars(Multiple))) %>%

  tab_options(
    footnotes.font.size = 11
  ) %>% 
  
  tab_style(
    style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
    locations = list(cells_body(columns = 2:8))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "left", font = "Arial Narrow", size = "medium"), 
    locations = list(cells_body(columns = 1))
  ) %>% 

  
  tab_style(
    style = cell_text(align = "center", font = "Arial Narrow"), 
    locations = list(cells_column_labels(columns = 1:8), 
                     cells_column_spanners(spanners ="Offender Race" ), 
                     cells_title(groups = c("title")),
                     cells_stubhead())
  ) %>% 
  
    tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_title(groups = c("title")))
  ) %>% 
  tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2018.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686)   \nFilename: cv18t14.csv"))

Violent incidents by victim and offender race (2018)

Victim Race	Offender Race						Victimizations
Victim Race	White	Black	Hispanic	Asian	Other¹	Multiple²	Victimizations
White	62.10%	15.30%	10.20%	2.20%	8.10%	2.10%	3,581,360
Black	10.60%	70.30%	7.90%	0.10%	9.30%	1.90%	563,940
Hispanic	28.20%	15.30%	45.40%	0.60%	7.40%	3.00%	734,410
Asian	24.10%	27.50%	7.00%	24.10%	14.40%	2.90%	182,230
Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018. Filename: cv18t14.csv
¹ Includes Native Americans, Hawaiians, and multiracial individuals. ² Multiple offenders of various races.

From these percentages, we can derive the frequencies (counts). Below I show the frequencies of violent victimizations and transpose the table so that the rows show Offender Race and the columns show Vicim Race.

library(gt)


m <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(Victims = round(Victims)) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>% 
  dplyr::group_by(`Offender Race`) %>% 
  dplyr::mutate(`Total Offending` = `White` + `Black` + `Hispanic` + `Asian`) %>% 
  dplyr::ungroup()



gt(m) %>% 
  tab_header(
    title = "Violent Victimizations by Race of Offender and Victim (2018)"
    #subtitle = "Transposed"
  ) %>% 
  fmt_number(
    data = .,
    columns = c(2:6),
    rows = NULL,
    decimals = 0,
    drop_trailing_zeros = FALSE,
    use_seps = TRUE,
    scale_by = 1,
    suffixing = FALSE,
    pattern = "{x}",
    sep_mark = ",",
    dec_mark = ".",
    locale = NULL) %>% 
  tab_spanner(
    label = "Victim Race",
    columns = vars(White, Black, Hispanic, Asian)) %>% 
  summary_rows(
    groups = NULL,
    columns = vars(`White`,`Black`, `Hispanic`, `Asian`, `Total Offending`),
    fns = list(`Total` = ~sum(.x, na.rm=TRUE)),
     decimals = 0,
    use_seps = TRUE,
    sep_mark = ",") %>% 
  
  tab_style(
    style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
    locations = list(cells_body(columns = 1:7), 
                     cells_stub(),
                     cells_grand_summary(columns = 1:7, rows = "Total"))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "left", font = "Arial Narrow", size = "medium"), 
    locations = list(cells_body(columns = 2))
  ) %>% 
  
  tab_style(
    style = cell_text(align = "center", font = "Arial Narrow"), 
    locations = list(cells_column_labels(columns = 1:6), 
                     cells_column_spanners(spanners ="Victim Race" ), 
                     cells_title(groups = c("title")),
                     cells_stubhead())
  ) %>% 
  
    tab_style(
    style = cell_text(weight = "bolder"), 
    locations = list(cells_title(groups = c("title")))
  ) %>% 

  tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2018.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686)   \nFilename: cv18t14.csv"))

Violent Victimizations by Race of Offender and Victim (2018)

	Offender Race	Victim Race				Total Offending
	Offender Race	White	Black	Hispanic	Asian	Total Offending
	White	2,224,025	59,778	207,104	43,917	2,534,824
	Black	547,948	396,450	112,365	50,113	1,106,876
	Hispanic	365,299	44,551	333,422	12,756	756,028
	Asian	78,790	564	4,406	43,917	127,677
Total	—	3,216,062	501,343	657,297	150,703	4,525,405
Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018. Filename: cv18t14.csv

From the table above provided by the BJS, we can derive the frequencies of both offenders and victims by race. Below I have removed Offenders classified as ‘Other’ or ‘Multiple’ race. but not before calculating the total number of victimizations including them. Keep in mind is that these figures include only victimizations for which the race of both the offender and victim are known.

For example, reading across the second row involving White offenders and Black victims, we see that these involved only 1.18% of all victimizations (in which the race of offender and victim are both known); 10.6% of victimizations of Black victims; 2.36% of victimizations involving White offenders; 2.9% of all interracial violent victimizations; and approximately 10% of violent interracial victimizations between Blacks and Whites, the other 90% involving Black offenders and White victims.

library(reactable)


reactable::reactable(dt_ncvs_18, striped = TRUE, highlight = TRUE, 
                     bordered = TRUE, pagination = FALSE, 
                     compact = TRUE, resizable = TRUE, 
                     filterable = TRUE,
                       groupBy = NULL,
   defaultColDef = colDef(
     #minWidth = 85, 
     maxWidth = 65,
     align = "center",
     style = list(fontSize = 11, 
                  fontFamily = "'Roboto Condensed', 
                  'Arial Narrow', Merriweather,  Arial, 
                  Helvetica, sans-serif"),
     format = list(cell = colFormat(separators = TRUE, 
                                    percent = FALSE, digits = 0)),
     headerStyle =  list(fontWeight = "bold", fontSize = 12),
     footerStyle =  list(fontWeight = "bold"))
   
   ,columns = list(
     `Offender Race` = colDef(maxWidth = 75, 
                              minWidth = NULL, 
                              align = "left", 
                              aggregate = "unique"),
     `Race Pair` = colDef(maxWidth = 90, 
                          minWidth = NULL, 
                          align = "left", 
                          aggregate = "unique"),
     `Victim Race` = colDef(maxWidth = 65, 
                            minWidth = NULL, 
                            align = "left", 
                            aggregate = "unique"),
     
     `% of Total` = colDef(maxWidth = 50, 
                           align = "center", 
                           format = colFormat(separators = FALSE, 
                                              percent = TRUE, 
                                              digits = 2)),
     `Total Victim Race` = colDef(maxWidth = 60),
      `Total Inter-racial Pairs` = colDef(maxWidth = 55),
      `Total Inter-racial` = colDef(maxWidth = 60),
     `% of Victim Race` = colDef(maxWidth = 60, 
                                 align = "center", 
                                 format = colFormat(separators = FALSE, 
                                                    percent = TRUE, 
                                                    digits = 2)),
     `% of Offender Race` = colDef(maxWidth = 75, 
                                   align = "center", 
                                   format = colFormat(separators = FALSE, 
                                                      percent = TRUE, 
                                                      digits = 2)),
     `% Inter-racial` = colDef(maxWidth = 50, 
                               align = "center", 
                               format = colFormat(separators = FALSE, 
                                                  percent = TRUE, 
                                                  digits = 2)),
    
     `% Inter-racial Pairs` = colDef(maxWidth = 55, 
                                     align = "center", 
                                     format = colFormat(separators = FALSE, 
                                                        percent = TRUE, 
                                                        digits = 2))
     )
                     )

The table above is in ‘long-format’. We can simplify greatly by pivoting to wide format as shown below. The pie charts on the far right under the column header % of Victims show the racial make-up of victims of offenders of each race.

For example, reading from left to right across the top row, we see that roughly 88% of the victims of White offenders are themselves White. The pie graphs in the last last row % of Offenders, by contrast, show the racial make-up of the offenders who victimized each racial group. For example, in the Black Victims column we see that roughly 79% of Black victims of violence are victimized by other Blacks.

library(reactable)
library(sparkline)


m <-  dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(
    Victims = round(Victims),
    `Victim Race` = paste0(`Victim Race`, " Victims")) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>% 
  dplyr::group_by(`Offender Race`) %>% 
  dplyr::mutate(
    `Total Offending` = `White Victims` + `Black Victims` + 
      `Hispanic Victims` + `Asian Victims`, 
                
    `% of Victims` = list(c(`White Victims`,`Black Victims`,
                                   `Hispanic Victims`,`Asian Victims`))
                
    # ,`Victim Counts` = list(c(`White Victims`,`Black Victims`,
    #                                 `Hispanic Victims`,`Asian Victims`))
 ) 


bar_colors <- c("#56B4E9",  "#000000", "#D55E00", "#F0E442")
bar_color_list <- list(bar_colors, bar_colors, bar_colors, bar_colors)

reactable::reactable(m, striped = TRUE, highlight = TRUE, 
                     bordered = TRUE, compact = FALSE, resizable = FALSE,
                     
 # rowStyle = function(index) {if (index == 5)  list(fontWeight = "bold")},
 # rowClass = function(index) {if (index == 5) {"bold"}},
                     
 defaultColDef = colDef(
   format = colFormat(separators = TRUE, digits = 0), 
   footerStyle =  list(fontWeight = "bold"),
   footer = function(values) {
      if (!is.numeric(values)) return()
      sparkline(values, type = "pie", width = 80, height = 30, 
                sliceColors = c("#56B4E9",  "#000000", "#D55E00", "#F0E442"))
    }),
 
   columns = list(
     `Offender Race` = colDef(footer = "% of Offenders"),
     `% of Victims` = colDef(cell = function(values) {
        if (!is.numeric(values)) return()
        sparkline(values, type = "pie", width = 80, height = 30, 
                  sliceColors = c("#56B4E9",  "#000000", "#D55E00", "#F0E442"))
      })
   # ,`Victim Counts` = colDef(cell = function(values, index) {
   #      sparkline(values, type = "bar", 
   #                barWidth = 10,
   #                #barColor = c("#56B4E9",  "#000000", "#D55E00", "#F0E442"),
   #                barColor = bar_colors[[index]],
   #                stackedBarColor = bar_colors[[index]],
   #                chartRangeMin = 0, 
   #                chartRangeMax = unlist(m$`Victim Counts`[[index]])
   #                )
   #    })
   )
)

Violent Victimizations by Race of Offender and Victim (2018)

The information above can be encoded in a chord diagram, shown below. The ‘total’ for each race, depicted by the color and size of the outside segments correspond to the offender race and not the race of the victim.

library(chorddiag)

m <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Victims) %>% 
  dplyr::mutate(Victims = round(Victims)) %>% 
  tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims)

matrix_names <- list(Offender = m$`Offender Race`, Victim = names(m)[-1])

m <- as.matrix(m[,-1])
dimnames(m) <- matrix_names


groupColors <- c("#56B4E9",  "#000000", "#D55E00", "#F0E442")
chorddiag(m, groupColors = groupColors, groupThickness = 0.1, 
          groupPadding = 2, groupnamePadding = 30)

Chord diagram showing violent victimizations by race of offender and victim (2018)

Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018. : cv18t14.csv

library(plotly)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2018 victimizations"))
 

g <- ggplot(d_gg, aes(x = `Victim Race`, y = Victims, 
                      fill = `Offender Race`)) +
  geom_col(alpha = 0.75, aes(text = Label)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06), 
                     #labels = scales::unit_format(scale = .001, suffix = "k")
                     labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 12), 
        axis.title.y = element_text(size = 12)) +
  ggtitle("Violent incidents by victim and offender race (2018)") +
  labs(
    x = "Victim Race", 
    y = "Number of violent incidents") +
  geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1), 
            aes(y = `Total Victim Race`, label = victimLabel), 
            vjust = 0, family = "Roboto Condensed") 


sources_label <- "Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>"


p <- plotly::ggplotly(g, width = NULL, height = NULL, #plotly_graph_heights, 
                      tooltip = c("text"), dynamicTicks = FALSE) %>% 
    plotly::config(scrollZoom = FALSE) %>% plotly::layout(dragmode='pan') %>% 
    #plotly::hide_legend() %>% 
    layout(annotations = 
      list(x = 0.5, y = -0.36, text = sources_label, 
        showarrow = F, xref='paper', yref='paper', 
        xanchor='auto', yanchor='auto', xshift=0, yshift=0,
        font=list(size=12, color="black"))
 )

p

Here is a more printer-friendly version using the ggpattern package.

library(ggpattern)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2018 victimizations"))
 

ggplot(d_gg, aes(x = `Victim Race`, y = Victims, 
                 fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", 
                                                "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",Hispanic="#D55E00",Black="#000000",Asian="#F0E442")) +
  ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,90)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06), 
                     #labels = scales::unit_format(scale = .001, suffix = "k")
                     labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 12), 
        axis.title.y = element_text(size = 12)) +
  ggtitle("Nonfatal Violent incidents by victim and offender race (2018)", 
          subtitle = "National Crime Victimization Survey") +
  labs(
    x = "Victim Race", y = "Number of violent incidents",
    caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) +
  geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1), 
            aes(y = `Total Victim Race`, label = victimLabel), 
            vjust = 0, family = "Roboto Condensed")

Below I use the treemapify package.

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, 
                               format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2018 victimizations"),
    
    Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    Label = paste0(Category, "\n", signs::signs(Victims, accuracy = 100, 
                                                format = scales::comma),
                   "\n",
                   signs::signs(`% of Total`, accuracy = 1, 
                                format = scales::percent)),
    colorText = ifelse(`Offender Race` == "Asian", "Black", "White")
    )


ggplot(d_gg, aes(area = Victims, 
                 label = Label, 
                 fill = `Offender Race`, 
                 subgroup = Type,
                 subgroup2 = `Offender Race`, 
                 subgroup3 = `Victim Race`)) +
  geom_treemap(alpha = 0.95) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_color_manual(values = c("black", "white")) +
  #ggthemes::scale_fill_colorblind() +
  #ggthemes::scale_fill_ptol() +
  geom_treemap_text(aes(color = colorText), 
                    place = "topleft", reflow = T, 
                    family = "Roboto Condensed") +
  geom_treemap_subgroup_border(size = 6, alpha = 1) +
  geom_treemap_subgroup_text(place = "bottom", 
                             grow = T, alpha = 0.4, colour = "white", 
                             fontface = "italic", min.size = 0, 
                             family = "Roboto Condensed", 
                             reflow = TRUE) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  ggtitle("Nonfatal Violent incidents within and between races (2018)", 
          subtitle = NULL) +
  labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>"))

Focusing specifically on crime involving Blacks and Whites.

library(treemapify)


d_gg <-  ncvs_t14 %>% 
  dplyr::rename(`Total Victim Race` = Victimizations) %>% 
  tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`), 
                      names_to = "Offender Race", 
                      values_to = "Percent") %>% 
  #dplyr::filter(`Offender Race` != "Total") %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black"), 
                `Offender Race` %in% c("White", "Black")) %>% 
  dplyr::mutate(
      Victimizations = `Total Victim Race` * (Percent),
      `Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])) %>% 

  process_online_ncvs_data(.) %>% 

  dplyr::mutate(
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_offender_label = signs::signs(`% of Offender Race`, 
                                          format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, format = scales::percent),
    Label = paste0(countLabel, "\n", 
                   percent_victim_label, " of ", 
                   `Victim Race`, " victims\n", 
                   percent_offender_label, " of ", 
                   `Offender Race`, " offender victimizations\n", 
                   percent_total_label, " of all 2018 victimizations"),
    
    Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    Label = paste0(Category, "\n", signs::signs(Victims, accuracy = 100, 
                                                format = scales::comma),
                   "\n",
                   signs::signs(`% of Total`, accuracy = 1, 
                                format = scales::percent)),
    #colorText = ifelse(`Offender Race` == "Asian", "Black", "White"),
     `Victim Race` = factor(`Victim Race`, levels = c("White",  "Black")), 
    ##for ordering graphs
    `Offender Race` = factor(`Offender Race`, levels = c("White",  "Black"))
    )


ggplot(d_gg, aes(area = Victims, 
                 label = Label, 
                 fill = `Offender Race`, 
                 subgroup = Type,
                 subgroup2 = `Offender Race`, 
                 subgroup3 = `Victim Race`)) +
  geom_treemap(alpha = 0.95) +
  scale_fill_manual(values = c("#56B4E9",  "#000000", "#D55E00", "#F0E442")) +
  scale_color_manual(values = c("black", "white")) +
  #ggthemes::scale_fill_colorblind() +
  #ggthemes::scale_fill_ptol() +
  geom_treemap_text(color = "white", place = "topleft", 
                    reflow = T, family = "Roboto Condensed") +
  #geom_treemap_text(aes(color = colorText), place = "topleft", reflow = T, family = "Roboto Condensed") +
  geom_treemap_subgroup_border(size = 6, alpha = 1) +
  geom_treemap_subgroup_text(place = "bottom", grow = T, 
                             alpha = 0.4, colour = "white", 
                             fontface = "italic", min.size = 0, 
                             family = "Roboto Condensed", 
                             reflow = TRUE) +
  hrbrthemes::theme_ipsum_rc(base_size = 12) +
  ggtitle("Nonfatal Violent incidents among Blacks and Whites (2018)", 
          subtitle = NULL) +
  labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>"))

race_levels <- c("White", "Black", "Hispanic", "Asian")

library(ggpattern)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Offender Race` %in% race_levels,
                `Victim Race` %in% race_levels) %>% 
  dplyr::mutate(
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    
    
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    ##Label = paste0(rateLabel, "\n(", countLabel, ")"),
    Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels),
    
    `Race Pair` = factor(`Race Pair`, levels = c("White-Black", "White-Hispanic", 
                                                 "White-Asian", "Black-Hispanic", 
                                                 "Black-Asian", "Hispanic-Asian"))
    
    
    )
 

ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", 
                                                "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
                                                  Black="#000000", 
                                                  Hispanic = "#D55E00", 
                                                  Asian = "#F0E442")) +
  #ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
   ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  #scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  scale_fill_manual(values = c(   "#56B4E9","#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, 
                     expand = expansion(mult = c(0, 0.25)), 
                     labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
        axis.title.y = element_text(size = 14), 
        legend.position = "none", 
        axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Nonfatal Interracial Violent Crime (2018)", 
          subtitle = NULL) +  
           #subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
          #subtitle = "") +
  labs(
    x = "", y = "Number of violent incidents",
    caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) +
  geom_text(aes(y = Victims, label = Label), 
            vjust = 0, nudge_y = 50, 
            family = "Roboto Condensed", 
            fill = NA, size = 4, 
            fontface = "bold") +
  facet_wrap(~`Race Pair`, scales = "free", ncol = 2)

library(ggpattern)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black"), 
                `Offender Race` %in% c("White", "Black")) %>% 
  #dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::mutate(
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(countLabel, " (", percent_interracial_pair, ")"))
 

ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill =  `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none","stripe" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
                                                  Black="#000000")) +
  ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  #scale_fill_manual(values = c(  "#000000", "#56B4E9", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.2)),
                     labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(
    linetype = 2), panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
    axis.title.y = element_text(size = 14), 
        legend.position = "none", 
    axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Black-White Nonfatal Interracial Violence (2018)", 
          subtitle = NULL) +
          #subtitle = "") +
  labs(
    x = "", y = "Number of nonfatal violent incidents",
    caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) +
  geom_text(aes(y = `Victims`, label = Label), 
            vjust = 0, nudge_y = 5000, 
            family = "Roboto Condensed", 
            fill = NA, 
            size = 5, fontface = "bold")

race_levels <- c("White", "Black", "Hispanic", "Asian")

library(ggpattern)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::left_join(census_data %>% 
                     dplyr::select(`Offender Race` = Race, 
                                   Population), 
                   by = c("Offender Race")) %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Offender Race` %in% race_levels,
                `Victim Race` %in% race_levels) %>% 
  dplyr::mutate(
    `Victimization Rate` = 1e06*Victims / Population, 
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, 
                               format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    rateLabel = paste0(signs::signs(`Victimization Rate`, 
                                    format = scales::comma, accuracy = 1), 
                       " per 1 million ", 
                       `Offender Race`, "s"),
    
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(rateLabel, "\n", countLabel, " (", 
                   percent_interracial_pair, ")"),
    #Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels),
    
    `Race Pair` = factor(`Race Pair`, levels = c("White-Black", 
                                                 "White-Hispanic", 
                                                 "White-Asian", 
                                                 "Black-Hispanic", 
                                                 "Black-Asian", 
                                                 "Hispanic-Asian"))
    
    
    )
 

ggplot(d_gg, aes(x = Category, 
                 y = `Victimization Rate`, 
                 fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill = `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none",  "stripe", 
                                                "circle", "crosshatch" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9", 
                                                  Black="#000000", 
                                                  Hispanic = "#D55E00", 
                                                  Asian = "#F0E442")) +
  #ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
   ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  #scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  scale_fill_manual(values = c(   "#56B4E9","#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, 
                     expand = expansion(mult = c(0, 0.25))) +  
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), 
        panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), 
        axis.title.y = element_text(size = 14), 
        legend.position = "none", 
        axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Rate of Interracial Nonfatal Violent Crime by Offender Population (2018)", 
          subtitle = NULL) +  
           #subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
          #subtitle = "") +
  labs(
    x = "", y = "Interracial Nonfatal Violent Crime Rate",
    caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>\nUS Census 5-Year American Community Survey (2014-2018)")) +
  geom_text(aes(y = `Victimization Rate`, label = Label), 
            vjust = 0, nudge_y = 50, 
            family = "Roboto Condensed", fill = NA, 
            size = 4, fontface = "bold") +
  facet_wrap(~`Race Pair`, scales = "free", ncol = 2)

race_levels <- c("White", "Black", "Hispanic", "Asian")

library(ggpattern)

d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>% 
  dplyr::left_join(census_data %>% dplyr::select(`Offender Race` = Race, 
                                                 Population), 
                   by = c("Offender Race")) %>% 
  dplyr::filter(`Offender Race` != `Victim Race`) %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black"), 
                `Offender Race` %in% c("White", "Black")) %>% 
  dplyr::filter(`Offender Race` %in% race_levels,
                `Victim Race` %in% race_levels) %>% 
  dplyr::mutate(
    `Victimization Rate` = 1e06*Victims / Population, 
     Category = paste0(`Offender Race`, " on ", `Victim Race`),
    Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
    victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
    countLabel = signs::signs(Victims, format = scales::comma),
    rateLabel = paste0(signs::signs(`Victimization Rate`, 
                                    format = scales::comma, accuracy = 1), 
                       " per 1 million ", 
                       `Offender Race`, "s"),
    
    percent_interracial_pair = signs::signs(`% Inter-racial Pairs`, 
                                            format = scales::percent),
    percent_victim_label = signs::signs(`% of Victim Race`, 
                                        format = scales::percent),
    percent_total_label = signs::signs(`% of Total`, 
                                       format = scales::percent),
    Label = paste0(rateLabel, "\n", countLabel, " (", 
                   percent_interracial_pair, ")"),
    #Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels),
    
    `Race Pair` = factor(`Race Pair`, levels = c("White-Black", 
                                                 "White-Hispanic", 
                                                 "White-Asian", 
                                                 "Black-Hispanic", 
                                                 "Black-Asian", 
                                                 "Hispanic-Asian"))
    
    
    )
 


ggplot(d_gg, aes(x = Category, y = `Victimization Rate`, fill = `Offender Race`)) +
  ggpattern::geom_col_pattern(
    aes(pattern = `Offender Race`, 
        fill =  `Offender Race`, 
        pattern_fill = `Offender Race`,
        pattern_angle = `Offender Race`, 
        pattern_spacing = `Offender Race`
        ), 
    colour          = 'white',
    pattern_density = 0.35,
    pattern_colour  = 'white'
  ) +
  ggpattern::scale_pattern_discrete(choices = c("none","stripe" )) +
  ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",Black="#000000", Hispanic = "#D55E00", Asian = "#F0E442")) +
  ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
  ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
  #scale_fill_manual(values = c("#56B4E9",  "#000000")) +
  scale_fill_manual(values = c(   "#56B4E9","#000000", "#D55E00", "#F0E442")) +
  scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.2))) +  #, labels = scales::unit_format(scale = .001, suffix = "k")) +
                     #labels = c("0", "1M", "2M", "3M")) +
  hrbrthemes::theme_ipsum_rc(base_size = 13) +
  theme(panel.grid.major.y = element_line(linetype = 2), panel.grid.major.x = element_blank(), 
        axis.title.x = element_text(size = 14), axis.title.y = element_text(size = 14), 
        legend.position = "none", axis.text.x = element_text(size = 13, face = "bold")
        ) +
  ggtitle("Rate of Black-White Interracial Violent Crime by Offender Population (2018)", subtitle = "Does not include murder") +
          #subtitle = "") +
  labs(
    x = "", y = "Interracial Violent Crime Rate",
    caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>\nUS Census 5-Year American Community Survey (2014-2018)")) +
  geom_text(aes(y = `Victimization Rate`, label = Label), vjust = 0, nudge_y = 50, family = "Roboto Condensed", fill = NA, size = 5, fontface = "bold")

B. ICPSR Data (2015, 2016, 2018)

All of the following tables and graphs are derived from the National Crime Victimization Surveys (NCVS) available at the Inter-university Consortium for Political and Social Research (ICPSR). The ICPSR data can be found at the following links:

National Crime Victimization Survey, 2015 (ICPSR 36448) - https://doi.org/10.3886/ICPSR36448.v1
National Crime Victimization Survey, [United States], 2016: Revised Version (ICPSR 37296) - https://doi.org/10.3886/ICPSR37296.v1
National Crime Victimization Survey, [United States], 2018 (ICPSR 37297) - https://doi.org/10.3886/ICPSR37297.v1

Functions

The code I use to derive the victim and offender counts can be found below.

library(tidyverse)
library(data.table)

get_crime <- function(x){
  data.table::fcase(
    ##Violent Crimes 1-20
    x == 1, "Completed rape",
    x == 2, "Attempted rape",
    x == 3, "Sexual attack with serious assault",
    x == 4, "Sexual attack with minor assault",
    x == 5, "Completed robbery with injury from serious assault",
    x == 6, "Completed robbery with injury from minor assault",
    x == 7, "Completed robbery without injury from minor assault",
    x == 8, "Attempted robbery with injury from serious assault",
    x == 9, "Attempted robbery with injury from minor assault",
    x == 10, "Attempted robbery without injury",
    x == 11, "Completed aggravated assault with injury",
    x == 12, "Attempted aggravated assault with weapon",
    x == 13, "Threatened assault with weapon",
    x == 14, "Simple assault completed with injury",
    x == 15, "Sexual assault without injury",
    x == 16, "Unwanted sexual contact without force",
    x == 17, "Assault without weapon without injury",
    x == 18, "Verbal threat of rape",
    x == 19, "Verbal threat of sexual assault",
    x == 20, "Verbal threat of assault",
    
    ##Purse snatching/pocket picking 21-23
    x == 21, "Completed purse snatching",
    x == 22, "Attempted purse snatching",
    x == 23, "Pocket picking (completed only)",
    
    ##Property Crimes
    x == 31, "Completed burglary, forcible entry",
    x == 32, "Completed burglary, unlawful entry without force",
    x == 33, "Attempted forcible entry",
    x == 40, "Completed motor vehicle theft",
    x == 41, "Attempted motor vehicle theft",
    x == 54, "Completed theft less than $10",
    x == 55, "Completed theft $10 to $49",
    x == 56, "Completed theft $50 to $249",
    x == 57, "Completed theft $250 or greater",
    x == 58, "Completed theft value NA",
    x == 59, "Attempted theft",
    default = NA_character_ )
}

recode_serious_crimes <- function(x){
  data.table::fcase(
    x %in% 1:4, "Rape/Sexual attack",
    x %in% 5:10, "Robbery",
    x %in% 11:13, "Assault",
    x %in% 14:20, "Minor violent crime",
    x %in% 21:23, "Purse snatch/pocket picking",
    x %in% 31:59, "Property crimes",
    default = NA_character_ )
}


get_race <- function(x){
  data.table::fcase(
    x == 1, "White",
    x == 2, "Black",
    x == 3, "American Indian",
    x == 4, "Asian",
    x == 5, "Hawaiian",
    x == 6, "White-Black",
    x == 7, "White-American Indian",
    x == 8, "White-Asian",
    x == 9, "White-Hawaiian",
    x == 10, "Black-American Indian",
    x == 11, "Black-Asian",
    x == 12, "Black-Hawaiian",
    x == 13, "American Indian-Asian",
    x == 14, "Asian-Hawaiian",
    x == 15, "White-Black-American Indian",
    x == 16, "White-Black-Asian",
    x == 17, "White-American Indian-Asian",
    x == 18, "White-Asian-Hawaiian",
    x == 19, "2 or 3 races",
    x == 20, "4 or 5 races",
    default = NA_character_)
}

get_race_category <- function(race, hisp){
  data.table::fcase(
    race %chin% c("White") & hisp == "Non-Hispanic", "White",
    race %chin% c("White") & hisp == "Hispanic", "Hispanic", 
    ##Hispanic getting at mestizos, i.e. Amerindian with Iberian (Spanish) admixture - 
    ## what most people think of as "Hispanic", but classified as "White"
    ##- ignoring Black and Asian hispanic.  There is a very small percentage of white hispanics with no amerindian ancestry who according to this 
    ## criterion should be classified as 'White', but its negligible.  
    race %chin% c("Black", "White-Black", "Black-Asian", "Black-Hawaiian", "Black-American Indian"), "Black",
    race %chin% c("Asian", "Hawaiian", "Asian-Hawaiian"), "Asian", ##Asian/PAcific Islander/Hawaiian 
    race %chin% c("American Indian"), "American Indian", 
    ##all others, it not missing, "Other"
    !is.na(race), "Other",
    default = NA_character_
  )
}


get_race_category_v2 <- function(race, hisp){
  data.table::fcase(
    hisp == "Hispanic", "Hispanic", 
    race %chin% c("White") & hisp == "Non-Hispanic", "White",
    race %chin% c("Black") & hisp == "Non-Hispanic", "Black",
    race %chin% c("Asian") & hisp == "Non-Hispanic", "Asian",
    !is.na(race), "Other",
    default = NA_character_
  )
}



process_ncvs_002_files <- function(dt){
  require(data.table)
  require(magrittr)
  
    dt[, .(
      IDHH,
      HH_WEIGHT = V2116,
      WGTHHCY, ##ADJUSTED HOUSEHOLD WEIGHT - COLLECTION YEAR
      
      hh_Race0 = data.table::fcase(
        V2031 == 1, "White", ##White, but not specifying Hispanic
        V2031 == 2, "Black",
        V2031 == 6, "White-Black",
        V2031 %in% c(3,21), "Other",
        V2031 == 22, "White-Other",
        V2031 == 23, "Black-Other",
        V2031 == 24, "White-Black-Other",
        default = NA_character_),
      pp_Race0 = get_race(V2040A),
      pp_Hisp = fcase(V2041 == 1, "Hispanic", V2041 == 2, "Non-Hispanic", default = NA_character_),
      ref_Race0 = get_race(V2049A),
      ref_Hisp = fcase(V2050 == 1, "Hispanic", V2050 == 2, "Non-Hispanic", default = NA_character_))] %>% 
  
    .[, `:=`(
      hh_Race = data.table::fcase(
        hh_Race0 %chin% c("White","White-Other"), "White/Hispanic",
        hh_Race0 %chin% c("Black", "Black-Other", "White-Black"), "Black", 
        ##Note:  due to social construction, etc.  White-Black is coded as Black, but this is negligible
        hh_Race0 %chin% c("Other", "White-Black-Other"), "Other",
        default = NA_character_),
      pp_Race = get_race_category(pp_Race0, pp_Hisp),
      ref_Race = get_race_category(ref_Race0, ref_Hisp)
  )]
  
}

process_ncvs_003_files <- function(dt){
  require(data.table)
  require(magrittr)
  
  dt[, .(
        YEARQ,
        IDHH, IDPER, 
        WGTPERCY,
        PP_weight = V3080,
        Num_incidents = V3081, ##Num Incidents
        Race0 = get_race(V3023A), 
        Hisp = data.table::fcase(V3024 == 1, "Hispanic", V3024 == 2, "Non-Hispanic", default = NA_character_))] %>% 
    
  .[, `:=` (
    Race = get_race_category(Race0, Hisp),
    Race_v2 = get_race_category_v2(Race0, Hisp)
    )]
}

process_ncvs_004_files <- function(dt){
  
  require(data.table)
  require(magrittr)
  
    ##  In 2015, incident weight is called V4527
    ## "Beginning in 2016, the incident weight (V4527) was adjusted to account for series victimizations and was renamed SERIES_IWEIGHT."
  if("V4527" %in% names(dt) & !"SERIES_IWEIGHT" %in% names(dt)){
    if(max(dt$V4527, na.rm=TRUE) > 0){
      data.table::setnames(dt, old = c("V4527"), new = c("SERIES_IWEIGHT"), skip_absent = FALSE) 
    }
  }
    
  
   ##Rule V4022 != 1 excludes cases outside US

   dt[V4022 != 1] %>% 
     .[, V4016 := fifelse(V4016 %in% 11:996, 10, V4016)] %>%  ##recode series, no greater than 10]
   
     .[, .(
          YEARQ,
          Year = V4015,
          Incident_num = V4012,
          Crime = get_crime(V4529),
          Crime_type = recode_serious_crimes(V4529),
          Crime_code = V4529,
          IDHH, 
          IDPER, 
          WGTVICCY = fifelse(V4019 == 2, WGTVICCY * V4016, WGTVICCY), ##adust for series
          SERIES_WEIGHT, 
          Incident_Weight = SERIES_IWEIGHT,
          
          off_Race0 = data.table::fcase(
            V4246C == 1 | V4281  == 1 | V4285A == 2, "Black",
            V4246B == 1 | V4280 == 1 | V4285A == 1, "White",
            V4246D == 1 | V4282A == 1 | V4285A == 3, "Native American",
            V4246E == 1 | V4282B == 1 | V4285A == 4, "Asian",
            V4246F == 1 | V4282C == 1 | V4285A == 5, "Hawaiian",
            default = NA_character_),
          
          ##Excluding multiple offenders of various races
          off_Race0_v2 = data.table::fcase(
            V4285A %in% 1:5, "Multiple",
            V4246C == 1 | V4281  == 1, "Black",
            V4246B == 1 | V4280 == 1,  "White",
            V4246D == 1 | V4282A == 1, "Native American",
            V4246E == 1 | V4282B == 1, "Asian",
            V4246F == 1 | V4282C == 1, "Hawaiian",
            default = NA_character_),
          
          off_Hisp = data.table::fcase(
            V4237A == 1 | V4252A == 1 | V4252B %in% c(1), "Hispanic",
            V4237A == 2 | V4252A == 2 | V4252B %in% c(2), "Non-Hispanic",
            default = NA_character_))] %>% 
    
    .[, `:=`(
          `Offender Race` = data.table::fcase(
            off_Race0 == "White" & off_Hisp == "Non-Hispanic", "White",
            off_Race0 == "White" & is.na(off_Hisp), "White", ## Counting White with missing Hispanic designation as White
            
            off_Race0 == "White" & off_Hisp == "Hispanic", "Hispanic",
            off_Race0 == "Black", "Black",
            off_Race0 %chin% c("Asian", "Hawaiian"), "Asian",
            off_Race0 == "Native American", "Native American",
            default = NA_character_),
          
          ##Excluding all hispanic; Other = Hawaiian, Native American
          `Offender Race v2` = data.table::fcase(
            off_Hisp == "Hispanic", "Hispanic",
            off_Race0 == "White" & off_Hisp == "Non-Hispanic", "White",
            off_Race0 == "Black" & off_Hisp == "Non-Hispanic", "Black",
            off_Race0 == "Asian" & off_Hisp == "Non-Hispanic", "Asian",
            off_Race0 %chin% c("Native American", "Hawaiian"), "Other",
            default = NA_character_),
          
          ##version 2 + excluding multiple

          `Offender Race v3` = data.table::fcase(
            off_Hisp == "Hispanic", "Hispanic",
            off_Race0_v2 == "White" & off_Hisp == "Non-Hispanic", "White",
            off_Race0_v2 == "Black" & off_Hisp == "Non-Hispanic", "Black",
            off_Race0_v2 == "Asian" & off_Hisp == "Non-Hispanic", "Asian",
            off_Race0_v2 == "Multiple", "Multiple Races",
            off_Race0_v2 %chin% c("Native American", "Hawaiian"), "Other",
            default = NA_character_)
          )]
}



merge_and_summarise_interracial <- function(dt_004, dt_003, keep_codes = 1:20, grouping_vars = c("Weight_type"), 
                                            filter_interracial = TRUE, filter_races = FALSE, keep_races = c("White", "Black")){
  require(data.table)
  
  weight_cols <- c("SERIES_WEIGHT", "Incident_Weight", "WGTPERCY", "UNWEIGHTED")
  
  dt <- dt_004[Crime_code %in% keep_codes] 
  dt <- data.table::merge.data.table(x = dt, y = dt_003, by = c("IDPER", "IDHH", "YEARQ"), all.x = TRUE) 
  dt <- dt[!is.na(`Offender Race`) & !is.na(Race)]
  dt[, UNWEIGHTED := 1]
  
  keep_vars <- unique(c(weight_cols, grouping_vars, "Crime", "Crime_code", "Race", "Offender Race"))
  keep_vars <- keep_vars[which(keep_vars != "Weight_type")]
  dt <- dt[, ..keep_vars]
  
  weight_cols <- weight_cols[which(weight_cols %in% names(dt))]
  dt <- data.table::melt.data.table(data = dt, measure.vars = weight_cols, variable.name = "Weight_type", value.name = "Weight")
  
  gVars <- unique(c("Weight_type", "Offender Race", "Race", grouping_vars))
  gVars <- gVars[which(gVars %in% names(dt))]
  
  dt <- dt[, .(`Victimizations` = sum(Weight, na.rm=TRUE)), by = gVars]
  
  data.table::setnames(dt, old = "Race", new = "Victim Race", skip_absent = FALSE)
  ##new variable identifying pairs - each cell contains a vector of two
  dt[, index := .I]
  
  gVars <- unique(c("Weight_type", grouping_vars))
  gVars <- gVars[which(gVars %in% names(dt))]
  dt[, `:=` (`Total Victimizations` = sum(Victimizations)), by = gVars] 
  
  gVars <- unique(c("Weight_type", "Offender Race", grouping_vars))
  gVars <- gVars[which(gVars %in% names(dt))]
  dt[, `:=` (`Total Offender Race` = sum(Victimizations)), by = gVars]
  
  gVars <- unique(c("Weight_type", "Victim Race", grouping_vars))
  gVars <- gVars[which(gVars %in% names(dt))]
  dt[, `:=` (`Total Victim Race` = sum(Victimizations)), by = gVars]

  dt[, Interracial_flag := ifelse(`Victim Race` == `Offender Race`, "No", "Yes")]

  
  gVars <- unique(c("Weight_type", grouping_vars))
  gVars <- gVars[which(gVars %in% names(dt))]
  
  ##FILTERING TO ONLY INTERRACIAL CRIMES
  if(isTRUE(filter_interracial)){
    dt <- dt[`Victim Race` != `Offender Race`]
  }
  dt[, `:=` (`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])), by = gVars]
  
  all_races <- unique(c(dt$`Offender Race`))
  all_pairs_list <- combn(all_races, 2, simplify = FALSE)
  
  for(i in seq_along(all_pairs_list)){
    race_pairs <- all_pairs_list[[i]]
    race1 <- race_pairs[1]; race2 <- race_pairs[2]
    dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1), 
       `Total Interracial Pairs` := sum(Victimizations), by = gVars]
    
    dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1),  
       `Race Pair` := paste0(race1, "-", race2), by = index]
  }
  
  ##FILTERING TO ONLY PAIRS
  if(isTRUE(filter_interracial) & isTRUE(filter_races)){
    if(length(unique(keep_races) == 2)){
      race1 <-unique(keep_races)[1]; race2 <- unique(keep_races)[2]
      dt <- dt[ (`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1)]
    }
  }
  
  dt[, `:=` (
    `Percent of Total` = Victimizations/`Total Victimizations`,
    `Percent of Offender Race` = Victimizations/`Total Offender Race`,
    `Percent of Victim Race` = Victimizations/`Total Victim Race`,
    `Percent Interracial` = ifelse(Interracial_flag == "Yes", Victimizations/`Total Interracial`, NA_real_),
    `Percent Interracial Pairs` = ifelse(Interracial_flag == "Yes", Victimizations/`Total Interracial Pairs`, NA_real_))]
  
  ##last filter, removing 'Other' - including only Victim races that also appear as Offender Races and vice-versa
  u_races <- unique(dt$`Offender Race`)
  dt <- dt[`Victim Race` %chin% u_races]
  
  ##for ordering purposes in graphs, reordering race categories
  race_levels <- c("White", "Black", "Hispanic", "Asian", "Native American")
  race_levels <- race_levels[which(race_levels %in% u_races)]
  
  dt[, `:=`(
    `Offender Race` = factor(`Offender Race`, levels = race_levels),
    `Victim Race` = factor(`Victim Race`, levels = race_levels))]
  
  
  
  gVars <- unique(c("Weight_type", grouping_vars, "Offender Race", "Victim Race"))
  gVars <- gVars[which(gVars %in% names(dt))]
  data.table::setorderv(dt, cols = gVars)
  
  ##reorganize columns
  dt[, index := NULL]
  dt[is.na(`Race Pair`) & Interracial_flag == "No", `Race Pair` := `Victim Race`]
  
  ##rearranging columns
  keep_vars <- unique(c(gVars, "Race Pair", "Victimizations", 
                        "Total Victimizations", "Percent of Total",
                        "Total Victim Race", "Percent of Victim Race",
                        "Total Offender Race", "Percent of Offender Race",
                        "Percent of Offender Race", 
                        "Total Interracial",  "Percent Interracial",  
                        "Total Interracial Pairs", "Percent Interracial Pairs"))
  dt <- dt[, ..keep_vars]
  return(dt)
}


show_crime_summary_interracial <- function(df, grouping_vars = NULL, use_weight = c("series", "incident", "person", "unweighted")){
  require(data.table)
  require(reactable)
  
  use_weight = match.arg(use_weight)
  use_weight = switch(use_weight, "series" = "SERIES", 
                      "incident" = "INCIDENT", 
                      "person" = "WGTPERCY",
                      "unweighted" = "UNWEIGHTED")
  
  dt <- data.table::as.data.table(df)
  data.table::setnames(dt, skip_absent = TRUE,
                       old = c("Crime_type", "Weight_type"), 
                       new = c("Offense", "Weight"))
  
  dt <- dt[!is.na(`Offender Race`)]
  dt[, Weight := dplyr::recode(Weight, "Incident_Weight" = "INCIDENT", "SERIES_WEIGHT" = "SERIES")]
  
  if("Offense" %in% names(dt)){
    dt[, Offense := factor(Offense, levels = c("Robbery", "Assault", "Rape/Sexual attack"))]
  }
  
  dt <- dt[Weight == use_weight]
  dt[, Weight := NULL]
  
  order_vars <- c( "Offender Race","Victim Race","Race Pair", "Offense", "Year" )
  order_vars <- order_vars[which(order_vars %in% names(dt))]
  data.table::setorderv(dt, cols = order_vars)
  
  grouping_vars <- grouping_vars[which(grouping_vars %in% names(dt))]
  if(length(grouping_vars) == 0){grouping_vars <- NULL}
  
  col_names <- names(dt)
  new_names <- col_names %>% gsub("Percent ", "% ", .) %>% gsub("Victimizations", "Victims", .) %>% gsub("Interracial", "Inter-racial", .)
  names(dt) <- new_names

  reactable::reactable(dt, striped = TRUE, highlight = TRUE, bordered = TRUE, pagination = FALSE, compact = TRUE, resizable = TRUE, filterable = TRUE,
                       groupBy = NULL,
                       defaultColDef = colDef(
                         #minWidth = 85, 
                         maxWidth = 65,
                         align = "center",
                         style = list(fontSize = 11, fontFamily = "'Roboto Condensed', 'Arial Narrow', Merriweather,  Arial, Helvetica, sans-serif"),
                         format = list(cell = colFormat(separators = TRUE, percent = FALSE, digits = 0)),
                         headerStyle =  list(fontWeight = "bold", fontSize = 12),
                         footerStyle =  list(fontWeight = "bold"))
                       ,columns = list(
                         `Offender Race` = colDef(maxWidth = 75, minWidth = NULL, align = "left", aggregate = "unique"),
                         `Race Pair` = colDef(maxWidth = 90, minWidth = NULL, align = "left", aggregate = "unique"),
                         `Victim Race` = colDef(maxWidth = 65, minWidth = NULL, align = "left", aggregate = "unique"),
                         
                         `% of Total` = colDef(maxWidth = 50, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
                         `Total Victim Race` = colDef(maxWidth = 60),
                         `Total Inter-racial Pairs` = colDef(maxWidth = 55),
                         `Total Inter-racial` = colDef(maxWidth = 60),
                         `% of Victim Race` = colDef(maxWidth = 60, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
                         `% of Offender Race` = colDef(maxWidth = 75, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
                         `% Inter-racial` = colDef(maxWidth = 50, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
                         
                         `% Inter-racial Pairs` = colDef(maxWidth = 55, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2))
                       )
  )
  
  
  
}

Loading data…

##2015 Data

nc_15_003 <- data.table::fread(paste0(myDataFolder, "DATA/NCVS-ICPSR-2015-36448/DS0003/36448-0003-Data.tsv")) %>% process_ncvs_003_files(.)
nc_15_004 <- data.table::fread(paste0(myDataFolder, "DATA/NCVS-ICPSR-2015-36448/DS0004/36448-0004-Data.tsv")) %>% process_ncvs_004_files(.)

##2016 Data

nc_16_003 <- data.table::fread(paste0(myDataFolder, "DATA/NCVS-ICPSR-2016-36828/DS0003/36828-0003-Data.tsv")) %>% process_ncvs_003_files(.)
nc_16_004 <- data.table::fread(paste0(myDataFolder,"DATA/NCVS-ICPSR-2016-36828/DS0004/36828-0004-Data.tsv")) %>% process_ncvs_004_files(.)

##2018 Data

nc_18_003 <- data.table::fread(paste0(myDataFolder,"DATA/NCVS-ICPSR-2018-37297/DS0003/37297-0003-Data.tsv")) %>% process_ncvs_003_files(.)
nc_18_004 <- data.table::fread(paste0(myDataFolder,"DATA/NCVS-ICPSR-2018-37297/DS0004/37297-0004-Data.tsv")) %>% process_ncvs_004_files(.)

2015 - Major Offenses

d_gg_15 <- merge_and_summarise_interracial(nc_15_004, nc_15_003, keep_codes = 1:13, grouping_vars = c("Weight_type")) 

show_crime_summary_interracial(d_gg_15, use_weight = "series")

2016 - Major Offenses

d_gg_16 <- merge_and_summarise_interracial(nc_16_004, nc_16_003, keep_codes = 1:13, grouping_vars = c("Weight_type")) 

show_crime_summary_interracial(d_gg_16, use_weight = "series")

2018 - Major Offenses

d_gg_18 <- merge_and_summarise_interracial(nc_18_004, nc_18_003, keep_codes = 1:13, grouping_vars = c("Weight_type")) 

show_crime_summary_interracial(d_gg_18, use_weight = "series")

Appendix: Codebook Notes

A. Race Codes

Victim Race Codes (Column V3023A found in Victim files XXXXX-0003-Data.tsv:

White only
Black only
American Indian
Asian only
Hawaiian
White-Black
White-American Indian
White-Asian
White-Hawaiian
Black-American Indian
Black-Asian
Black-Hawaiian
American Indian-Asian
Asian-Hawaiian
White-Black-American Indian
White-Black-Asian
White-American Indian-Asian
White-Asian-Hawaiian
2 or 3 races
4 or 5 races

Offender Race/Ethnicity Codes (found in offender files XXXXX-0004-Data.tsv:

V4246A - SINGLE OFFENDER RACE MISSING FLAG - 1 = Identified Race, 9 = No race info
V4279 - MULTIPLE OFFENDER RACE MISSING FLAG -
V4283 - MULTIPLE OFFENDERS: DON’T KNOW
V4247 - SINGLE OFF ONLY CRIME AGAINST RESP OR HH
- “Was this the only time this offender committed a crime or made threats against you or your household?”
- 1 - Yes (only time)
- 2 - No (there were other times)
- 3 - Don’t know
V4246C - SINGLE OFFENDER RACE: BLACK OR AFRICAN AMERICAN
V4281 - MULT OFF RACE: BLACK
V4246B - SINGLE OFFENDER RACE: WHITE
V4280 - MULT OFF RACE: WHITE
V4237A - SINGLE OFFENDER HISPANIC/LATINO
V4252A - MULTIPLE OFFENDERS HISPANIC/LATINO
V4246D - SINGLE OFFENDER RACE: AMERICAN INDIAN OR ALASKA NATIVE
V4282A - MULT OFF RACE: AMERICAN INDIAN OR ALASKA NATIVE
V4246E - SINGLE OFFENDER RACE: ASIAN
V4282B - MULT OFF RACE: ASIAN
V4246F - SINGLE OFFENDER RACE: NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER
V4282C - MULT OFF RACE: NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER
V4285A - MULTIPLE OFFENDER RACE OF MOST: 1 = “mostly white”; 2 = “mostly black..”, 3 = "mostly Nat Am, 4 = mostly Asian; 5 = Mostly Native Hawaiian or Pacific Islander
V4252B - MULTIPLE OFFENDERS HISPANIC/NON-HISPANIC: 1 = “Mostly Hispanic or Latino”; 2 = “Mostly non-Hispanic”; 3 = “Equal number of Hispanic and non-Hispanic”; 4 = “Don’t know”

B. Weights

“INCIDENTS VS. VICTIMIZATIONS: For personal crimes, the NCVS makes a basic distinction between”incidents" and “victimizations.” The number of victimizations reflects how many criminal acts were experienced by NCVS survey respondents; the number of incidents reflects the number of criminal acts committed against NCVS respondents and others present during such incidents, as reported by survey respondents. If there was more than one victim, the incident weight is adjusted to compensate for the possibility that the incident could be reported several times by multiple victims and thus be over-counted."

“To calculate a victimization count to match recent BJS reports, users should use variable ‘SERIES_WEIGHT’”

SERIES_WEIGHT = WGTVICCY, where IF V4019 = 2 THEN WGTVICCY = WGTVICCY * v4016, ELSE WGTVICCY = WGTVICCY.

C. Crime Codes - `V4529`

TYPES OF CRIME, LISTED IN ORDER OF SERIOUSNESS (The Seriousness Hierarchy) p. 579 2018
ALL VIOLENT CRIMES - 1-20
MAJOR VIOLENT CRIMES - 1,2,3,5,6,7,8,9,10,11,12,13

Completed rape
Attempted rape
Sexual attack with serious assault
Sexual attack with minor assault
Completed robbery with injury from serious assault
Completed robbery with injury from minor assault
Completed robbery without injury from minor assault
Attempted robbery with injury from serious assault
Attempted robbery with injury from minor assault
Attempted robbery without injury
Completed aggravated assault with injury
Attempted aggravated assault with weapon
Threatened assault with weapon
Simple assault completed with injury
Sexual assault without injury
Unwanted sexual contact without force
Assault without weapon without injury
Verbal threat of rape
Verbal threat of sexual assault
Verbal threat of assault

D. Misc.

“To be consistent with BJS publications for all years, exclude crimes committed outside the U.S.” (2018 - p. 595)
- V4022 - IN WHAT CITY, TOWN, VILLAGE - 1 = “Outside U.S.”
- Rule: exclude if V4022 = 1, 9 or is missing.
“To be consistent with BJS publications from 2012 on … include series crimes…. Include up to 10 occurrences in a series crime”
- v4016 - HOW MANY TIMES INCIDENT OCCUR LAST 6 MOS
- Rule: Recode to 10 values of V4016 from 11 to 996
Person Victimization Counts consistent with BJS publications 2012+
- V4019 - CHECK ITEM D: ENOUGH DETAIL TO DISTINGUISH INCIDENTS
  - 1 = Yes (not a “series”)
  - 2 = No (is a “series”)
- Rule: If V4019 = 2, THEN WGTVICCY = WGTVICCY * v4016.
- SERIES_WEIGHT and SERIES_IWEIGHT already take this into account.
V4015 - YEAR INCIDENT OCCURRED (for 2018 file, either 2017 or 2018)
YEARQ - YEAR AND QUARTER OF INTERVIEW.
V4012 - Incident number. (Accounts for most duplicated IDs in _004 files)
- “When multiple incidents are reported in response to a single screen question, this variable indicates which of these incidents is being discussed”

II. Interracial Homicides (2015-2019)

Homicide data containing race/ethnicity data for both victims and offenders can be found in the FBI’s Expanded Homicide Data Tables. The code below downloads and processes the .csv files available at the following URL’s:

Functions

###HOMICIDES 2018
##Expanded Homicide Data Table 6
##Race, Sex, and Ethnicity of Victim by Race, Sex, and Ethnicity of Offender, 2018

##function to format homicide tables
format_homicide_tables <- function(url){
  require(htmltab)
  year <- as.numeric(substr(url, start = 38, stop = 41))
  d_url <- htmltab(doc = url, which = "//*[@id='tablecontainer']")
  
  d <- d_url[c(1:4,12),c(1:6,10)]
  
  new_names <- dplyr::case_when( 
    grepl("Black", names(d), ignore.case = TRUE) ~ "Black",
    grepl("White", names(d), ignore.case = TRUE) ~ "White/Hisp",
    grepl("Other", names(d), ignore.case = TRUE) ~ "Other",
    grepl("Unknown", names(d), ignore.case = TRUE) ~ "Unknown",
    grepl("Hispanic|Latino", names(d), ignore.case = TRUE) ~ "Hispanic",
    grepl("victim", names(d), ignore.case = TRUE) ~ "Victim Race",
    TRUE ~ trimws(names(d)))
  
  names(d) <- new_names
  
  d <- d %>% 
    ##remove commas and convert to numeric
    dplyr::mutate_at(2:7, list(~ as.numeric(gsub(",", "", .)))) %>% 
    dplyr::mutate(
      White = `White/Hisp` - Hispanic,
                  
      `Victim Race` = dplyr::case_when( 
        grepl("Black", `Victim Race`, ignore.case = TRUE) ~ "Black",
        grepl("White", `Victim Race`, ignore.case = TRUE) ~ "White/Hisp",
        grepl("Other", `Victim Race`, ignore.case = TRUE) ~ "Other",
        grepl("Unknown", `Victim Race`, ignore.case = TRUE) ~ "Unknown",
        grepl("Hispanic|Latino", `Victim Race`, ignore.case = TRUE) ~ "Hispanic",
        TRUE ~ trimws(`Victim Race`))) %>% 
    
    tidyr::pivot_longer(-c(`Victim Race`, Total), names_to = "Offender Race", values_to = "Homicides") 
  
    ##to isolate the White from the White/Hispanic for Victims...
    d_wh <- d %>% dplyr::filter(`Offender Race` != "White/Hisp") %>% 
      dplyr::select(-Total) %>% 
      tidyr::pivot_wider(names_from = `Victim Race`, 
                         values_from = Homicides) %>% 
      dplyr::mutate(White = `White/Hisp` - Hispanic) %>% 
      tidyr::pivot_longer(-`Offender Race`, 
                          names_to = "Victim Race", 
                          values_to = "Homicides") %>% 
      dplyr::filter(`Victim Race` == "White") %>% 
      dplyr::group_by(`Victim Race`) %>% 
      dplyr::mutate(Total = sum(Homicides))
      dplyr::bind_rows(d, d_wh) %>% 
        dplyr::mutate(Year = year)
  
}

url_2015 <- "https://ucr.fbi.gov/crime-in-the-u.s/2015/crime-in-the-u.s.-2015/tables/expanded_homicide_data_table_6_murder_race_and_sex_of_vicitm_by_race_and_sex_of_offender_2015.xls"
url_2016 <- "https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/tables/expanded-homicide-data-table-3.xls"
url_2017 <- "https://ucr.fbi.gov/crime-in-the-u.s/2017/crime-in-the-u.s.-2017/tables/expanded-homicide-data-table-6.xls"
url_2018 <- "https://ucr.fbi.gov/crime-in-the-u.s/2018/crime-in-the-u.s.-2018/tables/expanded-homicide-data-table-6.xls"
url_2019 <- "https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019/tables/expanded-homicide-data-table-6.xls"

homicides <- purrr::map_dfr(c(url_2015, url_2016, url_2017, url_2018, url_2019), 
                            format_homicide_tables)

Tables

d_h <- homicides %>% 
  ##collapse other and unknown together
  dplyr::mutate(`Victim Race` = dplyr::recode(`Victim Race`, 
                                              "Unknown" = "Other"), 
                `Offender Race` = dplyr::recode(`Offender Race`, 
                                                "Unknown" = "Other")) %>% 
  dplyr::group_by(`Victim Race`, `Offender Race`) %>% 
  dplyr::summarise(Homicides = sum(Homicides)) %>% 
  dplyr::filter(`Victim Race` %in% c("White", "Black", "Hispanic", "Other"), 
                `Offender Race` %in% c("White", "Black", "Hispanic", "Other")) %>% 
  dplyr::group_by(`Victim Race`) %>% 
  dplyr::mutate(`Total Homicides` = sum(Homicides)) %>% 
  dplyr::ungroup() %>% 
  dplyr::mutate(
    Percent = Homicides/`Total Homicides`, 
    Label = paste0(signs::signs(Homicides, 
                                format = scales::comma, 
                                accuracy = 1), 
                   " (", 
                   signs::signs(Percent, format = scales::percent), ")")) %>% 
  dplyr::select(`Victim Race`, `Offender Race`, Label, `Total Homicides`) %>% 
  tidyr::pivot_wider(names_from = `Offender Race`, 
                     values_from = Label) %>%
  dplyr::ungroup() %>%
  dplyr::mutate(`Victim Race` = factor(`Victim Race`, 
                                       levels = c("White",  "Black",  
                                                  "Hispanic", "Other"))) %>% 
  dplyr::arrange(`Victim Race`) %>% 
  dplyr::select(`Victim Race`, White, Black, Hispanic, Other, `Total Homicides`) %>% 
  dplyr::mutate(`Total Homicides` = paste0(signs::signs(`Total Homicides`, 
                                                        format = scales::comma, accuracy = 1)))

gt(d_h) %>% 
    tab_header(
    title = "Homicides by Race/Ethnicity of Victims and Offenders (2015-2019)"
    #subtitle = ""
  ) %>% 
  
  tab_style(
    style = cell_text(size = "small", align = "center"),
    locations = cells_body()
  ) %>% 
  
  tab_options(
    footnotes.font.size = 11
  ) %>% 

  
  tab_spanner(
    label = "Offender Race",
    columns = vars(White, Black, Hispanic, Other)) %>% 
  tab_footnote(
    footnote = "Non-Hispanic white victim and offender counts estimated by subtracting Hispanic ethnicity counts from original White counts including both Hispanic and Non-Hispanic Whites.", locations = cells_column_labels(vars(White))) %>%
  
  tab_footnote(
    footnote = "Total homicides summed across years 2015-2019. These tables are only include incidents where some information about the offender is known by law enforcement.  The actual victim counts by race are roughly double what is shown here.", 
    locations = cells_column_labels(vars(`Total Homicides`))) %>%

  tab_source_note(source_note = md("Source:  FBI's Expanded Homicide Data Tables. [2015 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2015/crime-in-the-u.s.-2015/tables/expanded_homicide_data_table_6_murder_race_and_sex_of_vicitm_by_race_and_sex_of_offender_2015.xls), [2016 Table-3](https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/tables/expanded-homicide-data-table-3.xls), [2017 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2017/crime-in-the-u.s.-2017/tables/expanded-homicide-data-table-6.xls), [2018 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2018/crime-in-the-u.s.-2018/tables/expanded-homicide-data-table-6.xls), [2019 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019/tables/expanded-homicide-data-table-6.xls)"))

Homicides by Race/Ethnicity of Victims and Offenders (2015-2019)

Victim Race	Offender Race				Total Homicides²
Victim Race	White¹	Black	Hispanic	Other	Total Homicides²
White	9,277 (75.197%)	1,963 (15.911%)	692 (5.609%)	405 (3.283%)	12,337
Black	740 (5.162%)	12,751 (88.950%)	476 (3.321%)	368 (2.567%)	14,335
Hispanic	562 (12.461%)	726 (16.098%)	3,029 (67.162%)	193 (4.279%)	4,510
Other	378 (22.487%)	308 (18.322%)	150 (8.923%)	845 (50.268%)	1,681
Source: FBI's Expanded Homicide Data Tables. 2015 Table-6, 2016 Table-3, 2017 Table-6, 2018 Table-6, 2019 Table-6
¹ Non-Hispanic white victim and offender counts estimated by subtracting Hispanic ethnicity counts from original White counts including both Hispanic and Non-Hispanic Whites. ² Total homicides summed across years 2015-2019. These tables are only include incidents where some information about the offender is known by law enforcement. The actual victim counts by race are roughly double what is shown here.

Interracial Violence (2015-2019)

John Hamilton Bradford, Ph.D.

Census Data

I. Nonfatal Interracial Violence

A. Online Data (2018, 2019)

Functions

Get Data

2019

2018

B. ICPSR Data (2015, 2016, 2018)

Functions

2015 - Major Offenses

2016 - Major Offenses

2018 - Major Offenses

Appendix: Codebook Notes

A. Race Codes

B. Weights

C. Crime Codes - `V4529`

D. Misc.

II. Interracial Homicides (2015-2019)

Functions

Tables

Interracial Violence (2015-2019)

John Hamilton Bradford, Ph.D.

Census Data

I. Nonfatal Interracial Violence

A. Online Data (2018, 2019)

Functions

Get Data

2019

2018

B. ICPSR Data (2015, 2016, 2018)

Functions

2015 - Major Offenses

2016 - Major Offenses

2018 - Major Offenses

Appendix: Codebook Notes

A. Race Codes

B. Weights

C. Crime Codes - V4529

D. Misc.

II. Interracial Homicides (2015-2019)

Functions

Tables

C. Crime Codes - `V4529`