2020-10-27
For data pertaining to violent offending by race see also: https://rpubs.com/johnbradford/violent-crime
Below I obtain population data from the US Census to calculate population rates of interracial violent crime. To be consistent with the original NCVS excel file on the BJS website, population data for White, Black, and Asian are all non-Hispanic. I use the tidycensus R package to download the population data from the 2014-2018 5-Year American Community Survey.
library(tidyverse)
library(tidycensus)
library(gt)
##CENSUS DATA
##American Community Survey (ACS)
#acs_vars <- tidycensus::load_variables(2018, "acs5")
census_data <- tidycensus::get_acs(geography = "us", geometry = FALSE, year = 2018, survey = "acs5",
variables = c(Total = "B01003_001",
White = "B03002_003",
Black = "B03002_004",
Asian = "B03002_006",
`Native American` = "B03002_005",
Hawaiian = "B03002_007",
Hispanic = "B03002_012")) %>%
dplyr::select(variable, estimate) %>%
tidyr::pivot_wider(names_from = variable, values_from = estimate) %>%
tidyr::pivot_longer(-Total, names_to = "Race", values_to = "Population")
##display in table
census_data %>%
dplyr::mutate(Percent = Population/Total,
Race = factor(Race, levels = c("White", "Hispanic", "Black", "Asian", "Native American", "Hawaiian"))) %>%
dplyr::select(-Total) %>%
dplyr::arrange(Race) %>%
gt::gt(.) %>%
tab_header(
title = "Population by Race",
subtitle = "US Census 5-Year American Community Survey (2014-2018)"
) %>%
fmt_number(
data = .,
columns = c(2),
rows = NULL,
decimals = 2,
drop_trailing_zeros = FALSE,
use_seps = TRUE,
scale_by = 1,
suffixing = TRUE,
pattern = "{x}",
sep_mark = ",",
dec_mark = ".",
locale = NULL) %>%
fmt_percent(
data = .,
columns = c(3),
placement = "right",
decimals = 2,
drop_trailing_zeros = FALSE,
use_seps = TRUE,
pattern = "{x}",
sep_mark = ",",
dec_mark = ".",
locale = NULL) %>%
tab_style(
style = cell_text(size = "medium", align = "center", font = "Roboto Condensed"),
locations = list(cells_body(columns = 1:3))
) %>%
tab_style(
style = cell_text(align = "left", font = "Arial Narrow", size = "medium"),
locations = list(cells_body(columns = 1), cells_column_labels(columns = 1))
) %>%
tab_style(
style = cell_text(weight = "bolder"),
locations = list(cells_column_labels(columns = 1:3))
) %>%
tab_style(
style = cell_text(align = "center", font = "Arial Narrow"),
locations = list(cells_column_labels(columns = 2:3),
cells_title(groups = c("title")))
) %>%
tab_style(
style = cell_text(weight = "bolder"),
locations = list(cells_title(groups = c("title")))
) %>%
tab_source_note(source_note = md("White, Black, Asian, Native American, and Native Hawaiian include only non-Hispanic; Hispanic estimate includes all races.")) | Population by Race | ||
|---|---|---|
| US Census 5-Year American Community Survey (2014-2018) | ||
| Race | Population | Percent |
| White | 197.18M | 61.07% |
| Hispanic | 57.52M | 17.81% |
| Black | 39.72M | 12.30% |
| Asian | 17.37M | 5.38% |
| Native American | 2.14M | 0.66% |
| Hawaiian | 525.10K | 0.16% |
| White, Black, Asian, Native American, and Native Hawaiian include only non-Hispanic; Hispanic estimate includes all races. | ||
Data pertaining to interracial violent victimizations (excluding murder) can be obtained from the National Crime Victimization Survey (NCVS). The Bureau of Justice Statistics (BJS) provides the following description on their website:
“The Bureau of Justice Statistics’ (BJS) National Crime Victimization Survey (NCVS) is the nation’s primary source of information on criminal victimization. Each year, data are obtained from a nationally representative sample of about 240,000 interviews on criminal victimization, involving 160,000 unique persons in about 95,000 households…. The NCVS collects information on nonfatal personal crimes (i.e., rape or sexual assault, robbery, aggravated and simple assault, and personal larceny) … both reported and not reported to police.”
The BJS has made available on their website data pertaining to interracial violence for 2018 and 2019, which can be downloaded as excel files.
Criminal Victimization, 2019, click the link Data tables to download the zipped excel files. Data pertaining to interracial violence can be found in Table 15 (file cv19t15.csv).
Criminal Victimization, 2018, click the link Data tables to download the zipped excel files. Data pertaining to interracial violence can be found in Table 14 (file cv18t14.csv).
Unfortunately, the BJS has not published analogous files for earlier versions of the survey from 2015 and 2016, and the format of the data they do publish changes from year to year. However, the raw data can be obtained from the Inter-university Consortium for Political and Social Research (ICPSR). The drawback is that these data are not easily accessible to the general public. The files are huge, require code books to interpret, and are generally unmanageable without some prior experience in coding and/or data analysis.
My presentation of the data available on the web is completely reproducible in R. The code below will automatically download, unzip, read and format the comma separated values (csv) files contained on the BJS website. My presentation of the ICPSR data is, in contrast, mostly reproducible, but requires that you first download the .tsv formatted files from the following links:
For practitioners of R, note that because the files are so large, I load and process the files using the data.table package, which might not be as familiar as the tidyverse or base R. For more information on data.table syntax, see: https://rdatatable.gitlab.io/data.table/index.html .
All of the following tables and graphs are derived from the National Crime Victimization Survey (NCVS) available on the BJS website at Criminal Victimization, 2019 and Criminal Victimization, 2018.
process_online_ncvs_data <- function(df){
require(dplyr)
require(tidyr)
require(data.table)
dt <- data.table::as.data.table(df)
dt[, `:=` (`Total Victimizations` = sum(Victimizations))]
dt[, `:=` (`Total Offender Race` = sum(Victimizations)), by = .(`Offender Race`)]
dt[, index := .I]
dt[, Interracial_flag := ifelse(`Victim Race` == `Offender Race`, "No", "Yes")]
all_races <- unique(c(dt$`Victim Race`))
all_pairs_list <- combn(all_races, 2, simplify = FALSE)
for(i in seq_along(all_pairs_list)){
race_pairs <- all_pairs_list[[i]]
race1 <- race_pairs[1]; race2 <- race_pairs[2]
dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1),
`Total Interracial Pairs` := sum(Victimizations)]
dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1),
`Race Pair` := paste0(race1, "-", race2), by = index]
}
dt[, `:=` (
`Percent of Total` = Victimizations/`Total Victimizations`,
`Percent of Offender Race` = Victimizations/`Total Offender Race`,
`Percent of Victim Race` = Victimizations/`Total Victim Race`,
`Percent Interracial` = ifelse(Interracial_flag == "Yes",
Victimizations/`Total Interracial`,
NA_real_),
`Percent Interracial Pairs` = ifelse(Interracial_flag == "Yes",
Victimizations/`Total Interracial Pairs`,
NA_real_))]
race_levels <- c("White", "Black", "Hispanic", "Asian")
race_levels <- race_levels[which(race_levels %in%
unique(dt[, c(`Offender Race`,
`Victim Race`)]))]
dt <- dt[`Offender Race` %in% race_levels & `Victim Race` %in% race_levels]
dt[, `:=`(
`Offender Race` = factor(`Offender Race`, levels = race_levels),
`Victim Race` = factor(`Victim Race`, levels = race_levels))]
gVars <- c("Offender Race", "Victim Race")
gVars <- gVars[which(gVars %in% names(dt))]
data.table::setorderv(dt, cols = gVars)
##reorganize columns
dt[, index := NULL]
dt[is.na(`Race Pair`) & Interracial_flag == "No", `Race Pair` := `Victim Race`]
##rearranging columns
keep_vars <- unique(c(gVars, "Race Pair", "Victimizations",
"Total Victimizations", "Percent of Total",
"Total Victim Race", "Percent of Victim Race",
"Total Offender Race", "Percent of Offender Race",
"Percent of Offender Race",
"Total Interracial", "Percent Interracial",
"Total Interracial Pairs", "Percent Interracial Pairs"))
dt <- dt[, ..keep_vars]
## I relabel some columns to save space
col_names <- names(dt)
new_names <- col_names %>% gsub("Percent ", "% ", .) %>%
gsub("Victimizations", "Victims", .) %>%
gsub("Interracial", "Inter-racial", .)
names(dt) <- new_names
dt
}2019 Data
## Online BJS Files
##National Crime Victimization Survey
library(tidyverse)
library(data.table)
## 2019 Data
temp <- tempfile()
download.file("https://www.bjs.gov/content/pub/sheets/cv19.zip", temp)
ncvs_t15 <- readr::read_csv(unz(temp, filename="cv19t15.csv"),
skip = 11, n_max = 4) %>%
dplyr::select(-matches("X")) %>%
dplyr::rename(`Victim Race` = `Victim race/ethnicity`) %>%
dplyr::rename_all(., list(~gsub("\\/.*", "", .))) %>%
dplyr::mutate(`Victim Race` = gsub("\\/.*", "", `Victim Race`)) %>%
dplyr::rename( Total = `Number of violent incidents`) %>%
dplyr::filter(`Victim Race` %in% c("White", "Black", "Hispanic", "Other"))
dt_ncvs_19 <- ncvs_t15 %>%
dplyr::rename(`Total Victim Race` = Total) %>%
tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`),
names_to = "Offender Race",
values_to = "Victimizations") %>%
dplyr::filter(`Offender Race` != "Total") %>%
dplyr::mutate(
Percent = Victimizations/`Total Victim Race`,
`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])
) %>%
##function defined above
process_online_ncvs_data(.)2018 Data
## Online BJS Files
##National Crime Victimization Survey
library(tidyverse)
library(data.table)
temp <- tempfile()
download.file("https://www.bjs.gov/content/pub/sheets/cv18.zip", temp)
ncvs_t14 <- readr::read_csv(unz(temp, filename="cv18t14.csv"),
skip = 11, n_max = 4) %>%
dplyr::select(-matches("X")) %>%
dplyr::rename(`Victim Race` = `Victim race/ethnicity`) %>%
dplyr::mutate(`Asian/a` = as.numeric(gsub("<", "", `Asian/a`))) %>%
dplyr::select(-Total) %>%
dplyr::rename_all(., list(~gsub("\\/.*", "", .))) %>%
dplyr::mutate_at(3:8, ~./100) %>%
dplyr::mutate(`Victim Race` = gsub("\\/.*", "", `Victim Race`)) %>%
dplyr::rename(`Multiple` = `Multiple offenders of various races`) %>%
dplyr::mutate( Victimizations = `Number of violent incidents`) %>%
dplyr::select(-`Number of violent incidents`)
dt_ncvs_18 <- ncvs_t14 %>%
dplyr::rename(`Total Victim Race` = Victimizations) %>%
tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`), names_to = "Offender Race", values_to = "Percent") %>%
dplyr::filter(`Offender Race` != "Total") %>%
dplyr::mutate(
Victimizations = `Total Victim Race` * (Percent),
`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])
) %>%
##function defined above
process_online_ncvs_data(.)library(gt)
library(extrafont)
gt(ncvs_t15) %>%
tab_header(
title = "Nonfatal Violent incidents by victim and offender race (2019)"
#subtitle = ""
) %>%
fmt_number(
data = .,
columns = c(2:6),
rows = NULL,
decimals = 0,
drop_trailing_zeros = FALSE,
use_seps = TRUE,
scale_by = 1,
suffixing = FALSE,
pattern = "{x}",
sep_mark = ",",
dec_mark = ".",
locale = NULL) %>%
tab_spanner(
label = "Offender Race",
columns = vars(White, Black, Hispanic, Other)) %>%
tab_footnote(
footnote = "Includes Asians, Native Hawaiians and Other Pacific Islanders, American Indians and Alaska Natives, persons of two or more races, and multiple offenders of various races.",
locations = cells_column_labels(vars(Other))) %>%
tab_footnote(
footnote = "If the victim perceived any of the offenders in a multiple offender incident to be of Hispanic origin, they are classified as Hispanic.",
locations = cells_column_labels(vars(Hispanic))) %>%
tab_footnote(
footnote = "Excludes persons of Hispanic origin",
locations = cells_column_labels(vars(Black, White))) %>%
tab_options(
footnotes.font.size = 11
) %>%
tab_style(
style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
locations = list(cells_body(columns = 2:6))
) %>%
tab_style(
style = cell_text(align = "left", font = "Arial Narrow", size = "medium"),
locations = list(cells_body(columns = 1))
) %>%
tab_style(
style = cell_text(align = "center", font = "Arial Narrow"),
locations = list(cells_column_labels(columns = 1:6),
cells_column_spanners(spanners ="Offender Race" ),
cells_title(groups = c("title")),
cells_stubhead())
) %>%
tab_style(
style = cell_text(weight = "bolder"),
locations = list(cells_title(groups = c("title")))
) %>%
tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2019.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686) \nFilename: cv19t15.csv")) | Nonfatal Violent incidents by victim and offender race (2019) | |||||
|---|---|---|---|---|---|
| Victim Race | Total | Offender Race | |||
| White1 | Black1 | Hispanic2 | Other3 | ||
| White | 2,796,710 | 1,722,230 | 472,570 | 463,520 | 138,380 |
| Black | 494,610 | 89,980 | 346,260 | 43,730 | 14,640 |
| Hispanic | 774,310 | 170,840 | 249,030 | 334,600 | 19,840 |
| Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019. Filename: cv19t15.csv |
|||||
|
1
Excludes persons of Hispanic origin
2
If the victim perceived any of the offenders in a multiple offender incident to be of Hispanic origin, they are classified as Hispanic.
3
Includes Asians, Native Hawaiians and Other Pacific Islanders, American Indians and Alaska Natives, persons of two or more races, and multiple offenders of various races.
|
|||||
library(gt)
m <- dt_ncvs_19 %>% tibble::as_tibble() %>%
dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>%
dplyr::select(`Victim Race`, `Offender Race`, Victims) %>%
dplyr::mutate(Victims = round(Victims)) %>%
tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>%
dplyr::group_by(`Offender Race`) %>%
dplyr::mutate(`Total Offending` = `White` + `Black` + `Hispanic`) %>%
dplyr::ungroup()
gt(m) %>%
tab_header(
title = "Nonfatal Violent Victimizations by Race of Offender and Victim (2019)"
#subtitle = "Transposed"
) %>%
fmt_number(
data = .,
columns = c(2:5),
rows = NULL,
decimals = 0,
drop_trailing_zeros = FALSE,
use_seps = TRUE,
scale_by = 1,
suffixing = FALSE,
pattern = "{x}",
sep_mark = ",",
dec_mark = ".",
locale = NULL) %>%
tab_spanner(
label = "Victim Race",
columns = vars(White, Black, Hispanic)) %>%
summary_rows(
groups = NULL,
columns = vars(`White`,`Black`, `Hispanic`, `Total Offending`),
fns = list(`Total` = ~sum(.x, na.rm=TRUE)),
decimals = 0,
use_seps = TRUE,
sep_mark = ",") %>%
tab_options(
footnotes.font.size = 11
) %>%
tab_style(
style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
locations = list(cells_body(columns = 1:5),
cells_stub(),
cells_grand_summary(columns = 1:5, rows = "Total"))
) %>%
tab_style(
style = cell_text(align = "left", font = "Arial Narrow", size = "medium"),
locations = list(cells_body(columns = 2))
) %>%
tab_style(
style = cell_text(align = "center", font = "Arial Narrow"),
locations = list(cells_column_labels(columns = 1:5),
cells_column_spanners(spanners ="Victim Race" ),
cells_title(groups = c("title")),
cells_stubhead())
) %>%
tab_style(
style = cell_text(weight = "bolder"),
locations = list(cells_title(groups = c("title")))
) %>%
tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2019.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046) \nFilename: cv19t15.csv")) | Nonfatal Violent Victimizations by Race of Offender and Victim (2019) | |||||
|---|---|---|---|---|---|
| Offender Race | Victim Race | Total Offending | |||
| White | Black | Hispanic | |||
| White | 1,722,230 | 89,980 | 170,840 | 1,983,050 | |
| Black | 472,570 | 346,260 | 249,030 | 1,067,860 | |
| Hispanic | 463,520 | 43,730 | 334,600 | 841,850 | |
| Total | — | 2,658,320 | 479,970 | 754,470 | 3,892,760 |
| Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019. Filename: cv19t15.csv |
|||||
Keep in mind that these figures include only victimizations for which the race of both the offender and victim are known and that the 2019 BJS table includes only White, Black, and Hispanic as categories. Asians, which were included in 2018 as a separate category, are removed and lumped with ‘Other’. The column definitions are as follows:
Victims: frequency of victimizations involving Offender Race and Victim Race.Total Victim Race: sum of victimizations involving Victim Race across all offender races.Total Offender Race: sum of victimizations perpetrated by Offender Race across all victim races.Total Inter-racial: sum of victimizations in which the race of offender and the race victim are different.Total Inter-racial Pairs: sum of interracial victimizations involving the two races as either victim or offender.For example, reading across the second row involving White offenders and Black victims, we see that these involved only 2.21% of all victimizations (in which the race of offender and victim are both known); 18% of victimizations of Black victims; 4.5% of victimizations involving White offenders; 5.4% of all interracial violent victimizations; and approximately 16% of violent interracial victimizations between Blacks and Whites, the other 84% involving Black offenders and White victims.
library(reactable)
reactable::reactable(dt_ncvs_19,
striped = TRUE, highlight = TRUE,
bordered = TRUE, pagination = FALSE,
compact = TRUE, resizable = TRUE,
filterable = TRUE,
groupBy = NULL,
##Default formatting
defaultColDef = colDef(
#minWidth = 85,
maxWidth = 65,
align = "center",
style = list(fontSize = 11,
fontFamily = "'Roboto Condensed',
'Arial Narrow', Merriweather, Arial,
Helvetica, sans-serif"),
format = list(cell = colFormat(separators = TRUE,
percent = FALSE,
digits = 0)),
headerStyle = list(fontWeight = "bold", fontSize = 12),
footerStyle = list(fontWeight = "bold"))
,columns = list(
`Offender Race` = colDef(maxWidth = 75,
minWidth = NULL,
align = "left",
aggregate = "unique"),
`Race Pair` = colDef(maxWidth = 90,
minWidth = NULL,
align = "left",
aggregate = "unique"),
`Victim Race` = colDef(maxWidth = 65,
minWidth = NULL,
align = "left",
aggregate = "unique"),
`% of Total` = colDef(maxWidth = 50,
align = "center",
format = colFormat(separators = FALSE,
percent = TRUE,
digits = 2)),
`Total Victim Race` = colDef(maxWidth = 60),
`Total Inter-racial Pairs` = colDef(maxWidth = 55),
`Total Inter-racial` = colDef(maxWidth = 60),
`% of Victim Race` = colDef(maxWidth = 60,
align = "center",
format = colFormat(separators = FALSE,
percent = TRUE,
digits = 2)),
`% of Offender Race` = colDef(maxWidth = 75,
align = "center",
format = colFormat(separators = FALSE,
percent = TRUE,
digits = 2)),
`% Inter-racial` = colDef(maxWidth = 50,
align = "center",
format = colFormat(separators = FALSE,
percent = TRUE,
digits = 2)),
`% Inter-racial Pairs` = colDef(maxWidth = 55,
align = "center",
format = colFormat(separators = FALSE,
percent = TRUE,
digits = 2))
)
)Converting from long to wide format…
m_19 <- dt_ncvs_19 %>% tibble::as_tibble() %>%
dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic")) %>%
dplyr::select(`Victim Race`, `Offender Race`, Victims) %>%
dplyr::mutate(
Victims = round(Victims),
`Victim Race` = paste0(`Victim Race`, " Victims")) %>%
tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>%
dplyr::group_by(`Offender Race`) %>%
dplyr::mutate(
`Total Offending` = `White Victims` + `Black Victims` + `Hispanic Victims`,
`% of Victims` = list(c(`White Victims`,`Black Victims`, `Hispanic Victims`))
# ,`Victim Counts` = list(c(`White Victims`,`Black Victims`,
# `Hispanic Victims`,`Asian Victims`))
)
bar_colors <- c("#56B4E9", "#000000", "#D55E00")
bar_color_list <- list(bar_colors, bar_colors, bar_colors, bar_colors)
reactable::reactable(m_19, striped = TRUE, highlight = TRUE,
bordered = TRUE, compact = FALSE, resizable = FALSE,
# rowStyle = function(index) {if (index == 5) list(fontWeight = "bold")},
# rowClass = function(index) {if (index == 5) {"bold"}},
defaultColDef = colDef(
format = colFormat(separators = TRUE, digits = 0),
footerStyle = list(fontWeight = "bold"),
footer = function(values) {
if (!is.numeric(values)) return()
sparkline(values, type = "pie", width = 80, height = 30,
sliceColors = c("#56B4E9", "#000000", "#D55E00"))
}),
columns = list(
`Offender Race` = colDef(footer = "% of Offenders"),
`% of Victims` = colDef(cell = function(values) {
if (!is.numeric(values)) return()
sparkline(values, type = "pie", width = 80, height = 30,
sliceColors = c("#56B4E9", "#000000", "#D55E00"))
})
# ,`Victim Counts` = colDef(cell = function(values, index) {
# sparkline(values, type = "bar",
# barWidth = 10,
# #barColor = c("#56B4E9", "#000000", "#D55E00", "#F0E442"),
# barColor = bar_colors[[index]],
# stackedBarColor = bar_colors[[index]],
# chartRangeMin = 0,
# chartRangeMax = unlist(m$`Victim Counts`[[index]])
# )
# })
)
)Violent Victimizations by Race of Offender and Victim (2019)
#devtools::install_github("mattflor/chorddiag")
library(chorddiag)
m <- dt_ncvs_19 %>% tibble::as_tibble() %>%
dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic")) %>%
dplyr::select(`Victim Race`, `Offender Race`, Victims) %>%
dplyr::mutate(Victims = round(Victims)) %>%
tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims)
matrix_names <- list(Offender = m$`Offender Race`, Victim = names(m)[-1])
m <- as.matrix(m[,-1])
dimnames(m) <- matrix_names
groupColors <- c("#56B4E9", "#000000", "#D55E00")
chorddiag(m, groupColors = groupColors, groupThickness = 0.1, groupPadding = 2, groupnamePadding = 30)library(plotly)
d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>%
#dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::mutate(
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_offender_label = signs::signs(`% of Offender Race`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(countLabel, "\n",
percent_victim_label, " of ",
`Victim Race`, " victims\n",
percent_offender_label, " of ",
`Offender Race`, " offender victimizations\n",
percent_total_label, " of all 2019 victimizations"))
g <- ggplot(d_gg,
aes(x = `Victim Race`,
y = Victims,
fill = `Offender Race`)) +
geom_col(alpha = 0.75, aes(text = Label)) +
scale_fill_manual(values = c("#56B4E9", "#000000", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06),
#labels = scales::unit_format(scale = .001, suffix = "k")
labels = c("0", "1M", "2M", "3M")) +
hrbrthemes::theme_ipsum_rc(base_size = 12) +
theme(panel.grid.major.y = element_line(linetype = 2),
panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 12),
axis.title.y = element_text(size = 12)) +
ggtitle("Violent incidents by victim and offender race (2019)") +
labs(
x = "Victim Race",
y = "Number of violent incidents") +
geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1),
aes(y = `Total Victim Race`, label = victimLabel),
vjust = 0, family = "Roboto Condensed")
sources_label <- "Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>"
p <- plotly::ggplotly(g, width = NULL, height = NULL, #plotly_graph_heights,
tooltip = c("text"), dynamicTicks = FALSE) %>%
plotly::config(scrollZoom = FALSE) %>% plotly::layout(dragmode='pan') %>%
#plotly::hide_legend() %>%
layout(annotations =
list(x = 0.5, y = -0.36, text = sources_label,
showarrow = F, xref='paper', yref='paper',
xanchor='auto', yanchor='auto', xshift=0, yshift=0,
font=list(size=12, color="black"))
)
p# install.packages("remotes")
# remotes::install_github("coolbutuseless/ggpattern")
library(ggpattern)
d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>%
#dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::mutate(
victimLabel = signs::signs(`Total Victim Race`,
format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_offender_label = signs::signs(`% of Offender Race`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(countLabel, "\n",
percent_victim_label, " of ",
`Victim Race`, " victims\n",
percent_offender_label, " of ",
`Offender Race`, " offender victimizations\n",
percent_total_label, " of all 2019 victimizations"))
ggplot(d_gg, aes(x = `Victim Race`, y = Victims, fill = `Offender Race`)) +
ggpattern::geom_col_pattern(
aes(pattern = `Offender Race`,
fill = `Offender Race`,
pattern_fill = `Offender Race`,
pattern_angle = `Offender Race`,
pattern_spacing = `Offender Race`
),
colour = 'white',
pattern_density = 0.35,
pattern_colour = 'white'
) +
ggpattern::scale_pattern_discrete(choices = c("none", "stripe", "circle", "crosshatch" )) +
ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",Hispanic="#D55E00",Black="#000000",Asian="#F0E442")) +
ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
ggpattern::scale_pattern_angle_discrete(range = c(0,90)) +
scale_fill_manual(values = c("#56B4E9", "#000000", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06),
#labels = scales::unit_format(scale = .001, suffix = "k")
labels = c("0", "1M", "2M", "3M")) +
hrbrthemes::theme_ipsum_rc(base_size = 12) +
theme(panel.grid.major.y = element_line(linetype = 2),
panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 12),
axis.title.y = element_text(size = 12)) +
ggtitle("Nonfatal Violent incidents by victim and offender race (2019)",
subtitle = "National Crime Victimization Survey") +
labs(
x = "Victim Race", y = "Number of violent incidents",
caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) +
geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1),
aes(y = `Total Victim Race`, label = victimLabel),
vjust = 0, family = "Roboto Condensed") ##rpubs not publishing due to file size
#install.packages("treemapify")
library(treemapify)
d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>%
#dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::mutate(
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_offender_label = signs::signs(`% of Offender Race`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(countLabel, "\n",
percent_victim_label, " of ",
`Victim Race`, " victims\n",
percent_offender_label, " of ",
`Offender Race`, " offender victimizations\n",
percent_total_label, " of all 2019 victimizations"),
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
Label = paste0(Category, "\n",
signs::signs(Victims, accuracy = 100, format = scales::comma),
"\n",
signs::signs(`% of Total`, accuracy = 1,
format = scales::percent)),
colorText = "white" #ifelse(`Offender Race` == "Asian", "Black", "White")
)
ggplot(d_gg, aes(area = Victims, label = Label, fill = `Offender Race`,
subgroup = Type,
subgroup2 = `Offender Race`,
subgroup3 = `Victim Race`)) +
geom_treemap(alpha = 0.95) +
scale_fill_manual(values = c("#56B4E9", "#000000", "#D55E00")) + #"#F0E442")) +
scale_color_manual(values = c("white")) +
#ggthemes::scale_fill_colorblind() +
#ggthemes::scale_fill_ptol() +
geom_treemap_text(aes(color = "white"), place = "topleft",
reflow = T, family = "Roboto Condensed") +
geom_treemap_subgroup_border(size = 6, alpha = 1) +
geom_treemap_subgroup_text(place = "bottom", grow = T, alpha = 0.4,
colour = "white", fontface = "italic",
min.size = 0,
family = "Roboto Condensed", reflow = TRUE) +
hrbrthemes::theme_ipsum_rc(base_size = 12) +
ggtitle("Nonfatal Violent incidents within and between races (2019)",
subtitle = NULL) +
labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) Focusing specifically on crime involving Blacks and Whites:
library(treemapify)
d_gg <- ncvs_t15 %>%
dplyr::rename(`Total Victim Race` = Total) %>%
tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`),
names_to = "Offender Race",
values_to = "Victimizations") %>%
#dplyr::filter(`Offender Race` != "Total") %>%
dplyr::filter(`Victim Race` %in% c("White", "Black"),
`Offender Race` %in% c("White", "Black")) %>%
dplyr::mutate(
Percent = Victimizations/`Total Victim Race`,
`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])) %>%
process_online_ncvs_data(.) %>%
dplyr::mutate(
victimLabel = signs::signs(`Total Victim Race`,
format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_offender_label = signs::signs(`% of Offender Race`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(countLabel, "\n",
percent_victim_label, " of ",
`Victim Race`, " victims\n",
percent_offender_label, " of ",
`Offender Race`, " offender victimizations\n",
percent_total_label, " of all 2019 victimizations"),
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
Label = paste0(Category, "\n", signs::signs(Victims, accuracy = 100,
format = scales::comma),
"\n",
signs::signs(`% of Total`, accuracy = 1,
format = scales::percent)),
#colorText = ifelse(`Offender Race` == "Asian", "Black", "White"),
`Victim Race` = factor(`Victim Race`, levels = c("White", "Black")),
`Offender Race` = factor(`Offender Race`, levels = c("White", "Black"))
)
ggplot(d_gg, aes(area = Victims, label = Label, fill = `Offender Race`,
subgroup = Type,
subgroup2 = `Offender Race`,
subgroup3 = `Victim Race`)) +
geom_treemap(alpha = 0.95) +
scale_fill_manual(values = c("#56B4E9", "#000000", "#D55E00", "#F0E442")) +
scale_color_manual(values = c("black", "white")) +
#ggthemes::scale_fill_colorblind() +
#ggthemes::scale_fill_ptol() +
geom_treemap_text(color = "white", place = "topleft",
reflow = T, family = "Roboto Condensed") +
#geom_treemap_text(aes(color = colorText), place = "topleft", reflow = T, family = "Roboto Condensed") +
geom_treemap_subgroup_border(size = 6, alpha = 1) +
geom_treemap_subgroup_text(place = "bottom", grow = T,
alpha = 0.4, colour = "white",
fontface = "italic", min.size = 0,
family = "Roboto Condensed", reflow = TRUE) +
hrbrthemes::theme_ipsum_rc(base_size = 12) +
ggtitle("Nonfatal Violent incidents among Blacks and Whites (2019)",
subtitle = NULL) +
labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) race_levels <- c("White", "Black", "Hispanic")
library(ggpattern)
d_gg <- dt_ncvs_19 %>%
tibble::as_tibble() %>%
dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::filter(`Offender Race` %in% race_levels,
`Victim Race` %in% race_levels) %>%
dplyr::mutate(
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_interracial_pair = signs::signs(`% Inter-racial Pairs`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
##Label = paste0(rateLabel, "\n(", countLabel, ")"),
Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
`Offender Race` = factor(`Offender Race`, levels = race_levels),
`Victim Race` = factor(`Victim Race`, levels = race_levels),
`Race Pair` = factor(`Race Pair`, levels = c("White-Black",
"White-Hispanic",
"Black-Hispanic"
# "White-Asian",
# "Black-Asian",
# "Hispanic-Asian"
))
)
ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
ggpattern::geom_col_pattern(
aes(pattern = `Offender Race`,
fill = `Offender Race`,
pattern_fill = `Offender Race`,
pattern_angle = `Offender Race`,
pattern_spacing = `Offender Race`
),
colour = 'white',
pattern_density = 0.35,
pattern_colour = 'white'
) +
ggpattern::scale_pattern_discrete(choices = c("none", "stripe", "circle", "crosshatch" )) +
ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
Black="#000000",
Hispanic = "#D55E00",
Asian = "#F0E442")) +
#ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
#scale_fill_manual(values = c("#56B4E9", "#000000")) +
scale_fill_manual(values = c( "#56B4E9","#000000", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL,
expand = expansion(mult = c(0, 0.25)),
labels = scales::unit_format(scale = .001, suffix = "k")) +
#labels = c("0", "1M", "2M", "3M")) +
hrbrthemes::theme_ipsum_rc(base_size = 13) +
theme(panel.grid.major.y = element_line(linetype = 2),
panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 14),
axis.title.y = element_text(size = 14),
legend.position = "none",
axis.text.x = element_text(size = 13, face = "bold")
) +
ggtitle("Nonfatal Interracial Violent Crime (2019)",
subtitle = NULL) +
#subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
#subtitle = "") +
labs(
x = "", y = "Number of violent incidents",
caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) +
geom_text(aes(y = Victims, label = Label),
vjust = 0, nudge_y = 50, family = "Roboto Condensed",
fill = NA, size = 4, fontface = "bold") +
facet_wrap(~`Race Pair`, scales = "free", ncol = 3)library(ggpattern)
d_gg <- dt_ncvs_19 %>% tibble::as_tibble() %>%
dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::filter(`Victim Race` %in% c("White", "Black"),
`Offender Race` %in% c("White", "Black")) %>%
#dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::mutate(
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_interracial_pair = signs::signs(`% Inter-racial Pairs`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(countLabel, " (", percent_interracial_pair, ")"))
ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
ggpattern::geom_col_pattern(
aes(pattern = `Offender Race`,
fill = `Offender Race`,
pattern_fill = `Offender Race`,
pattern_angle = `Offender Race`,
pattern_spacing = `Offender Race`
),
colour = 'white',
pattern_density = 0.35,
pattern_colour = 'white'
) +
ggpattern::scale_pattern_discrete(choices = c("none","stripe" )) +
ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
Black="#000000")) +
ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
scale_fill_manual(values = c("#56B4E9", "#000000")) +
#scale_fill_manual(values = c( "#000000", "#56B4E9", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.2)),
labels = scales::unit_format(scale = .001, suffix = "k")) +
#labels = c("0", "1M", "2M", "3M")) +
hrbrthemes::theme_ipsum_rc(base_size = 13) +
theme(panel.grid.major.y = element_line(linetype = 2),
panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 14),
axis.title.y = element_text(size = 14),
legend.position = "none",
axis.text.x = element_text(size = 13, face = "bold")
) +
ggtitle("Nonfatal Black-White Interracial Violence (2019)",
subtitle = NULL) +
#subtitle = "") +
labs(
x = "", y = "Number of violent incidents",
caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=7046/>")) +
geom_text(aes(y = `Victims`, label = Label),
vjust = 0, nudge_y = 5000, family = "Roboto Condensed",
fill = NA, size = 5, fontface = "bold") race_levels <- c("White", "Black", "Hispanic")
library(ggpattern)
d_gg <- dt_ncvs_19 %>%
tibble::as_tibble() %>%
dplyr::left_join(census_data %>%
dplyr::select(`Offender Race` = Race, Population),
by = c("Offender Race")) %>%
dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::filter(`Offender Race` %in% race_levels,
`Victim Race` %in% race_levels) %>%
dplyr::mutate(
`Victimization Rate` = 1e06*Victims / Population,
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
rateLabel = paste0(signs::signs(`Victimization Rate`,
format = scales::comma,
accuracy = 1),
" per 1M ",
`Offender Race`, "s"),
percent_interracial_pair = signs::signs(`% Inter-racial Pairs`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(rateLabel, "\n", countLabel, " (",
percent_interracial_pair, ")"),
#Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
`Offender Race` = factor(`Offender Race`, levels = race_levels),
`Victim Race` = factor(`Victim Race`, levels = race_levels),
`Race Pair` = factor(`Race Pair`, levels = c("White-Black",
"White-Hispanic",
"Black-Hispanic"
# "White-Asian",
# "Black-Asian",
# "Hispanic-Asian"
))
)
ggplot(d_gg, aes(x = Category,
y = `Victimization Rate`,
fill = `Offender Race`)) +
ggpattern::geom_col_pattern(
aes(pattern = `Offender Race`,
fill = `Offender Race`,
pattern_fill = `Offender Race`,
pattern_angle = `Offender Race`,
pattern_spacing = `Offender Race`
),
colour = 'white',
pattern_density = 0.35,
pattern_colour = 'white'
) +
ggpattern::scale_pattern_discrete(choices = c("none", "stripe", "circle", "crosshatch" )) +
ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
Black="#000000",
Hispanic = "#D55E00",
Asian = "#F0E442")) +
#ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
#scale_fill_manual(values = c("#56B4E9", "#000000")) +
scale_fill_manual(values = c( "#56B4E9","#000000", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.25))) + #, labels = scales::unit_format(scale = .001, suffix = "k")) +
#labels = c("0", "1M", "2M", "3M")) +
hrbrthemes::theme_ipsum_rc(base_size = 13) +
theme(panel.grid.major.y = element_line(linetype = 2),
panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 14),
axis.title.y = element_text(size = 14),
legend.position = "none",
axis.text.x = element_text(size = 13, face = "bold")
) +
ggtitle("Rate of Interracial Nonfatal Violent Crime by Offender Population (2019)",
subtitle = NULL) +
#subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
#subtitle = "") +
labs(
x = "", y = "Interracial Nonfatal Violent Crime Rate",
caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2019.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>\nUS Census 5-Year American Community Survey (2014-2018)")) +
geom_text(aes(y = `Victimization Rate`, label = Label),
vjust = 0, nudge_y = 50,
family = "Roboto Condensed",
fill = NA, size = 3.5,
fontface = "bold") +
facet_wrap(~`Race Pair`, scales = "free", ncol = 3)The original table provided by BJS looks like the following:
## Online BJS Files
##National Crime Victimization Survey
##Download NCVS 2018 (released 2019)
library(tidyverse)
library(data.table)
library(gt)
library(extrafont)
gt(ncvs_t14) %>%
tab_header(
title = "Violent incidents by victim and offender race (2018)"
#subtitle = ""
) %>%
fmt_number(
data = .,
columns = c(8),
rows = NULL,
decimals = 0,
drop_trailing_zeros = FALSE,
use_seps = TRUE,
scale_by = 1,
suffixing = FALSE,
pattern = "{x}",
sep_mark = ",",
dec_mark = ".",
locale = NULL) %>%
fmt_percent(
data = .,
columns = c(2:7),
placement = "right",
decimals = 2,
drop_trailing_zeros = FALSE,
use_seps = TRUE,
pattern = "{x}",
sep_mark = ",",
dec_mark = ".",
locale = NULL) %>%
tab_spanner(
label = "Offender Race",
columns = vars(White, Black, Hispanic, Asian, Other, Multiple)) %>%
tab_footnote(
footnote = "Includes Native Americans, Hawaiians, and multiracial individuals.",
locations = cells_column_labels(vars(Other))) %>%
tab_footnote(
footnote = "Multiple offenders of various races.",
locations = cells_column_labels(vars(Multiple))) %>%
tab_options(
footnotes.font.size = 11
) %>%
tab_style(
style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
locations = list(cells_body(columns = 2:8))
) %>%
tab_style(
style = cell_text(align = "left", font = "Arial Narrow", size = "medium"),
locations = list(cells_body(columns = 1))
) %>%
tab_style(
style = cell_text(align = "center", font = "Arial Narrow"),
locations = list(cells_column_labels(columns = 1:8),
cells_column_spanners(spanners ="Offender Race" ),
cells_title(groups = c("title")),
cells_stubhead())
) %>%
tab_style(
style = cell_text(weight = "bolder"),
locations = list(cells_title(groups = c("title")))
) %>%
tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2018.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686) \nFilename: cv18t14.csv")) | Violent incidents by victim and offender race (2018) | |||||||
|---|---|---|---|---|---|---|---|
| Victim Race | Offender Race | Victimizations | |||||
| White | Black | Hispanic | Asian | Other1 | Multiple2 | ||
| White | 62.10% | 15.30% | 10.20% | 2.20% | 8.10% | 2.10% | 3,581,360 |
| Black | 10.60% | 70.30% | 7.90% | 0.10% | 9.30% | 1.90% | 563,940 |
| Hispanic | 28.20% | 15.30% | 45.40% | 0.60% | 7.40% | 3.00% | 734,410 |
| Asian | 24.10% | 27.50% | 7.00% | 24.10% | 14.40% | 2.90% | 182,230 |
| Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018. Filename: cv18t14.csv |
|||||||
|
1
Includes Native Americans, Hawaiians, and multiracial individuals.
2
Multiple offenders of various races.
|
|||||||
From these percentages, we can derive the frequencies (counts). Below I show the frequencies of violent victimizations and transpose the table so that the rows show Offender Race and the columns show Vicim Race.
library(gt)
m <- dt_ncvs_18 %>% tibble::as_tibble() %>%
dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>%
dplyr::select(`Victim Race`, `Offender Race`, Victims) %>%
dplyr::mutate(Victims = round(Victims)) %>%
tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>%
dplyr::group_by(`Offender Race`) %>%
dplyr::mutate(`Total Offending` = `White` + `Black` + `Hispanic` + `Asian`) %>%
dplyr::ungroup()
gt(m) %>%
tab_header(
title = "Violent Victimizations by Race of Offender and Victim (2018)"
#subtitle = "Transposed"
) %>%
fmt_number(
data = .,
columns = c(2:6),
rows = NULL,
decimals = 0,
drop_trailing_zeros = FALSE,
use_seps = TRUE,
scale_by = 1,
suffixing = FALSE,
pattern = "{x}",
sep_mark = ",",
dec_mark = ".",
locale = NULL) %>%
tab_spanner(
label = "Victim Race",
columns = vars(White, Black, Hispanic, Asian)) %>%
summary_rows(
groups = NULL,
columns = vars(`White`,`Black`, `Hispanic`, `Asian`, `Total Offending`),
fns = list(`Total` = ~sum(.x, na.rm=TRUE)),
decimals = 0,
use_seps = TRUE,
sep_mark = ",") %>%
tab_style(
style = cell_text(size = "small", align = "right", font = "Roboto Condensed"),
locations = list(cells_body(columns = 1:7),
cells_stub(),
cells_grand_summary(columns = 1:7, rows = "Total"))
) %>%
tab_style(
style = cell_text(align = "left", font = "Arial Narrow", size = "medium"),
locations = list(cells_body(columns = 2))
) %>%
tab_style(
style = cell_text(align = "center", font = "Arial Narrow"),
locations = list(cells_column_labels(columns = 1:6),
cells_column_spanners(spanners ="Victim Race" ),
cells_title(groups = c("title")),
cells_stubhead())
) %>%
tab_style(
style = cell_text(weight = "bolder"),
locations = list(cells_title(groups = c("title")))
) %>%
tab_source_note(source_note = md("Source: [Bureau of Justice Statistics, National Crime Victimization Survey, 2018.](https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686) \nFilename: cv18t14.csv")) | Violent Victimizations by Race of Offender and Victim (2018) | ||||||
|---|---|---|---|---|---|---|
| Offender Race | Victim Race | Total Offending | ||||
| White | Black | Hispanic | Asian | |||
| White | 2,224,025 | 59,778 | 207,104 | 43,917 | 2,534,824 | |
| Black | 547,948 | 396,450 | 112,365 | 50,113 | 1,106,876 | |
| Hispanic | 365,299 | 44,551 | 333,422 | 12,756 | 756,028 | |
| Asian | 78,790 | 564 | 4,406 | 43,917 | 127,677 | |
| Total | — | 3,216,062 | 501,343 | 657,297 | 150,703 | 4,525,405 |
| Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018. Filename: cv18t14.csv |
||||||
From the table above provided by the BJS, we can derive the frequencies of both offenders and victims by race. Below I have removed Offenders classified as ‘Other’ or ‘Multiple’ race. but not before calculating the total number of victimizations including them. Keep in mind is that these figures include only victimizations for which the race of both the offender and victim are known.
For example, reading across the second row involving White offenders and Black victims, we see that these involved only 1.18% of all victimizations (in which the race of offender and victim are both known); 10.6% of victimizations of Black victims; 2.36% of victimizations involving White offenders; 2.9% of all interracial violent victimizations; and approximately 10% of violent interracial victimizations between Blacks and Whites, the other 90% involving Black offenders and White victims.
library(reactable)
reactable::reactable(dt_ncvs_18, striped = TRUE, highlight = TRUE,
bordered = TRUE, pagination = FALSE,
compact = TRUE, resizable = TRUE,
filterable = TRUE,
groupBy = NULL,
defaultColDef = colDef(
#minWidth = 85,
maxWidth = 65,
align = "center",
style = list(fontSize = 11,
fontFamily = "'Roboto Condensed',
'Arial Narrow', Merriweather, Arial,
Helvetica, sans-serif"),
format = list(cell = colFormat(separators = TRUE,
percent = FALSE, digits = 0)),
headerStyle = list(fontWeight = "bold", fontSize = 12),
footerStyle = list(fontWeight = "bold"))
,columns = list(
`Offender Race` = colDef(maxWidth = 75,
minWidth = NULL,
align = "left",
aggregate = "unique"),
`Race Pair` = colDef(maxWidth = 90,
minWidth = NULL,
align = "left",
aggregate = "unique"),
`Victim Race` = colDef(maxWidth = 65,
minWidth = NULL,
align = "left",
aggregate = "unique"),
`% of Total` = colDef(maxWidth = 50,
align = "center",
format = colFormat(separators = FALSE,
percent = TRUE,
digits = 2)),
`Total Victim Race` = colDef(maxWidth = 60),
`Total Inter-racial Pairs` = colDef(maxWidth = 55),
`Total Inter-racial` = colDef(maxWidth = 60),
`% of Victim Race` = colDef(maxWidth = 60,
align = "center",
format = colFormat(separators = FALSE,
percent = TRUE,
digits = 2)),
`% of Offender Race` = colDef(maxWidth = 75,
align = "center",
format = colFormat(separators = FALSE,
percent = TRUE,
digits = 2)),
`% Inter-racial` = colDef(maxWidth = 50,
align = "center",
format = colFormat(separators = FALSE,
percent = TRUE,
digits = 2)),
`% Inter-racial Pairs` = colDef(maxWidth = 55,
align = "center",
format = colFormat(separators = FALSE,
percent = TRUE,
digits = 2))
)
)The table above is in ‘long-format’. We can simplify greatly by pivoting to wide format as shown below. The pie charts on the far right under the column header % of Victims show the racial make-up of victims of offenders of each race.
For example, reading from left to right across the top row, we see that roughly 88% of the victims of White offenders are themselves White. The pie graphs in the last last row % of Offenders, by contrast, show the racial make-up of the offenders who victimized each racial group. For example, in the Black Victims column we see that roughly 79% of Black victims of violence are victimized by other Blacks.
library(reactable)
library(sparkline)
m <- dt_ncvs_18 %>% tibble::as_tibble() %>%
dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>%
dplyr::select(`Victim Race`, `Offender Race`, Victims) %>%
dplyr::mutate(
Victims = round(Victims),
`Victim Race` = paste0(`Victim Race`, " Victims")) %>%
tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims) %>%
dplyr::group_by(`Offender Race`) %>%
dplyr::mutate(
`Total Offending` = `White Victims` + `Black Victims` +
`Hispanic Victims` + `Asian Victims`,
`% of Victims` = list(c(`White Victims`,`Black Victims`,
`Hispanic Victims`,`Asian Victims`))
# ,`Victim Counts` = list(c(`White Victims`,`Black Victims`,
# `Hispanic Victims`,`Asian Victims`))
)
bar_colors <- c("#56B4E9", "#000000", "#D55E00", "#F0E442")
bar_color_list <- list(bar_colors, bar_colors, bar_colors, bar_colors)
reactable::reactable(m, striped = TRUE, highlight = TRUE,
bordered = TRUE, compact = FALSE, resizable = FALSE,
# rowStyle = function(index) {if (index == 5) list(fontWeight = "bold")},
# rowClass = function(index) {if (index == 5) {"bold"}},
defaultColDef = colDef(
format = colFormat(separators = TRUE, digits = 0),
footerStyle = list(fontWeight = "bold"),
footer = function(values) {
if (!is.numeric(values)) return()
sparkline(values, type = "pie", width = 80, height = 30,
sliceColors = c("#56B4E9", "#000000", "#D55E00", "#F0E442"))
}),
columns = list(
`Offender Race` = colDef(footer = "% of Offenders"),
`% of Victims` = colDef(cell = function(values) {
if (!is.numeric(values)) return()
sparkline(values, type = "pie", width = 80, height = 30,
sliceColors = c("#56B4E9", "#000000", "#D55E00", "#F0E442"))
})
# ,`Victim Counts` = colDef(cell = function(values, index) {
# sparkline(values, type = "bar",
# barWidth = 10,
# #barColor = c("#56B4E9", "#000000", "#D55E00", "#F0E442"),
# barColor = bar_colors[[index]],
# stackedBarColor = bar_colors[[index]],
# chartRangeMin = 0,
# chartRangeMax = unlist(m$`Victim Counts`[[index]])
# )
# })
)
)Violent Victimizations by Race of Offender and Victim (2018)
The information above can be encoded in a chord diagram, shown below. The ‘total’ for each race, depicted by the color and size of the outside segments correspond to the offender race and not the race of the victim.
library(chorddiag)
m <- dt_ncvs_18 %>% tibble::as_tibble() %>%
dplyr::filter(`Offender Race` %in% c("White", "Black", "Hispanic", "Asian")) %>%
dplyr::select(`Victim Race`, `Offender Race`, Victims) %>%
dplyr::mutate(Victims = round(Victims)) %>%
tidyr::pivot_wider(names_from = `Victim Race`, values_from = Victims)
matrix_names <- list(Offender = m$`Offender Race`, Victim = names(m)[-1])
m <- as.matrix(m[,-1])
dimnames(m) <- matrix_names
groupColors <- c("#56B4E9", "#000000", "#D55E00", "#F0E442")
chorddiag(m, groupColors = groupColors, groupThickness = 0.1,
groupPadding = 2, groupnamePadding = 30)Chord diagram showing violent victimizations by race of offender and victim (2018)
Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018. : cv18t14.csv
library(plotly)
d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>%
#dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::mutate(
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_offender_label = signs::signs(`% of Offender Race`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(countLabel, "\n",
percent_victim_label, " of ",
`Victim Race`, " victims\n",
percent_offender_label, " of ",
`Offender Race`, " offender victimizations\n",
percent_total_label, " of all 2018 victimizations"))
g <- ggplot(d_gg, aes(x = `Victim Race`, y = Victims,
fill = `Offender Race`)) +
geom_col(alpha = 0.75, aes(text = Label)) +
scale_fill_manual(values = c("#56B4E9", "#000000", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06),
#labels = scales::unit_format(scale = .001, suffix = "k")
labels = c("0", "1M", "2M", "3M")) +
hrbrthemes::theme_ipsum_rc(base_size = 12) +
theme(panel.grid.major.y = element_line(linetype = 2),
panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 12),
axis.title.y = element_text(size = 12)) +
ggtitle("Violent incidents by victim and offender race (2018)") +
labs(
x = "Victim Race",
y = "Number of violent incidents") +
geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1),
aes(y = `Total Victim Race`, label = victimLabel),
vjust = 0, family = "Roboto Condensed")
sources_label <- "Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>"
p <- plotly::ggplotly(g, width = NULL, height = NULL, #plotly_graph_heights,
tooltip = c("text"), dynamicTicks = FALSE) %>%
plotly::config(scrollZoom = FALSE) %>% plotly::layout(dragmode='pan') %>%
#plotly::hide_legend() %>%
layout(annotations =
list(x = 0.5, y = -0.36, text = sources_label,
showarrow = F, xref='paper', yref='paper',
xanchor='auto', yanchor='auto', xshift=0, yshift=0,
font=list(size=12, color="black"))
)
pHere is a more printer-friendly version using the ggpattern package.
library(ggpattern)
d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>%
#dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::mutate(
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_offender_label = signs::signs(`% of Offender Race`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(countLabel, "\n",
percent_victim_label, " of ",
`Victim Race`, " victims\n",
percent_offender_label, " of ",
`Offender Race`, " offender victimizations\n",
percent_total_label, " of all 2018 victimizations"))
ggplot(d_gg, aes(x = `Victim Race`, y = Victims,
fill = `Offender Race`)) +
ggpattern::geom_col_pattern(
aes(pattern = `Offender Race`,
fill = `Offender Race`,
pattern_fill = `Offender Race`,
pattern_angle = `Offender Race`,
pattern_spacing = `Offender Race`
),
colour = 'white',
pattern_density = 0.35,
pattern_colour = 'white'
) +
ggpattern::scale_pattern_discrete(choices = c("none", "stripe",
"circle", "crosshatch" )) +
ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",Hispanic="#D55E00",Black="#000000",Asian="#F0E442")) +
ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
ggpattern::scale_pattern_angle_discrete(range = c(0,90)) +
scale_fill_manual(values = c("#56B4E9", "#000000", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL, breaks = c(0, 1e06, 2e06, 3e06),
#labels = scales::unit_format(scale = .001, suffix = "k")
labels = c("0", "1M", "2M", "3M")) +
hrbrthemes::theme_ipsum_rc(base_size = 12) +
theme(panel.grid.major.y = element_line(linetype = 2),
panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 12),
axis.title.y = element_text(size = 12)) +
ggtitle("Nonfatal Violent incidents by victim and offender race (2018)",
subtitle = "National Crime Victimization Survey") +
labs(
x = "Victim Race", y = "Number of violent incidents",
caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) +
geom_text(data = d_gg %>% dplyr::group_by(`Victim Race`) %>% dplyr::slice(1),
aes(y = `Total Victim Race`, label = victimLabel),
vjust = 0, family = "Roboto Condensed") Below I use the treemapify package.
d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>%
#dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::mutate(
victimLabel = signs::signs(`Total Victim Race`,
format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_offender_label = signs::signs(`% of Offender Race`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(countLabel, "\n",
percent_victim_label, " of ",
`Victim Race`, " victims\n",
percent_offender_label, " of ",
`Offender Race`, " offender victimizations\n",
percent_total_label, " of all 2018 victimizations"),
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
Label = paste0(Category, "\n", signs::signs(Victims, accuracy = 100,
format = scales::comma),
"\n",
signs::signs(`% of Total`, accuracy = 1,
format = scales::percent)),
colorText = ifelse(`Offender Race` == "Asian", "Black", "White")
)
ggplot(d_gg, aes(area = Victims,
label = Label,
fill = `Offender Race`,
subgroup = Type,
subgroup2 = `Offender Race`,
subgroup3 = `Victim Race`)) +
geom_treemap(alpha = 0.95) +
scale_fill_manual(values = c("#56B4E9", "#000000", "#D55E00", "#F0E442")) +
scale_color_manual(values = c("black", "white")) +
#ggthemes::scale_fill_colorblind() +
#ggthemes::scale_fill_ptol() +
geom_treemap_text(aes(color = colorText),
place = "topleft", reflow = T,
family = "Roboto Condensed") +
geom_treemap_subgroup_border(size = 6, alpha = 1) +
geom_treemap_subgroup_text(place = "bottom",
grow = T, alpha = 0.4, colour = "white",
fontface = "italic", min.size = 0,
family = "Roboto Condensed",
reflow = TRUE) +
hrbrthemes::theme_ipsum_rc(base_size = 12) +
ggtitle("Nonfatal Violent incidents within and between races (2018)",
subtitle = NULL) +
labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) Focusing specifically on crime involving Blacks and Whites.
library(treemapify)
d_gg <- ncvs_t14 %>%
dplyr::rename(`Total Victim Race` = Victimizations) %>%
tidyr::pivot_longer(-c(`Victim Race`, `Total Victim Race`),
names_to = "Offender Race",
values_to = "Percent") %>%
#dplyr::filter(`Offender Race` != "Total") %>%
dplyr::filter(`Victim Race` %in% c("White", "Black"),
`Offender Race` %in% c("White", "Black")) %>%
dplyr::mutate(
Victimizations = `Total Victim Race` * (Percent),
`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])) %>%
process_online_ncvs_data(.) %>%
dplyr::mutate(
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_offender_label = signs::signs(`% of Offender Race`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`, format = scales::percent),
Label = paste0(countLabel, "\n",
percent_victim_label, " of ",
`Victim Race`, " victims\n",
percent_offender_label, " of ",
`Offender Race`, " offender victimizations\n",
percent_total_label, " of all 2018 victimizations"),
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
Label = paste0(Category, "\n", signs::signs(Victims, accuracy = 100,
format = scales::comma),
"\n",
signs::signs(`% of Total`, accuracy = 1,
format = scales::percent)),
#colorText = ifelse(`Offender Race` == "Asian", "Black", "White"),
`Victim Race` = factor(`Victim Race`, levels = c("White", "Black")),
##for ordering graphs
`Offender Race` = factor(`Offender Race`, levels = c("White", "Black"))
)
ggplot(d_gg, aes(area = Victims,
label = Label,
fill = `Offender Race`,
subgroup = Type,
subgroup2 = `Offender Race`,
subgroup3 = `Victim Race`)) +
geom_treemap(alpha = 0.95) +
scale_fill_manual(values = c("#56B4E9", "#000000", "#D55E00", "#F0E442")) +
scale_color_manual(values = c("black", "white")) +
#ggthemes::scale_fill_colorblind() +
#ggthemes::scale_fill_ptol() +
geom_treemap_text(color = "white", place = "topleft",
reflow = T, family = "Roboto Condensed") +
#geom_treemap_text(aes(color = colorText), place = "topleft", reflow = T, family = "Roboto Condensed") +
geom_treemap_subgroup_border(size = 6, alpha = 1) +
geom_treemap_subgroup_text(place = "bottom", grow = T,
alpha = 0.4, colour = "white",
fontface = "italic", min.size = 0,
family = "Roboto Condensed",
reflow = TRUE) +
hrbrthemes::theme_ipsum_rc(base_size = 12) +
ggtitle("Nonfatal Violent incidents among Blacks and Whites (2018)",
subtitle = NULL) +
labs(caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n <https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) race_levels <- c("White", "Black", "Hispanic", "Asian")
library(ggpattern)
d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>%
dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::filter(`Offender Race` %in% race_levels,
`Victim Race` %in% race_levels) %>%
dplyr::mutate(
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_interracial_pair = signs::signs(`% Inter-racial Pairs`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
##Label = paste0(rateLabel, "\n(", countLabel, ")"),
Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
`Offender Race` = factor(`Offender Race`, levels = race_levels),
`Victim Race` = factor(`Victim Race`, levels = race_levels),
`Race Pair` = factor(`Race Pair`, levels = c("White-Black", "White-Hispanic",
"White-Asian", "Black-Hispanic",
"Black-Asian", "Hispanic-Asian"))
)
ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
ggpattern::geom_col_pattern(
aes(pattern = `Offender Race`,
fill = `Offender Race`,
pattern_fill = `Offender Race`,
pattern_angle = `Offender Race`,
pattern_spacing = `Offender Race`
),
colour = 'white',
pattern_density = 0.35,
pattern_colour = 'white'
) +
ggpattern::scale_pattern_discrete(choices = c("none", "stripe",
"circle", "crosshatch" )) +
ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
Black="#000000",
Hispanic = "#D55E00",
Asian = "#F0E442")) +
#ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
#scale_fill_manual(values = c("#56B4E9", "#000000")) +
scale_fill_manual(values = c( "#56B4E9","#000000", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL,
expand = expansion(mult = c(0, 0.25)),
labels = scales::unit_format(scale = .001, suffix = "k")) +
#labels = c("0", "1M", "2M", "3M")) +
hrbrthemes::theme_ipsum_rc(base_size = 13) +
theme(panel.grid.major.y = element_line(linetype = 2),
panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 14),
axis.title.y = element_text(size = 14),
legend.position = "none",
axis.text.x = element_text(size = 13, face = "bold")
) +
ggtitle("Nonfatal Interracial Violent Crime (2018)",
subtitle = NULL) +
#subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
#subtitle = "") +
labs(
x = "", y = "Number of violent incidents",
caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) +
geom_text(aes(y = Victims, label = Label),
vjust = 0, nudge_y = 50,
family = "Roboto Condensed",
fill = NA, size = 4,
fontface = "bold") +
facet_wrap(~`Race Pair`, scales = "free", ncol = 2)library(ggpattern)
d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>%
dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::filter(`Victim Race` %in% c("White", "Black"),
`Offender Race` %in% c("White", "Black")) %>%
#dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::mutate(
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
percent_interracial_pair = signs::signs(`% Inter-racial Pairs`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(countLabel, " (", percent_interracial_pair, ")"))
ggplot(d_gg, aes(x = Category, y = Victims, fill = `Offender Race`)) +
ggpattern::geom_col_pattern(
aes(pattern = `Offender Race`,
fill = `Offender Race`,
pattern_fill = `Offender Race`,
pattern_angle = `Offender Race`,
pattern_spacing = `Offender Race`
),
colour = 'white',
pattern_density = 0.35,
pattern_colour = 'white'
) +
ggpattern::scale_pattern_discrete(choices = c("none","stripe" )) +
ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
Black="#000000")) +
ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
scale_fill_manual(values = c("#56B4E9", "#000000")) +
#scale_fill_manual(values = c( "#000000", "#56B4E9", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.2)),
labels = scales::unit_format(scale = .001, suffix = "k")) +
#labels = c("0", "1M", "2M", "3M")) +
hrbrthemes::theme_ipsum_rc(base_size = 13) +
theme(panel.grid.major.y = element_line(
linetype = 2), panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 14),
axis.title.y = element_text(size = 14),
legend.position = "none",
axis.text.x = element_text(size = 13, face = "bold")
) +
ggtitle("Black-White Nonfatal Interracial Violence (2018)",
subtitle = NULL) +
#subtitle = "") +
labs(
x = "", y = "Number of nonfatal violent incidents",
caption = sprintf("Source: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>")) +
geom_text(aes(y = `Victims`, label = Label),
vjust = 0, nudge_y = 5000,
family = "Roboto Condensed",
fill = NA,
size = 5, fontface = "bold") race_levels <- c("White", "Black", "Hispanic", "Asian")
library(ggpattern)
d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>%
dplyr::left_join(census_data %>%
dplyr::select(`Offender Race` = Race,
Population),
by = c("Offender Race")) %>%
dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::filter(`Offender Race` %in% race_levels,
`Victim Race` %in% race_levels) %>%
dplyr::mutate(
`Victimization Rate` = 1e06*Victims / Population,
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
victimLabel = signs::signs(`Total Victim Race`,
format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
rateLabel = paste0(signs::signs(`Victimization Rate`,
format = scales::comma, accuracy = 1),
" per 1 million ",
`Offender Race`, "s"),
percent_interracial_pair = signs::signs(`% Inter-racial Pairs`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(rateLabel, "\n", countLabel, " (",
percent_interracial_pair, ")"),
#Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
`Offender Race` = factor(`Offender Race`, levels = race_levels),
`Victim Race` = factor(`Victim Race`, levels = race_levels),
`Race Pair` = factor(`Race Pair`, levels = c("White-Black",
"White-Hispanic",
"White-Asian",
"Black-Hispanic",
"Black-Asian",
"Hispanic-Asian"))
)
ggplot(d_gg, aes(x = Category,
y = `Victimization Rate`,
fill = `Offender Race`)) +
ggpattern::geom_col_pattern(
aes(pattern = `Offender Race`,
fill = `Offender Race`,
pattern_fill = `Offender Race`,
pattern_angle = `Offender Race`,
pattern_spacing = `Offender Race`
),
colour = 'white',
pattern_density = 0.35,
pattern_colour = 'white'
) +
ggpattern::scale_pattern_discrete(choices = c("none", "stripe",
"circle", "crosshatch" )) +
ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",
Black="#000000",
Hispanic = "#D55E00",
Asian = "#F0E442")) +
#ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
ggpattern::scale_pattern_spacing_discrete(range = c(0.25, 0.35)) +
ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
#scale_fill_manual(values = c("#56B4E9", "#000000")) +
scale_fill_manual(values = c( "#56B4E9","#000000", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL,
expand = expansion(mult = c(0, 0.25))) +
hrbrthemes::theme_ipsum_rc(base_size = 13) +
theme(panel.grid.major.y = element_line(linetype = 2),
panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 14),
axis.title.y = element_text(size = 14),
legend.position = "none",
axis.text.x = element_text(size = 13, face = "bold")
) +
ggtitle("Rate of Interracial Nonfatal Violent Crime by Offender Population (2018)",
subtitle = NULL) +
#subtitle = "US Census 5-Year American Community Survey (2014-2018)") +
#subtitle = "") +
labs(
x = "", y = "Interracial Nonfatal Violent Crime Rate",
caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>\nUS Census 5-Year American Community Survey (2014-2018)")) +
geom_text(aes(y = `Victimization Rate`, label = Label),
vjust = 0, nudge_y = 50,
family = "Roboto Condensed", fill = NA,
size = 4, fontface = "bold") +
facet_wrap(~`Race Pair`, scales = "free", ncol = 2)race_levels <- c("White", "Black", "Hispanic", "Asian")
library(ggpattern)
d_gg <- dt_ncvs_18 %>% tibble::as_tibble() %>%
dplyr::left_join(census_data %>% dplyr::select(`Offender Race` = Race,
Population),
by = c("Offender Race")) %>%
dplyr::filter(`Offender Race` != `Victim Race`) %>%
dplyr::filter(`Victim Race` %in% c("White", "Black"),
`Offender Race` %in% c("White", "Black")) %>%
dplyr::filter(`Offender Race` %in% race_levels,
`Victim Race` %in% race_levels) %>%
dplyr::mutate(
`Victimization Rate` = 1e06*Victims / Population,
Category = paste0(`Offender Race`, " on ", `Victim Race`),
Type = ifelse(`Victim Race` == `Offender Race`, "Within-Race", "Between-Race"),
victimLabel = signs::signs(`Total Victim Race`, format = scales::comma),
countLabel = signs::signs(Victims, format = scales::comma),
rateLabel = paste0(signs::signs(`Victimization Rate`,
format = scales::comma, accuracy = 1),
" per 1 million ",
`Offender Race`, "s"),
percent_interracial_pair = signs::signs(`% Inter-racial Pairs`,
format = scales::percent),
percent_victim_label = signs::signs(`% of Victim Race`,
format = scales::percent),
percent_total_label = signs::signs(`% of Total`,
format = scales::percent),
Label = paste0(rateLabel, "\n", countLabel, " (",
percent_interracial_pair, ")"),
#Label = paste0(countLabel, " (", percent_interracial_pair, ")"),
`Offender Race` = factor(`Offender Race`, levels = race_levels),
`Victim Race` = factor(`Victim Race`, levels = race_levels),
`Race Pair` = factor(`Race Pair`, levels = c("White-Black",
"White-Hispanic",
"White-Asian",
"Black-Hispanic",
"Black-Asian",
"Hispanic-Asian"))
)
ggplot(d_gg, aes(x = Category, y = `Victimization Rate`, fill = `Offender Race`)) +
ggpattern::geom_col_pattern(
aes(pattern = `Offender Race`,
fill = `Offender Race`,
pattern_fill = `Offender Race`,
pattern_angle = `Offender Race`,
pattern_spacing = `Offender Race`
),
colour = 'white',
pattern_density = 0.35,
pattern_colour = 'white'
) +
ggpattern::scale_pattern_discrete(choices = c("none","stripe" )) +
ggpattern::scale_pattern_fill_manual(values = c(White="#56B4E9",Black="#000000", Hispanic = "#D55E00", Asian = "#F0E442")) +
ggpattern::scale_pattern_spacing_discrete(range = c(0.01, 0.10)) +
ggpattern::scale_pattern_angle_discrete(range = c(0,45)) +
#scale_fill_manual(values = c("#56B4E9", "#000000")) +
scale_fill_manual(values = c( "#56B4E9","#000000", "#D55E00", "#F0E442")) +
scale_y_continuous(minor_breaks = NULL, expand = expansion(mult = c(0, 0.2))) + #, labels = scales::unit_format(scale = .001, suffix = "k")) +
#labels = c("0", "1M", "2M", "3M")) +
hrbrthemes::theme_ipsum_rc(base_size = 13) +
theme(panel.grid.major.y = element_line(linetype = 2), panel.grid.major.x = element_blank(),
axis.title.x = element_text(size = 14), axis.title.y = element_text(size = 14),
legend.position = "none", axis.text.x = element_text(size = 13, face = "bold")
) +
ggtitle("Rate of Black-White Interracial Violent Crime by Offender Population (2018)", subtitle = "Does not include murder") +
#subtitle = "") +
labs(
x = "", y = "Interracial Violent Crime Rate",
caption = sprintf("Sources: Bureau of Justice Statistics, National Crime Victimization Survey, 2018.\n<https://www.bjs.gov/index.cfm?ty=pbdetail&iid=6686/>\nUS Census 5-Year American Community Survey (2014-2018)")) +
geom_text(aes(y = `Victimization Rate`, label = Label), vjust = 0, nudge_y = 50, family = "Roboto Condensed", fill = NA, size = 5, fontface = "bold") All of the following tables and graphs are derived from the National Crime Victimization Surveys (NCVS) available at the Inter-university Consortium for Political and Social Research (ICPSR). The ICPSR data can be found at the following links:
The code I use to derive the victim and offender counts can be found below.
library(tidyverse)
library(data.table)
get_crime <- function(x){
data.table::fcase(
##Violent Crimes 1-20
x == 1, "Completed rape",
x == 2, "Attempted rape",
x == 3, "Sexual attack with serious assault",
x == 4, "Sexual attack with minor assault",
x == 5, "Completed robbery with injury from serious assault",
x == 6, "Completed robbery with injury from minor assault",
x == 7, "Completed robbery without injury from minor assault",
x == 8, "Attempted robbery with injury from serious assault",
x == 9, "Attempted robbery with injury from minor assault",
x == 10, "Attempted robbery without injury",
x == 11, "Completed aggravated assault with injury",
x == 12, "Attempted aggravated assault with weapon",
x == 13, "Threatened assault with weapon",
x == 14, "Simple assault completed with injury",
x == 15, "Sexual assault without injury",
x == 16, "Unwanted sexual contact without force",
x == 17, "Assault without weapon without injury",
x == 18, "Verbal threat of rape",
x == 19, "Verbal threat of sexual assault",
x == 20, "Verbal threat of assault",
##Purse snatching/pocket picking 21-23
x == 21, "Completed purse snatching",
x == 22, "Attempted purse snatching",
x == 23, "Pocket picking (completed only)",
##Property Crimes
x == 31, "Completed burglary, forcible entry",
x == 32, "Completed burglary, unlawful entry without force",
x == 33, "Attempted forcible entry",
x == 40, "Completed motor vehicle theft",
x == 41, "Attempted motor vehicle theft",
x == 54, "Completed theft less than $10",
x == 55, "Completed theft $10 to $49",
x == 56, "Completed theft $50 to $249",
x == 57, "Completed theft $250 or greater",
x == 58, "Completed theft value NA",
x == 59, "Attempted theft",
default = NA_character_ )
}
recode_serious_crimes <- function(x){
data.table::fcase(
x %in% 1:4, "Rape/Sexual attack",
x %in% 5:10, "Robbery",
x %in% 11:13, "Assault",
x %in% 14:20, "Minor violent crime",
x %in% 21:23, "Purse snatch/pocket picking",
x %in% 31:59, "Property crimes",
default = NA_character_ )
}
get_race <- function(x){
data.table::fcase(
x == 1, "White",
x == 2, "Black",
x == 3, "American Indian",
x == 4, "Asian",
x == 5, "Hawaiian",
x == 6, "White-Black",
x == 7, "White-American Indian",
x == 8, "White-Asian",
x == 9, "White-Hawaiian",
x == 10, "Black-American Indian",
x == 11, "Black-Asian",
x == 12, "Black-Hawaiian",
x == 13, "American Indian-Asian",
x == 14, "Asian-Hawaiian",
x == 15, "White-Black-American Indian",
x == 16, "White-Black-Asian",
x == 17, "White-American Indian-Asian",
x == 18, "White-Asian-Hawaiian",
x == 19, "2 or 3 races",
x == 20, "4 or 5 races",
default = NA_character_)
}
get_race_category <- function(race, hisp){
data.table::fcase(
race %chin% c("White") & hisp == "Non-Hispanic", "White",
race %chin% c("White") & hisp == "Hispanic", "Hispanic",
##Hispanic getting at mestizos, i.e. Amerindian with Iberian (Spanish) admixture -
## what most people think of as "Hispanic", but classified as "White"
##- ignoring Black and Asian hispanic. There is a very small percentage of white hispanics with no amerindian ancestry who according to this
## criterion should be classified as 'White', but its negligible.
race %chin% c("Black", "White-Black", "Black-Asian", "Black-Hawaiian", "Black-American Indian"), "Black",
race %chin% c("Asian", "Hawaiian", "Asian-Hawaiian"), "Asian", ##Asian/PAcific Islander/Hawaiian
race %chin% c("American Indian"), "American Indian",
##all others, it not missing, "Other"
!is.na(race), "Other",
default = NA_character_
)
}
get_race_category_v2 <- function(race, hisp){
data.table::fcase(
hisp == "Hispanic", "Hispanic",
race %chin% c("White") & hisp == "Non-Hispanic", "White",
race %chin% c("Black") & hisp == "Non-Hispanic", "Black",
race %chin% c("Asian") & hisp == "Non-Hispanic", "Asian",
!is.na(race), "Other",
default = NA_character_
)
}
process_ncvs_002_files <- function(dt){
require(data.table)
require(magrittr)
dt[, .(
IDHH,
HH_WEIGHT = V2116,
WGTHHCY, ##ADJUSTED HOUSEHOLD WEIGHT - COLLECTION YEAR
hh_Race0 = data.table::fcase(
V2031 == 1, "White", ##White, but not specifying Hispanic
V2031 == 2, "Black",
V2031 == 6, "White-Black",
V2031 %in% c(3,21), "Other",
V2031 == 22, "White-Other",
V2031 == 23, "Black-Other",
V2031 == 24, "White-Black-Other",
default = NA_character_),
pp_Race0 = get_race(V2040A),
pp_Hisp = fcase(V2041 == 1, "Hispanic", V2041 == 2, "Non-Hispanic", default = NA_character_),
ref_Race0 = get_race(V2049A),
ref_Hisp = fcase(V2050 == 1, "Hispanic", V2050 == 2, "Non-Hispanic", default = NA_character_))] %>%
.[, `:=`(
hh_Race = data.table::fcase(
hh_Race0 %chin% c("White","White-Other"), "White/Hispanic",
hh_Race0 %chin% c("Black", "Black-Other", "White-Black"), "Black",
##Note: due to social construction, etc. White-Black is coded as Black, but this is negligible
hh_Race0 %chin% c("Other", "White-Black-Other"), "Other",
default = NA_character_),
pp_Race = get_race_category(pp_Race0, pp_Hisp),
ref_Race = get_race_category(ref_Race0, ref_Hisp)
)]
}
process_ncvs_003_files <- function(dt){
require(data.table)
require(magrittr)
dt[, .(
YEARQ,
IDHH, IDPER,
WGTPERCY,
PP_weight = V3080,
Num_incidents = V3081, ##Num Incidents
Race0 = get_race(V3023A),
Hisp = data.table::fcase(V3024 == 1, "Hispanic", V3024 == 2, "Non-Hispanic", default = NA_character_))] %>%
.[, `:=` (
Race = get_race_category(Race0, Hisp),
Race_v2 = get_race_category_v2(Race0, Hisp)
)]
}
process_ncvs_004_files <- function(dt){
require(data.table)
require(magrittr)
## In 2015, incident weight is called V4527
## "Beginning in 2016, the incident weight (V4527) was adjusted to account for series victimizations and was renamed SERIES_IWEIGHT."
if("V4527" %in% names(dt) & !"SERIES_IWEIGHT" %in% names(dt)){
if(max(dt$V4527, na.rm=TRUE) > 0){
data.table::setnames(dt, old = c("V4527"), new = c("SERIES_IWEIGHT"), skip_absent = FALSE)
}
}
##Rule V4022 != 1 excludes cases outside US
dt[V4022 != 1] %>%
.[, V4016 := fifelse(V4016 %in% 11:996, 10, V4016)] %>% ##recode series, no greater than 10]
.[, .(
YEARQ,
Year = V4015,
Incident_num = V4012,
Crime = get_crime(V4529),
Crime_type = recode_serious_crimes(V4529),
Crime_code = V4529,
IDHH,
IDPER,
WGTVICCY = fifelse(V4019 == 2, WGTVICCY * V4016, WGTVICCY), ##adust for series
SERIES_WEIGHT,
Incident_Weight = SERIES_IWEIGHT,
off_Race0 = data.table::fcase(
V4246C == 1 | V4281 == 1 | V4285A == 2, "Black",
V4246B == 1 | V4280 == 1 | V4285A == 1, "White",
V4246D == 1 | V4282A == 1 | V4285A == 3, "Native American",
V4246E == 1 | V4282B == 1 | V4285A == 4, "Asian",
V4246F == 1 | V4282C == 1 | V4285A == 5, "Hawaiian",
default = NA_character_),
##Excluding multiple offenders of various races
off_Race0_v2 = data.table::fcase(
V4285A %in% 1:5, "Multiple",
V4246C == 1 | V4281 == 1, "Black",
V4246B == 1 | V4280 == 1, "White",
V4246D == 1 | V4282A == 1, "Native American",
V4246E == 1 | V4282B == 1, "Asian",
V4246F == 1 | V4282C == 1, "Hawaiian",
default = NA_character_),
off_Hisp = data.table::fcase(
V4237A == 1 | V4252A == 1 | V4252B %in% c(1), "Hispanic",
V4237A == 2 | V4252A == 2 | V4252B %in% c(2), "Non-Hispanic",
default = NA_character_))] %>%
.[, `:=`(
`Offender Race` = data.table::fcase(
off_Race0 == "White" & off_Hisp == "Non-Hispanic", "White",
off_Race0 == "White" & is.na(off_Hisp), "White", ## Counting White with missing Hispanic designation as White
off_Race0 == "White" & off_Hisp == "Hispanic", "Hispanic",
off_Race0 == "Black", "Black",
off_Race0 %chin% c("Asian", "Hawaiian"), "Asian",
off_Race0 == "Native American", "Native American",
default = NA_character_),
##Excluding all hispanic; Other = Hawaiian, Native American
`Offender Race v2` = data.table::fcase(
off_Hisp == "Hispanic", "Hispanic",
off_Race0 == "White" & off_Hisp == "Non-Hispanic", "White",
off_Race0 == "Black" & off_Hisp == "Non-Hispanic", "Black",
off_Race0 == "Asian" & off_Hisp == "Non-Hispanic", "Asian",
off_Race0 %chin% c("Native American", "Hawaiian"), "Other",
default = NA_character_),
##version 2 + excluding multiple
`Offender Race v3` = data.table::fcase(
off_Hisp == "Hispanic", "Hispanic",
off_Race0_v2 == "White" & off_Hisp == "Non-Hispanic", "White",
off_Race0_v2 == "Black" & off_Hisp == "Non-Hispanic", "Black",
off_Race0_v2 == "Asian" & off_Hisp == "Non-Hispanic", "Asian",
off_Race0_v2 == "Multiple", "Multiple Races",
off_Race0_v2 %chin% c("Native American", "Hawaiian"), "Other",
default = NA_character_)
)]
}
merge_and_summarise_interracial <- function(dt_004, dt_003, keep_codes = 1:20, grouping_vars = c("Weight_type"),
filter_interracial = TRUE, filter_races = FALSE, keep_races = c("White", "Black")){
require(data.table)
weight_cols <- c("SERIES_WEIGHT", "Incident_Weight", "WGTPERCY", "UNWEIGHTED")
dt <- dt_004[Crime_code %in% keep_codes]
dt <- data.table::merge.data.table(x = dt, y = dt_003, by = c("IDPER", "IDHH", "YEARQ"), all.x = TRUE)
dt <- dt[!is.na(`Offender Race`) & !is.na(Race)]
dt[, UNWEIGHTED := 1]
keep_vars <- unique(c(weight_cols, grouping_vars, "Crime", "Crime_code", "Race", "Offender Race"))
keep_vars <- keep_vars[which(keep_vars != "Weight_type")]
dt <- dt[, ..keep_vars]
weight_cols <- weight_cols[which(weight_cols %in% names(dt))]
dt <- data.table::melt.data.table(data = dt, measure.vars = weight_cols, variable.name = "Weight_type", value.name = "Weight")
gVars <- unique(c("Weight_type", "Offender Race", "Race", grouping_vars))
gVars <- gVars[which(gVars %in% names(dt))]
dt <- dt[, .(`Victimizations` = sum(Weight, na.rm=TRUE)), by = gVars]
data.table::setnames(dt, old = "Race", new = "Victim Race", skip_absent = FALSE)
##new variable identifying pairs - each cell contains a vector of two
dt[, index := .I]
gVars <- unique(c("Weight_type", grouping_vars))
gVars <- gVars[which(gVars %in% names(dt))]
dt[, `:=` (`Total Victimizations` = sum(Victimizations)), by = gVars]
gVars <- unique(c("Weight_type", "Offender Race", grouping_vars))
gVars <- gVars[which(gVars %in% names(dt))]
dt[, `:=` (`Total Offender Race` = sum(Victimizations)), by = gVars]
gVars <- unique(c("Weight_type", "Victim Race", grouping_vars))
gVars <- gVars[which(gVars %in% names(dt))]
dt[, `:=` (`Total Victim Race` = sum(Victimizations)), by = gVars]
dt[, Interracial_flag := ifelse(`Victim Race` == `Offender Race`, "No", "Yes")]
gVars <- unique(c("Weight_type", grouping_vars))
gVars <- gVars[which(gVars %in% names(dt))]
##FILTERING TO ONLY INTERRACIAL CRIMES
if(isTRUE(filter_interracial)){
dt <- dt[`Victim Race` != `Offender Race`]
}
dt[, `:=` (`Total Interracial` = sum(Victimizations[which(`Victim Race` != `Offender Race`)])), by = gVars]
all_races <- unique(c(dt$`Offender Race`))
all_pairs_list <- combn(all_races, 2, simplify = FALSE)
for(i in seq_along(all_pairs_list)){
race_pairs <- all_pairs_list[[i]]
race1 <- race_pairs[1]; race2 <- race_pairs[2]
dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1),
`Total Interracial Pairs` := sum(Victimizations), by = gVars]
dt[(`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1),
`Race Pair` := paste0(race1, "-", race2), by = index]
}
##FILTERING TO ONLY PAIRS
if(isTRUE(filter_interracial) & isTRUE(filter_races)){
if(length(unique(keep_races) == 2)){
race1 <-unique(keep_races)[1]; race2 <- unique(keep_races)[2]
dt <- dt[ (`Victim Race` == race1 & `Offender Race` == race2) | (`Victim Race` == race2 & `Offender Race` == race1)]
}
}
dt[, `:=` (
`Percent of Total` = Victimizations/`Total Victimizations`,
`Percent of Offender Race` = Victimizations/`Total Offender Race`,
`Percent of Victim Race` = Victimizations/`Total Victim Race`,
`Percent Interracial` = ifelse(Interracial_flag == "Yes", Victimizations/`Total Interracial`, NA_real_),
`Percent Interracial Pairs` = ifelse(Interracial_flag == "Yes", Victimizations/`Total Interracial Pairs`, NA_real_))]
##last filter, removing 'Other' - including only Victim races that also appear as Offender Races and vice-versa
u_races <- unique(dt$`Offender Race`)
dt <- dt[`Victim Race` %chin% u_races]
##for ordering purposes in graphs, reordering race categories
race_levels <- c("White", "Black", "Hispanic", "Asian", "Native American")
race_levels <- race_levels[which(race_levels %in% u_races)]
dt[, `:=`(
`Offender Race` = factor(`Offender Race`, levels = race_levels),
`Victim Race` = factor(`Victim Race`, levels = race_levels))]
gVars <- unique(c("Weight_type", grouping_vars, "Offender Race", "Victim Race"))
gVars <- gVars[which(gVars %in% names(dt))]
data.table::setorderv(dt, cols = gVars)
##reorganize columns
dt[, index := NULL]
dt[is.na(`Race Pair`) & Interracial_flag == "No", `Race Pair` := `Victim Race`]
##rearranging columns
keep_vars <- unique(c(gVars, "Race Pair", "Victimizations",
"Total Victimizations", "Percent of Total",
"Total Victim Race", "Percent of Victim Race",
"Total Offender Race", "Percent of Offender Race",
"Percent of Offender Race",
"Total Interracial", "Percent Interracial",
"Total Interracial Pairs", "Percent Interracial Pairs"))
dt <- dt[, ..keep_vars]
return(dt)
}
show_crime_summary_interracial <- function(df, grouping_vars = NULL, use_weight = c("series", "incident", "person", "unweighted")){
require(data.table)
require(reactable)
use_weight = match.arg(use_weight)
use_weight = switch(use_weight, "series" = "SERIES",
"incident" = "INCIDENT",
"person" = "WGTPERCY",
"unweighted" = "UNWEIGHTED")
dt <- data.table::as.data.table(df)
data.table::setnames(dt, skip_absent = TRUE,
old = c("Crime_type", "Weight_type"),
new = c("Offense", "Weight"))
dt <- dt[!is.na(`Offender Race`)]
dt[, Weight := dplyr::recode(Weight, "Incident_Weight" = "INCIDENT", "SERIES_WEIGHT" = "SERIES")]
if("Offense" %in% names(dt)){
dt[, Offense := factor(Offense, levels = c("Robbery", "Assault", "Rape/Sexual attack"))]
}
dt <- dt[Weight == use_weight]
dt[, Weight := NULL]
order_vars <- c( "Offender Race","Victim Race","Race Pair", "Offense", "Year" )
order_vars <- order_vars[which(order_vars %in% names(dt))]
data.table::setorderv(dt, cols = order_vars)
grouping_vars <- grouping_vars[which(grouping_vars %in% names(dt))]
if(length(grouping_vars) == 0){grouping_vars <- NULL}
col_names <- names(dt)
new_names <- col_names %>% gsub("Percent ", "% ", .) %>% gsub("Victimizations", "Victims", .) %>% gsub("Interracial", "Inter-racial", .)
names(dt) <- new_names
reactable::reactable(dt, striped = TRUE, highlight = TRUE, bordered = TRUE, pagination = FALSE, compact = TRUE, resizable = TRUE, filterable = TRUE,
groupBy = NULL,
defaultColDef = colDef(
#minWidth = 85,
maxWidth = 65,
align = "center",
style = list(fontSize = 11, fontFamily = "'Roboto Condensed', 'Arial Narrow', Merriweather, Arial, Helvetica, sans-serif"),
format = list(cell = colFormat(separators = TRUE, percent = FALSE, digits = 0)),
headerStyle = list(fontWeight = "bold", fontSize = 12),
footerStyle = list(fontWeight = "bold"))
,columns = list(
`Offender Race` = colDef(maxWidth = 75, minWidth = NULL, align = "left", aggregate = "unique"),
`Race Pair` = colDef(maxWidth = 90, minWidth = NULL, align = "left", aggregate = "unique"),
`Victim Race` = colDef(maxWidth = 65, minWidth = NULL, align = "left", aggregate = "unique"),
`% of Total` = colDef(maxWidth = 50, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
`Total Victim Race` = colDef(maxWidth = 60),
`Total Inter-racial Pairs` = colDef(maxWidth = 55),
`Total Inter-racial` = colDef(maxWidth = 60),
`% of Victim Race` = colDef(maxWidth = 60, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
`% of Offender Race` = colDef(maxWidth = 75, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
`% Inter-racial` = colDef(maxWidth = 50, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2)),
`% Inter-racial Pairs` = colDef(maxWidth = 55, align = "center", format = colFormat(separators = FALSE, percent = TRUE, digits = 2))
)
)
}Loading data…
##2015 Data
nc_15_003 <- data.table::fread(paste0(myDataFolder, "DATA/NCVS-ICPSR-2015-36448/DS0003/36448-0003-Data.tsv")) %>% process_ncvs_003_files(.)
nc_15_004 <- data.table::fread(paste0(myDataFolder, "DATA/NCVS-ICPSR-2015-36448/DS0004/36448-0004-Data.tsv")) %>% process_ncvs_004_files(.)
##2016 Data
nc_16_003 <- data.table::fread(paste0(myDataFolder, "DATA/NCVS-ICPSR-2016-36828/DS0003/36828-0003-Data.tsv")) %>% process_ncvs_003_files(.)
nc_16_004 <- data.table::fread(paste0(myDataFolder,"DATA/NCVS-ICPSR-2016-36828/DS0004/36828-0004-Data.tsv")) %>% process_ncvs_004_files(.)
##2018 Data
nc_18_003 <- data.table::fread(paste0(myDataFolder,"DATA/NCVS-ICPSR-2018-37297/DS0003/37297-0003-Data.tsv")) %>% process_ncvs_003_files(.)
nc_18_004 <- data.table::fread(paste0(myDataFolder,"DATA/NCVS-ICPSR-2018-37297/DS0004/37297-0004-Data.tsv")) %>% process_ncvs_004_files(.)d_gg_15 <- merge_and_summarise_interracial(nc_15_004, nc_15_003, keep_codes = 1:13, grouping_vars = c("Weight_type"))
show_crime_summary_interracial(d_gg_15, use_weight = "series")d_gg_16 <- merge_and_summarise_interracial(nc_16_004, nc_16_003, keep_codes = 1:13, grouping_vars = c("Weight_type"))
show_crime_summary_interracial(d_gg_16, use_weight = "series")d_gg_18 <- merge_and_summarise_interracial(nc_18_004, nc_18_003, keep_codes = 1:13, grouping_vars = c("Weight_type"))
show_crime_summary_interracial(d_gg_18, use_weight = "series")Victim Race Codes (Column V3023A found in Victim files XXXXX-0003-Data.tsv:
Offender Race/Ethnicity Codes (found in offender files XXXXX-0004-Data.tsv:
V4246A - SINGLE OFFENDER RACE MISSING FLAG - 1 = Identified Race, 9 = No race info
V4279 - MULTIPLE OFFENDER RACE MISSING FLAG -
V4283 - MULTIPLE OFFENDERS: DON’T KNOW
V4247 - SINGLE OFF ONLY CRIME AGAINST RESP OR HH
V4246C - SINGLE OFFENDER RACE: BLACK OR AFRICAN AMERICAN
V4281 - MULT OFF RACE: BLACK
V4246B - SINGLE OFFENDER RACE: WHITE
V4280 - MULT OFF RACE: WHITE
V4237A - SINGLE OFFENDER HISPANIC/LATINO
V4252A - MULTIPLE OFFENDERS HISPANIC/LATINO
V4246D - SINGLE OFFENDER RACE: AMERICAN INDIAN OR ALASKA NATIVE
V4282A - MULT OFF RACE: AMERICAN INDIAN OR ALASKA NATIVE
V4246E - SINGLE OFFENDER RACE: ASIAN
V4282B - MULT OFF RACE: ASIAN
V4246F - SINGLE OFFENDER RACE: NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER
V4282C - MULT OFF RACE: NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER
V4285A - MULTIPLE OFFENDER RACE OF MOST: 1 = “mostly white”; 2 = “mostly black..”, 3 = "mostly Nat Am, 4 = mostly Asian; 5 = Mostly Native Hawaiian or Pacific Islander
V4252B - MULTIPLE OFFENDERS HISPANIC/NON-HISPANIC: 1 = “Mostly Hispanic or Latino”; 2 = “Mostly non-Hispanic”; 3 = “Equal number of Hispanic and non-Hispanic”; 4 = “Don’t know”
“INCIDENTS VS. VICTIMIZATIONS: For personal crimes, the NCVS makes a basic distinction between”incidents" and “victimizations.” The number of victimizations reflects how many criminal acts were experienced by NCVS survey respondents; the number of incidents reflects the number of criminal acts committed against NCVS respondents and others present during such incidents, as reported by survey respondents. If there was more than one victim, the incident weight is adjusted to compensate for the possibility that the incident could be reported several times by multiple victims and thus be over-counted."
“To calculate a victimization count to match recent BJS reports, users should use variable ‘SERIES_WEIGHT’”
SERIES_WEIGHT = WGTVICCY, where IF V4019 = 2 THEN WGTVICCY = WGTVICCY * v4016, ELSE WGTVICCY = WGTVICCY.V4529TYPES OF CRIME, LISTED IN ORDER OF SERIOUSNESS (The Seriousness Hierarchy) p. 579 2018
ALL VIOLENT CRIMES - 1-20
MAJOR VIOLENT CRIMES - 1,2,3,5,6,7,8,9,10,11,12,13
Completed rape
Attempted rape
Sexual attack with serious assault
Sexual attack with minor assault
Completed robbery with injury from serious assault
Completed robbery with injury from minor assault
Completed robbery without injury from minor assault
Attempted robbery with injury from serious assault
Attempted robbery with injury from minor assault
Attempted robbery without injury
Completed aggravated assault with injury
Attempted aggravated assault with weapon
Threatened assault with weapon
Simple assault completed with injury
Sexual assault without injury
Unwanted sexual contact without force
Assault without weapon without injury
Verbal threat of rape
Verbal threat of sexual assault
Verbal threat of assault
“To be consistent with BJS publications for all years, exclude crimes committed outside the U.S.” (2018 - p. 595)
“To be consistent with BJS publications from 2012 on … include series crimes…. Include up to 10 occurrences in a series crime”
Person Victimization Counts consistent with BJS publications 2012+
V4019 - CHECK ITEM D: ENOUGH DETAIL TO DISTINGUISH INCIDENTS
Rule: If V4019 = 2, THEN WGTVICCY = WGTVICCY * v4016.
SERIES_WEIGHT and SERIES_IWEIGHT already take this into account.
V4015 - YEAR INCIDENT OCCURRED (for 2018 file, either 2017 or 2018)
YEARQ - YEAR AND QUARTER OF INTERVIEW.
V4012 - Incident number. (Accounts for most duplicated IDs in _004 files)
Homicide data containing race/ethnicity data for both victims and offenders can be found in the FBI’s Expanded Homicide Data Tables. The code below downloads and processes the .csv files available at the following URL’s:
###HOMICIDES 2018
##Expanded Homicide Data Table 6
##Race, Sex, and Ethnicity of Victim by Race, Sex, and Ethnicity of Offender, 2018
##function to format homicide tables
format_homicide_tables <- function(url){
require(htmltab)
year <- as.numeric(substr(url, start = 38, stop = 41))
d_url <- htmltab(doc = url, which = "//*[@id='tablecontainer']")
d <- d_url[c(1:4,12),c(1:6,10)]
new_names <- dplyr::case_when(
grepl("Black", names(d), ignore.case = TRUE) ~ "Black",
grepl("White", names(d), ignore.case = TRUE) ~ "White/Hisp",
grepl("Other", names(d), ignore.case = TRUE) ~ "Other",
grepl("Unknown", names(d), ignore.case = TRUE) ~ "Unknown",
grepl("Hispanic|Latino", names(d), ignore.case = TRUE) ~ "Hispanic",
grepl("victim", names(d), ignore.case = TRUE) ~ "Victim Race",
TRUE ~ trimws(names(d)))
names(d) <- new_names
d <- d %>%
##remove commas and convert to numeric
dplyr::mutate_at(2:7, list(~ as.numeric(gsub(",", "", .)))) %>%
dplyr::mutate(
White = `White/Hisp` - Hispanic,
`Victim Race` = dplyr::case_when(
grepl("Black", `Victim Race`, ignore.case = TRUE) ~ "Black",
grepl("White", `Victim Race`, ignore.case = TRUE) ~ "White/Hisp",
grepl("Other", `Victim Race`, ignore.case = TRUE) ~ "Other",
grepl("Unknown", `Victim Race`, ignore.case = TRUE) ~ "Unknown",
grepl("Hispanic|Latino", `Victim Race`, ignore.case = TRUE) ~ "Hispanic",
TRUE ~ trimws(`Victim Race`))) %>%
tidyr::pivot_longer(-c(`Victim Race`, Total), names_to = "Offender Race", values_to = "Homicides")
##to isolate the White from the White/Hispanic for Victims...
d_wh <- d %>% dplyr::filter(`Offender Race` != "White/Hisp") %>%
dplyr::select(-Total) %>%
tidyr::pivot_wider(names_from = `Victim Race`,
values_from = Homicides) %>%
dplyr::mutate(White = `White/Hisp` - Hispanic) %>%
tidyr::pivot_longer(-`Offender Race`,
names_to = "Victim Race",
values_to = "Homicides") %>%
dplyr::filter(`Victim Race` == "White") %>%
dplyr::group_by(`Victim Race`) %>%
dplyr::mutate(Total = sum(Homicides))
dplyr::bind_rows(d, d_wh) %>%
dplyr::mutate(Year = year)
}
url_2015 <- "https://ucr.fbi.gov/crime-in-the-u.s/2015/crime-in-the-u.s.-2015/tables/expanded_homicide_data_table_6_murder_race_and_sex_of_vicitm_by_race_and_sex_of_offender_2015.xls"
url_2016 <- "https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/tables/expanded-homicide-data-table-3.xls"
url_2017 <- "https://ucr.fbi.gov/crime-in-the-u.s/2017/crime-in-the-u.s.-2017/tables/expanded-homicide-data-table-6.xls"
url_2018 <- "https://ucr.fbi.gov/crime-in-the-u.s/2018/crime-in-the-u.s.-2018/tables/expanded-homicide-data-table-6.xls"
url_2019 <- "https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019/tables/expanded-homicide-data-table-6.xls"
homicides <- purrr::map_dfr(c(url_2015, url_2016, url_2017, url_2018, url_2019),
format_homicide_tables)d_h <- homicides %>%
##collapse other and unknown together
dplyr::mutate(`Victim Race` = dplyr::recode(`Victim Race`,
"Unknown" = "Other"),
`Offender Race` = dplyr::recode(`Offender Race`,
"Unknown" = "Other")) %>%
dplyr::group_by(`Victim Race`, `Offender Race`) %>%
dplyr::summarise(Homicides = sum(Homicides)) %>%
dplyr::filter(`Victim Race` %in% c("White", "Black", "Hispanic", "Other"),
`Offender Race` %in% c("White", "Black", "Hispanic", "Other")) %>%
dplyr::group_by(`Victim Race`) %>%
dplyr::mutate(`Total Homicides` = sum(Homicides)) %>%
dplyr::ungroup() %>%
dplyr::mutate(
Percent = Homicides/`Total Homicides`,
Label = paste0(signs::signs(Homicides,
format = scales::comma,
accuracy = 1),
" (",
signs::signs(Percent, format = scales::percent), ")")) %>%
dplyr::select(`Victim Race`, `Offender Race`, Label, `Total Homicides`) %>%
tidyr::pivot_wider(names_from = `Offender Race`,
values_from = Label) %>%
dplyr::ungroup() %>%
dplyr::mutate(`Victim Race` = factor(`Victim Race`,
levels = c("White", "Black",
"Hispanic", "Other"))) %>%
dplyr::arrange(`Victim Race`) %>%
dplyr::select(`Victim Race`, White, Black, Hispanic, Other, `Total Homicides`) %>%
dplyr::mutate(`Total Homicides` = paste0(signs::signs(`Total Homicides`,
format = scales::comma, accuracy = 1)))
gt(d_h) %>%
tab_header(
title = "Homicides by Race/Ethnicity of Victims and Offenders (2015-2019)"
#subtitle = ""
) %>%
tab_style(
style = cell_text(size = "small", align = "center"),
locations = cells_body()
) %>%
tab_options(
footnotes.font.size = 11
) %>%
tab_spanner(
label = "Offender Race",
columns = vars(White, Black, Hispanic, Other)) %>%
tab_footnote(
footnote = "Non-Hispanic white victim and offender counts estimated by subtracting Hispanic ethnicity counts from original White counts including both Hispanic and Non-Hispanic Whites.", locations = cells_column_labels(vars(White))) %>%
tab_footnote(
footnote = "Total homicides summed across years 2015-2019. These tables are only include incidents where some information about the offender is known by law enforcement. The actual victim counts by race are roughly double what is shown here.",
locations = cells_column_labels(vars(`Total Homicides`))) %>%
tab_source_note(source_note = md("Source: FBI's Expanded Homicide Data Tables. [2015 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2015/crime-in-the-u.s.-2015/tables/expanded_homicide_data_table_6_murder_race_and_sex_of_vicitm_by_race_and_sex_of_offender_2015.xls), [2016 Table-3](https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/tables/expanded-homicide-data-table-3.xls), [2017 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2017/crime-in-the-u.s.-2017/tables/expanded-homicide-data-table-6.xls), [2018 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2018/crime-in-the-u.s.-2018/tables/expanded-homicide-data-table-6.xls), [2019 Table-6](https://ucr.fbi.gov/crime-in-the-u.s/2019/crime-in-the-u.s.-2019/tables/expanded-homicide-data-table-6.xls)")) | Homicides by Race/Ethnicity of Victims and Offenders (2015-2019) | |||||
|---|---|---|---|---|---|
| Victim Race | Offender Race | Total Homicides2 | |||
| White1 | Black | Hispanic | Other | ||
| White | 9,277 (75.197%) | 1,963 (15.911%) | 692 (5.609%) | 405 (3.283%) | 12,337 |
| Black | 740 (5.162%) | 12,751 (88.950%) | 476 (3.321%) | 368 (2.567%) | 14,335 |
| Hispanic | 562 (12.461%) | 726 (16.098%) | 3,029 (67.162%) | 193 (4.279%) | 4,510 |
| Other | 378 (22.487%) | 308 (18.322%) | 150 (8.923%) | 845 (50.268%) | 1,681 |
| Source: FBI's Expanded Homicide Data Tables. 2015 Table-6, 2016 Table-3, 2017 Table-6, 2018 Table-6, 2019 Table-6 | |||||
|
1
Non-Hispanic white victim and offender counts estimated by subtracting Hispanic ethnicity counts from original White counts including both Hispanic and Non-Hispanic Whites.
2
Total homicides summed across years 2015-2019. These tables are only include incidents where some information about the offender is known by law enforcement. The actual victim counts by race are roughly double what is shown here.
|
|||||