Analyzing and visualizing data on race at the census tract-level in New York City.
library(tidyverse)
library(tidycensus)
library(sf)
library(scales)
library(RColorBrewer)
library(plotly)
library(knitr)
Borough boundaries are represented with a shapefile that was downloaded from NYC Open Data.
The data for this analysis was obtained from the U.S. Census Bureau’s 2020 Decennial Census, accessed with the tidyverse R package. Census data includes:
dp_2020 <- load_variables(2020, "pl", cache = T)
boros <- st_read("~/Desktop/methods1/part2/data/raw/geo/Borough_Boundaries.geojson")
raw_race <- get_decennial(geography = "tract",
variables = c(total_pop = "P2_001N",
hisp_lat = "P2_002N",
white_alone = "P2_005N",
black_alone = "P2_006N",
asian_alone = "P2_008N"),
state = "New York",
geometry = TRUE,
year = 2020,
output = "wide")
ny_race <- raw_race |>
mutate(pct_hisp_lat = hisp_lat/total_pop * 100,
pct_white_alone = white_alone/total_pop * 100,
pct_black_alone = black_alone/total_pop * 100,
pct_asian_alone = asian_alone/total_pop * 100) |>
filter(total_pop>0) |>
separate(NAME, into = c("tract", "county"), sep = ",") |>
filter(county %in% c(" Kings County", " Bronx County", " New York County", " Queens County", " Richmond County"))
ggplot(ny_race, aes(fill = pct_hisp_lat)) +
geom_sf() +
theme_void() +
scale_fill_gradient(name = "Percent Hispanic and Latino",
low = "darkseagreen1",
high = "darkgreen",
labels = percent_format(scale = 1)) +
labs(title = "Percent Hispanic and Latino in NYC by Census Tract",
subtitle = "Source: US Census Bureau 2020") +
geom_sf(data = boros,
color = "black", fill = NA, lwd = .3)
This map shows the percentage of the population identifying as Hispanic and Latino, which appears to be concentrated in the Western-most parts of NYC, particularly in the Bronx, Queens, and one area of high concentration in Kings County.
ggplot(ny_race, aes(fill = pct_black_alone)) +
geom_sf() +
theme_void() +
scale_fill_gradient(name = "Percent Black, not Hispanic or Latino",
low = "thistle1",
high = "purple4",
labels = percent_format(scale = 1)) +
labs(title = "Percent Black not Hispanic or Latino in NYC by Census Tract",
subtitle = "Source: US Census Bureau 2020") +
geom_sf(data = boros,
color = "black", fill = NA, lwd = .3)
This map shows the percentage of the population identifying as Black, not Hispanic or Latino, which appears to be concentrated in three distinct areas: throughout the Bronx, in eastern Queens, and eastern Kings County.
ggplot(ny_race, aes(fill = pct_white_alone)) +
geom_sf() +
theme_void() +
scale_fill_gradient(name = "Percent White, not Hispanic or Latino",
low = "lightskyblue1",
high = "royalblue4",
labels = percent_format(scale = 1)) +
labs(title = "Percent White not Hispanic or Latino by in NYC Census Tract",
subtitle = "Source: US Census Bureau 2020") +
geom_sf(data = boros,
color = "black", fill = NA, lwd = .3)
This map shows the percentage of the population identifying as White, not Hispanic or Latino. While this population is dispersed more widely across the city relative to the other populations examined in this analysis, slightly higher concentrations can be observed in Richmond County, southern Kings County, throughout Manhattan, and southeastern Queens.
ggplot(ny_race, aes(fill = pct_asian_alone)) +
geom_sf() +
theme_void() +
scale_fill_gradient(name = "Percent Asian, not Hispanic or Latino",
low = "wheat",
high = "orangered4",
labels = percent_format(scale = 1)) +
labs(title = "Percent Asian not Hispanic or Latino by in NYC Census Tract",
subtitle = "Source: US Census Bureau 2020") +
geom_sf(data = boros,
color = "black", fill = NA, lwd = .3)
This map shows the percentage of the population identifying as Asian, not Hispanic or Latino, which appears to be concentrated in southern Manhattan, southern Kings County, and throughout northern Queens.