Load in data, review regional distribution, filter for NYC only.
note: No good data for this in census
## [1] No Yes Skipped Question
## [4] Prefer not to answer
## 4 Levels: No Prefer not to answer ... Yes
## V11_clean n percent
## 1 Yes 1719 16.997923
## 2 No 7704 76.179175
## 3 Prefer not to answer 470 4.647483
## 4 Skipped 220 2.175418
## 5 Other 0 0.000000
Taking five year 2025 ACS data for NYC
## Getting data from the 2019-2023 5-year ACS
## Getting data from the 2019-2023 5-year ACS
## Warning: Removed 3 rows containing missing values or values outside the scale range
## (`geom_col()`).
Combined ACS “Asian” and “Native Hawaiian/Pacific Islander” to match survey
Kept “Arab or Middle Eastern” separate (not in ACS standard categories)
Included “Multiracial” from ACS (not in your survey)
ACS treats Hispanic origin separately from race
Your survey combines them (respondents can select both)
The “Other” category in ACS may include some Middle Eastern respondents
ACS doesn’t have “Prefer not to answer” category
## Getting data from the 2019-2023 5-year ACS
2016–2020 ACS 5-year
#filter for hpd sampling
nyc_data <- data %>%
filter(PUMA %in% nyc_puma_codes, GQ==1) %>% # GQ==1 means household, not group quarters
mutate(
is_renter = OWNERSHP == 2,
household_size = PERNUM,
ami_100_limit = case_when(
household_size == 1 ~ 82400,
household_size == 2 ~ 94250,
household_size == 3 ~ 106050,
household_size == 4 ~ 117900,
household_size == 5 ~ 127400,
TRUE ~ 127400 + (household_size - 5) * 8000
),
under_100_ami = HHINCOME < ami_100_limit
) %>%
filter(is_renter, under_100_ami) %>%
filter(household_size < 9)
# weighted totals
nyc_data %>%
group_by(OCC) %>%
summarise(weighted_total = sum(PERWT, na.rm = TRUE)) %>%
arrange(desc(weighted_total))
## # A tibble: 507 × 2
## OCC weighted_total
## <int+lbl> <dbl>
## 1 0 [] 498433
## 2 4720 23264
## 3 4220 17652
## 4 9130 15201
## 5 6260 15101
## 6 4760 15084
## 7 5240 14353
## 8 4110 13564
## 9 4230 12871
## 10 4020 12330
## # ℹ 497 more rows
# HPD Artist definition
# Create a named vector of occupation codes and labels
artist_occ_labels <- c(
"2600" = "ENT-Artists And Related Workers",
"2640" = "ENT-Other Designers",
"2700" = "ENT-Actors",
"2740" = "ENT-Dancers And Choreographers",
"2752" = "ENT-Musicians and Singers",
"2770" = "ENT-Entertainers And Performers, Sports and Related Workers, All Other",
"2850" = "ENT-Writers And Authors",
"2910" = "ENT-Photographers"
)
# Filter existing NYC PUMS data
nyc_artists <- nyc_data %>%
filter(OCC %in% as.numeric(names(artist_occ_labels))) %>%
mutate(artist_label = artist_occ_labels[as.character(OCC)])
Race/Ethnicity: NYC Artists vs All NYC Residents | ||||
Race/Ethnicity | Count (All NYC) | % (All NYC) | Count (Narrow Artists) | % (Narrow Artists) |
---|---|---|---|---|
American Indian or Alaska Native | 2,395 | 6.4 | 18 | 9.1 |
Asian (Non-Hispanic) | 4,383 | 11.6 | 25 | 12.6 |
Black or African American (Non-Hispanic) | 6,671 | 17.7 | 17 | 8.6 |
Hispanic or Latino | 9,111 | 24.2 | 25 | 12.6 |
Some Other Race | 1,124 | 3.0 | 6 | 3.0 |
Two or More Races | 121 | 0.3 | 2 | 1.0 |
White (Non-Hispanic) | 13,859 | 36.8 | 105 | 53.0 |
# Weighted race breakdown for all NYC residents
race_all_wt <- nyc_data %>%
group_by(race_group) %>%
summarise(count_all = sum(PERWT, na.rm = TRUE)) %>%
mutate(percent_all = count_all / sum(count_all))
# Weighted race breakdown for NYC artists
race_artists_wt <- nyc_artists %>%
group_by(race_group) %>%
summarise(count_artists = sum(PERWT, na.rm = TRUE)) %>%
mutate(percent_artists = count_artists / sum(count_artists))
# Join weighted data
race_compare_wt <- full_join(race_all_wt, race_artists_wt, by = "race_group") %>%
mutate(
percent_all = percent_all * 100,
percent_artists = percent_artists * 100,
) %>%
select(race_group, count_all, percent_all,
count_artists, percent_artists)
race_compare_wt %>%
gt() %>%
cols_label(
race_group = "Race/Ethnicity",
count_all = "Weighted Count (All NYC)",
percent_all = "% (All NYC)",
count_artists = "Weighted Count (Artists)",
percent_artists = "% (Artists)"
) %>%
fmt_number(
columns = c(count_all, count_artists),
decimals = 0,
use_seps = TRUE
) %>%
fmt_number(
columns = c(percent_all, percent_artists),
decimals = 1
) %>%
tab_header(
title = "Weighted Race/Ethnicity: NYC Artists vs All NYC Residents"
)
Weighted Race/Ethnicity: NYC Artists vs All NYC Residents | ||||
Race/Ethnicity | Weighted Count (All NYC) | % (All NYC) | Weighted Count (Artists) | % (Artists) |
---|---|---|---|---|
American Indian or Alaska Native | 69,605 | 6.4 | 509 | 10.6 |
Asian (Non-Hispanic) | 108,879 | 10.0 | 580 | 12.0 |
Black or African American (Non-Hispanic) | 217,290 | 20.0 | 515 | 10.7 |
Hispanic or Latino | 283,698 | 26.2 | 705 | 14.6 |
Some Other Race | 33,181 | 3.1 | 98 | 2.0 |
Two or More Races | 2,799 | 0.3 | 25 | 0.5 |
White (Non-Hispanic) | 368,501 | 34.0 | 2,390 | 49.6 |