Setup

Load in data, review regional distribution, filter for NYC only.

Demographics

Age

Race

Income

Gender Identity/Sexuality

note: No good data for this in census

Discipline

rank 1 (V1_1)

Income Sources Analysis (Multiple Response)

Additional Income Sources Analysis (V11)

## [1] No                   Yes                  Skipped Question    
## [4] Prefer not to answer
## 4 Levels: No                   Prefer not to answer ... Yes

##              V11_clean    n   percent
## 1                  Yes 1719 16.997923
## 2                   No 7704 76.179175
## 3 Prefer not to answer  470  4.647483
## 4              Skipped  220  2.175418
## 5                Other    0  0.000000

Comparing to ACS data

Taking five year 2025 ACS data for NYC

Age

Parse

## Getting data from the 2019-2023 5-year ACS

combine with survey data

Race

Harmonize race categories

## Getting data from the 2019-2023 5-year ACS

compare %

viz

## Warning: Removed 3 rows containing missing values or values outside the scale range
## (`geom_col()`).

Notes:

Category Alignment:

Combined ACS “Asian” and “Native Hawaiian/Pacific Islander” to match survey

Kept “Arab or Middle Eastern” separate (not in ACS standard categories)

Included “Multiracial” from ACS (not in your survey)

Hispanic Handling:

ACS treats Hispanic origin separately from race

Your survey combines them (respondents can select both)

Special Notes:

The “Other” category in ACS may include some Middle Eastern respondents

ACS doesn’t have “Prefer not to answer” category

Income

## Getting data from the 2019-2023 5-year ACS

IPUMS

extract

2016–2020 ACS 5-year

clean

filter

#filter for hpd sampling 
nyc_data <- data %>%
  filter(PUMA %in% nyc_puma_codes, GQ==1) %>%  # GQ==1 means household, not group quarters
  mutate(
    is_renter = OWNERSHP == 2,
    household_size = PERNUM,
    ami_100_limit = case_when(
      household_size == 1 ~ 82400,
      household_size == 2 ~ 94250,
      household_size == 3 ~ 106050,
      household_size == 4 ~ 117900,
      household_size == 5 ~ 127400,
      TRUE ~ 127400 + (household_size - 5) * 8000
    ),
    under_100_ami = HHINCOME < ami_100_limit
  ) %>%
  filter(is_renter, under_100_ami) %>% 
  filter(household_size < 9)

# weighted totals
nyc_data %>%
  group_by(OCC) %>%
  summarise(weighted_total = sum(PERWT, na.rm = TRUE)) %>%
  arrange(desc(weighted_total))

## # A tibble: 507 × 2
##    OCC       weighted_total
##    <int+lbl>          <dbl>
##  1    0 []           498433
##  2 4720               23264
##  3 4220               17652
##  4 9130               15201
##  5 6260               15101
##  6 4760               15084
##  7 5240               14353
##  8 4110               13564
##  9 4230               12871
## 10 4020               12330
## # ℹ 497 more rows

# HPD Artist definition

# Create a named vector of occupation codes and labels
artist_occ_labels <- c(
  "2600" = "ENT-Artists And Related Workers",
  "2640" = "ENT-Other Designers",
  "2700" = "ENT-Actors",
  "2740" = "ENT-Dancers And Choreographers",
  "2752" = "ENT-Musicians and Singers",
  "2770" = "ENT-Entertainers And Performers, Sports and Related Workers, All Other",
  "2850" = "ENT-Writers And Authors",
  "2910" = "ENT-Photographers"
)

# Filter existing NYC PUMS data
nyc_artists <- nyc_data %>%
  filter(OCC %in% as.numeric(names(artist_occ_labels))) %>%
  mutate(artist_label = artist_occ_labels[as.character(OCC)])

PUMS breakdown

race breakdown

race viz

unweighted

Race/Ethnicity	Count (All NYC)	% (All NYC)	Count (Narrow Artists)	% (Narrow Artists)
Race/Ethnicity: NYC Artists vs All NYC Residents
American Indian or Alaska Native	2,395	6.4	18	9.1
Asian (Non-Hispanic)	4,383	11.6	25	12.6
Black or African American (Non-Hispanic)	6,671	17.7	17	8.6
Hispanic or Latino	9,111	24.2	25	12.6
Some Other Race	1,124	3.0	6	3.0
Two or More Races	121	0.3	2	1.0
White (Non-Hispanic)	13,859	36.8	105	53.0

weighted

# Weighted race breakdown for all NYC residents
race_all_wt <- nyc_data %>%
  group_by(race_group) %>%
  summarise(count_all = sum(PERWT, na.rm = TRUE)) %>%
  mutate(percent_all = count_all / sum(count_all))

# Weighted race breakdown for NYC artists
race_artists_wt <- nyc_artists %>%
  group_by(race_group) %>%
  summarise(count_artists = sum(PERWT, na.rm = TRUE)) %>%
  mutate(percent_artists = count_artists / sum(count_artists))

# Join weighted data
race_compare_wt <- full_join(race_all_wt, race_artists_wt, by = "race_group") %>%
  mutate(
    percent_all = percent_all * 100,
    percent_artists = percent_artists * 100,
  ) %>%
  select(race_group, count_all, percent_all, 
         count_artists, percent_artists)

race_compare_wt %>%
  gt() %>%
  cols_label(
    race_group = "Race/Ethnicity",
    count_all = "Weighted Count (All NYC)",
    percent_all = "% (All NYC)",
    count_artists = "Weighted Count (Artists)",
    percent_artists = "% (Artists)"
  ) %>%
  fmt_number(
    columns = c(count_all, count_artists),
    decimals = 0,
    use_seps = TRUE
  ) %>%
  fmt_number(
    columns = c(percent_all, percent_artists),
    decimals = 1
  ) %>%
  tab_header(
    title = "Weighted Race/Ethnicity: NYC Artists vs All NYC Residents"
  )

Race/Ethnicity	Weighted Count (All NYC)	% (All NYC)	Weighted Count (Artists)	% (Artists)
Weighted Race/Ethnicity: NYC Artists vs All NYC Residents
American Indian or Alaska Native	69,605	6.4	509	10.6
Asian (Non-Hispanic)	108,879	10.0	580	12.0
Black or African American (Non-Hispanic)	217,290	20.0	515	10.7
Hispanic or Latino	283,698	26.2	705	14.6
Some Other Race	33,181	3.1	98	2.0
Two or More Races	2,799	0.3	25	0.5
White (Non-Hispanic)	368,501	34.0	2,390	49.6

NYS Artists Survey

2025-03-19

Setup

Demographics

Age

Race

Income

Gender Identity/Sexuality

Discipline

rank 1 (V1_1)

Income Sources Analysis (Multiple Response)

Additional Income Sources Analysis (V11)

Comparing to ACS data

Age

Parse

combine with survey data

Race

Harmonize race categories

compare %

viz

Notes:

Category Alignment:

Hispanic Handling:

Special Notes:

Income

IPUMS

extract

clean

filter

PUMS breakdown

race breakdown

race viz

unweighted

weighted

Cross Variable Analysis (Survey)

race / income

reshape

viz

by income quartile