I downloaded some basic demographics tables for block groups from Social Explorer. THe first data table just has voting age citizens by race and ethnicity. The second has full population numbers for race & ethnicity, income in ranges (to calculate approx. medians with cumulative numbers), and education (collapsed)

import citizenship information

The Census does a special voting tabulation that includes the estimated number of voting age citizens (approx. the same as eligible voters, but doesn’t back out people who may be formerly incarcerated and have not gotten their voting rights back).

This is the ACS 2018 five-year file, which means it’s an average of 2014-2018. Here’s the layout:

Variables

      FIPS:           FIPS
      NAME:           Name of Area
      QNAME:          Qualifying Name
      NATION:         Nation
      STATE:          State
      COUNTY:         County
      CS:             County Subdivision
      CT:             Census Tract
      BG:             Block Group
      PLACE:          Place
      CD:             Congressional District
      SLDC:           State Legislative District
      T004_001:       US Citizens 18 Years and Over
      T004_002:       US Citizens 18 Years and Over: Not Hispanic or Latino
      T004_003:       US Citizens 18 Years and Over: Not Hispanic or Latino: White Alone
      T004_004:       US Citizens 18 Years and Over: Not Hispanic or Latino: Black or African American Alone
      T004_005:       US Citizens 18 Years and Over: Not Hispanic or Latino: American Indian or Alaska Native Alone
      T004_006:       US Citizens 18 Years and Over: Not Hispanic or Latino: Asian Alone
      T004_007:       US Citizens 18 Years and Over: Not Hispanic or Latino: Native Hawaiian or Other Pacific Islander Alone
      T004_008:       US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**
      T004_009:       US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**: Black or African American and White
      T004_010:       US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**: American Indian or Alaska Native and White
      T004_011:       US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**: American Indian or Alaska Native and Black or African American
      T004_012:       US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**: Asian and White
      T004_013:       US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**: Remainder of Two or More Race Responses
      T004_014:       US Citizens 18 Years and Over: Hispanic or Latino
      

Note that these don’t add up – they’re rounded to 5’s in each cell, so the total can be more than the original amount. The information is suppressed in most block groups for the main census data, because the cells are too small. For that reason, just pick up the total block group citizenship, not worrying about race / ethnicity. (The Nativity / foreign born question on the main census is suppressed most of the time, too. )

orig_blockgroup_citizen <- read_tsv( "us_citizen_block_grp.txt") %>%
  select (fips=Geo_FIPS, voting_age_citizens = SE_T004_001)

Those are rounded, but they’ll be OK for the proportions.

Demographics

These are also sometimes suppressed, but most of the block groups are filled in. Here’s the record layout:

  FIPS:           FIPS
  GEOID:          Geographic Identifier
  NAME:           Name of Area
  QName:          Qualifying Name
  STUSAB:         State/U.S.-Abbreviation (USPS)
  SUMLEV:         Summary Level
  GEOCOMP:        Geographic Component
  FILEID:         File Identification
  LOGRECNO:       Logical Record Number
  US:             US
  REGION:         Region
  DIVISION:       Division
  STATECE:        State (Census Code)
  STATE:          State (FIPS)
  COUNTY:         County
  COUSUB:         County Subdivision (FIPS)
  PLACE:          Place (FIPS Code)
  PLACESE:        Place (State FIPS + Place FIPS)
  TRACT:          Census Tract
  BLKGRP:         Block Group
  CONCIT:         Consolidated City
  AIANHH:         American Indian Area/Alaska Native Area/Hawaiian Home Land (Census)
  AIANHHFP:       American Indian Area/Alaska Native Area/Hawaiian Home Land (FIPS)
  AIHHTLI:        American Indian Trust Land/Hawaiian Home Land Indicator
  AITSCE:         American Indian Tribal Subdivision (Census)
  AITS:           American Indian Tribal Subdivision (FIPS)
  ANRC:           Alaska Native Regional Corporation (FIPS)
  CBSA:           Metropolitan and Micropolitan Statistical Area
  CSA:            Combined Statistical Area
  METDIV:         Metropolitan Division
  MACC:           Metropolitan Area Central City
  MEMI:           Metropolitan/Micropolitan Indicator Flag
  NECTA:          New England City and Town Combined Statistical Area
  CNECTA:         New England City and Town Area
  NECTADIV:       New England City and Town Area Division
  UA:             Urban Area
  UACP:           Urban Area Central Place
  CDCURR:         Current Congressional District ***
  SLDU:           State Legislative District Upper
  SLDL:           State Legislative District Lower
  VTD:            Voting District
  ZCTA3:          ZIP Code Tabulation Area (3-digit)
  ZCTA5:          ZIP Code Tabulation Area (5-digit)
  SUBMCD:         Subbarrio (FIPS)
  SDELM:          School District (Elementary)
  SDSEC:          School District (Secondary)
  SDUNI:          School District (Unified)
  UR:             Urban/Rural
  PCI:            Principal City Indicator
  TAZ:            Traffic Analysis Zone
  UGA:            Urban Growth Area
  BTTR:           Tribal Tract
  BTBG:           Tribal Block Group
  PUMA5:          Public Use Microdata Area - 5% File
  PUMA1:          Public Use Microdata Area - 1% File
  A00001_001:     Total Population
  B01001_001:     Total Population:
  B01001_002:     Total Population: Under 18 Years
  B01001_003:     Total Population: 18 to 34 Years
  B01001_004:     Total Population: 35 to 64 Years
  B01001_005:     Total Population: 65 and Over
  B04001_001:     Total Population
  B04001_002:     Total Population: Not Hispanic or Latino
  B04001_003:     Total Population: Not Hispanic or Latino: White Alone
  B04001_004:     Total Population: Not Hispanic or Latino: Black or African American Alone
  B04001_005:     Total Population: Not Hispanic or Latino: American Indian and Alaska Native Alone
  B04001_006:     Total Population: Not Hispanic or Latino: Asian Alone
  B04001_007:     Total Population: Not Hispanic or Latino: Native Hawaiian and Other Pacific Islander Alone
  B04001_008:     Total Population: Not Hispanic or Latino: Some Other Race Alone
  B04001_009:     Total Population: Not Hispanic or Latino: Two or More Races
  B04001_010:     Total Population: Hispanic or Latino
  A10008_001:     Households:
  A10008_002:     Households: Family Households
  A10008_003:     Households: Family Households: Married-Couple Family
  A10008_004:     Households: Family Households: Other Family
  A10008_005:     Households: Family Households: Other Family: Male Householder, No Wife Present
  A10008_006:     Households: Family Households: Other Family: Female Householder, No Husband Present
  A10008_007:     Households: Nonfamily Households
  A10008_008:     Households: Nonfamily Households: Male Householder
  A10008_009:     Households: Nonfamily Households: Female Householder
  B12001_001:     Population 25 Years and Over
  B12001_002:     Population 25 Years and Over: Less than High School
  B12001_003:     Population 25 Years and Over: High School Diploma
  B12001_004:     Population 25 Years and Over: Bachelor's Degree or Better
  A14006_001:     Median Household Income (In 2018 Inflation Adjusted Dollars)
  A14018_001:     Aggregate Household Income (In 2018 Inflation Adjusted Dollars)
orig_blockgroup_demo <- read_tsv("demographics_blockgroups.txt")


select_blockgroup_demo <- 
  orig_blockgroup_demo %>%
   select ( fips = Geo_FIPS,
            total_pop = SE_A00001_001,
            under18 = SE_B01001_003,
            age65up = SE_B01001_005,
            white_nh = SE_B04001_003, 
            black_nh = SE_B04001_004, 
            native_am_nh = SE_B04001_005, 
            hispanic = SE_B04001_010, 
            households = SE_A10008_001,
            pop_25up = SE_B12001_001,
            no_highschool = SE_B12001_002,
            college_or_more = SE_B12001_004, 
            median_hh_inc = SE_A14006_001,
            total_hh_inc = SE_A14018_001) %>%
  left_join ( orig_blockgroup_citizen, by=c("fips" = "fips"))

Join to the crosswalk

precinct_demographics <- 
  maricopa_block_group_to_precinct %>%
  left_join (select_blockgroup_demo, by=c("full_geo_code" = "fips")) %>%
  group_by (precinct_num, precinct_chr, precinct_name ) %>%
  summarise ( est_median_inc = weighted.mean (median_hh_inc, households, na.rm=T ), 
              across (c( total_pop : college_or_more, total_hh_inc, voting_age_citizens), sum, na.rm=T), .groups="drop") 

Just check one with a join -

maricopa_block_group_to_precinct %>%
  filter ( str_detect ( precinct_name, "ARCADIA")) %>%
  select ( fip = full_geo_code, precinct_name ) %>%
  inner_join (select_blockgroup_demo , by=c("fip"="fips"))
##            fip precinct_name total_pop under18 age65up white_nh black_nh
## 1 040131110003       ARCADIA      1169     208      95      785       76
## 2 040131110005       ARCADIA       804      93     173      704       65
## 3 040131110001       ARCADIA       943     178      13      599       21
## 4 040131110002       ARCADIA       812     139     212      675        0
## 5 040131110004       ARCADIA       692     151     126      671        0
##   native_am_nh hispanic households pop_25up no_highschool college_or_more
## 1           13       21        468      836            26             550
## 2            0       35        375      570             0             320
## 3            0      323        418      567            22             154
## 4            0      112        366      657             0             277
## 5            0       21        310      543             0             225
##   median_hh_inc total_hh_inc voting_age_citizens
## 1        126667     64987400                 900
## 2        102721     49733600                 665
## 3            NA     25500400                 665
## 4         93571     41985000                 655
## 5        117143     36159300                 545

That looks right –

Put the latitude and longitude back in for precincts

precinct_demographics <- 
  precinct_demographics %>%
  mutate ( avg_hh_income = round(total_hh_inc / households, 0) ) %>%
  left_join ( select (precinct_latlon , precinct_num, X, Y) ,
              by=c("precinct_num" = "precinct_num") )