I downloaded some basic demographics tables for block groups from Social Explorer. THe first data table just has voting age citizens by race and ethnicity. The second has full population numbers for race & ethnicity, income in ranges (to calculate approx. medians with cumulative numbers), and education (collapsed)
The Census does a special voting tabulation that includes the estimated number of voting age citizens (approx. the same as eligible voters, but doesn’t back out people who may be formerly incarcerated and have not gotten their voting rights back).
This is the ACS 2018 five-year file, which means it’s an average of 2014-2018. Here’s the layout:
Variables
FIPS: FIPS
NAME: Name of Area
QNAME: Qualifying Name
NATION: Nation
STATE: State
COUNTY: County
CS: County Subdivision
CT: Census Tract
BG: Block Group
PLACE: Place
CD: Congressional District
SLDC: State Legislative District
T004_001: US Citizens 18 Years and Over
T004_002: US Citizens 18 Years and Over: Not Hispanic or Latino
T004_003: US Citizens 18 Years and Over: Not Hispanic or Latino: White Alone
T004_004: US Citizens 18 Years and Over: Not Hispanic or Latino: Black or African American Alone
T004_005: US Citizens 18 Years and Over: Not Hispanic or Latino: American Indian or Alaska Native Alone
T004_006: US Citizens 18 Years and Over: Not Hispanic or Latino: Asian Alone
T004_007: US Citizens 18 Years and Over: Not Hispanic or Latino: Native Hawaiian or Other Pacific Islander Alone
T004_008: US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**
T004_009: US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**: Black or African American and White
T004_010: US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**: American Indian or Alaska Native and White
T004_011: US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**: American Indian or Alaska Native and Black or African American
T004_012: US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**: Asian and White
T004_013: US Citizens 18 Years and Over: Not Hispanic or Latino: Two or More Races**: Remainder of Two or More Race Responses
T004_014: US Citizens 18 Years and Over: Hispanic or Latino
Note that these don’t add up – they’re rounded to 5’s in each cell, so the total can be more than the original amount. The information is suppressed in most block groups for the main census data, because the cells are too small. For that reason, just pick up the total block group citizenship, not worrying about race / ethnicity. (The Nativity / foreign born question on the main census is suppressed most of the time, too. )
orig_blockgroup_citizen <- read_tsv( "us_citizen_block_grp.txt") %>%
select (fips=Geo_FIPS, voting_age_citizens = SE_T004_001)
Those are rounded, but they’ll be OK for the proportions.
These are also sometimes suppressed, but most of the block groups are filled in. Here’s the record layout:
FIPS: FIPS
GEOID: Geographic Identifier
NAME: Name of Area
QName: Qualifying Name
STUSAB: State/U.S.-Abbreviation (USPS)
SUMLEV: Summary Level
GEOCOMP: Geographic Component
FILEID: File Identification
LOGRECNO: Logical Record Number
US: US
REGION: Region
DIVISION: Division
STATECE: State (Census Code)
STATE: State (FIPS)
COUNTY: County
COUSUB: County Subdivision (FIPS)
PLACE: Place (FIPS Code)
PLACESE: Place (State FIPS + Place FIPS)
TRACT: Census Tract
BLKGRP: Block Group
CONCIT: Consolidated City
AIANHH: American Indian Area/Alaska Native Area/Hawaiian Home Land (Census)
AIANHHFP: American Indian Area/Alaska Native Area/Hawaiian Home Land (FIPS)
AIHHTLI: American Indian Trust Land/Hawaiian Home Land Indicator
AITSCE: American Indian Tribal Subdivision (Census)
AITS: American Indian Tribal Subdivision (FIPS)
ANRC: Alaska Native Regional Corporation (FIPS)
CBSA: Metropolitan and Micropolitan Statistical Area
CSA: Combined Statistical Area
METDIV: Metropolitan Division
MACC: Metropolitan Area Central City
MEMI: Metropolitan/Micropolitan Indicator Flag
NECTA: New England City and Town Combined Statistical Area
CNECTA: New England City and Town Area
NECTADIV: New England City and Town Area Division
UA: Urban Area
UACP: Urban Area Central Place
CDCURR: Current Congressional District ***
SLDU: State Legislative District Upper
SLDL: State Legislative District Lower
VTD: Voting District
ZCTA3: ZIP Code Tabulation Area (3-digit)
ZCTA5: ZIP Code Tabulation Area (5-digit)
SUBMCD: Subbarrio (FIPS)
SDELM: School District (Elementary)
SDSEC: School District (Secondary)
SDUNI: School District (Unified)
UR: Urban/Rural
PCI: Principal City Indicator
TAZ: Traffic Analysis Zone
UGA: Urban Growth Area
BTTR: Tribal Tract
BTBG: Tribal Block Group
PUMA5: Public Use Microdata Area - 5% File
PUMA1: Public Use Microdata Area - 1% File
A00001_001: Total Population
B01001_001: Total Population:
B01001_002: Total Population: Under 18 Years
B01001_003: Total Population: 18 to 34 Years
B01001_004: Total Population: 35 to 64 Years
B01001_005: Total Population: 65 and Over
B04001_001: Total Population
B04001_002: Total Population: Not Hispanic or Latino
B04001_003: Total Population: Not Hispanic or Latino: White Alone
B04001_004: Total Population: Not Hispanic or Latino: Black or African American Alone
B04001_005: Total Population: Not Hispanic or Latino: American Indian and Alaska Native Alone
B04001_006: Total Population: Not Hispanic or Latino: Asian Alone
B04001_007: Total Population: Not Hispanic or Latino: Native Hawaiian and Other Pacific Islander Alone
B04001_008: Total Population: Not Hispanic or Latino: Some Other Race Alone
B04001_009: Total Population: Not Hispanic or Latino: Two or More Races
B04001_010: Total Population: Hispanic or Latino
A10008_001: Households:
A10008_002: Households: Family Households
A10008_003: Households: Family Households: Married-Couple Family
A10008_004: Households: Family Households: Other Family
A10008_005: Households: Family Households: Other Family: Male Householder, No Wife Present
A10008_006: Households: Family Households: Other Family: Female Householder, No Husband Present
A10008_007: Households: Nonfamily Households
A10008_008: Households: Nonfamily Households: Male Householder
A10008_009: Households: Nonfamily Households: Female Householder
B12001_001: Population 25 Years and Over
B12001_002: Population 25 Years and Over: Less than High School
B12001_003: Population 25 Years and Over: High School Diploma
B12001_004: Population 25 Years and Over: Bachelor's Degree or Better
A14006_001: Median Household Income (In 2018 Inflation Adjusted Dollars)
A14018_001: Aggregate Household Income (In 2018 Inflation Adjusted Dollars)
orig_blockgroup_demo <- read_tsv("demographics_blockgroups.txt")
select_blockgroup_demo <-
orig_blockgroup_demo %>%
select ( fips = Geo_FIPS,
total_pop = SE_A00001_001,
under18 = SE_B01001_003,
age65up = SE_B01001_005,
white_nh = SE_B04001_003,
black_nh = SE_B04001_004,
native_am_nh = SE_B04001_005,
hispanic = SE_B04001_010,
households = SE_A10008_001,
pop_25up = SE_B12001_001,
no_highschool = SE_B12001_002,
college_or_more = SE_B12001_004,
median_hh_inc = SE_A14006_001,
total_hh_inc = SE_A14018_001) %>%
left_join ( orig_blockgroup_citizen, by=c("fips" = "fips"))
precinct_demographics <-
maricopa_block_group_to_precinct %>%
left_join (select_blockgroup_demo, by=c("full_geo_code" = "fips")) %>%
group_by (precinct_num, precinct_chr, precinct_name ) %>%
summarise ( est_median_inc = weighted.mean (median_hh_inc, households, na.rm=T ),
across (c( total_pop : college_or_more, total_hh_inc, voting_age_citizens), sum, na.rm=T), .groups="drop")
Just check one with a join -
maricopa_block_group_to_precinct %>%
filter ( str_detect ( precinct_name, "ARCADIA")) %>%
select ( fip = full_geo_code, precinct_name ) %>%
inner_join (select_blockgroup_demo , by=c("fip"="fips"))
## fip precinct_name total_pop under18 age65up white_nh black_nh
## 1 040131110003 ARCADIA 1169 208 95 785 76
## 2 040131110005 ARCADIA 804 93 173 704 65
## 3 040131110001 ARCADIA 943 178 13 599 21
## 4 040131110002 ARCADIA 812 139 212 675 0
## 5 040131110004 ARCADIA 692 151 126 671 0
## native_am_nh hispanic households pop_25up no_highschool college_or_more
## 1 13 21 468 836 26 550
## 2 0 35 375 570 0 320
## 3 0 323 418 567 22 154
## 4 0 112 366 657 0 277
## 5 0 21 310 543 0 225
## median_hh_inc total_hh_inc voting_age_citizens
## 1 126667 64987400 900
## 2 102721 49733600 665
## 3 NA 25500400 665
## 4 93571 41985000 655
## 5 117143 36159300 545
That looks right –
precinct_demographics <-
precinct_demographics %>%
mutate ( avg_hh_income = round(total_hh_inc / households, 0) ) %>%
left_join ( select (precinct_latlon , precinct_num, X, Y) ,
by=c("precinct_num" = "precinct_num") )