This EDA looks into the Census of Governments Annual Survey of Public Employment & Payroll (ASPEP) data from 2019. This is an annual study, and has data going back to at least 1992. It contains data on employee counts and payroll at various levels of government, including municipalities, counties, and states. This EDA focuses on municipality governemnts.
There is data on 1,561 counties (there are about 3,143 in total), 2,918 municipalities (there were about 19,522 in 2012), 2,961 special districts (of about 37K), and 540 townships (of about 16K total). However, the data seems to cover all of the larger counties/munciipalities so if we are looking at the top 200/300 it shouldn’t be a problem at all.
## # A tibble: 6 × 2
## lvl_govt n
## <chr> <int>
## 1 County 1561
## 2 Independent School District 3366
## 3 Municipality 2918
## 4 Special District 2961
## 5 State 50
## 6 Township 540
Still looking into this, but it seems like for the functions we care about, none of the flags are problematic. Flags C, K, R, T, U, V and Z denote that the value is from reported data, and flags A, B, D, G, J, P, Q and X denote that the value was imputed.
## # A tibble: 8 × 2
## fte_flag n
## <chr> <int>
## 1 C 1635
## 2 G 5275
## 3 P 39
## 4 Q 135
## 5 R 16977
## 6 T 1013
## 7 V 1
## 8 X 195
Note that payroll is reported as a 31-day monthly equivalent values for the month of March. Annual pay here was crudely estimated as 12 * Payroll / # FTEs.