The most recent data we have access to is from ACS 2019 estimates, both single-and-five year estimates. This data was released in fall 2020. More information on the data release timelines and contents is here. The one-year data only includes estimates for geographic areas of 65,000+ so we’ll want to use the 2015-2019 five-year estimates to capture data at the county level.

I think the ACS variable that best reflects our research interest is the number of people in each county with limited English proficiency (LEP). We can use this variable to compare the total number of people with LEP in each county as well as the percentage of each county’s population with LEP. The Census Bureau defines a respondent as LEP if they are over the age of 5, speak a language other than English, and self-identify as speaking English less than “very well”. This means they responded with either “well”, “not well”, or “not well at all”. Alternatively, we could also use variables that capture the number of households with LEP or the number of households where English is not spoken at home.

I retrieved the LEP data from ACS Table S1601. The table below shows the top few counties by percent of population with LEP, most of which are in Texas. The histogram shows the distribution of the data for all U.S. counties, and shows that less than 10% of people have LEP in the vast majority of counties. Still, it seems that translation services would be very essential for a CAH that serves a population with 10% LEP.

The value for each county is an estimate, so it comes with a margin of error as well. It would certainly be important to incorporate it into statistical analysis using this data, but I didn’t do much with it here because we’re just planning to use the data to create a list of counties for the moment. For reference The map below is another way to view the data for all U.S. counties. It shows which counties have higher LEP population, which counties have a higher percentage of LEP residents, and which counties have CAHs.

The previous figures showed the LEP data across the country, but we’re specifically looking for counties that have CAHs and are considered rural. Before getting into how we define rural, we can take a quick look at LEP rates among counties with CAHs. The table below is the same as the first, but only includes counties with CAHs. There were 1,147 counties with CAHs compared to the overall county total of just over 3,000.

Selecting Rural Counties

OMB Definition

There are a few different ways to classify rural and urban areas, as discussed here. I tried using a couple of these restrict the counties we consider. First, I used a similar definition as the Office of Management and Budget which defines counties as metro and nonmetro. There are other government agencies that use this same classification, but break them into different categories into subcategories. The USDA provides “rural-urban continuum codes” which contains three metro and six non-metro categories. The National Center for Health Statistics uses a six-level scheme, which is available here. These categories were last updated in 2013 and are planned to be updated again in 2023.

The OMB data is broken into the following categories listed below. I wasn’t sure of exactly which categories made the most sense to classify as rural, but ultimately choose to divide urban and rural similar to the USDA (see the next set of classification data). All counties in metro areas were classified as urban (only the first two), while the rest were classified as rural. This is certainly an imperfect way to do things, but seemed to make general sense to me based on how MN counties were ranked on this scale. Based on this system, 1,236 counties are considered urban while 1,996 are considered rural.

  • 1: Large-in a metro area with at least 1 million residents or more
  • 2: Small-in a metro area with fewer than 1 million residents
  • 3: Micropolitan adjacent to a large metro area
  • 4: Noncore adjacent to a large metro area
  • 5: Micropolitan adjacent to a small metro area
  • 6: Noncore adjacent to a small metro with town of at least 2,500 residents
  • 7: Noncore adjacent to a small metro and does not contain a town of at least 2,500 residents
  • 8: Micropolitan not adjacent to a metro area
  • 9: Noncore adjacent to micro area and contains a town of 2,500-19,999 residents
  • 10: Noncore adjacent to micro area and does not contain a town of at least 2,500 residents
  • 11: Noncore not adjacent to a metro/micro area and contains a town of 2,500 or more residents
  • 12: Noncore not adjacent to a metro/micro area and does not contain a town of at least 2,500 residents

The table below shows how many counties are in each group.

## # A tibble: 12 x 2
## # Groups:   uic_2013 [12]
##    uic_2013     n
##       <dbl> <int>
##  1        1   472
##  2        2   764
##  3        3   132
##  4        4   149
##  5        5   245
##  6        6   344
##  7        7   164
##  8        8   269
##  9        9   184
## 10       10   189
## 11       11   125
## 12       12   184

I also used the USDA data and broke those nine different categories into a simple metro (the first 3) or non-metro (the rest) indicator. We could refine this scheme, and the original categories were:

  • 1: Metro - Counties in metro areas of 1 million population or more
  • 2: Metro - Counties in metro areas of 250,000 to 1 million population
  • 3: Metro - Counties in metro areas of fewer than 250,000 population
  • 4: Nonmetro -Urban population of 20,000 or more, adjacent to a metro area
  • 5: Nonmetro - Urban population of 20,000 or more, not adjacent to a metro area
  • 6: Nonmetro -Urban population of 2,500 to 19,999, adjacent to a metro area
  • 7: Nonmetro - Urban population of 2,500 to 19,999, not adjacent to a metro area
  • 8: Nonmetro - Completely rural or less than 2,500 urban population, adjacent to a metro area
  • 9: Nonmetro - Completely rural or less than 2,500 urban population, not adjacent to a metro area

The first table below shows the number of counties that fall into each category. 1,236 counties into one of the urban classifications, while 1,996 are considered rural. This is the rural-urban distribution as the OMB data, just broken down into different subcategories. The table shows how counties are distributed across the USDA categories. The map below shows where rural and urban counties are in the U.S. according to both datasets (they are same for this).

## # A tibble: 9 x 2
## # Groups:   rucc_2013 [9]
##   rucc_2013     n
##       <dbl> <int>
## 1         1   472
## 2         2   395
## 3         3   369
## 4         4   217
## 5         5    98
## 6         6   597
## 7         7   436
## 8         8   220
## 9         9   428

Using the definition of rural outlined above, I filtered the data to only include counties that were both rural and had CAHs. The table below shows those counties listed according to the percentage of their population with LEP. It also includes the CAH in each county. This is a good starting point for a list of counties for study. In addition to choosing counties with a high percent of residents with LEP, I assume we will also want to prioritize geographic diversity. Another thing to consider is the extent to which patients, especially at CAHs in very small counties, come from nearby counties. We may also want to think about whether we’d like a diversity of types of rural counties considered either the OMB or USDA subclassifications.

Overall, it seems like there’s a pretty good selection of counties to choose from. There are about 60 CAH counties that have at least 10 percent of their population with LEP. One note about the table– based on the type of join I did any county with more than 1 CAH will be listed twice.

Census Bureau Definition

Another option for defining rural comes from the Census Bureau, which breaks things into urbanized areas, urban clusters, and rural areas. This is a bit different because it deals with smaller geographic units than counties. However, the Bureau has data for download that tracks the percentage of people in each county living in a rural area. They define rural areas as those outside of urban areas and urban clusters, which are defined further in the previous links. If we wanted to, we could rank counties by percent rural rather than classifying them as either rural or not. The Bureau further classifies counties as urban if they are less than 50% rural, mostly rural if they’re 50-99% rural, and rural if they’re 100% rural. The map below shows how counties are divided into these three groups.

The table below shows the data once it’s filtered to exclude urban counties according to this definition and exclude those without CAHs. It includes a slightly smaller set of “rural” counties with CAHs than the OMB and USDA definitions- 952 compared to 1,078 (keeping in mind counties with multiple CAHs are counted multiple times).

Other measures of rural

The USDA has another definition of rural that uses the rural-urban commuting system, which is apparently particularly good at finding rural areas within metro counties. FORHP uses thhis for determining rural eligibility. This data is available at the tract and ZIP level, so it could be aggregated the the county level. One use for this could be to doublecheck that the CAHs in the counties we choose are located in a rural tract (are there any CAHs that aren’t?).

There are also a few other ways that various government agencies use to determine rurality, discussed more here. One additional way of thinking about the problem that could be useful are the “Frontier and Remoteness Area Codes” which place more emphasis on the geographic isolation of an area. This ties in well with the rules for CAHs. This data is given at the PUMA level and could probably be joined with the raw ACS data.

urban influence codes - https://www.ers.usda.gov/data-products/urban-influence-codes/