Directions

Task 1

Import the hospital POI data (link) and make sure it is tidy and ready for analysis.

Response: In this step, I first removed potential duplicates by keeping only one record for each hospital id, then checked for missing values across all fields. The results showed that:
- The hospital POI dataset has 119 unique records.
- No missing values in all fields (id, name, address, primary_type, types, status, rating, rating_count, and geometry).
- The dataset is clean, complete, and tidy, and thus ready for spatial analysis.

# import the dataset
hosp0 <- st_read("https://raw.githubusercontent.com/ujhwang/urban-analytics-2025/main/Assignment/mini_3/hospital_11counties.geojson")

## Reading layer `hospital_11counties' from data source 
##   `https://raw.githubusercontent.com/ujhwang/urban-analytics-2025/main/Assignment/mini_3/hospital_11counties.geojson' 
##   using driver `GeoJSON'
## Simple feature collection with 119 features and 8 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -84.73147 ymin: 33.42719 xmax: -83.92052 ymax: 34.24585
## Geodetic CRS:  WGS 84

# print the dataset
print(head(hosp0, 5))

## Simple feature collection with 5 features and 8 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -84.50847 ymin: 33.42719 xmax: -84.09618 ymax: 33.50894
## Geodetic CRS:  WGS 84
##                            id                                            name
## 1 ChIJZfvWaSfv9IgRO0Er_vksXXU                                Piedmont Fayette
## 2 ChIJg5T0Tsxb9IgRmDdy4cVBydU                              Primary Pediatrics
## 3 ChIJeZlNG9RE9IgRsiEsG6aOOPI Aylo Health - Primary Care at McDonough, Hwy 81
## 4 ChIJn4tmsLD69IgR3Jdd7nzpAkY         Southeast Medical Group at Fayetteville
## 5 ChIJA3catM1b9IgRa7fpG9cm3Iw                          Resurgens Orthopaedics
##                                             address primary_type
## 1        1255 Hwy 54 W, Fayetteville, GA 30214, USA     hospital
## 2   110-A Regency Park Dr, McDonough, GA 30253, USA     hospital
## 3       65 Old Jackson Rd, McDonough, GA 30252, USA     hospital
## 4 105 Carnegie Pl #103, Fayetteville, GA 30214, USA     hospital
## 5     156 Foster Dr Ste B, McDonough, GA 30253, USA     hospital
##                                                    types      status rating
## 1        hospital,health,point_of_interest,establishment OPERATIONAL    2.6
## 2        hospital,health,point_of_interest,establishment OPERATIONAL    4.0
## 3 hospital,doctor,health,point_of_interest,establishment OPERATIONAL    4.7
## 4 hospital,doctor,health,point_of_interest,establishment OPERATIONAL    4.5
## 5 hospital,doctor,health,point_of_interest,establishment OPERATIONAL    4.8
##   rating_count                   geometry
## 1         1096 POINT (-84.50847 33.45259)
## 2           69 POINT (-84.17159 33.43241)
## 3         1165 POINT (-84.09618 33.42719)
## 4          739 POINT (-84.42877 33.50894)
## 5         3180  POINT (-84.2052 33.46362)

# 1. Remove duplicated rows
hosp_unique <- hosp0 %>% distinct(id, .keep_all=T)
# 2. Handle missing values
# Count the number of NAs in each column
hosp_unique %>% map_dbl(., ~sum(is.na(.x)))

##           id         name      address primary_type        types       status 
##            0            0            0            0            0            0 
##       rating rating_count     geometry 
##            0            0            0

print(paste0("Before: ", nrow(hosp0)))

## [1] "Before: 119"

print(paste0("After: ", nrow(hosp_unique)))

## [1] "After: 119"

Task 2

Explore variables in the Census ACS 5-year estimates, carefully select those that are relevant for an equity analysis, and provide justification for each variable you select.

Think about how each variable captures different circumstances and needs for healthcare.
Response: I chose to analyze healthcare equity with a focus on older adults. Older populations often have higher and more urgent healthcare needs due to chronic conditions, mobility limitations, and greater dependence on timely access to hospitals. To capture both demographic and socioeconomic aspects of vulnerability, I selected two key variables from the Census ACS 5-year estimates:
- Population aged 65 and over (tract level): This variable directly identifies where older adults are concentrated across Metro Atlanta. Since older adults have higher healthcare utilization rates, the spatial distribution of seniors highlights areas with greater baseline demand for medical services. Comparing this distribution with hospital locations allows us to evaluate whether healthcare infrastructure aligns with the needs of aging communities.
- Median household income of households headed by older adults (tract level): Household income serves as a proxy for socioeconomic status, which is strongly associated with health outcomes through the well-established “wealth–health” gradient. In general, wealthier households are able to afford healthier living environments, better preventive care, and greater treatment options, while lower-income households face barriers such as delayed care, reliance on overburdened facilities, and increased exposure to risk factors. By specifically focusing on the income of older householders, I capture the unique economic conditions of seniors—many of whom live on fixed incomes after retirement. This is particularly important because limited financial resources compound health vulnerabilities, reducing the ability of older adults to travel longer distances or afford specialized care.
References:

[1] McMaughan, D. J., Oloruntoba, O., & Smith, M. L. (2020). Socioeconomic status and access to healthcare: interrelated drivers for healthy aging. Frontiers in public health, 8, 231.

[2] Penning, M. J., & Zheng, C. (2016). Income inequities in health care utilization among adults aged 50 and older. Canadian Journal on Aging/La Revue canadienne du vieillissement, 35(1), 55-69.

[3] Chi, Z., Lun, H., Ma, J., & Zhou, Y. (2024). Income inequality and healthcare utilization of the older adults-based on a study in three provinces and six cities in China. Frontiers in Public Health, 12, 1435162.

[4] Cheng, L., Yang, M., De Vos, J., & Witlox, F. (2020). Examining geographical accessibility to multi-tier hospital care services for the elderly: A focus on spatial equity. Journal of Transport & Health, 19, 100926.

census_api_key(Sys.getenv("CENSUS_API_KEY"))

## To install your API key for use in future sessions, run this function with `install = TRUE`.

vars23 <- load_variables(2023, "acs5", cache = TRUE)

Task 3

Download ACS data for the selected variables at the Census Tract level for the 11 counties using the tidycensus.

Response: I used the tidycensus package to pull ACS 5-year estimates at the census tract level for the 11 Metro Atlanta counties. Because my equity analysis focuses on older adults, I downloaded:
- Population 65+ (from table B01001, summed across all 65+ bins) and total population (B01003_001) to compute % seniors.
- Median household income for senior-headed households (B19049_003 for age 65–74; B19049_004 for age 75+).
Handling API issues: When I initially queried all counties in one request (with geometry), I hit intermittent Census API rate/timeout errors. To make the pull robust, I queried county-by-county and inserted a short Sys.sleep() between requests to respect rate limits. I also pulled attributes without geometry for the income table (lighter requests) and merged to tract geometries afterward.

# counties
counties <- c("Cherokee","Clayton","Cobb","DeKalb","Douglas","Fayette","Forsyth","Fulton","Gwinnett","Henry","Rockdale")

# 1. Population 65+
age_vars <- c(
  # Male 65+
  "B01001_020","B01001_021","B01001_022","B01001_023","B01001_024","B01001_025",
  # Female 65+
  "B01001_044","B01001_045","B01001_046","B01001_047","B01001_048","B01001_049",
  # Total population
  "B01003_001"
)

age_long <- map_dfr(counties, function(cty){
  message("Pulling age 65+ with geometry: ", cty)
  out <- suppressMessages(
    get_acs(
      geography = "tract",
      state = "GA",
      county = cty,
      variables = age_vars,
      year = 2023,
      survey = "acs5",
      geometry = TRUE
    )
  )
  Sys.sleep(1) # pause to avoid rate limit
  out
})

age_sf <- age_long %>%
  mutate(
    is_male65 = variable %in% c("B01001_020","B01001_021","B01001_022","B01001_023","B01001_024","B01001_025"),
    is_fem65  = variable %in% c("B01001_044","B01001_045","B01001_046","B01001_047","B01001_048","B01001_049"),
    is_total  = variable == "B01003_001"
  ) %>%
  group_by(GEOID) %>%
  summarise(
    male65plus   = sum(ifelse(is_male65, estimate, 0), na.rm = TRUE),
    female65plus = sum(ifelse(is_fem65, estimate, 0), na.rm = TRUE),
    total_pop    = sum(ifelse(is_total, estimate, 0), na.rm = TRUE),
    geometry     = first(geometry)
  ) %>%
  mutate(
    age65plus    = male65plus + female65plus,
    pct_65plus   = if_else(total_pop > 0, 100 * age65plus / total_pop, NA_real_),
    female_share_65plus = if_else(age65plus > 0, 100 * female65plus / age65plus, NA_real_)
  )

# 2. Senior-related household income
inc_df <- purrr::map_dfr(counties, function(cty){
  message("Pulling senior household income medians: ", cty)
  out <- suppressMessages(
    get_acs(
      geography = "tract",
      state = "GA",
      county = cty,
      variables = c(
        hhinc_65_74 = "B19049_003",
        hhinc_75p   = "B19049_004"
      ),
      year = 2023,
      survey = "acs5",
      geometry = FALSE,
      output = "wide"
    )
  )
  Sys.sleep(1)
  out
}) %>%
  dplyr::select(GEOID,
                hhinc_65_74 = hhinc_65_74E,
                hhinc_75p   = hhinc_75pE)

acs_senior_income <- age_sf %>%
  left_join(inc_df, by = "GEOID")

# check the dataset
print(head(acs_senior_income, 5))

## Simple feature collection with 5 features and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -84.65586 ymin: 34.21361 xmax: -84.25759 ymax: 34.41259
## Geodetic CRS:  NAD83
## # A tibble: 5 × 10
##   GEOID    male65plus female65plus total_pop                  geometry age65plus
##   <chr>         <dbl>        <dbl>     <dbl>        <MULTIPOLYGON [°]>     <dbl>
## 1 1305709…        380          374      4968 (((-84.39222 34.37329, -…       754
## 2 1305709…        372          381      5166 (((-84.49538 34.29371, -…       753
## 3 1305709…        151          153      1958 (((-84.42115 34.30266, -…       304
## 4 1305709…        259          212      2472 (((-84.56155 34.29755, -…       471
## 5 1305709…        611          861      7217 (((-84.65586 34.2139, -8…      1472
## # ℹ 4 more variables: pct_65plus <dbl>, female_share_65plus <dbl>,
## #   hhinc_65_74 <dbl>, hhinc_75p <dbl>

Task 4

Prepare the data for analysis:

Check for missing values and handle them appropriately.
Create new variables if needed (e.g., proportions, densities).
Join the POI data with the ACS data.
Ensure coordinate reference systems (CRS) are appropriate for distance calculations.
Response: In preparing the data, I first addressed missing values. Tracts with total_pop = 0 were removed because they represent non-residential areas. I also dropped tracts with missing senior household income, since income is highly heterogeneous across space and cannot be reliably imputed from surrounding tracts without bias. When downloading the ACS data in the previous step, I directly computed the proportion of older adults (pct_65plus) rather than storing every single age group variable. With over 1,000 tracts in the dataset, this approach avoided generating an unnecessarily large dataset while still retaining the key demographic measure for my analysis. After constructing new variables such as the hospital accessibility score, I joined the ACS tract-level data with the hospital POI dataset. Finally, I reprojected all spatial layers to a local projected CRS (EPSG:32616) to ensure distance-based measures are accurate.

Check for missing values

# Count the number of NAs in each column
acs_senior_income %>% map_dbl(., ~sum(is.na(.x)))

##               GEOID          male65plus        female65plus           total_pop 
##                   0                   0                   0                   0 
##            geometry           age65plus          pct_65plus female_share_65plus 
##                   0                   0                   3                   6 
##         hhinc_65_74           hhinc_75p 
##                 104                 105

# check for missing values
acs_clean <- acs_senior_income %>%
  # 1. drop rows where pct_65plus is NA (total_pop = 0 tracts)
  filter(!is.na(pct_65plus)) %>%
  # 2. drop rows where pct_65plus == 0 (tract has no seniors)
  filter(pct_65plus > 0) %>%
  # 3. drop rows where either income variable is NA
  filter(!is.na(hhinc_65_74) & !is.na(hhinc_75p))

# Compare before and after
n_before <- nrow(acs_senior_income)
n_after  <- nrow(acs_clean)
cat("Rows before:", n_before, "\nRows after:", n_after, "\nRemoved:", n_before - n_after, "\n")

## Rows before: 1248 
## Rows after: 1056 
## Removed: 192

# Check missing values after cleaning
acs_clean %>% map_dbl(., ~sum(is.na(.x)))

##               GEOID          male65plus        female65plus           total_pop 
##                   0                   0                   0                   0 
##            geometry           age65plus          pct_65plus female_share_65plus 
##                   0                   0                   0                   0 
##         hhinc_65_74           hhinc_75p 
##                   0                   0

print(head(acs_clean, 5))

## Simple feature collection with 5 features and 9 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -84.65586 ymin: 34.21361 xmax: -84.25759 ymax: 34.41259
## Geodetic CRS:  NAD83
## # A tibble: 5 × 10
##   GEOID    male65plus female65plus total_pop                  geometry age65plus
##   <chr>         <dbl>        <dbl>     <dbl>        <MULTIPOLYGON [°]>     <dbl>
## 1 1305709…        380          374      4968 (((-84.39222 34.37329, -…       754
## 2 1305709…        372          381      5166 (((-84.49538 34.29371, -…       753
## 3 1305709…        151          153      1958 (((-84.42115 34.30266, -…       304
## 4 1305709…        259          212      2472 (((-84.56155 34.29755, -…       471
## 5 1305709…        611          861      7217 (((-84.65586 34.2139, -8…      1472
## # ℹ 4 more variables: pct_65plus <dbl>, female_share_65plus <dbl>,
## #   hhinc_65_74 <dbl>, hhinc_75p <dbl>

Join the POI data with the ACS data

## Make sure the CRS of the two sf objects are in the same PCS
acs_clean %<>% st_transform(32616)
hosp_unique %<>% st_transform(32616)

# Join hospitals with tracts
tract_hosp <- st_join(acs_clean, hosp_unique)

# Get the presence of hospitals at the tract level
geoid_tracts_with_hosp <- tract_hosp %>% 
  filter(!is.na(name)) %>% 
  pull(GEOID) %>% 
  unique()

# Join the presence of hospitals to `tract`
acs_clean %<>% mutate(hosp = GEOID %in% geoid_tracts_with_hosp)

Ensure CRS and the spatial join

st_crs(acs_clean)

## Coordinate Reference System:
##   User input: EPSG:32616 
##   wkt:
## PROJCRS["WGS 84 / UTM zone 16N",
##     BASEGEOGCRS["WGS 84",
##         ENSEMBLE["World Geodetic System 1984 ensemble",
##             MEMBER["World Geodetic System 1984 (Transit)"],
##             MEMBER["World Geodetic System 1984 (G730)"],
##             MEMBER["World Geodetic System 1984 (G873)"],
##             MEMBER["World Geodetic System 1984 (G1150)"],
##             MEMBER["World Geodetic System 1984 (G1674)"],
##             MEMBER["World Geodetic System 1984 (G1762)"],
##             MEMBER["World Geodetic System 1984 (G2139)"],
##             MEMBER["World Geodetic System 1984 (G2296)"],
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]],
##             ENSEMBLEACCURACY[2.0]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4326]],
##     CONVERSION["UTM zone 16N",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",-87,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9996,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",500000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["(E)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["(N)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Navigation and medium accuracy spatial referencing."],
##         AREA["Between 90°W and 84°W, northern hemisphere between equator and 84°N, onshore and offshore. Belize. Canada - Manitoba; Nunavut; Ontario. Costa Rica. Cuba. Ecuador - Galapagos. El Salvador. Guatemala. Honduras. Mexico. Nicaragua. United States (USA)."],
##         BBOX[0,-90,84,-84]],
##     ID["EPSG",32616]]

st_crs(hosp_unique)

## Coordinate Reference System:
##   User input: EPSG:32616 
##   wkt:
## PROJCRS["WGS 84 / UTM zone 16N",
##     BASEGEOGCRS["WGS 84",
##         ENSEMBLE["World Geodetic System 1984 ensemble",
##             MEMBER["World Geodetic System 1984 (Transit)"],
##             MEMBER["World Geodetic System 1984 (G730)"],
##             MEMBER["World Geodetic System 1984 (G873)"],
##             MEMBER["World Geodetic System 1984 (G1150)"],
##             MEMBER["World Geodetic System 1984 (G1674)"],
##             MEMBER["World Geodetic System 1984 (G1762)"],
##             MEMBER["World Geodetic System 1984 (G2139)"],
##             MEMBER["World Geodetic System 1984 (G2296)"],
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]],
##             ENSEMBLEACCURACY[2.0]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4326]],
##     CONVERSION["UTM zone 16N",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",-87,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9996,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",500000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["(E)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["(N)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Navigation and medium accuracy spatial referencing."],
##         AREA["Between 90°W and 84°W, northern hemisphere between equator and 84°N, onshore and offshore. Belize. Canada - Manitoba; Nunavut; Ontario. Costa Rica. Cuba. Ecuador - Galapagos. El Salvador. Guatemala. Honduras. Mexico. Nicaragua. United States (USA)."],
##         BBOX[0,-90,84,-84]],
##     ID["EPSG",32616]]

tm_shape(acs_clean) + tm_fill(fill="hosp") +
  tm_shape(hosp_unique) + tm_dots()

## Registered S3 method overwritten by 'jsonify':
##   method     from    
##   print.json jsonlite

Task 5

Define and compute your operational measure of healthcare access (the dependent variable).

Identify suitable metric(s) and justify your choices. Example metrics include:
- The number of hospitals within X miles.
- The presence of a hospital within X miles.
- The distance to the nearest hospital.
- The average distance to the X nearest hospitals.
Compute your chosen measure(s).
Response: For this study, I operationalized healthcare access using hospitals within a 5-mile radius of each census tract centroid. The choice of 5 miles is guided by both the literature and practical considerations: in urban regions like Metro Atlanta, five miles roughly corresponds to a 10–15 minute drive, which is considered a reasonable threshold for timely access to acute care, especially for older adults who may have mobility limitations. I first used the count of hospitals within 5 miles as a baseline measure. A higher count reflects greater availability and redundancy in healthcare services. Relying only on the nearest hospital can underestimate access in dense areas where multiple facilities are clustered, so the count metric better captures the range of options available to residents. However, access is not only about the quantity of hospitals but also their quality. To incorporate this dimension, I calculated the average rating of hospitals within 5 miles, based on the ratings attribute in the POI dataset. This ensures that tracts with many low-rated hospitals are not automatically considered “well served.” Finally, I proposed a composite measure that blends quantity and quality: the hospital access score = (count of hospitals within 5 miles) × (average rating of those hospitals). This composite reflects both how many hospitals are nearby and whether they are of acceptable quality. Areas with more hospitals and higher ratings score higher, while areas with no nearby hospitals or only poorly rated hospitals score lower. This combined metric provides a more nuanced measure of healthcare access equity, particularly relevant for assessing the needs of vulnerable older populations.

# Create tract centroids
tract_cent <- st_centroid(acs_clean)

# Compute distances matrix (in meters)
dist_m <- st_distance(tract_cent, hosp_unique)
dist_km <- units::drop_units(as.matrix(dist_m)) / 1000  # convert to km

# Identify hospitals within 5 miles (~8.05 km)
within5mi <- dist_km <= 8.05

# Count of hospitals within 5 miles
hosp_count_5mi <- apply(within5mi, 1, sum, na.rm = TRUE)

# Average rating of hospitals within 5 miles
hosp_avg_rating_5mi <- sapply(1:nrow(within5mi), function(i){
  idx <- which(within5mi[i, ])
  if (length(idx) == 0) return(NA_real_)  # no hospitals nearby
  mean(hosp_unique$rating[idx], na.rm = TRUE)
})

# Attach to tract data
acs_access <- acs_clean %>%
  mutate(
    hosp_count_5mi = hosp_count_5mi,
    hosp_avg_rating_5mi = hosp_avg_rating_5mi
  )

acs_access <- acs_access %>%
  mutate(
    hosp_access_score = if_else(
      !is.na(hosp_avg_rating_5mi),
      hosp_count_5mi * hosp_avg_rating_5mi,
      0  # set to 0 if no hospitals / no ratings
    )
  )

Task 6

Analyze the spatial distribution of hospitals from an equity perspective:

Conduct exploratory data analysis and present at least two graphs and at least two maps.
Fit appropriate regression model(s) to test associations between healthcare access and socioeconomic factors.
Conclude with your verdict: “Is the spatial distribution of hospitals in Metro Atlanta equitable?”

Explore the distribution

Map 1: Percentage of older adults (65+) with hospital locations

This map shows the relative aging of each tract. Darker blue areas represent tracts where seniors make up a larger share of the population.
Hospitals (black dots) are heavily clustered in the urban core and along major corridors.
Noticeably, some tracts with high % seniors (darker blue, especially in the northern and southern edges) have few or no hospitals nearby, suggesting potential inequities: these “aging communities” may be underserved despite having a higher relative need.

tm_shape(acs_access) + tm_fill(fill="pct_65plus") +
  tm_shape(hosp_unique) + tm_dots()

Map 2: Number of older adults (65+) with hospital locations

This map highlights the absolute senior population. Darker blue tracts contain the largest numbers of older residents (1,500–2,500+).
These tracts are concentrated in suburban areas, where hospitals are present but more dispersed than in the city center.
Compared to Map 1, this reveals a contrast: some areas with relatively few seniors (% terms) still host large numbers in absolute terms, which may justify why hospitals are sited there (serving more people overall).
Hospitals are more closely aligned with where the most seniors live in total rather than with where seniors make up the highest share of the local population.

tm_shape(acs_access) + tm_fill(fill="age65plus") +
  tm_shape(hosp_unique) + tm_dots()

Map 3: Hospital access score (count × rating)

This composite measure blends the quantity of nearby hospitals with their quality (ratings).
Higher scores (darker blue) are concentrated in central Atlanta and adjacent tracts, reflecting both high hospital density and generally favorable ratings.
Peripheral tracts show much lighter shades, meaning residents have fewer hospitals within 5 miles and thus lower overall access.
When overlaid with Maps 1 and 2, this suggests that many tracts with high percentages of seniors (Map 1) or large absolute senior populations (Map 2) fall into low-access zones (Map 3), underscoring inequities in the distribution of healthcare resources.

tm_shape(acs_access) + tm_fill(fill="hosp_access_score") +
  tm_shape(hosp_unique) + tm_dots()

Explore the association

Graph 1 & 2: Hospital access score vs. % seniors / number of seniors

Tracts with a higher proportion of seniors tend to have slightly lower hospital access scores, indicating that “aging” communities may be underserved.
Tracts with larger absolute numbers of seniors generally show lower access scores, suggesting misalignment between total senior population size and healthcare provision.

ggplot(acs_access, aes(x = pct_65plus, y = hosp_access_score)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", color = "red") +
  labs(x = "% population 65+",
       y = "Hospital access score (count × rating)",
       title = "Hospital access score vs. % seniors across tracts")

ggplot(acs_access, aes(x = age65plus, y = hosp_access_score)) +
  geom_point(alpha = 0.5) +
  geom_smooth(method = "lm", color = "green") +
  labs(x = "Number of seniors (65+)",
       y = "Hospital access score (count × rating)",
       title = "Hospital access score vs. number of seniors")

Graph 3 & 4: Hospital access score vs. Median household income for 65–74 households / 75+ households

Access scores rise only marginally with income among 65–74 households / 75+ households, pointing to a weak positive relationship.

ggplot(acs_access, aes(x = hhinc_65_74, y = hosp_access_score)) +
  geom_point(alpha = 0.5, color = "lightgreen") +
  geom_smooth(method = "lm", color = "darkgreen") +
  labs(x = "Median household income (65–74)",
       y = "Hospital access score",
       title = "Hospital access score vs. Median household income for 65–74 households")

ggplot(acs_access, aes(x = hhinc_75p, y = hosp_access_score)) +
  geom_point(alpha = 0.5, color = "steelblue") +
  geom_smooth(method = "lm", color = "darkblue") +
  labs(x = "Median household income (75+)", y = "Hospital access score",
       title = "Hospital access score vs. Median household income for 75+ households")

Run the model

Linear regression model

Model fit: The regression explains about 12% of variation in hospital access scores (Adj. R² = 0.12; F-test p < 0.001).
Senior population effects:
- Absolute number of seniors (age65plus): Significant negative effect (β = –0.021, p < 0.001) → tracts with more seniors have lower access.
- Share of seniors (pct_65plus): Significant positive effect (β = 0.354, p < 0.001) → tracts where seniors make up a larger proportion of residents are better served.
- Gender composition (female share among seniors): Not significant.
Income effects:
- Households headed by seniors 75+: Significant positive effect (β = 6.81e-05, p < 0.001) → higher income aligns with better access.
- Households headed by seniors 65–74: Not significant.
Access is associated with relative aging and higher income among the oldest seniors, but not with the absolute number of seniors, suggesting that hospital distribution does not fully align with areas of greatest need.

reg_data <- acs_access %>%
  st_drop_geometry() %>%
  select(GEOID, hosp_count_5mi, hosp_access_score, age65plus, pct_65plus, female_share_65plus, hhinc_65_74, hhinc_75p)

m1 <- lm(hosp_access_score ~ age65plus + pct_65plus + female_share_65plus + hhinc_65_74 + hhinc_75p, data = reg_data)

summary(m1)

## 
## Call:
## lm(formula = hosp_access_score ~ age65plus + pct_65plus + female_share_65plus + 
##     hhinc_65_74 + hhinc_75p, data = reg_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -32.759 -11.567  -1.737   9.915  45.088 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          1.797e+01  2.605e+00   6.898 9.10e-12 ***
## age65plus           -2.087e-02  2.207e-03  -9.455  < 2e-16 ***
## pct_65plus           3.537e-01  1.027e-01   3.444 0.000595 ***
## female_share_65plus  2.508e-02  3.697e-02   0.678 0.497803    
## hhinc_65_74         -8.073e-06  1.253e-05  -0.644 0.519470    
## hhinc_75p            6.810e-05  1.187e-05   5.738 1.25e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.6 on 1050 degrees of freedom
## Multiple R-squared:  0.1239, Adjusted R-squared:  0.1197 
## F-statistic:  29.7 on 5 and 1050 DF,  p-value: < 2.2e-16

Conclusion

The linear regression model shows mixed associations between healthcare access and senior demographics.
With an R² of 0.12, the model explains only a small share of variation, but the patterns indicate that hospital access is more closely aligned with affluence and relative aging than with absolute need.
The spatial distribution of hospitals in Metro Atlanta cannot be considered fully equitable: while some older and wealthier communities benefit, tracts with the largest senior populations remain underserved.
Equity means recognizing that different groups face different circumstances and health demands, and ensuring that resources are allocated in proportion to those needs. For older adults in particular, healthcare demands are higher due to chronic conditions, mobility limitations, and greater reliance on timely medical services. From this perspective, an equitable distribution of hospitals in Metro Atlanta would require aligning healthcare access with the unique vulnerabilities of senior populations, not just with overall population size or affluence.

Mini-Assignment 3

Yuhan Xu

2025-09-28

Healthcare Equity Analysis