Loading the necessary libraries

library(sf)
## Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
library(tidycensus)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
library(tmap)
## Breaking News: tmap 3.x is retiring. Please test v4, e.g. with
## remotes::install_github('r-tmap/tmap')
library(tidyr)

Importing the Census API

tidycensus::census_api_key(Sys.getenv("Census_API"))
## To install your API key for use in future sessions, run this function with `install = TRUE`.

Importing the Yelp data using the st_read function

hospital_data <- st_read("https://raw.githubusercontent.com/ujhwang/urban-analytics-2024/main/Assignment/mini_3/yelp_hospital.geojson")
## Reading layer `yelp_hospital' from data source 
##   `https://raw.githubusercontent.com/ujhwang/urban-analytics-2024/main/Assignment/mini_3/yelp_hospital.geojson' 
##   using driver `GeoJSON'
## Simple feature collection with 129 features and 23 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -84.56242 ymin: 33.60009 xmax: -84.08677 ymax: 34.0701
## Geodetic CRS:  WGS 84

Downloading the ACS data for the Fulton and DeKalb census tracts using tidycensus

acs_tract <- suppressMessages(
  get_acs(geography = "tract", 
          state = "GA",
          county = c("Fulton", "Dekalb"), 
          # chose 3 variables: median house hold income, poverty status, and population
          # Download ACS data for Fulton and DeKalb counties
  acs_vars <- c(median_income = "B19013_001", 
              percent_poverty = "B17001_002", 
              percent_uninsured = "B27001_001", 
              percent_black = "B02009_001"), 
          year = 2021,
          survey = "acs5", # American Community Survey 5-year estimate
          geometry = TRUE, # returns sf objects
          output = "wide") # wide vs. long
)
##   |                                                                              |                                                                      |   0%  |                                                                              |=                                                                     |   1%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |============                                                          |  17%  |                                                                              |=============                                                         |  18%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  22%  |                                                                              |=================                                                     |  24%  |                                                                              |======================                                                |  32%  |                                                                              |=============================                                         |  41%  |                                                                              |==================================                                    |  49%  |                                                                              |======================================                                |  54%  |                                                                              |========================================                              |  57%  |                                                                              |==============================================                        |  65%  |                                                                              |=================================================                     |  70%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |==================================================================    |  94%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |======================================================================| 100%

Tidying the data by removing duplicates and by joining the yelp data with the ACS data. Also making sure the two sets of data are set to the same CRS system

acs_tract_clean = acs_tract %>%
  drop_na()

hospital_data_clean = hospital_data %>%
  drop_na()

acs_tract_clean = st_transform(acs_tract_clean, crs = st_crs(hospital_data_clean))

#Joining the datasets
hospital_acs = st_join(acs_tract_clean, hospital_data_clean)

Use spatial operations to analyze the proximity of Census Tracts to hospitals:

#Create a 0.25 mile buffer (402.336 meters)
acs_tract_buffer <- st_buffer(acs_tract_clean, dist = 402.336)
# Intersect with hospitals
hospitals_within_buffer <- st_intersects(acs_tract_buffer,hospital_data_clean)

Visualizing hospital location within the buffer zone

tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(acs_tract_buffer) +  # ACS Census tracts with buffer
  tm_polygons(col = "red", border.col = "cyan", alpha = 0.5, title = "Census Tract Buffer (0.25 mile)") +
  tm_shape(hospital_data) +  # Hospital locations
  tm_dots(col = "yellow", size = 0.05, title = "Hospitals") +
  tm_layout(title = "Hospital Locations with 0.25-mile Buffer", legend.outside = TRUE)
# Calculate the distance from each Census Tract to the nearest hospital
distances_to_hospital <- st_distance(acs_tract_clean, hospital_data_clean)

This graph will show the number of hospitals within 0.25 miles of Census Tracts in relation to a key socioeconomic variable, such as the percentage of the population below the poverty line.

# Create a new variable for the number of hospitals within the buffer
acs_tract_clean$num_hospitals <- lengths(hospitals_within_buffer)

# Create the second scatter plot for the relationship

ggplot(acs_tract_clean, aes(x = percent_povertyE, y = num_hospitals)) +
  geom_point(color = "green", alpha = 0.7) +
  geom_smooth(method = "lm", se = FALSE, color = "red") + # Linear regression line
  labs(title = "Number of Hospitals within 0.25 Miles vs. Percentage of Population Below Poverty Line",
       x = "Percentage of Population Below Poverty Line",
       y = "Number of Hospitals within 0.25 Miles") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

Similarly this graph will show the number of hospitals within 0.25 miles of Census Tracts in relation to a key socioeconomic variable, such as the percentage of the population who are un-insured

ggplot(acs_tract_clean, aes(x = percent_uninsuredE, y = num_hospitals)) +
  geom_point(color = "yellow", alpha = 0.7) +
  geom_smooth(method = "lm", se = FALSE, color = "blue") + # Linear regression line
  labs(title = "Number of Hospitals within 0.25 Miles vs. Percentage of Population UnInsured",
       x = "Percentage of Population UnInsured",
       y = "Number of Hospitals within 0.25 Miles") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

This graph will show disparities in hospital access across racial groups, such as the African American population, in relation to hospital proximity.

# Plot showing the relationship between African American population proportion and number of hospitals within 0.25 miles
ggplot(acs_tract_clean, aes(x = percent_blackE, y = num_hospitals)) +
  geom_point(color = "green", size = 2) +
  geom_smooth(method = "lm", se = FALSE, color = "darkgreen") +  # Add a linear trendline
  labs(title = "Hospital Access for African American Population",
       x = "African American Population (%)", y = "Number of Hospitals within 0.25 Mile") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

And finally plotting Bar Plot Showing the Relationship Between Hospital Access and Median Income

ggplot(acs_tract_clean, aes(x = median_incomeE, y = num_hospitals)) +
  geom_point(color = "black", size = 2) +
  geom_smooth(method = "lm", se = FALSE, color = "white") +  # Add a linear trendline
  labs(title = "Number of Hospitals within 0.25 Mile vs. Median Income",
       x = "Median Income (%)", y = "Number of Hospitals within 0.25 Mile") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'

Conclusion

For this analysis, Yelp data was used to gather the locations of hospitals in Fulton and DeKalb counties, while demographic data was obtained from the U.S. Census Bureau at the Census tract level. The selected socioeconomic variables for the study included the percentage of African American residents, the percentage of uninsured residents, median household income, and the percentage of the population below the poverty line. The goal was to assess whether the distribution of hospitals was equitable across these factors.

The findings revealed a disparity in hospital distribution. Census tracts with higher median incomes had a greater number of hospitals, while tracts with higher percentages of uninsured residents and populations below the poverty line had fewer hospitals. This indicates that healthcare access is less favorable for lower-income and uninsured populations, highlighting a potential inequity in healthcare access in these areas. Although the results suggest that wealthier areas are better served, the presence of hospitals across most tracts means that some areas may still have reasonable access despite economic disadvantages. However, the concentration of healthcare facilities in wealthier regions raises questions about whether more vulnerable populations are receiving the same level of healthcare services.