In this assignment, you will simulate a journey that begins at a Census Tract centroid, drives to the nearest MARTA station, and then take the MARTA to the Midtown station. You will then compare how travel time varies by different Census Tracts. The steps and data required for this analysis are outlined below.
Step 1. Download the required GTFS data. Convert it to sf format, extract MARTA rail stations, and clean the stop names to remove duplicates. Also, extract the destination station.
Step 2. Download the required Census data. Convert Census Tract polygons into centroids and create a subset for analysis.
Step 3. Download the required OSM data. Convert it into an sfnetwork object and clean the network.
Step 4. Simulate a park-and-ride trip (from a test origin → closest station → Midtown station).
Step 5. Turn the simulation process from Step 4 into a reusable function.
Step 6. Apply the function from Step 5 iteratively to all home locations.
Step 7.Finally, create maps and plots to analyze whether there are any disparities in transit accessibility when commuting to Midtown.
Importing the necessary packages is part of this assignment. Add any required packages to the code chunk below as you progress through the tasks.
library(tidyverse)
library(gtfstools)
library(dplyr)
library(tidytransit)
library(gtfsrouter)
library(sf)
library(tidycensus)
library(osmdata)
library(nominatimlite)
library(tmap)
library(sfnetworks)
library(tidygraph)
library(leaflet)
library(htmltools)
library(ggplot2)
# TASK ////////////////////////////////////////////////////////////////////////
# Download MARTA GTFS data and assign it to `gtfs` object
gtfs <- read_gtfs("https://www.itsmarta.com/google_transit_feed/google_transit.zip")
# //TASK //////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Edit stop_name to append serial numbers (1, 2, etc.) to remove duplicate names
stop_dist <-stop_group_distances(gtfs$stops, by='stop_name') %>%
  filter(dist_max > 200)
gtfs$stops <- gtfs$stops %>% 
  group_by(stop_name) %>% 
  mutate(stop_name = case_when(stop_name %in% stop_dist$stop_name ~ paste0(stop_name, " (", seq(1,n()), ")"),
                               TRUE ~ stop_name))
# Create a transfer table
gtfs <- gtfsrouter::gtfs_transfer_table(gtfs, 
                                        d_limit = 200, 
                                        min_transfer_time = 120)
# NOTE: Converting to sf format uses stop_lat and stop_lon columns contained in gtfs$stops.
#       In the conversion process, stop_lat and stop_lon are converted into a geometry column, and
#       the output sf object do not have the lat lon column anymore.
#       But many other functions in tidytransit look for stop_lat and stop_lon.
#       So I re-create them using mutate().
gtfs <- gtfs %>% gtfs_as_sf(crs = 4326)
gtfs$stops <- gtfs$stops %>% 
  ungroup() %>% 
  mutate(stop_lat = st_coordinates(.)[,2],
         stop_lon = st_coordinates(.)[,1]) 
# Get stop_id for rails and buses
rail_stops <- gtfs$routes %>% 
  filter(route_type %in% c(1)) %>% 
  inner_join(gtfs$trips, by = "route_id") %>% 
  inner_join(gtfs$stop_times, by = "trip_id") %>% 
  inner_join(gtfs$stops, by = "stop_id") %>% 
  group_by(stop_id) %>% 
  slice(1) %>% 
  pull(stop_id)
# Extract MARTA rail stations
station <- gtfs$stops %>% filter(stop_id %in% rail_stops)
# Extract Midtown Station
midtown <- gtfs$stops %>% filter(stop_id == "134")
# Create a bounding box to which we limit our analysis
bbox <- st_bbox(c(xmin = -84.45241, ymin = 33.72109, xmax = -84.35009, ymax = 33.80101), crs = st_crs(4326)) %>% 
  st_as_sfc()
# =========== NO MODIFY ZONE ENDS HERE ========================================
# TASK ////////////////////////////////////////////////////////////////////////
# Specify Census API key whichever you prefer using census_api_key() function
census_api_key(Sys.getenv("MY_API_KEY"))
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# Using get_acs() function, download Census Tract level data for 2022 for Fulton, DeKalb, and Clayton in GA.
# and assign it to `census` object.
# Make sure you set geometry = TRUE.
# Required data from the Census ACS:
#  1) Median Household Income (name the column `hhinc`)
#  2) Minority Population (%) (name the column `pct_minority`)
# Note: You may need to download two or more Census ACS variables to calculate minority population (%). "Minority" here can refer to either racial minorities or racial+ethnic minorities -- it's your choice.
acs_variables <- load_variables(year = 2022, dataset = "acs5", cache = TRUE)
census_raw <- get_acs(
  geography = "tract",
  state = "GA",
  county = c("Fulton", "DeKalb", "Clayton"),
  variables = c("B19013_001E", "B03002_001E", "B03002_012E", "B03002_003E", "B03002_004E", "B03002_006E"),
  year = 2022,
  survey = "acs5",
  geometry = TRUE,
  output = "wide"
)
##   |                                                                              |                                                                      |   0%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |========                                                              |  11%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  17%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  23%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  57%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  80%  |                                                                              |=========================================================             |  81%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================| 100%
census_raw$hh_inc <- census_raw$B19013_001E
census_raw$tot_indv <- census_raw$B03002_001E
census_raw$hisp <- census_raw$B03002_012E
census_raw$non_hisp_white <- census_raw$B03002_003E
census_raw$non_hisp_black <- census_raw$B03002_004E
census_raw$non_hisp_asian <- census_raw$B03002_006E
census_raw$non_hisp_other <- census_raw$tot_indv - (census_raw$hisp + 
                                                 census_raw$non_hisp_white + 
                                                 census_raw$non_hisp_black + 
                                                 census_raw$non_hisp_asian)
census <- census_raw %>%
  mutate(
    pct_non_hisp_white = non_hisp_white / tot_indv * 100,
    pct_non_hisp_black = non_hisp_black / tot_indv * 100,
    pct_non_hisp_asian = non_hisp_asian / tot_indv * 100,
    pct_non_hisp_other = non_hisp_other / tot_indv * 100,
    pct_minority = (tot_indv - non_hisp_white) / tot_indv * 100
  ) %>%
  select(GEOID, NAME, hh_inc, pct_minority, geometry)
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# Convert the CRS of `census` to GCS
census <- census %>%
  st_transform(census, crs = 4326)
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# get centroids of `census` polygons, 
# extract centroids that fall inside the `bbox`, 
# and assign it to `home` object.
home <- st_centroid(census) %>%
  filter(!if_any(everything(), is.na)) %>% 
  st_intersection(bbox)
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# 1. Get OSM road data for all road types for the `bbox` area.
# 3. Convert the OSM data into an sf object
# 4. Convert osmdata polygons into lines
osm_road <- opq(bbox) %>%
  add_osm_feature(key = 'highway', 
                  value = c("motorway", "motorway_link",
                            "trunk", "trunk_link", 
                            "primary", "primary_link",
                            "secondary", "secondary_link",
                            "tertiary", "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line() 
osm_road <- osm_road$osm_lines[bbox,] %>% select(osm_id, highway)
  
names(osm_road)
## [1] "osm_id"   "highway"  "geometry"
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# 1. Convert osm_road$osm_line into sfnetwork
# 2. Activate edges
# 3. Clean the network using the methods we learned
# 4. Assign the cleaned network to an object named 'osm'
osm <- osm_road %>%
  as_sfnetwork(directed = FALSE) %>%   
  activate("edges") %>%                
  filter(!edge_is_multiple()) %>%     
  filter(!edge_is_loop()) %>%         
  convert(sfnetworks::to_spatial_subdivision) %>%  
  convert(sfnetworks::to_spatial_smooth)   
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# Add a new column named 'length' to the edges part of the object `osm`.
osm <- osm %>% 
  mutate(length = as.numeric(st_length(.)))
  # ...
# //TASK //////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Extract the first row from `home` object and store it `home_1`
home_1 <- home[1,]
# =========== NO MODIFY ZONE ENDS HERE ========================================
# TASK ////////////////////////////////////////////////////////////////////////
# Find the shortest paths from `home_1` to all other stations
start_p <- geo_lite_sf("home_1") %>% st_geometry()
station_sf <- st_as_sf(station, coords = c("stop_lon", "stop_lat"), crs = 4326)
start_p    <- st_transform(start_p, st_crs(osm))
station_sf <- st_transform(station_sf, st_crs(osm))
paths <- st_network_paths(osm, from = start_p, to = station_sf, type = "shortest")
# //TASK //////////////////////////////////////////////////////////////////////
  
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Get shortest network distances from `home_1` to all other stations.
dist_all <- map_dbl(1:nrow(paths), function(x){
  osm %>%
    activate("nodes") %>% 
    slice(paths$node_paths[[x]]) %>% 
    st_as_sf("edges") %>% 
    pull(length) %>% 
    sum()
}) %>% unlist()
# Replace zeros with a large value.
if (any(dist_all == 0)){
  dist_all[dist_all == 0] <- max(dist_all)
}
# TASK ////////////////////////////////////////////////////////////////////////
# Find the closest station and assign the value to `closest_station`
closest_id <- which.min(dist_all)
closest_value <- dist_all[closest_id]
# Extract paths and stations
paths_list <- map(closest_id, ~ osm %>%
  activate("nodes") %>%
  slice(paths$node_paths[[.x]]))
closest_station <- map(closest_id, ~ station %>% slice(.x))
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# Get the distance to the closest station.
closest_dist <- dist_all[closest_id]
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# Calculate travel time based on the `closest_dist` assuming we drive at 20 miles/hour speed.
# Assign the value to `trvt_osm_m` object.
trvt_osm_m <- closest_dist / 20
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# Create a subset of stop_times data table for date = 2025-11-10, 
# minimum departure time of 7AM, maximum departure time of 10AM.
# Assign the output to `am_stop_time` object
am_stop_time <- filter_stop_times(gtfs_obj = gtfs, 
                                  extract_date = "2025-11-10", 
                                  min_departure_time = 3600*7,
                                  max_arrival_time = 3600*10)
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# 1. Calculate travel times from the `closest_station` to all other stations 
#    during time specified in `am_stop_time`. Allow ONE transfer.
# 2. Filter the row where the destination is Midtown station. 
# 3. Assign it to `trvt` object.
trvt <- travel_times(
  filtered_stop_times = am_stop_time, 
  stop_name = "MIDTOWN STATION",           
  time_range = 3600,                   
  arrival = FALSE,                    
  max_transfers = 1,                 
  return_coords = TRUE        
)
# //TASK //////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Divide the calculated travel time by 60 to convert the unit from seconds to minutes.
trvt_gtfs_m <- trvt$travel_time/60
# Add the travel time from home to the nearest station and
# the travel time from the nearest station to Midtown station
total_trvt <- trvt_osm_m + trvt_gtfs_m
# =========== NO MODIFY ZONE ENDS HERE ========================================
# Function definition (do not modify other parts of the code in this code chunk except for those inside the TASK section)
get_trvt <- function(home, osm, station, midtown){
  
    # Find shortest paths from `home` to all stations
    start_p <- st_geometry(home)
    station_sf <- st_as_sf(station, coords = c("stop_lon", "stop_lat"), crs = 4326)
    
    start_p    <- st_transform(start_p, st_crs(osm))
    station_sf <- st_transform(station_sf, st_crs(osm))
    
    paths <- st_network_paths(osm, from = start_p, to = station_sf, type = "shortest")
  
    dist_all <- map_dbl(1:nrow(paths), function(x){
      osm %>%
        activate("nodes") %>% 
        slice(paths$node_paths[[x]]) %>% 
        st_as_sf("edges") %>% 
        pull(length) %>% 
        sum()
    })
  
    dist_all[dist_all == 0] <- max(dist_all)
    closest_id <- which.min(dist_all)
    closest_dist <- dist_all[closest_id]
    trvt_osm_m <- closest_dist / 20
    
    am_stop_time <- filter_stop_times(
      gtfs_obj = gtfs, 
      extract_date = "2025-11-10", 
      min_departure_time = 3600*7,
      max_arrival_time = 3600*10
    )
    
    trvt <- travel_times(
      filtered_stop_times = am_stop_time, 
      stop_name = midtown,  # dynamic
      time_range = 3600,                   
      arrival = FALSE,                    
      max_transfers = 1,                 
      return_coords = TRUE        
    )
  
    trvt_gtfs_m <- trvt$travel_time / 60
    total_trvt <- trvt_osm_m + trvt_gtfs_m
  return(total_trvt)
}
# Prepare an empty vector
total_trvt <- vector("numeric", nrow(home))
# Apply the function to all Census Tracts
for (i in 1:nrow(home)) {
  total_trvt[i] <- get_trvt(home[i,], osm, station, midtown)
  
  # Print progress
  message("Iteration ", i, " of ", nrow(home), " completed.")
}
# cbind the calculated travel time to `home`
home <- home %>% 
  cbind(trvt = total_trvt)
Create two maps and two plots by following the instructions in the code chunk below. Write a brief (maximum 200 words) description summarizing your observations from the maps and plots
# TASK ////////////////////////////////////////
# Create an interactive map displaying `census` (polygons) and `home` (points), effectively visualizing household income and travel time to Midtown Station, respectively.
# --- Color palettes ---
income_pal <- colorQuantile(palette = "YlGnBu", domain = census$hh_inc)
trvt_pal   <- colorQuantile(palette = "Reds", domain = home$trvt)
# --- Ensure geometries are sf ---
census <- st_make_valid(census)   # POLYGONS
home   <- st_make_valid(home)     # POINTS
# --- Leaflet map ---
leaflet() %>%
  addProviderTiles(providers$CartoDB.DarkMatter) %>%
  
  # Census polygons colored by income
  addPolygons(
    data = census,
    fillColor = ~income_pal(hh_inc),
    color = "white",
    weight = 0.5,
    fillOpacity = 0.7,
    popup = ~paste0(
      "<b>", NAME, "</b><br>",
      "Median HH Income: $", formatC(hh_inc, format="f", big.mark=",", digits=0), "<br>"
    )
  ) %>%
  
  # Home points colored by travel time
  addCircles(
    data = home,
    fillColor = ~trvt_pal(trvt),
    stroke = FALSE,
    radius = 300,
    fillOpacity = 0.7,
    popup = ~lapply(
      paste0(
        "Travel Time from <br> <strong>MIDTOWN: ", round(trvt, 1), " minutes</strong>"
      ),
      htmltools::HTML
    )
  ) %>%
  
  # Legends
  addLegend("bottomleft", pal = income_pal, values = census$hh_inc, title = "Household Income", opacity = 0.7) %>%
  addLegend("bottomright", pal = trvt_pal, values = home$trvt, title = "Travel Time (min)", opacity = 0.7)
# Create a scatter plot with a trend line showing the relationship between household income (x-axis) and travel time to Midtown Station (y-axis).
# Scatter plot with trend line
ggplot(home, aes(x = hh_inc, y = trvt)) +
  geom_point(color = "steelblue", alpha = 0.6, size = 2) +   
  geom_smooth(method = "lm", color = "red", se = TRUE) +    
  labs(
    title = "Relationship between Household Income and Travel Time to Midtown",
    x = "Median Household Income ($)",
    y = "Travel Time to Midtown Station (minutes)"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 14, face = "bold"),
    axis.title = element_text(size = 12)
  )
## `geom_smooth()` using formula = 'y ~ x'
# Create a scatter plot with a trend line showing the relationship between the percentage of minority population (x-axis) and travel time to Midtown Station (y-axis).
# Scatter plot with trend line
ggplot(home, aes(x = trvt, y = pct_minority)) +
  geom_point(color = "darkorange", alpha = 0.6, size = 2) +  
  geom_smooth(method = "lm", color = "blue", se = TRUE) + 
  labs(
    title = "Relationship between % Minority Population and Travel Time to Midtown",
    x = "Percentage of Minority Population",
    y = "Travel Time to Midtown Station (minutes)"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 14, face = "bold"),
    axis.title = element_text(size = 12)
  )
## `geom_smooth()` using formula = 'y ~ x'
# //TASK //////////////////////////////////////
For the interactive map, we can see that most of the census tracts around Midtown Station do not fall in higher income quantiles. This is expected of most cities in the US as we know that more higher income households are in census tracts that are located in the suburbs. We also observe that census tracts close to Midtown experience less travel time based on proximity and the travel time increases as we move farther away. Since MARTA is stretched more vertically, we can observe that stations that are not far from Midtown but are spread more horizontally have a longer travel time than stations that are on the same longitude, but closer to Midtown Station.
The scatter plot confirms the relationship observed in the interactive map. We observe that higher income households experience longer travel time to Midtown primarily as they are located in the suburbs. The trend line moves with a upward trend, indicating a positive correlation between household income and travel time.
The scatter plot showing the relationship between the % of minority population and travel time to Midtown indicates a positive trend, though the relationship is relatively weak. The trend line suggests that census tracts with higher minority populations tend to have shorter travel times to Midtown. This pattern aligns with general trends observed in many U.S. cities, where suburban areas are predominantly White, while minority populations are more concentrated in neighborhoods closer to the city center.