library(tidyverse)
library(sf)
library(spData)
library(tmap)
library(tidycensus)
library(tigris)

Before data analysis, let’s prepare main data sets.
- Fixed guideway transit facilities in the US (updated as of 1/1/2019) from the TOD database.
- The Urban Area boundary shapefile from the US Census (tigris).
- The number of workers commuting by public transit, walking, and biking at the block group level from the US Census American Community Survey (tidycensus).

Download tod_database_download.csv on https://toddata.cnt.org/downloads.php. Save it on your working directory and read in as a csv file.

# getwd()
stations_df <- read_csv("tod_database_download.csv") 

Create an sf object out of the csv file.
- how to plot xy coordinates: https://ryanpeek.github.io/2017-08-03-converting-XY-data-with-sf-package/
- crs of gtfs data: https://developers.google.com/transit/gtfs/reference/#stopstxt

Then, project it for the Altanta area.
- NAD83/Georgia West crs: http://epsg.io/26967

stations_sf <- st_as_sf(stations_df, coords = c("Longitude", "Latitude"), crs = 4326)
stations_sf <- stations_sf %>%
  filter(Buffer == "Existing Transit") %>%
  st_transform(26967)

tmap_mode("view")
tm_shape(us_states) +
  tm_borders() + 
  tm_shape(stations_sf) + 
  tm_dots()

Next, download the Urban Area boundary for the Atlanta region, delineated by the US Census Bureau. For this task, we use tigris::urban_areas to retrieve a UA boundary shapefile.

ua <- st_as_sf(
  tigris::urban_areas(cb = TRUE, year = 2017)
  )
ua_atl <- ua %>%
  filter(UACE10 == "03817") %>%
  st_transform(26967)

Spatially subset those stations within the boundary of the Atlanta UA ua_atl.

stations_atl <- stations_sf[ua_atl, ]

# for the full lost of basemaps, visit http://leaflet-extras.github.io/leaflet-providers/preview/
tmap_mode("view") + 
  tm_shape(stations_atl) + tm_dots() + 
  tm_shape(ua_atl) + tm_borders() 

Download the US Census ACS 5-year estimates for # of workers living in individual block groups and commuting by various means of transportation. To obtain your Census API key, visit http://api.census.gov/data/key_signup.html, and insert your key to the current environment.

census_api_key("YOUR API KEY GOES HERE")
var_acs17 <- load_variables(2017, "acs5", cache = TRUE)
View(var_acs17)
acs17 <- get_acs(state = "GA", county = c("Fulton", "DeKalb"), 
                 geography = "block group", year = 2017, 
                 variables = c("B08301_001", "B08301_010", "B08301_018", "B08301_019"), 
                 output = "wide",
                 geometry = TRUE) 
blkgrp_atl <- acs17 %>%
  st_transform(26967)

blkgrp_atl$area_whole <- unclass(st_area(blkgrp_atl))

Creat half-mile buffers for individual stations. Check the unit of the crs in stations_atl.

buffer_atl <- st_buffer(stations_atl, dist = 800)

Clip census block groups by the buffer polygons.

seg_atl <- st_intersection(buffer_atl, blkgrp_atl)
seg_atl$area_seg <- unclass(st_area(seg_atl))
seg_atl$area_prop <- seg_atl$area_seg / seg_atl$area_whole
hist(seg_atl$area_prop)

# View(stations_atl)

seg_midtown <- seg_atl %>%
  filter(Station.Name == "MIDTOWN STATION")
tmap_mode("view")
tm_shape(seg_midtown) +
  tm_polygons(col = "area_prop", alpha = 0.5)

Questions

Report your answers here: Google Forms

Get yourself familiar with the data and operations here.
  1. How many existing fixed-guideway facilities in the US?
nrow(stations_sf)
## [1] 4417
  1. How many existing fixed-guideway facilities in the Atlanta UA?
nrow(stations_atl)
## [1] 41
  1. What do "B08301_001", "B08301_010", "B08301_018", and "B08301_019" indicate?
var_acs17 %>%
  filter(name %in% c("B08301_001", "B08301_010", "B08301_018", "B08301_019")) %>%
  dplyr::select(name, label)
## # A tibble: 4 x 2
##   name       label                                                     
##   <chr>      <chr>                                                     
## 1 B08301_001 Estimate!!Total                                           
## 2 B08301_010 Estimate!!Total!!Public transportation (excluding taxicab)
## 3 B08301_018 Estimate!!Total!!Bicycle                                  
## 4 B08301_019 Estimate!!Total!!Walked
Focus on the "MIDTOWN STATION".
  1. How many blockgroup segments are there in the helf mile buffer from the "MIDTOWN STATION"?
nrow(seg_midtown)
## [1] 13
  1. How many resident workers are there in the helf mile buffer from the "MIDTOWN STATION"?
  2. How many transit commuters are there in the helf mile buffer from the "MIDTOWN STATION"?
  3. How many biking commuters are there in the helf mile buffer from the "MIDTOWN STATION"?
  4. How many walking commuters are there in the helf mile buffer from the "MIDTOWN STATION"?
  5. Compute the percent of transit commuters (as a proportion of all commuters) in the helf mile buffer from the "MIDTOWN STATION"?
  6. Compute the percent of active commuters (bikers/walkers combined as a proportion of all commuters) in the helf mile buffer from the "MIDTOWN STATION"?
union_midtown <- 
  seg_midtown %>% 
  mutate(workers = B08301_001E * area_prop, 
         workers_tr = B08301_010E * area_prop, 
         workers_bk = B08301_018E * area_prop,
         workers_wk = B08301_019E * area_prop) %>%
  summarize(n_seg = n(), 
            workers_sum = sum(workers), 
            workers_sum_tr = sum(workers_tr),
            workers_sum_bk = sum(workers_bk),
            workers_sum_wk = sum(workers_wk), 
            workers_pct_tr = workers_sum_tr/workers_sum*100, 
            workers_pct_active = (workers_sum_bk + workers_sum_bk)/workers_sum*100)  

tmap_mode("view")
tm_shape(union_midtown) +
  tm_borders()
Think about all station areas in the Atlanta region.
  1. Which station has the the highest number of blockgroup segments?
  2. Which station has the the largest resident workers in the helf mile buffers?
  3. Which station has the the largest transit commuters in the helf mile buffer?
  4. Which station has the the largest biking commuters in the helf mile buffer?
  5. Which station has the the largest walking commuters in the helf mile buffer?
  6. Which station has the the highest percent of transit commuters (as a proportion of all commuters) in the helf mile buffer?
  7. Which station has the the highest percent of active commuters (bikers/walkers combined as a proportion of all commuters) in the helf mile buffer?
union_atl <-
  seg_atl %>% 
  group_by(Station.Name) %>% 
  mutate(workers = B08301_001E * area_prop, 
         workers_tr = B08301_010E * area_prop, 
         workers_bk = B08301_018E * area_prop,
         workers_wk = B08301_019E * area_prop
         ) %>%
  summarize(n_seg = n(), 
            workers_sum = sum(workers), 
            workers_sum_tr = sum(workers_tr),
            workers_sum_bk = sum(workers_bk),
            workers_sum_wk = sum(workers_wk), 
            workers_pct_tr = workers_sum_tr/workers_sum*100, 
            workers_pct_active = (workers_sum_bk + workers_sum_bk)/workers_sum*100) %>%
  arrange(desc(n_seg))

tmap_mode("view")
tm_shape(union_atl) +
  tm_borders()