Section 1. The most walkable/unwalkable census tract in Atlanta, GA

Based on my personal experience, I selected census tract 13121001101 as the most walkable area and census tract 13121008902 as the least walkable area within the city of Atlanta.

Census tract 13121001100, located in Midtown Atlanta (bounded by 10th Street to 14th Street, West Peachtree Street and Piedmont Avenue NE), features a highly walkable-scale layout. The dense sidewalk infrastructure makes it easy for pedestrians to navigate, while frequent four-way stops prioritize pedestrian crossings over vehicular traffic.

On the other hand, census tract 13121008903 (bounded by Marietta Boulevard, Bolton Road, and a private railroad) faces significant walkability challenges. The area is divided by a railroad, and commercial and industrial sites located in the center of the census tract disrupts connectivity of the neighborhood. The frequent presence of super-blocks makes people inevitably walk longer distances, and the lack of sidewalk infrastructure further limits pedestrian route options.

Section 2. OSM, GSV, and computer vision.

Fill out the template to complete the script.

library(tidyverse)
library(tidycensus)
library(osmdata)
library(sfnetworks)
library(units)
library(sf)
library(tidygraph)
library(tmap)
library(here)

Step 1. Get OSM data and clean it.

  1. Using tidycensus package, download the Census Tract polygon for Fulton and DeKalb counties.
  2. Extract two Census Tracts, each of which will be your most walkable and least walkable Census Tracts.
  3. Using their bounding boxes, get OSM data.
  4. Convert them into sfnetwork object and clean it.
# TASK ////////////////////////////////////////////////////////////////////////
# 1. Set up your api key here
census_api_key(Sys.getenv("census_api"))
# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Download Census Tract polygon for Fulton and DeKalb
tract <- get_acs("tract", 
                 variables = c('tot_pop' = 'B01001_001'),
                 year = 2022,
                 state = "GA", 
                 county = c("Fulton", "DeKalb"), 
                 geometry = TRUE)
# =========== NO MODIFY ZONE ENDS HERE R========================================

# TASK ////////////////////////////////////////////////////////////////////////
# 1. Write the GEOID of walkable & unwalkable Census Tracts. e.g., tr1_ID <- c("13121001205", "13121001206")
# 2. Extract the selected Census Tracts using tr1_ID & tr2_ID
# 3. Create their bounding boxes using st_bbox(), and 
# 4. assign them to tract_1_bb and tract_1_bb, respectively.

# For the walkable Census Tract(s)
tr1_ID <- c("13121001101")
tract_1_bb <- tract %>% 
  filter(GEOID == tr1_ID) %>%
  st_bbox()

# The most unwalkable Census Tract
tr2_ID <- c("13121008903")
tract_2_bb <- tract %>% 
  filter(GEOID == tr2_ID) %>%
  st_bbox()

# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Get OSM data for the two bounding box
osm_1 <- opq(bbox = tract_1_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("motorway", "trunk", "primary", 
                            "secondary", "tertiary", "unclassified",
                            "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()

osm_2 <- opq(bbox = tract_2_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("motorway", "trunk", "primary", 
                            "secondary", "tertiary", "unclassified",
                            "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()
# =========== NO MODIFY ZONE ENDS HERE ========================================

# TASK ////////////////////////////////////////////////////////////////////////
# 1. Convert osm_1 and osm_2 to sfnetworks objects (set directed = FALSE)
# 2. Clean network by (1) deleting parallel lines/loops, (2) create missing nodes, and (3) remove pseudo nodes, 
# 3. Add a new column named length using edge_length() function.

net1 <- osm_1$osm_lines %>% 
  # Drop redundant columns 
  select(osm_id, highway) %>%
  sfnetworks::as_sfnetwork(directed = FALSE) %>%  
  activate("edges") %>%
  filter(!edge_is_multiple()) %>%
  filter(!edge_is_loop()) %>% 
  convert(., sfnetworks::to_spatial_subdivision) %>% 
  convert(., sfnetworks::to_spatial_smooth) %>% 
  mutate(length = edge_length())

net2 <- osm_2$osm_lines %>% 
  # Drop redundant columns 
  select(osm_id, highway) %>%
  sfnetworks::as_sfnetwork(directed = FALSE) %>%  
  activate("edges") %>%
  filter(!edge_is_multiple()) %>%
  filter(!edge_is_loop()) %>% 
  convert(., sfnetworks::to_spatial_subdivision) %>% 
  convert(., sfnetworks::to_spatial_smooth) %>% 
  mutate(length = edge_length())
  
# //TASK //////////////////////////////////////////////////////////////////////
  
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# OSM for the walkable part
edges_1 <- net1 %>% 
  # Extract 'edges'
  st_as_sf("edges") %>% 
  # Drop segments that are too short (100m)
  mutate(length = as.vector(length)) %>% 
  filter(length > 100) %>% 
  # Add a unique ID for each edge
  mutate(edge_id = seq(1,nrow(.)),
         is_walkable = "walkable")

# OSM for the unwalkable part
edges_2 <- net2 %>% 
  # Extract 'edges'
  st_as_sf("edges") %>%
  # Drop segments that are too short (100m)
  mutate(length = as.vector(length)) %>% 
  filter(length > 100) %>% 
  # Add a unique ID for each edge
  mutate(edge_id = seq(1,nrow(.)),
         is_walkable = "unwalkable")

# Merge the two
edges <- bind_rows(edges_1, edges_2)

# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 2. Define getAzimuth() function.

getAzimuth <- function(line){
  
  # This function takes one edge (i.e., a street segment) as an input and
  # outputs a data frame with four points (start, mid1, mid2, and end) and their azimuth.

  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. From `line` object, extract the coordinates using st_coordinates() and extract the first two rows.
  # 2. Use atan2() function to calculate the azimuth in degree. Make sure to adjust the value such that 0 is north, 90 is east, 180 is south, and 270 is west.
  # 1
  start_p <- line %>% 
    st_coordinates() %>% 
    .[1:2, 1:2]

  # 2
  start_azi <- atan2(
    start_p[2,"X"] - start_p[1, "X"], 
    start_p[2,"Y"] - start_p[1, "Y"]
  ) * 180/pi

  # //TASK //////////////////////////////////////////////////////////////////////

  # TASK ////////////////////////////////////////////////////////////////////////
  # Repeat what you did above, but for last two rows (instead of the first two rows).
  # Remember to flip the azimuth so that the camera would be looking at the street that's being measured
  end_p <- line %>% 
    st_coordinates() %>% 
    .[(nrow(.)-1):nrow(.),1:2]
    
  end_azi <- atan2(
    end_p[2,"X"] - end_p[1, "X"], 
    end_p[2,"Y"] - end_p[1, "Y"]
  ) * 180/pi
    
  end_azi <- if (end_azi < 180) {end_azi + 180} else {end_azi - 180}
  # //TASK //////////////////////////////////////////////////////////////////////
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. From `line` object, use st_line_sample() function to generate points at 0.45 and 0.55 locations. These two points will be used to calculate the azimuth.
  # 2. Use st_case() function to convert 'MULTIPOINT' object to 'POINT' object.
  # 3. Extract coordinates using st_coordinates().
  # 4. Use atan2() functino to Calculate azimuth.
  # 5. Use st_line_sample() again to generate a point at 0.5 location and get its coordinates. This point will be the location at which GSV image will be downloaded.
  
  mid_p <- line %>%
    st_geometry() %>% 
    .[[1]] %>% 
    st_line_sample(sample = c(0.45, 0.55)) %>% 
    st_cast("POINT") %>% 
    st_coordinates()
  
  mid_azi <- atan2(
    mid_p[2,"X"] - mid_p[1, "X"],
    mid_p[2,"Y"] - mid_p[1, "Y"]
  ) * 180/pi
  
  mid_p <- line %>%
    st_geometry() %>% 
    .[[1]] %>% 
    st_line_sample(sample = 0.5) %>% 
    st_coordinates() %>% 
    .[1,1:2]

    # //TASK //////////////////////////////////////////////////////////////////////
 
  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  return(tribble(
    ~type,    ~X,            ~Y,             ~azi,
    "start",   start_p[1,"X"], start_p[1,"Y"], start_azi,
    "mid1",    mid_p["X"],   mid_p["Y"],   mid_azi,
    "mid2",    mid_p["X"],   mid_p["Y"],   ifelse(mid_azi < 180, mid_azi + 180, mid_azi - 180),
    "end",     end_p[2,"X"],   end_p[2,"Y"],   end_azi))
  # =========== NO MODIFY ZONE ENDS HERE ========================================

}

Step 3. Apply the function to all street segments

We can apply getAzimuth() function to the edges object. We finally append edges object to make use of the columns in edges object (e.g., is_walkable column). When you are finished with this code chunk, you will be ready to download GSV images.

# TASK ////////////////////////////////////////////////////////////////////////
# Apply getAzimuth() function to all edges.
# Remember that you need to pass edges object to st_geometry() before you apply getAzimuth()
edges_azi <- edges %>% 
  st_geometry() %>% 
  map_df(getAzimuth) 

# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
edges_azi <- edges_azi %>% 
  bind_cols(edges %>% 
              st_drop_geometry() %>% 
              slice(rep(1:nrow(edges),each=4))) %>% 
  st_as_sf(coords = c("X", "Y"), crs = 4326, remove=FALSE) %>% 
  mutate(node_id = seq(1, nrow(.)))
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 4. Define a function that formats request URL and download images.

getImage <- function(iterrow){
  # This function takes one row of edges_azi and downloads GSV image using the information from edges_azi.
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # Finish this function definition.
  # 1. Extract required information from the row of edges_azi, including 
  #    type (i.e., start, mid1, mid2, end), location, heading, edge_id, node_id, and key.
  # 2. Format the full URL and store it in `request`. Refer to this page: https://developers.google.com/maps/documentation/streetview/request-streetview
  # 3. Format the full path (including the file name) of the image being downloaded and store it in `fpath`
  
  type <- iterrow$type
  location <- paste0(iterrow$Y %>% round(4), ",", iterrow$X %>% round(4))
  heading <- iterrow$azi %>% round(1)
  edge_id <- iterrow$edge_id
  node_id <- iterrow$node_id
  key <- Sys.getenv("google_api")
  endpoint <- "https://maps.googleapis.com/maps/api/streetview"
  request <- glue::glue("https://maps.googleapis.com/maps/api/streetview?size=640x640&location={location}&heading={heading}&fov=90&pitch=0&key={key}")
  fname <- glue::glue("GSV-nid_{node_id}-eid_{edge_id}-type_{type}-Location_{location}-heading_{heading}.jpg")
  fpath <- here("E:/Georgia_Tech_MCRP/2024_FALL/CP8883_Urban_Analytics/Assignment/major_assignment_2/image", fname)
  # //TASK //////////////////////////////////////////////////////////////////////

  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  # Download images
  if (!file.exists(fpath)){
    download.file(request, fpath, mode = 'wb') 
  }
  # =========== NO MODIFY ZONE ENDS HERE ========================================
}

Step 5. Download GSV images

Before you download GSV images, make sure the row number of edges_azi is not too large! The row number of edges_azi will be the number of GSV images you will be downloading. Before you download images, always double-check your Google Cloud Console’s Billing tab to make sure that you will not go above the free credit of $200 each month. The price is $7 per 1000 images.

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Loop!
for (i in seq(1,nrow(edges_azi))){
  getImage(edges_azi[i,])
}
# =========== NO MODIFY ZONE ENDS HERE ========================================

ZIP THE DOWNLOADED IMAGES AND NAME IT ‘gsv_images.zip’ FOR STEP 6.

Step 6. Apply computer vision

Now, use Google Colab to apply the semantic segmentation model. Zip your images and upload the images to your Colab session.

Step 7. Merging the processed data back to R

Once all of the images are processed and saved in your Colab session as a CSV file, download the CSV file and merge it back to edges.

# TASK ////////////////////////////////////////////////////////////////////////
# Read the downloaded CSV file from Google Colab
seg_output <- read.csv("E:/Georgia_Tech_MCRP/2024_FALL/CP8883_Urban_Analytics/Assignment/major_assignment_2/seg_output_changmin.csv")

# //TASK ////////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Join the seg_output object back to edges_azi object using node_id as the join key.
edges_seg_output <- edges_azi %>%
  inner_join(seg_output, by=c("node_id" = "img_id")) %>%
  select(type, X, Y, node_id, building, sky, tree, road, sidewalk, is_walkable) %>%
  mutate(across(c(building, sky, tree, road, sidewalk), function(x) x/(640*640)))
# =========== NO MODIFY ZONE ENDS HERE ========================================

Section 3. Summarise and analyze the results.

Analysis 1 - Create interactive map(s) to visualize the spatial distribution of the streetscape.

You need to create maps of the proportion of building, sky, tree, road, and sidewalk for walkable and unwalkable areas. In total, you will have 10 maps.

Provide a brief description of your findings from the maps.

Above all, it is notable that the walkable census tract contains significantly more POIs than the unwalkable census tract, highlighting its popularity as a destination for travelers. When comparing the percentage of building coverage for POIs between the two census tracts, POIs in the walkable census tract have a higher percentage of building coverage than those in the unwalkable census tract. This suggests that the land use pattern in the walkable census tract is considerably denser than in the unwalkable census tract.

In contrast, when comparing the percentage of sky visibility for POIs, the unwalkable census tract shows a higher percentage of sky coverage than the walkable census tract. A possible explanation for this result is the lower density of the built environment in the unwalkable census tract, leading to greater sky visibility.

For POIs in the walkable census tract, the percentage of road coverage varies significantly. POIs near major roads such as West Peachtree St, Peachtree St, Piedmont Avenue, and 10th to 14th streets exhibit a high percentage of road coverage, while POIs in residential areas show a lower percentage. In contrast, most POIs in the unwalkable census tract have a higher percentage of road coverage, as the neighborhood is surrounded by major arterials like Marietta Boulevard and Bolton Road NW.

When comparing the percentage of tree coverage, most POIs in both census tracts show a low percentage of tree coverage. However, POIs with a high percentage of tree coverage are concentrated in specific areas, likely representing undeveloped spaces or greenspaces such as parks.

Lastly, when comparing the percentage of sidewalk coverage, there is a stark contrast between the two census tracts. In the walkable census tract, the percentage of sidewalk coverage for POIs ranges from 0 to 35%, whereas in the unwalkable census tract, it ranges from 0 to 4%. This clearly indicates that a major contributing factor to the walkability of the area is the presence of sidewalks.

Preparation

# TASK ////////////////////////////////////////////////////////////////////////
# Create interactive map(s) to visualize the `edges_seg_output` objects. 

# Preparation

tmap_mode("view")

edges_seg_final <- edges_seg_output %>% 
  mutate(building_ratio = building*100) %>% 
  mutate(sky_ratio = sky*100) %>% 
  mutate(tree_ratio = tree*100) %>% 
  mutate(road_ratio = road*100) %>% 
  mutate(sidewalk_ratio = sidewalk*100)

w <- edges_seg_final %>% filter(is_walkable == "walkable")
uw <- edges_seg_final %>% filter(is_walkable == "unwalkable")

# //TASK //////////////////////////////////////////////////////////////////////

% of Building

# Building
building_w <- tm_shape(edges_seg_final %>% filter(is_walkable == "walkable")) +
  tm_dots(col = "building_ratio", style = "pretty") + tm_layout(title = "% of Building - Walkable Tract")

building_uw <- tm_shape(edges_seg_final %>% filter(is_walkable == "unwalkable")) +
  tm_dots(col = "building_ratio", style = "pretty")+ tm_layout(title = "% of Building - Unwalkable Tract")

#map both together
tmap_arrange(building_w, building_uw) 

% of sky

sky_w <- tm_shape(edges_seg_final %>% filter(is_walkable == "walkable")) +
  tm_dots(col = "sky_ratio", style = "pretty") + tm_layout(title = "% of Sky - Walkable Tract")

sky_uw <- tm_shape(edges_seg_final %>% filter(is_walkable == "unwalkable")) +
  tm_dots(col = "sky_ratio", style = "pretty") + tm_layout(title = "% of Sky - Unwalkable Tract")

tmap_arrange(sky_w, sky_uw) 

% of tree

tree_w <- tm_shape(edges_seg_final %>% filter(is_walkable == "walkable")) +
  tm_dots(col = "tree_ratio", style = "pretty") + tm_layout(title = "% of Tree - Walkable Tract")

tree_uw <- tm_shape(edges_seg_final %>% filter(is_walkable == "unwalkable")) +
  tm_dots(col = "tree_ratio", style = "pretty")+ tm_layout(title = "% of Tree - Unwalkable Tract")

tmap_arrange(tree_w, tree_uw) 

% of road

road_w <- tm_shape(edges_seg_final %>% filter(is_walkable == "walkable")) +
  tm_dots(col = "road_ratio", style = "pretty") + tm_layout(title = "% of Road - Walkable Tract")

road_uw <- tm_shape(edges_seg_final %>% filter(is_walkable == "unwalkable")) +
  tm_dots(col = "road_ratio", style = "pretty")+ tm_layout(title = "% of Road - Unwalkable Tract")

tmap_arrange(road_w, road_uw) 

% of sidewalk

sidewalk_w <- tm_shape(edges_seg_final %>% filter(is_walkable == "walkable")) +
  tm_dots(col = "sidewalk_ratio", style = "pretty") + tm_layout(title = "% of Sidewalk - Walkable Tract")

sidewalk_uw <- tm_shape(edges_seg_final %>% filter(is_walkable == "unwalkable")) +
  tm_dots(col = "sidewalk_ratio", style = "pretty") + tm_layout(title = "% of Sidewalk - Unwalkable Tract")

tmap_arrange(sidewalk_w, sidewalk_uw) 

Analysis 2 - Compare the means.

You need to calculate the mean of the proportion of building, sky, tree, road, and sidewalk for walkable and unwalkable areas. For example, you need to calculate the mean of building category for each of walkable and unwalkable Census Tracts. Then, you need to calculate the mean of sky category for each of walkable and unwalkable Census Tracts. In total, you will have 10 mean values.

Provide a brief description of your findings.

The mean values of five categories for the walkable and unwalkable census tracts support the visual analysis results of the neighborhood characteristics of the two areas. First, the walkable census tract has a significantly higher mean value for the percentage of building compared to the unwalkable census tract, whereas the unwalkable census tract has a much higher mean value for the percentage of sky. This indicates a clear contrast in land use intensity: the walkable census tract features a dense built environment, while the unwalkable census tract exhibits low-density neighborhood characteristics. Although the mean values for road and tree coverage are similar between the two census tracts, the mean value for sidewalk coverage shows a clear difference, which highlights the scarcity of sidewalks in the unwalkable census tract.

Detailed analysis result

# TASK ////////////////////////////////////////////////////////////////////////
# Perform the calculation as described above.
# As long as you can deliver the message clearly, you can use any format/package you want.

w_building_ratio_avg <- mean(w$building_ratio)
uw_building_ratio_avg <- mean(uw$building_ratio)

w_sky_ratio_avg <- mean(w$sky_ratio)
uw_sky_ratio_avg <- mean(uw$sky_ratio)

w_tree_ratio_avg <- mean(w$tree_ratio)
uw_tree_ratio_avg <- mean(uw$tree_ratio)

w_road_ratio_avg <- mean(w$road_ratio)
uw_road_ratio_avg <- mean(uw$road_ratio)

w_sidewalk_ratio_avg <- mean(w$sidewalk_ratio)
uw_sidewalk_ratio_avg <- mean(uw$sidewalk_ratio)

#display means

category <- c("Buildings", "Sky", "Trees", "Roads", "Sidewalks")
w_avg <- c(w_building_ratio_avg, w_sky_ratio_avg, w_tree_ratio_avg, w_road_ratio_avg, w_sidewalk_ratio_avg)
uw_avg <- c(uw_building_ratio_avg, uw_sky_ratio_avg, uw_tree_ratio_avg, uw_road_ratio_avg, uw_sidewalk_ratio_avg)

avg <- data.frame(category, w_avg, uw_avg)

print(avg)
##    category     w_avg    uw_avg
## 1 Buildings 26.248616  3.256244
## 2       Sky 12.319424 32.240930
## 3     Trees 14.877636 16.373202
## 4     Roads 32.477466 40.618053
## 5 Sidewalks  6.389503  1.083215
# //TASK //////////////////////////////////////////////////////////////////////

Analysis 3 - Draw boxplot

Provide a brief description of your findings.

When comparing boxplots of the two census tracts, we can get a clearer understanding of their neighborhood characteristics. First of all, the boxplots for the percentage of road and tree are quite similar between the two census tracts with similar levels of variability. However, the walkable census tract shows a significantly higher percentage of building compared to the unwalkable census tract. Also, the percentage of building for POIs in the walkable census tract shows significant variability, whereas the percentages of building for POIs in the unwalkable census tract are very similar to one another. This result may reflect a mixed-use land pattern in the walkable census tract, while the unwalkable census tract has a homogeneous, low-density development pattern.

A similar conclusion can be drawn when comparing the boxplots for the percentage of sky. The boxplot for the unwalkable census tract shows a much higher percentage of sky than that of the walkable census tract. The variability in the percentage of sky in the unwalkable census tract might be interconnected with the variability in the percentage of tree, whereas for the walkable census tract, the variability may be driven by the variability in the percentage of building.

Lastly, when comparing boxplots for the percentage of sidewalk of two census tracts, it is evident that POIs in the unwalkable census tract consistently have a very low percentage of sidewalk while POIs in the walkable census tract show relatively higher percentage of sidewalk with greater variability. This result may also indicate the presence of pedestrian-friendly infrastructure and diverse built environment & dynamic land use pattern of the walkable census tract.

# TASK ////////////////////////////////////////////////////////////////////////

# Five Categories
plotlist <- c("building", "sky", "tree", "road", "sidewalk")

# Boxplot Presentation
boxplot_edge <- edges_seg_output %>%
  st_drop_geometry() %>% 
  gather(key = "Category", value = "Proportion", all_of(plotlist))

# Plot with modified color palette and styles
ggplot(boxplot_edge, aes(x = is_walkable, y = Proportion, fill = is_walkable)) + 
  geom_boxplot(outlier.shape = NA, alpha = 0.5, width = 0.5, lwd = 0.5) + 
  stat_summary(fun = "mean", geom = "point", shape = 20, size = 4, color = "darkred", fill = "white") +  
  
  # Change summary point style
  facet_wrap(~ Category, scales = "free_y") +
  scale_fill_manual(values = c("walkable" = "#4daf4a", "unwalkable" = "#e41a1c")) +  # New color scheme
  labs(x = " ", y = "Proportion") +
  theme_bw() +  # Change to a different theme
  theme(
    legend.position = "none",
    strip.text = element_text(size = 13, face = "italic"),  # Italicize facet labels and increase font size
    panel.border = element_rect(color = "black", fill = NA, size = 0.1),  # Blue borders for panels
  )
## Warning: The `size` argument of `element_rect()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

# //TASK //////////////////////////////////////////////////////////////////////