Major Assignment 2

Jaegeon Lee

2024-11-15

Introduction to the assignment

This assignment consists of three main sections.

In the first section, you need to select one Census Tract that you think is the most walkable and another one that you think is least walkable within Fulton and DeKalb Counties, GA. As long as they are within the two counties, you can pick any two Census Tracts you want. If the area you want to use as walkable/unwalkable area is not well-covered by a single Census Tract, you can select multiple tracts (e.g., selecting three adjacent tracts as one walkable area). The definition of ‘walkable’ can be your own - you can choose solely based on your experience (e.g., had best/worst walking experience because …), refer to Walk Score, or any other mix of criteria you want. After you make the selection, provide a short write-up of why you chose those Census Tracts.

The second section is the main part of this assignment in which you prepare OSM data, download GSV images, apply computer vision technique we learned in the class (i.e., semantic segmentation).

In the third section, you will summarise and analyze the output and provide your findings. After you apply computer vision to the images, you will have the number of pixels in each image that represent 150 categories in your data. You will focus on the following categories in your analysis: building, sky, tree, road, and sidewalk. Specifically, you will (1) create maps to visualize the spatial distribution of different objects, (2) compare the mean of each category between the two Census Tract and (3) draw boxplots to compare the distributions.

Section 1. Choose your Census Tracts.

Provide a brief description of your census tracts. Why do you think the Census Tracts are walkable and unwalkable? What were the contributing factors?

I selected Inman Park as a walkable neighborhood. It ranked on 6th most walkable neighborhood in Atlanta according to the WalkScore (https://www.walkscore.com/GA/Atlanta/Inman_Park). I have been to Inman Park once and was impressed to see many restaurants and shops along streets. The streets were full of both residents and visitors. I chose a census tract with GEOID “13121001302”, which covers the most central and dense area within the Inman Park neighborhood.

Meanwhile, I selected Doraville as an example of unwalkable neighborhood. Doraville is one of an early suburb neighborhood in Atlanta and it has wide roads like Peachtree crossing it through. It has a MARTA metro station, Doraville stations, but when I visited, my impression was that there were no shops and amenities even near the subway station. According to the WalkScore (https://www.walkscore.com/GA/Atlanta/Inman_Park), it ranked 34th among the Atlanta metropolitan region. I chose a census tract with GEOID “13089021301” which covers the Doraville station where assumably a number of people tend to walk from and to than other areas in the Doraville neighborhood.

Section 2. OSM, GSV, and computer vision.

Fill out the template to complete the script.

library(tidyverse)
library(tidycensus)
library(osmdata)
library(sfnetworks)
library(units)
library(sf)
library(tidygraph)
library(tmap)
library(here)

Step 1. Get OSM data and clean it.

The getbb() function, which we used in the class to download OSM data, isn’t suitable for downloading just two Census Tracts. We will instead use an alternative method.

  1. Using tidycensus package, download the Census Tract polygon for Fulton and DeKalb counties.
  2. Extract two Census Tracts, each of which will be your most walkable and least walkable Census Tracts.
  3. Using their bounding boxes, get OSM data.
  4. Convert them into sfnetwork object and clean it.
# TASK ////////////////////////////////////////////////////////////////////////
# 1. Set up your api key here
census_api_key(Sys.getenv("census_api"))
## To install your API key for use in future sessions, run this function with `install = TRUE`.
# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Download Census Tract polygon for Fulton and DeKalb
tract <- get_acs("tract", 
                 variables = c('tot_pop' = 'B01001_001'),
                 year = 2020,
                 state = "GA", 
                 county = c("Fulton", "DeKalb"), 
                 geometry = TRUE) 
## Getting data from the 2016-2020 5-year ACS
## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
##   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  17%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  23%  |                                                                              |=================                                                     |  24%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  40%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  44%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |======================================                                |  54%  |                                                                              |=======================================                               |  55%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  80%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |================================================================      |  91%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================| 100%
# =========== NO MODIFY ZONE ENDS HERE ========================================

verification

tract 
## Simple feature collection with 530 features and 5 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -84.85071 ymin: 33.50251 xmax: -84.02371 ymax: 34.18629
## Geodetic CRS:  NAD83
## First 10 features:
##          GEOID                                        NAME variable estimate
## 1  13089020500    Census Tract 205, DeKalb County, Georgia  tot_pop     3347
## 2  13121010601 Census Tract 106.01, Fulton County, Georgia  tot_pop     3673
## 3  13121002300     Census Tract 23, Fulton County, Georgia  tot_pop     1384
## 4  13121004200     Census Tract 42, Fulton County, Georgia  tot_pop     2675
## 5  13121003100     Census Tract 31, Fulton County, Georgia  tot_pop     2290
## 6  13121009601  Census Tract 96.01, Fulton County, Georgia  tot_pop     3209
## 7  13121003600     Census Tract 36, Fulton County, Georgia  tot_pop     1132
## 8  13089021813 Census Tract 218.13, DeKalb County, Georgia  tot_pop     2286
## 9  13089021409 Census Tract 214.09, DeKalb County, Georgia  tot_pop     3986
## 10 13089023422 Census Tract 234.22, DeKalb County, Georgia  tot_pop     5945
##     moe                       geometry
## 1   574 MULTIPOLYGON (((-84.34919 3...
## 2   960 MULTIPOLYGON (((-84.46957 3...
## 3   322 MULTIPOLYGON (((-84.42613 3...
## 4   577 MULTIPOLYGON (((-84.42334 3...
## 5   397 MULTIPOLYGON (((-84.35705 3...
## 6   606 MULTIPOLYGON (((-84.38269 3...
## 7   267 MULTIPOLYGON (((-84.4065 33...
## 8   268 MULTIPOLYGON (((-84.2442 33...
## 9  1030 MULTIPOLYGON (((-84.30349 3...
## 10  715 MULTIPOLYGON (((-84.2865 33...
tmap_mode('view')
## tmap mode set to interactive viewing
tm_shape(tract) + 
  tm_polygons(col = "estimate", palette = 'GnBu', title = "Total population", alpha = 0.3)
# TASK ////////////////////////////////////////////////////////////////////////
# The purpose of this TASK is to create one bounding box for walkable Census Tract and another bounding box for unwalkable Census Tract.
# As long as you generate what's needed for the subsequent codes, you are good. The numbered list of tasks below is to provide some hints.
# 1. Write the GEOID of walkable & unwalkable Census Tracts. e.g., tr1_ID <- c("13121001205", "13121001206")
# 2. Extract the selected Census Tracts using tr1_ID & tr2_ID
# 3. Create their bounding boxes using st_bbox(), and 
# 4. assign them to tract_1_bb and tract_1_bb, respectively.

# For the walkable Census Tract(s)
tr1_ID <- "13121001302"

tract_1_bb <- tract %>% 
  filter(GEOID == tr1_ID) %>%
  st_bbox() 

# For the unwalkable Census Tract(s)  
tr2_ID <- "13089021301"

tract_2_bb <- tract %>% 
  filter(GEOID == tr2_ID) %>%
  st_bbox() 
# //TASK //////////////////////////////////////////////////////////////////////

  
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Get OSM data for the two bounding box
osm_1 <- opq(bbox = tract_1_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("motorway", "trunk", "primary", 
                            "secondary", "tertiary", "unclassified",
                            "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()

osm_2 <- opq(bbox = tract_2_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("motorway", "trunk", "primary", 
                            "secondary", "tertiary", "unclassified",
                            "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()
# =========== NO MODIFY ZONE ENDS HERE ========================================



# TASK ////////////////////////////////////////////////////////////////////////
# 1. Convert osm_1 and osm_2 to sfnetworks objects (set directed = FALSE)
# 2. Clean the network by (1) deleting parallel lines and loops, (2) create missing nodes, and (3) remove pseudo nodes, 
# 3. Add a new column named length using edge_length() function.
net1 <- osm_1$osm_lines %>% 
  select(osm_id, highway) %>% 
  sfnetworks::as_sfnetwork(directed = FALSE) %>% 
  activate("edges") %>%
  filter(!edge_is_multiple()) %>% # remove duplicated edges
  filter(!edge_is_loop()) %>% # remove loops
  convert(., sfnetworks::to_spatial_subdivision) %>% # subdivide edges
  convert(., sfnetworks::to_spatial_smooth) %>% # delete pseudo nodes
  mutate(length = edge_length()) %>%
  select(osm_id, highway, length) 
## Warning: to_spatial_subdivision assumes attributes are constant over geometries
  # **YOUR CODE HERE..**

net2 <- osm_2$osm_lines %>% 
  select(osm_id, highway) %>% 
  sfnetworks::as_sfnetwork(directed = FALSE) %>% 
  activate("edges") %>%
  filter(!edge_is_multiple()) %>% # remove duplicated edges
  filter(!edge_is_loop()) %>% # remove loops
  convert(., sfnetworks::to_spatial_subdivision) %>% # subdivide edges
  convert(., sfnetworks::to_spatial_smooth) %>% # delete pseudo nodes
  mutate(length = edge_length()) %>%
  select(osm_id, highway, length) 
## Warning: to_spatial_subdivision assumes attributes are constant over geometries
# //TASK //////////////////////////////////////////////////////////////////////
  
  
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# OSM for the walkable part
edges_1 <- net1 %>% 
  # Extract 'edges'
  st_as_sf("edges") %>% 
  # Drop segments that are too short (100m)
  mutate(length = as.vector(length)) %>% 
  filter(length > 100) %>% 
  # Add a unique ID for each edge
  mutate(edge_id = seq(1,nrow(.)),
         is_walkable = "walkable")

# OSM for the unwalkable part
edges_2 <- net2 %>% 
  # Extract 'edges'
  st_as_sf("edges") %>%
  # Drop segments that are too short (100m)
  mutate(length = as.vector(length)) %>% 
  filter(length > 100) %>% 
  # Add a unique ID for each edge
  mutate(edge_id = seq(1,nrow(.)),
         is_walkable = "unwalkable")

# Merge the two
edges <- bind_rows(edges_1, edges_2)
# =========== NO MODIFY ZONE ENDS HERE ========================================

verification

tmap_mode('view')
## tmap mode set to interactive viewing
tm_shape(tract) + 
  tm_polygons(col = "estimate", palette = 'GnBu', title = "Total population", alpha = 0.3) +
tm_shape(edges) +
  tm_lines()
# transform to epsg=32616 because st_line_sample() returns errors like below.
#Error in st_line_sample(., sample = c(0.45, 0.55)) : 
#  st_line_sample for longitude/latitude not supported; use st_segmentize?
edges <- edges %>%
  st_transform(32616)
edges
## Simple feature collection with 422 features and 7 fields
## Geometry type: LINESTRING
## Dimension:     XY
## Bounding box:  xmin: 742483.2 ymin: 3738759 xmax: 752834.8 ymax: 3757083
## Projected CRS: WGS 84 / UTM zone 16N
## # A tibble: 422 × 8
##     from    to osm_id  highway     length                       geometry edge_id
##  * <int> <int> <chr>   <chr>        <dbl>               <LINESTRING [m]>   <int>
##  1     1     2 9244983 residential   146. (743550.1 3740664, 743557.1 3…       1
##  2     3     4 9244993 tertiary      806. (743238 3739086, 743241.1 373…       2
##  3     4     5 9244993 tertiary      219. (743299.4 3739887, 743299 373…       3
##  4     6     7 9246351 residential   172. (742838.8 3740126, 742843.6 3…       4
##  5     8     9 9247478 residential   633. (743952 3740623, 744042.2 374…       5
##  6    14    15 9248893 residential   138. (743658.9 3739755, 743657.9 3…       6
##  7    16    17 9250084 residential   147. (742840.4 3739882, 742840.2 3…       7
##  8     2    18 9250146 residential   139. (743692.6 3740529, 743697.1 3…       8
##  9     2    19 9250146 residential   208. (743696.2 3740668, 743694.6 3…       9
## 10    25    26 9250715 residential   298. (743475.5 3740370, 743481.9 3…      10
## # ℹ 412 more rows
## # ℹ 1 more variable: is_walkable <chr>

Step 2. Define getAzimuth() function.

getAzimuth <- function(line){
  # This function takes one edge (i.e., a street segment) as an input and
  # outputs a data frame with four points (start, mid1, mid2, and end) and their azimuth.
  
  
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. From `line` object, extract the coordinates using st_coordinates() and extract the first two rows.
  # 2. Use atan2() function to calculate the azimuth in degree. 
  #    Make sure to adjust the value such that 0 is north, 90 is east, 180 is south, and 270 is west.
  # 1
  start_p <- line %>% 
  st_coordinates() %>% 
  .[1:2,1:2]

  # 2
  start_azi <- atan2(start_p[2,"X"] - start_p[1, "X"],
                     start_p[2,"Y"] - start_p[1, "Y"])*180/pi
  # //TASK //////////////////////////////////////////////////////////////////////

    
    
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # Repeat what you did above, but for last two rows (instead of the first two rows).
  # Remember to flip the azimuth so that the camera would be looking at the street that's being measured
  end_p <- line %>% 
    st_coordinates() %>% 
    .[(nrow(.)-1):nrow(.),1:2]
    
  end_azi <- atan2(end_p[2,"X"] - end_p[1, "X"],
                   end_p[2,"Y"] - end_p[1, "Y"])*180/pi
  
  end_azi <- if (end_azi < 180) {end_azi + 180} else {end_azi - 180}
  # //TASK //////////////////////////////////////////////////////////////////////
  
  
  
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. From `line` object, use st_line_sample() function to generate points at 0.45 and 0.55 locations. These two points will be used to calculate the azimuth.
  # 2. Use st_case() function to convert 'MULTIPOINT' object to 'POINT' object.
  # 3. Extract coordinates using st_coordinates().
  # 4. Use atan2() functino to Calculate azimuth.
  # 5. Use st_line_sample() again to generate a point at 0.5 location and get its coordinates. This point will be the location at which GSV image will be downloaded.
  
  mid_p <- line %>% 
    st_line_sample(sample = c(0.45, 0.55)) %>% 
    st_cast("POINT") %>% 
    st_coordinates()
  
  mid_azi <- atan2(mid_p[2,"X"] - mid_p[1, "X"],
                   mid_p[2,"Y"] - mid_p[1, "Y"])*180/pi
  
  mid_p <- line %>% 
    st_line_sample(sample = c(0.50)) %>% 
    st_cast("POINT") %>% 
    st_coordinates()
  # //TASK //////////////////////////////////////////////////////////////////////
 
    
  
  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  return(tribble(
    ~type,    ~X,            ~Y,             ~azi,
    "start",   start_p[1,"X"], start_p[1,"Y"], start_azi,
    "mid1",    mid_p[1,"X"],   mid_p[1,"Y"],   mid_azi,
    "mid2",    mid_p[1,"X"],   mid_p[1,"Y"],   ifelse(mid_azi < 180, mid_azi + 180, mid_azi - 180),
    "end",     end_p[2,"X"],   end_p[2,"Y"],   end_azi))
  # =========== NO MODIFY ZONE ENDS HERE ========================================

}

verification

edges[5, ] %>%
  st_geometry() %>%
  getAzimuth()
## # A tibble: 4 × 4
##   type        X        Y   azi
##   <chr>   <dbl>    <dbl> <dbl>
## 1 start 743952. 3740623.  87.6
## 2 mid1  744269. 3740638.  87.5
## 3 mid2  744269. 3740638. 267. 
## 4 end   744585. 3740661. 270.
# now turn the coordinates back to epsg=4326, because Google API requires lon-lat coordinates
edges <- edges %>%
  st_transform(4326)
edges
## Simple feature collection with 422 features and 7 fields
## Geometry type: LINESTRING
## Dimension:     XY
## Bounding box:  xmin: -84.38145 ymin: 33.76099 xmax: -84.26535 ymax: 33.92455
## Geodetic CRS:  WGS 84
## # A tibble: 422 × 8
##     from    to osm_id  highway     length                       geometry edge_id
##  * <int> <int> <chr>   <chr>        <dbl>               <LINESTRING [°]>   <int>
##  1     1     2 9244983 residential   146. (-84.36986 33.77812, -84.3697…       1
##  2     3     4 9244993 tertiary      806. (-84.37366 33.76398, -84.3736…       2
##  3     4     5 9244993 tertiary      219. (-84.37278 33.77118, -84.3727…       3
##  4     6     7 9246351 residential   172. (-84.37768 33.77344, -84.3776…       4
##  5     8     9 9247478 residential   633. (-84.36554 33.77766, -84.3645…       5
##  6    14    15 9248893 residential   138. (-84.36894 33.76991, -84.3689…       6
##  7    16    17 9250084 residential   147. (-84.37774 33.77123, -84.3777…       7
##  8     2    18 9250146 residential   139. (-84.36836 33.77687, -84.3682…       8
##  9     2    19 9250146 residential   208. (-84.36829 33.77812, -84.3682…       9
## 10    25    26 9250715 residential   298. (-84.37075 33.77548, -84.3706…      10
## # ℹ 412 more rows
## # ℹ 1 more variable: is_walkable <chr>

Step 3. Apply the function to all street segments

We can apply getAzimuth() function to the edges object. We finally append edges object to make use of the columns in edges object (e.g., is_walkable column). When you are finished with this code chunk, you will be ready to download GSV images.

# TASK ////////////////////////////////////////////////////////////////////////
# Apply getAzimuth() function to all edges.
# Remember that you need to pass edges object to st_geometry() before you apply getAzimuth()
edges_azi <- edges %>% 
  st_geometry() %>%
  map_df(getAzimuth, .progress=T)
edges_azi
## # A tibble: 1,688 × 4
##    type      X     Y     azi
##    <chr> <dbl> <dbl>   <dbl>
##  1 start -84.4  33.8 101.   
##  2 mid1  -84.4  33.8  89.5  
##  3 mid2  -84.4  33.8 270.   
##  4 end   -84.4  33.8 270.   
##  5 start -84.4  33.8  23.2  
##  6 mid1  -84.4  33.8   7.21 
##  7 mid2  -84.4  33.8 187.   
##  8 end   -84.4  33.8 180.   
##  9 start -84.4  33.8  -0.200
## 10 mid1  -84.4  33.8  -0.195
## # ℹ 1,678 more rows
# //TASK //////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
edges_azi <- edges_azi %>% 
  bind_cols(edges %>% 
              st_drop_geometry() %>% 
              slice(rep(1:nrow(edges),each=4))) %>% 
  st_as_sf(coords = c("X", "Y"), crs = 4326, remove=FALSE) %>% 
  mutate(node_id = seq(1, nrow(.)))
# =========== NO MODIFY ZONE ENDS HERE ========================================
#edges_azi %>%
#  st_write("assets/edges_azi_inman_doravile_4gsv.geojson")

Step 4. Define a function that formats request URL and download images.

getImage <- function(iterrow){
  # This function takes one row of edges_azi and downloads GSV image using the information from edges_azi.
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # Finish this function definition.
  # 1. Extract required information from the row of edges_azi, including 
  #    type (i.e., start, mid1, mid2, end), location, heading, edge_id, node_id, and key.
  # 2. Format the full URL and store it in `request`. Refer to this page: https://developers.google.com/maps/documentation/streetview/request-streetview
  # 3. Format the full path (including the file name) of the image being downloaded and store it in `fpath`
  type <- iterrow$type
  location <- paste0(iterrow$Y %>% round(5), ",", iterrow$X %>% round(5))
  heading <- iterrow$azi %>% round(1)
  edge_id <- iterrow$edge_id
  node_id <- iterrow$node_id
  highway <- iterrow$highway
  key <- Sys.getenv("google_api")
  
  endpoint <- "https://maps.googleapis.com/maps/api/streetview"
  
  furl <- glue::glue("{endpoint}?size=640x640&location={location}&heading={heading}&fov=90&pitch=0&key={key}")
  fname <- glue::glue("GSV-nid_{node_id}-eid_{edge_id}-type_{type}-Location_{location}-heading_{heading}.jpg") # Don't change this code for fname
  fpath <- here("assignments", "gsv_images", fname)
  # //TASK //////////////////////////////////////////////////////////////////////

  
  
  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  # Download images
  if (!file.exists(fpath)){
    download.file(furl, fpath, mode = 'wb') 
  }
  # =========== NO MODIFY ZONE ENDS HERE ========================================
}

Step 5. Download GSV images

Before you download GSV images, make sure the row number of edges_azi is not too large! The row number of edges_azi will be the number of GSV images you will be downloading. Before you download images, always double-check your Google Cloud Console’s Billing tab to make sure that you will not go above the free credit of $200 each month. The price is $7 per 1000 images.

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Loop!
for (i in seq(1,nrow(edges_azi))){
  getImage(edges_azi[i,])
}
# =========== NO MODIFY ZONE ENDS HERE ========================================

ZIP THE DOWNLOADED IMAGES AND NAME IT ‘gsv_images.zip’ FOR STEP 6.

Step 6. Apply computer vision

Now, use Google Colab to apply the semantic segmentation model. Zip your images and upload the images to your Colab session.

Step 7. Merging the processed data back to R

Once all of the images are processed and saved in your Colab session as a CSV file, download the CSV file and merge it back to edges.

# TASK ////////////////////////////////////////////////////////////////////////
# Read the downloaded CSV file from Google Colab
seg_output <- read.csv(
  "seg_output.csv"
)

# //TASK ////////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Join the seg_output object back to edges_azi object using node_id as the join key.
edges_seg_output <- edges_azi %>% 
  inner_join(seg_output, by=c("node_id" = "img_id")) %>% 
  select(type, X, Y, node_id, building, sky, tree, road, sidewalk, is_walkable) %>% 
  mutate(across(c(building, sky, tree, road, sidewalk), function(x) x/(640*640)))
# =========== NO MODIFY ZONE ENDS HERE ========================================

Section 3. Summarise and analyze the results.

At the beginning of this assignment, you defined one Census Tract as walkable and the other as unwalkable. The key to the following analysis is the comparison between walkable and unwalkable Census Tracts.

Analysis 1 - Create interactive map(s) to visualize the spatial distribution of the streetscape.

You need to create maps of the proportion of building, sky, tree, road, and sidewalk for walkable and unwalkable areas. In total, you will have 10 maps.

edges_seg_output <- edges_seg_output %>%
  mutate(building = building*100,
         sky = sky*100,
         tree = tree*100,
         road = road*100,
         sidewalk = sidewalk*100) %>%
  mutate(total = building + sky + tree + road + sidewalk)
edges_seg_output
## Simple feature collection with 1688 features and 11 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -84.38145 ymin: 33.76099 xmax: -84.26535 ymax: 33.92455
## Geodetic CRS:  WGS 84
## # A tibble: 1,688 × 12
##    type      X     Y node_id building   sky  tree  road sidewalk is_walkable
##  * <chr> <dbl> <dbl>   <int>    <dbl> <dbl> <dbl> <dbl>    <dbl> <chr>      
##  1 start -84.4  33.8       1    5.52   15.0 38.9   35.0    3.25  walkable   
##  2 mid1  -84.4  33.8       2    5.30   21.7 24.9   33.3    1.28  walkable   
##  3 mid2  -84.4  33.8       3    8.48   24.5 18.1   40.1    2.70  walkable   
##  4 end   -84.4  33.8       4    7.47   30.0 14.8   34.6    7.75  walkable   
##  5 start -84.4  33.8       5   10.8    35.4  6.33  39.0    3.06  walkable   
##  6 mid1  -84.4  33.8       6   18.5    26.3  7.91  39.5    3.77  walkable   
##  7 mid2  -84.4  33.8       7   11.5    30.8  9.03  35.7    0.359 walkable   
##  8 end   -84.4  33.8       8    4.08   33.0 12.2   42.6    2.80  walkable   
##  9 start -84.4  33.8       9   10.0    38.1  8.22  39.4    2.83  walkable   
## 10 mid1  -84.4  33.8      10    0.453  18.8 33.7   35.2    6.77  walkable   
## # ℹ 1,678 more rows
## # ℹ 2 more variables: geometry <POINT [°]>, total <dbl>

I found an outlier with 100% of sky in the image in one of the locations in Doraville. I guess this value came out because the GSV image was taken with a mistake.

edges_seg_output <- edges_seg_output %>%
  filter(sky < 90)
edges_seg_output
## Simple feature collection with 1621 features and 11 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -84.38145 ymin: 33.76099 xmax: -84.26535 ymax: 33.92455
## Geodetic CRS:  WGS 84
## # A tibble: 1,621 × 12
##    type      X     Y node_id building   sky  tree  road sidewalk is_walkable
##  * <chr> <dbl> <dbl>   <int>    <dbl> <dbl> <dbl> <dbl>    <dbl> <chr>      
##  1 start -84.4  33.8       1    5.52   15.0 38.9   35.0    3.25  walkable   
##  2 mid1  -84.4  33.8       2    5.30   21.7 24.9   33.3    1.28  walkable   
##  3 mid2  -84.4  33.8       3    8.48   24.5 18.1   40.1    2.70  walkable   
##  4 end   -84.4  33.8       4    7.47   30.0 14.8   34.6    7.75  walkable   
##  5 start -84.4  33.8       5   10.8    35.4  6.33  39.0    3.06  walkable   
##  6 mid1  -84.4  33.8       6   18.5    26.3  7.91  39.5    3.77  walkable   
##  7 mid2  -84.4  33.8       7   11.5    30.8  9.03  35.7    0.359 walkable   
##  8 end   -84.4  33.8       8    4.08   33.0 12.2   42.6    2.80  walkable   
##  9 start -84.4  33.8       9   10.0    38.1  8.22  39.4    2.83  walkable   
## 10 mid1  -84.4  33.8      10    0.453  18.8 33.7   35.2    6.77  walkable   
## # ℹ 1,611 more rows
## # ℹ 2 more variables: geometry <POINT [°]>, total <dbl>
edges_seg_output_walkable <- edges_seg_output %>%
           filter(is_walkable == "walkable")

edges_seg_output_unwalkable <- edges_seg_output %>%
           filter(is_walkable == "unwalkable")
draw_map <- function(edges_seg_output_is_walkable, element){
  
  bbox <- edges_seg_output_is_walkable %>%
              st_bbox()
  
tm_shape(tract) + 
  tm_polygons() +
tm_shape(edges_seg_output_is_walkable %>%
    mutate(building = round(building, 0),
           sky = round(sky, 0),
           tree = round(tree, 0),
           road = round(road, 0),
           sidewalk = round(sidewalk, 0))) +
    tm_dots(col = element, title = paste(element, "(%)"), style = "jenks", n = 5) +
    tm_view(bbox = bbox) +
  tm_layout(legend.outside = TRUE, legend.text.size = 0.2)
}
m1 <- draw_map(edges_seg_output_walkable, "building")
m2 <- draw_map(edges_seg_output_walkable, "sky")
m3 <- draw_map(edges_seg_output_walkable, "tree")
m4 <- draw_map(edges_seg_output_walkable, "road")
m5 <- draw_map(edges_seg_output_walkable, "sidewalk")

tmap_arrange(m1, m2, m3, m4, m5, nrow = 1)




m1 <- draw_map(edges_seg_output_unwalkable, "building")
m2 <- draw_map(edges_seg_output_unwalkable, "sky")
m3 <- draw_map(edges_seg_output_unwalkable, "tree")
m4 <- draw_map(edges_seg_output_unwalkable, "road")
m5 <- draw_map(edges_seg_output_unwalkable, "sidewalk")

tmap_arrange(m1, m2, m3, m4, m5, nrow = 1)




Provide a brief description of your findings from the maps. Overall, it is hard to understand how the two study areas differ in the elements visible from the streets. Still it would be worth noting that there are more buildings near the streets in the walkable neighborhood, the Inman Park, compared to the unwalkable neighborhood, Doraville. This indicates an association between the dense distribution of buildings and walkability. Other than that, Doraville has a hole in the middle where there are no points of observation. This seems to be the location of the Doraville stations. The outskirt of the station has high percentages of sky but low percentages of tree, which may indicate that it is pretty much barren area. While the clear view of sky may be aesthetically preferable, a small amount of tree shed may negatively affect people’s perception of comfortness while walking.

Analysis 2 - Compare the means.

You need to calculate the mean of the proportion of building, sky, tree, road, and sidewalk for walkable and unwalkable areas. For example, you need to calculate the mean of building category for each of walkable and unwalkable Census Tracts. Then, you need to calculate the mean of sky category for each of walkable and unwalkable Census Tracts. In total, you will have 10 mean values. Provide a brief description of your findings.

edges_seg_output_walkable_means <- edges_seg_output_walkable %>%
  st_drop_geometry() %>%
  summarise(building = mean(building),
         sky = mean(sky),
         tree = mean(tree),
         road = mean(road),
         sidewalk = mean(sidewalk)) %>%
  mutate(is_walkable = "walkable")

edges_seg_output_unwalkable_means <- edges_seg_output_unwalkable %>%
  st_drop_geometry() %>%
  summarise(building = mean(building),
         sky = mean(sky),
         tree = mean(tree),
         road = mean(road),
         sidewalk = mean(sidewalk)) %>%
  mutate(is_walkable = "unwalkable")
edges_seg_output_means <- edges_seg_output_walkable_means %>%
  bind_rows(edges_seg_output_unwalkable_means) %>%
  relocate(is_walkable)
edges_seg_output_means
## # A tibble: 2 × 6
##   is_walkable building   sky  tree  road sidewalk
##   <chr>          <dbl> <dbl> <dbl> <dbl>    <dbl>
## 1 walkable        6.51  24.2  21.6  35.5     3.18
## 2 unwalkable      2.70  26.8  22.4  35.1     1.37

As discussed before, the walkable area has higher percentages of building and sidewalk. This implies that the network of sidewalks are closely associated with the dense distribution of fairly tall (multi-story) buildings. Those street networks may have enhanced people’s accessibility to building facilities. An interesting point is that the percentages of sky are in fact higher in average in the unwalkable area. This supports the idea that the view of sky does not necessarily make people, at least according to my personal experience, feel comfortable to walk. Additionally, while roads for cars are not deemed conducive to pedestrian walking due to noises and air pollutions, the percentage of roads is in average slightly higher in the case of the walkable area. This lets us draw a conclusion that higher percentages of roads do not necessarily lead to lower walkability, especially when other elements are taken into account.

Analysis 3 - Draw boxplot

You need to calculate the mean of the proportion of building, sky, tree, road, and sidewalk for walkable and unwalkable areas. For example, you need to calculate the mean of building category for each of walkable and unwalkable Census Tracts. Then, you need to calculate the mean of sky category for each of walkable and unwalkable Census Tracts. In total, you will have 10 mean values. Provide a brief description of your findings.

g1 <- edges_seg_output_walkable %>%
  st_drop_geometry() %>%
  pivot_longer(cols = c("building", "sky", "tree", "road", "sidewalk"),
               names_to = "element", values_to = "percentage") %>%
  ggplot(aes(x = element, y = percentage)) +
    geom_boxplot()
g2 <- edges_seg_output_unwalkable %>%
  st_drop_geometry() %>%
  pivot_longer(cols = c("building", "sky", "tree", "road", "sidewalk"),
               names_to = "element", values_to = "percentage") %>%
  ggplot(aes(x = element, y = percentage)) +
    geom_boxplot()
ggpubr::ggarrange(g1, g2, nrow = 2)

The boxplots provide the most detailed information on the distributions of street elements. Given the small number of outliers off the range of the inter quartile range (IQR), we can say that the average percentages of sky and tree elements, respectively, are reliable indicators of the street environment. Conversely, the large numbers of outliers in the case of building, road, and sidewalk indicates us to be more conscious when using the average values. With these nuances taken into account, we can draw a conclusion that the more walkable neighborhood has higher percentages of building and sidewalk along the streets. This further implies that the access to various facilities and amenities along with the supportive pedestrian environment is the key factor in play for walkability.