Introduction to the assignment

This assignment consists of three main sections.

In the first section, you need to select one Census Tract that you think is the most walkable and another one that you think is least walkable within Fulton and DeKalb Counties, GA. As long as they are within the two counties, you can pick any two Census Tracts you want. If the area you want to use as walkable/unwalkable area is not well-covered by a single Census Tract, you can select multiple tracts (e.g., selecting three adjacent tracts as one walkable area). The definition of ‘walkable’ can be your own - you can choose solely based on your experience (e.g., had best/worst walking experience because …), refer to Walk Score, or any other mix of criteria you want. After you make the selection, provide a short write-up of why you chose those Census Tracts.

The second section is the main part of this assignment in which you prepare OSM data, download GSV images, apply computer vision technique we learned in the class (i.e., semantic segmentation).

In the third section, you will summarise and analyze the output and provide your findings. After you apply computer vision to the images, you will have the number of pixels in each image that represent 150 categories in your data. You will focus on the following categories in your analysis: building, sky, tree, road, and sidewalk. Specifically, you will (1) create maps to visualize the spatial distribution of different objects, (2) compare the mean of each category between the two Census Tract and (3) draw boxplots to compare the distributions.

Section 0. Loading the neccessary libraries

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(tidycensus)
library(osmdata)
## Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright
library(sfnetworks)
library(units)
## udunits database from C:/Users/srini/AppData/Local/R/win-library/4.4/units/share/udunits/udunits2.xml
library(sf)
## Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
library(tidygraph)
## 
## Attaching package: 'tidygraph'
## 
## The following object is masked from 'package:stats':
## 
##     filter
library(tmap)
## Breaking News: tmap 3.x is retiring. Please test v4, e.g. with
## remotes::install_github('r-tmap/tmap')
library(here)
## here() starts at C:/Users/srini/OneDrive/Documents/Urban Analytics
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows

Section 1. Choose your Census Tracts.

Provide a brief description of your census tracts. Why do you think the Census Tracts are walkable and unwalkable? What were the contributing factors?

Whether a Census Tract is walkable usually comes down to how it’s designed and connected. Areas with lots of nearby shops, schools, and public spaces—plus sidewalks and safe streets—naturally encourage walking. Income levels, car access, and even local policies also make a difference. Walkable neighborhoods simply make it easier and safer for people to get where they need to go on foot.

For my project I am considering using the Census tracts 12.05 as the hypothetically more walkable tract because it is located in Midtown and has a denser street grid. The hypothetically less walkable Census tract chosen in 211.01 in Brookhaven, which I believe will be less walkable due to the suburban sprawl.

Section 2. OSM, GSV, and computer vision.

Step 1. Get OSM data and clean it.

The getbb() function, which we used in the class to download OSM data, isn’t suitable for downloading just two Census Tracts. We will instead use an alternative method.

  1. Using tidycensus package, download the Census Tract polygon for Fulton and DeKalb counties.
  2. Extract two Census Tracts, each of which will be your most walkable and least walkable Census Tracts.
  3. Using their bounding boxes, get OSM data.
  4. Convert them into sfnetwork object and clean it.
# TASK ////////////////////////////////////////////////////////////////////////
# 1. Set up your api key here
census_api_key(Sys.getenv("census_api"))
## To install your API key for use in future sessions, run this function with `install = TRUE`.
  # **YOUR CODE HERE..**

# //TASK //////////////////////////////////////////////////////////////////////



# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Download Census Tract polygon for Fulton and DeKalb
tract <- get_acs("tract", 
                 variables = c('tot_pop' = 'B01001_001'),
                 year = 2022,
                 state = "GA", 
                 county = c("Fulton", "DeKalb"), 
                 geometry = TRUE)
## Getting data from the 2018-2022 5-year ACS
## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
##   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |========                                                              |  11%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |===========                                                           |  15%  |                                                                              |============                                                          |  16%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  21%  |                                                                              |================                                                      |  22%  |                                                                              |=======================                                               |  32%  |                                                                              |========================                                              |  34%  |                                                                              |==========================                                            |  37%  |                                                                              |=============================                                         |  41%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  48%  |                                                                              |===================================                                   |  50%  |                                                                              |====================================                                  |  52%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  58%  |                                                                              |==========================================                            |  60%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |====================================================                  |  74%  |                                                                              |=====================================================                 |  76%  |                                                                              |=======================================================               |  78%  |                                                                              |==========================================================            |  82%  |                                                                              |===========================================================           |  84%  |                                                                              |=============================================================         |  87%  |                                                                              |==============================================================        |  88%  |                                                                              |===============================================================       |  90%  |                                                                              |================================================================      |  91%  |                                                                              |===================================================================   |  95%  |                                                                              |====================================================================  |  97%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================| 100%
# =========== NO MODIFY ZONE ENDS HERE ========================================



# TASK ////////////////////////////////////////////////////////////////////////
# The purpose of this TASK is to create one bounding box for walkable Census Tract and another bounding box for unwalkable Census Tract.
# As long as you generate what's needed for the subsequent codes, you are good. The numbered list of tasks below is to provide some hints.
# 1. Write the GEOID of walkable & unwalkable Census Tracts. e.g., tr1_ID <- c("13121001205", "13121001206")
# 2. Extract the selected Census Tracts using tr1_ID & tr2_ID
# 3. Create their bounding boxes using st_bbox(), and 
# 4. assign them to tract_1_bb and tract_1_bb, respectively.

# For the walkable Census Tract(s)
# 1. 
tr1_ID <- c("13121001205")
  # **YOUR CODE HERE..** --> For example, tr1_ID <- c("13121001205", "13121001206").

# 2~4
tract_1_bb <- tract %>%
  filter(GEOID == tr1_ID) %>% 
  st_bbox()
  # **YOUR CODE HERE..**

# For the unwalkable Census Tract(s)  
# 1.
tr2_ID <- c("13089021101") 
  # **YOUR CODE HERE..**

# 2~4
tract_2_bb <- tract %>%
  filter(GEOID == tr2_ID) %>% 
  st_bbox()
  # **YOUR CODE HERE..**
# //TASK //////////////////////////////////////////////////////////////////////

  
  
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Get OSM data for the two bounding box
osm_1 <- opq(bbox = tract_1_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("motorway", "trunk", "primary", 
                            "secondary", "tertiary", "unclassified",
                            "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()

osm_2 <- opq(bbox = tract_2_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("motorway", "trunk", "primary", 
                            "secondary", "tertiary", "unclassified",
                            "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()
# =========== NO MODIFY ZONE ENDS HERE ========================================
# TASK ////////////////////////////////////////////////////////////////////////
# 1. Convert osm_1 and osm_2 to sfnetworks objects (set directed = FALSE)
# 2. Clean the network by (1) deleting parallel lines and loops, (2) create missing nodes, and (3) remove pseudo nodes, 
# 3. Add a new column named length using edge_length() function.
net1 <- osm_1$osm_lines %>% 
  # Drop redundant columns 
  select(osm_id, highway) %>%
  sfnetworks::as_sfnetwork(directed = FALSE) %>%
  activate("edges") %>%
  filter(!edge_is_multiple()) %>%
  filter(!edge_is_loop()) %>%
  convert(., sfnetworks::to_spatial_subdivision) %>%
  convert(., sfnetworks::to_spatial_smooth) %>% 
  mutate(length = edge_length())
## Warning: to_spatial_subdivision assumes attributes are constant over geometries
  # **YOUR CODE HERE..**

net2 <- osm_2$osm_lines %>% 
  # Drop redundant columns 
  select(osm_id, highway) %>%
  select(osm_id, highway) %>%
  sfnetworks::as_sfnetwork(directed = FALSE) %>%
  activate("edges") %>%
  filter(!edge_is_multiple()) %>%
  filter(!edge_is_loop()) %>%
  convert(., sfnetworks::to_spatial_subdivision) %>%
  convert(., sfnetworks::to_spatial_smooth) %>% 
  mutate(length = edge_length())
## Warning: to_spatial_subdivision assumes attributes are constant over geometries
  # **YOUR CODE HERE..**
  
# //TASK //////////////////////////////////////////////////////////////////////
  
  
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# OSM for the walkable part
edges_1 <- net1 %>% 
  # Extract 'edges'
  st_as_sf("edges") %>% 
  # Drop segments that are too short (100m)
  mutate(length = as.vector(length)) %>% 
  filter(length > 100) %>% 
  # Add a unique ID for each edge
  mutate(edge_id = seq(1,nrow(.)),
         is_walkable = "walkable")

# OSM for the unwalkable part
edges_2 <- net2 %>% 
  # Extract 'edges'
  st_as_sf("edges") %>%
  # Drop segments that are too short (100m)
  mutate(length = as.vector(length)) %>% 
  filter(length > 100) %>% 
  # Add a unique ID for each edge
  mutate(edge_id = seq(1,nrow(.)),
         is_walkable = "unwalkable")

# Merge the two
edges <- bind_rows(edges_1, edges_2)
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 2. Define getAzimuth() function.

getAzimuth <- function(line){
  # This function takes one edge (i.e., a street segment) as an input and
  # outputs a data frame with four points (start, mid1, mid2, and end) and their azimuth.
  
  
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. From `line` object, extract the coordinates using st_coordinates() and extract the first two rows.
  # 2. Use atan2() function to calculate the azimuth in degree. 
  #    Make sure to adjust the value such that 0 is north, 90 is east, 180 is south, and 270 is west.
  # 1
  start_p <- line %>%
    st_coordinates() %>% 
    .[1:2,1:2]
    # **YOUR CODE HERE..**

  # 2
  start_azi <- atan2(start_p[2,"X"] - start_p[1, "X"],
                   start_p[2,"Y"] - start_p[1, "Y"])*180/pi 
    # **YOUR CODE HERE..** --> For example, atan2()..
  # //TASK //////////////////////////////////////////////////////////////////////

    
    
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # Repeat what you did above, but for last two rows (instead of the first two rows).
  # Remember to flip the azimuth so that the camera would be looking at the street that's being measured
  end_p <- line %>%
    st_coordinates() %>% 
    .[(nrow(.)-1):nrow(.),1:2]
    # **YOUR CODE HERE..**
    
  end_azi <- atan2(end_p[2,"X"] - end_p[1, "X"],
                 end_p[2,"Y"] - end_p[1, "Y"])*180/pi 
    # **YOUR CODE HERE..** --> For example, atan2()..
    
  end_azi <- if (end_azi < 180) {end_azi + 180} else {end_azi - 180}
  # //TASK //////////////////////////////////////////////////////////////////////
  
  
  
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. From `line` object, use st_line_sample() function to generate points at 0.45 and 0.55 locations. These two points will be used to calculate the azimuth.
  # 2. Use st_case() function to convert 'MULTIPOINT' object to 'POINT' object.
  # 3. Extract coordinates using st_coordinates().
  # 4. Use atan2() functino to Calculate azimuth.
  # 5. Use st_line_sample() again to generate a point at 0.5 location and get its coordinates. This point will be the location at which GSV image will be downloaded.
  
  mid_p <- line %>%
    st_geometry() %>% 
    .[[1]] %>% 
    st_line_sample(sample = c(0.45, 0.55)) %>% 
    st_cast("POINT") %>% 
    st_coordinates()
    # **YOUR CODE HERE..** --> For 0.45 & 0.55 points
  
  mid_azi <- atan2(mid_p[2,"X"] - mid_p[1, "X"],
                 mid_p[2,"Y"] - mid_p[1, "Y"])*180/pi 
    # **YOUR CODE HERE..** For example, atan2()..
  
  mid_p <- line %>%
    st_geometry() %>% 
    .[[1]] %>% 
    st_line_sample(sample = 0.5) %>% 
    st_coordinates() %>% 
    .[1,1:2]
    # **YOUR CODE HERE..** --> For 0.5 point
  # //TASK //////////////////////////////////////////////////////////////////////
 
    
  
  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  return(tribble(
    ~type,    ~X,            ~Y,             ~azi,
    "start",   start_p[1,"X"], start_p[1,"Y"], start_azi,
    "mid1",    mid_p["X"],   mid_p["Y"],   mid_azi,
    "mid2",    mid_p["X"],   mid_p["Y"],   ifelse(mid_azi < 180, mid_azi + 180, mid_azi - 180),
    "end",     end_p[2,"X"],   end_p[2,"Y"],   end_azi))
  # =========== NO MODIFY ZONE ENDS HERE ========================================

}

Step 3. Apply the function to all street segments

We can apply getAzimuth() function to the edges object. We finally append edges object to make use of the columns in edges object (e.g., is_walkable column). When you are finished with this code chunk, you will be ready to download GSV images.

# TASK ////////////////////////////////////////////////////////////////////////
# Apply getAzimuth() function to all edges.
# Remember that you need to pass edges object to st_geometry() before you apply getAzimuth()
edges_azi <- edges %>%
  st_geometry() %>% 
  map_df(getAzimuth) 
  # **YOUR CODE HERE..**

# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
edges_azi <- edges_azi %>% 
  bind_cols(edges %>% 
              st_drop_geometry() %>% 
              slice(rep(1:nrow(edges),each=4))) %>% 
  st_as_sf(coords = c("X", "Y"), crs = 4326, remove=FALSE) %>% 
  mutate(node_id = seq(1, nrow(.)))
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 4. Define a function that formats request URL and download images

getImage <- function(iterrow){
  # This function takes one row of edges_azi and downloads GSV image using the information from edges_azi.
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # Finish this function definition.
  # 1. Extract required information from the row of edges_azi, including 
  #    type (i.e., start, mid1, mid2, end), location, heading, edge_id, node_id, and key.
  # 2. Format the full URL and store it in `request`. Refer to this page: https://developers.google.com/maps/documentation/streetview/request-streetview
  # 3. Format the full path (including the file name) of the image being downloaded and store it in `fpath`
  type <- iterrow$type
  location <- paste0(iterrow$Y %>% round(4), ",", iterrow$X %>% round(4))
  heading <- iterrow$azi %>% round(1)
  edge_id <- iterrow$edge_id
  node_id <- iterrow$node_id
  key <- Sys.getenv("google_api")
  
  endpoint <- "https://maps.googleapis.com/maps/api/streetview?size=640x640&location={location}&heading={heading}&fov=90&pitch=0&key={key}"
  
  furl <- glue::glue("https://maps.googleapis.com/maps/api/streetview?size=640x640&location={location}&heading={heading}&fov=90&pitch=0&key={key}") 
  
  fname <- glue::glue("GSV-nid_{node_id}-eid_{edge_id}-type_{type}-Location_{location}-heading_{heading}.jpg") # Don't change this code for fname
  fpath <- here("C:\\Users\\srini\\OneDrive\\Documents\\Urban Analytics\\Image",fname)
  # //TASK //////////////////////////////////////////////////////////////////////

  
  
  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  # Download images
  if (!file.exists(fpath)){
    download.file(furl, fpath, mode = 'wb') 
  }
  # =========== NO MODIFY ZONE ENDS HERE ========================================
}

Step 5. Download GSV images

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Loop!
for (i in seq(1,nrow(edges_azi))){
  getImage(edges_azi[i,])
}
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 6. Apply computer vision

Now, use Google Colab to apply the semantic segmentation model. Zip your images and upload the images to your Colab session.

Step 7. Merging the processed data back to R

Once all of the images are processed and saved in your Colab session as a CSV file, download the CSV file and merge it back to edges.

# TASK ////////////////////////////////////////////////////////////////////////
# Read the downloaded CSV file from Google Colab
seg_output <- read.csv("C:\\Users\\srini\\OneDrive\\Documents\\Urban Analytics\\seg_output.csv")
  # **YOUR CODE HERE..**


# //TASK ////////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Join the seg_output object back to edges_azi object using node_id as the join key.
edges_seg_output <- edges_azi %>% 
  inner_join(seg_output, by=c("node_id" = "img_id")) %>% 
  select(type, X, Y, node_id, building, sky, tree, road, sidewalk, is_walkable) %>% 
  mutate(across(c(building, sky, tree, road, sidewalk), function(x) x/(640*640)))

Section 3. Summarise and analyze the results.

At the beginning of this assignment, you defined one Census Tract as walkable and the other as unwalkable. The key to the following analysis is the comparison between walkable and unwalkable Census Tracts.

Analysis 1 - Create interactive map(s) to visualize the spatial distribution of the streetscape.

You need to create maps of the proportion of building, sky, tree, road, and sidewalk for walkable and unwalkable areas. In total, you will have 10 maps.

Provide a brief description of your findings from the maps.

# TASK ////////////////////////////////////////////////////////////////////////
# Create interactive map(s) to visualize the `edges_seg_output` objects. 
# As long as you can deliver the message clearly, you can use any format/package you want.


# //TASK //////////////////////////////////////////////////////////////////////

edges_seg_output_final <- edges_seg_output %>% 
  mutate(pct_building = building*100) %>% 
  mutate(pct_sky = sky*100) %>% 
  mutate(pct_tree = tree*100) %>% 
  mutate(pct_road = road*100) %>% 
  mutate(pct_sidewalk = sidewalk*100)
  
walkable <- edges_seg_output_final %>% 
  filter(is_walkable == "walkable")

unwalkable <- edges_seg_output_final %>% 
  filter(is_walkable == "unwalkable")

Amount of Buildings

tmap_mode("view")
## tmap mode set to interactive viewing
b_w <- tm_shape(edges_seg_output_final %>% 
                  filter(is_walkable == "walkable"))+
  tm_dots(col = "pct_building", style = "pretty") + tm_layout(legend.width = 0.25)+ tm_layout(title = "Percent of Buildings(Walkable)")

b_u <- tm_shape(edges_seg_output_final %>% 
                  filter(is_walkable == "unwalkable"))+
  tm_dots(col = "pct_building", style = "pretty")+ tm_layout(legend.width = 0.25)+ tm_layout(title = "Percent of Buildings(Unwalkable)")

#map both together
tmap_arrange(b_w,b_u) 

Analysis: The map shows that the Midtown region has a significantly higher density of buildings compared to the Brookhaven tract. In Midtown, building density ranges from 0-60%, while in Brookhaven, it ranges from 0-30%. This difference is likely due to Midtown’s taller buildings, whereas Brookhaven consists mostly of 1-2 story buildings and single-family houses. Additionally, Midtown’s buildings appear more organized, which may positively contribute to the area’s walkability.

Amount of Sky

tmap_mode("view")
## tmap mode set to interactive viewing
s_w <- tm_shape(edges_seg_output_final %>% 
                  filter(is_walkable == "walkable"))+
  tm_dots(col = "pct_sky", style = "pretty") + tm_layout(legend.width = 0.25)+ tm_layout(title = "Percent of Sky(Walkable)")

s_u <- tm_shape(edges_seg_output_final %>% 
                  filter(is_walkable == "unwalkable"))+
  tm_dots(col = "pct_sky", style = "pretty")+ tm_layout(legend.width = 0.25)+ tm_layout(title = "Percent of Sky(Unwalkable)")

#map both together
tmap_arrange(s_w,s_u)

Analysis: Brookhaven has a higher concentration of well-lit areas or streets with open skies compared to Midtown, suggesting more natural daylight and potentially improved walkability during the daytime.

Amount of Tree

tmap_mode("view")
## tmap mode set to interactive viewing
t_w <- tm_shape(edges_seg_output_final %>% 
                  filter(is_walkable == "walkable"))+
  tm_dots(col = "pct_tree", style = "pretty") + tm_layout(legend.width = 0.25)+ tm_layout(title = "Percent of Tree(Walkable)")

t_u <- tm_shape(edges_seg_output_final %>% 
                  filter(is_walkable == "unwalkable"))+
  tm_dots(col = "pct_tree", style = "pretty")+ tm_layout(legend.width = 0.25)+ tm_layout(title = "Percent of Tree(Unwalkable)")

#map both together
tmap_arrange(t_w,t_u) 

Analysis: The map indicates that tree concentration in Midtown varies across areas, while in Brookhaven, it appears more evenly distributed. This may be because suburbs, like Brookhaven, have lower-density development with more open space, whereas Midtown prioritizes buildings and infrastructure, limiting room for trees. Overall, the hypothetically unwalkable census tract seems to have more trees than the walkable one.

Amount of Road

tmap_mode("view")
## tmap mode set to interactive viewing
r_w <- tm_shape(edges_seg_output_final %>% 
                  filter(is_walkable == "walkable"))+
  tm_dots(col = "pct_road", style = "pretty") + tm_layout(legend.width = 0.25)+ tm_layout(title = "Percent of Roads(Walkable)")

r_u <- tm_shape(edges_seg_output_final %>% 
                  filter(is_walkable == "unwalkable"))+
  tm_dots(col = "pct_road", style = "pretty")+ tm_layout(legend.width = 0.25)+ tm_layout(title = "Percent of Roads(Unwalkable)")

#map both together
tmap_arrange(r_w,r_u) 

Analysis: The map reveals that nearly all images display around 30%-50% road coverage, which is expected, as Google Street View images are captured along roadways. This road-centric perspective means that both maps, despite representing different tracts (Midtown and Brookhaven), show a similar proportion of road area in the images.

Amount of Sidewalks

tmap_mode("view")
## tmap mode set to interactive viewing
sw_w <- tm_shape(edges_seg_output_final %>% 
                  filter(is_walkable == "walkable"))+
  tm_dots(col = "pct_sidewalk", style = "pretty") + tm_layout(legend.width = 0.25)+ tm_layout(title = "Percent of Sidewalk(Walkable)")

sw_u <- tm_shape(edges_seg_output_final %>% 
                  filter(is_walkable == "unwalkable"))+
  tm_dots(col = "pct_sidewalk", style = "pretty")+ tm_layout(legend.width = 0.25)+ tm_layout(title = "Percent of Sidewalk(Unwalkable)")

#map both together
tmap_arrange(sw_w,sw_u) 

Analysis: Both maps suggest that the amount of sidewalk coverage is similar on average across the Midtown and Brookhaven tracts, though Midtown has a slightly higher concentration of sidewalks. Since sidewalks are a key factor in evaluating a tract’s walkability, these findings indicate that neither area has particularly high walkability despite the presence of sidewalks.

Conclusion:It’s difficult to draw clear conclusions. The five categories chosen might not effectively indicate walkability, or the algorithm analyzing the Street View images may have errors. Additionally, Street View photos taken from a car in the middle of the road may not accurately capture walkable features. It’s also possible that Midtown isn’t as walkable as assumed, and the suburbs of Brookhaven might be more walkable than anticipated. As is often the case, further analysis is required.

Analysis 2 - Compare the means.

You need to calculate the mean of the proportion of building, sky, tree, road, and sidewalk for walkable and unwalkable areas. For example, you need to calculate the mean of building category for each of walkable and unwalkable Census Tracts. Then, you need to calculate the mean of sky category for each of walkable and unwalkable Census Tracts. In total, you will have 10 mean values. Provide a brief description of your findings.

# TASK ////////////////////////////////////////////////////////////////////////
# Perform the calculation as described above.
# As long as you can deliver the message clearly, you can use any format/package you want.
#Walkable values
w_building_pctmean <- mean(walkable$pct_building)

w_sky_pctmean <- mean(walkable$pct_sky)

w_tree_pctmean <- mean(walkable$pct_tree)

w_road_pctmean <- mean(walkable$pct_road)

w_sidewalk_pctmean <- mean(walkable$pct_sidewalk)

#Unwalkable Values

u_building_pctmean <- mean(unwalkable$pct_building)

u_sky_pctmean <- mean(unwalkable$pct_sky)

u_tree_pctmean <- mean(unwalkable$pct_tree)

u_road_pctmean <- mean(unwalkable$pct_road)

u_sidewalk_pctmean <- mean(unwalkable$pct_sidewalk)

#display means

a <- c("Buildings", "Sky", "Trees", "Roads", "Sidewalks")

walkable_means <- c(w_building_pctmean, w_sky_pctmean, w_tree_pctmean, w_road_pctmean, w_sidewalk_pctmean)

unwalkable_means <- c(u_building_pctmean, u_sky_pctmean, u_tree_pctmean, u_road_pctmean, u_sidewalk_pctmean)

df_means <- data.frame(a, walkable_means, unwalkable_means)

print(df_means)
##           a walkable_means unwalkable_means
## 1 Buildings      22.532415         2.264462
## 2       Sky      13.825158        26.211629
## 3     Trees      17.344912        26.074733
## 4     Roads      34.994066        31.325809
## 5 Sidewalks       6.239912         1.535506
# //TASK //////////////////////////////////////////////////////////////////////

Analysis: This table compares average proportions of different features (buildings, sky, trees, roads, sidewalks) in walkable and unwalkable tracts. In walkable tracts, there’s a higher average percentage of buildings (22.5%) and sidewalks (6.2%) compared to unwalkable tracts, which have only 2.3% buildings and 1.5% sidewalks. Unwalkable tracts, however, show a higher percentage of sky (26.2%) and trees (26.1%), likely indicating more open, less dense spaces typical of suburban areas. Roads remain fairly similar between the two, consistent with Google Street View’s road-based image capture. This suggests that denser areas with more sidewalks and buildings tend to be more walkable.

The higher percentage of sidewalks in Midtown compared to Brookhaven supports the previous assumption that Midtown is the more walkable area, with sidewalks being a crucial factor in determining walkability.

Analysis 3 - Draw boxplot

You need to calculate the mean of the proportion of building, sky, tree, road, and sidewalk for walkable and unwalkable areas. For example, you need to calculate the mean of building category for each of walkable and unwalkable Census Tracts. Then, you need to calculate the mean of sky category for each of walkable and unwalkable Census Tracts. In total, you will have 10 mean values. Provide a brief description of your findings.

# TASK ////////////////////////////////////////////////////////////////////////
# Create boxplot(s) using ggplot2 package.

edges_seg_output_longer <- edges_seg_output_final %>%
    pivot_longer(
      cols = pct_building:pct_sidewalk,
        names_to = "Image_Elements",
        values_to = "Percentages"
      )

ggplot(data = edges_seg_output_longer %>%
         separate(col = "Image_Elements", into=c("Image_Elements"),sep=", ") %>%
         drop_na(Image_Elements)
       ) +
  geom_boxplot(mapping = aes(x = is_walkable, y = Percentages), color="black",   outlier.size = 0.5, lwd=0.25) +
       labs(x = "",
            y = "Percentage",
            title = "Boxplots Comparing Image Elements",
            subtitle = "Walkable vs. Unwalkable Census Tracts")+
       facet_wrap(~Image_Elements, scales = "free_y", 
                  labeller = labeller(Image_Elements = 
    c("pct_building" = "Building",
      "pct_sky" = "Sky",
      "pct_tree" = "Trees",
      "pct_road" = "Roads",
      "pct_sidewalk" = "Sidewalks"))) +
       theme_bw()

# //TASK //////////////////////////////////////////////////////////////////////

Analysis: The box plots reveal similar findings to the previous two analyses, showing comparable feature densities between the two areas. Roads are distributed similarly across both walkable and unwalkable census tracts, while buildings and sidewalks show notable differences, with higher amounts in walkable areas. Sky and trees also have similar distribution patterns; however, there is significantly more visible sky and tree coverage in unwalkable tracts compared to walkable ones. Overall, walkable tracts tend to have a higher presence of buildings and sidewalks, whereas unwalkable tracts are characterized by more natural elements like sky and trees.