Task Description

Exploring Walkability Through Street View and Computer Vision

This assignment is divided into three main sections.

In the first section, you will select two Census Tracts within Fulton and DeKalb Counties, GA — one that you believe is the most walkable and another that is the least walkable. You may choose any tracts within these two counties. If the area you want to analyze is not well represented by a single tract, you may select multiple adjacent tracts (e.g., two contiguous tracts as one “walkable area”). The definition of walkable is up to you — it can be based on your personal experience (e.g., places where you’ve had particularly good or bad walking experiences), Walk Score data, or any combination of criteria. After making your selections, provide a brief explanation of why you chose those tracts.

The second section is the core of this assignment. You will prepare OpenStreetMap (OSM) data, download Google Street View (GSV) images, and apply the computer vision technique covered in class — semantic segmentation.

In the third section, you will summarize and analyze the results. After applying computer vision to the images, you will obtain pixel counts for 19 different object categories. Using the data, you will:

Create maps to visualize the spatial distribution of these objects,
Draw boxplots to compare their distributions between the walkable and unwalkable tracts, and
Perform t-tests to examine the differences in mean values and their statistical significance.

Section 0. Packages

Importing the necessary packages is part of this assignment. Add any required packages to the code chunk below as you progress through the tasks.

library(tidytransit)
library(gtfsrouter)

## Registered S3 method overwritten by 'gtfsrouter':
##   method       from  
##   summary.gtfs gtfsio

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(stringr)
library(sf)

## Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE

library(dplyr)
library(tidycensus)
library(osmdata)

## Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright

library(sfnetworks)
library(tidygraph)

## 
## Attaching package: 'tidygraph'

## The following object is masked from 'package:stats':
## 
##     filter

library(units)

## udunits database from /usr/share/xml/udunits/udunits2.xml

library(purrr)
library(hms)
library(leaflet)
library(htmltools)
library(tm)

## Loading required package: NLP

library(tmap)

## Breaking News: tmap 3.x is retiring. Please test v4, e.g. with
## remotes::install_github('r-tmap/tmap')

Section 1. Choose your Census Tracts.

Use the Census Tract map in the following code chunk to identify the GEOIDs of the tracts you consider walkable and unwalkable.

# TASK ////////////////////////////////////////////////////////////////////////
# Set up your api key here
census_api_key("8d6ffb1affac98f6f4cc2a28150fc30546b21861", install = TRUE, overwrite = TRUE)

## Your original .Renviron will be backed up and stored in your R HOME directory if needed.

## Your API key has been stored in your .Renviron and can be accessed by Sys.getenv("CENSUS_API_KEY"). 
## To use now, restart R or run `readRenviron("~/.Renviron")`

## [1] "8d6ffb1affac98f6f4cc2a28150fc30546b21861"

# //TASK //////////////////////////////////////////////////////////////////////
tm_mode <- function(mode = c("plot","view")) {
  mode <- match.arg(mode)
  tmap_mode(mode)
}
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Download Census Tract polygon for Fulton and DeKalb
tract <- get_acs("tract", 
                 variables = c('pop' = 'B01001_001'),
                 year = 2023,
                 state = "GA", 
                 county = c("Fulton", "DeKalb"), 
                 geometry = TRUE)

## Getting data from the 2019-2023 5-year ACS

## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.

##   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  23%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |========================                                              |  34%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |===========================                                           |  38%  |                                                                              |============================                                          |  40%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |======================================                                |  54%  |                                                                              |=======================================                               |  55%  |                                                                              |========================================                              |  57%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |===================================================                   |  73%  |                                                                              |====================================================                  |  75%  |                                                                              |======================================================                |  77%  |                                                                              |=======================================================               |  78%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |=========================================================             |  81%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |================================================================      |  92%  |                                                                              |==================================================================    |  94%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================| 100%

tm_mode("view")

## tmap mode set to interactive viewing

tm_basemap("OpenStreetMap") +
  tm_shape(tract) + 
  tm_polygons(fill_alpha = 0.2)

# =========== NO MODIFY ZONE ENDS HERE ========================================

Once you have the GEOIDs, create two Census Tract objects – one representing your most walkable area and the other your least walkable area.

# TASK ////////////////////////////////////////////////////////////////////////
# 1. Specify the GEOIDs of your walkable and unwalkable Census Tracts. 
#    e.g., tr_id_walkable <- c("13121001205", "13121001206")
# 2. Extract the selected Census Tracts using `tr_id_walkable` and `tr_id_unwalkable`

# For the walkable Census Tract(s)
tr_id_walkable <- c(
  "13121005000"  
)
tract_walkable <- tract %>% 
  dplyr::filter(GEOID %in% tr_id_walkable)

# For the unwalkable Census Tract(s)
tr_id_unwalkable <- c(
  "13121002600" 
)

tract_unwalkable <- tract %>% 
  dplyr::filter(GEOID %in% tr_id_unwalkable)


# //TASK //////////////////////////////////////////////////////////////////////


# TASK ////////////////////////////////////////////////////////////////////////
# Create an interactive map showing `tract_walkable` and `tract_unwalkable`

tmap_mode("view")

## tmap mode set to interactive viewing

tm_basemap("OpenStreetMap") +
  tm_shape(tract) +
  tm_polygons(col = "grey90", border.col = "white", alpha = 0.2) +
  tm_shape(tract_walkable) +
  tm_borders(col = "darkgreen", lwd = 3) +
  tm_fill(col = "green", alpha = 0.3) +
  tm_shape(tract_unwalkable) +
  tm_borders(col = "darkred", lwd = 3) +
  tm_fill(col = "red", alpha = 0.3)

# //TASK //////////////////////////////////////////////////////////////////////

Explain of selected census tracts

Walkable Census Tract (Green Area — GEOID: 13121005908)

I selected this tract because it sits in a part of Atlanta with strong walkability characteristics: dense street grids, short block lengths, mixed-use development, and close proximity to amenities such as shops, restaurants, and transit stops. Sidewalks are continuous, crossings are frequent, and traffic speeds are generally lower. These characteristics align with common indicators of walkability — fine-grained street networks, pedestrian infrastructure, and nearby destinations.

Unwalkable Census Tract (Red Area — GEOID: 13121003900)

I am living in this area so this is why I choose this tract. This tract appears much less walkable due to limited pedestrian infrastructure, longer blocks, and a more automobile oriented urban form. Land uses are separated, which forces longer distances between origins and destinations. Sidewalk coverage is inconsistent, intersections are spaced far apart, and roads carry faster traffic. These factors reduce safety and comfort for pedestrians and contribute to the tract being perceived as unwalkable.

Section 2. OSM, GSV, and Computer Vision.

Step 1. Get and clean OSM data.

To obtain the OSM network for your selected Census Tracts: (1) Create bounding boxes. (2) Use the bounding boxes to download OSM data. (3) Convert the data into an sfnetwork object and clean it.

# TASK ////////////////////////////////////////////////////////////////////////

# For the walkable Census Tract(s)
tract_walkable_bb <- tract_walkable %>%
  sf::st_bbox() %>%
  sf::st_as_sfc()

# For the unwalkable Census Tract(s)  
tract_unwalkable_bb <- tract_unwalkable %>%
  sf::st_bbox() %>%
  sf::st_as_sfc()

# //TASK //////////////////////////////////////////////////////////////////////


# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Get OSM data for the two bounding boxes
osm_walkable <- opq(bbox = tract_walkable_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("primary", "secondary", "tertiary", "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()

osm_unwalkable <- opq(bbox = tract_unwalkable_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("primary", "secondary", "tertiary", "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()
# =========== NO MODIFY ZONE ENDS HERE ========================================


# TASK ////////////////////////////////////////////////////////////////////////
# 1. Convert `osm_walkable` and `osm_unwalkable` into sfnetwork objects (as undirected networks),
# 2. Clean the network by (1) deleting parallel lines and loops, (2) creating missing nodes, and (3) removing pseudo nodes (make sure the `summarise_attributes` argument is set to 'first' when doing so).

net_walkable <- osm_walkable$osm_lines %>%
  select(osm_id, highway, geometry) %>%
  sfnetworks::as_sfnetwork(directed = FALSE) %>%
  tidygraph::convert(sfnetworks::to_spatial_subdivision) %>% 
  tidygraph::convert(sfnetworks::to_spatial_smooth, .clean = TRUE) %>% 
  tidygraph::convert(sfnetworks::to_spatial_contracted, 
                     .clean = TRUE, summarise_attributes = "first")

## Warning: to_spatial_subdivision assumes attributes are constant over geometries

net_unwalkable <- osm_unwalkable$osm_lines %>%
  select(osm_id, highway, geometry) %>%
  sfnetworks::as_sfnetwork(directed = FALSE) %>%
  tidygraph::convert(sfnetworks::to_spatial_subdivision) %>% 
  tidygraph::convert(sfnetworks::to_spatial_smooth, .clean = TRUE) %>%
  tidygraph::convert(sfnetworks::to_spatial_contracted, 
                     .clean = TRUE, summarise_attributes = "first")

## Warning: to_spatial_subdivision assumes attributes are constant over geometries

# //TASK //////////////////////////////////////////////////////////////////////
  
  
# TASK //////////////////////////////////////////////////////////////////////
# Using `net_walkable` and`net_unwalkable`,
# 1. Activate the edge component of each network.
# 2. Create a `length` column.
# 3. Filter out short (<300 feet) segments.
# 4. Randomly Sample 100 rows per road type.
# 5. Assign the results to `edges_walkable` and `edges_unwalkable`, respectively.

# OSM for the walkable part
edges_walkable <- net_walkable %>%
  tidygraph::activate("edges") %>%
  dplyr::mutate(length = as.numeric(sf::st_length(geometry))) %>%
  dplyr::filter(length >= 91.44) %>%   # 300 feet
  dplyr::group_by(highway) %>%
  dplyr::slice_sample(n = 100, replace = TRUE) %>%
  dplyr::ungroup() %>%
  sf::st_as_sf()
# OSM for the unwalkable part
edges_unwalkable <- net_unwalkable %>%
  tidygraph::activate("edges") %>%
  dplyr::mutate(length = as.numeric(sf::st_length(geometry))) %>%
  dplyr::filter(length >= 91.44) %>%   # 300 feet
  dplyr::group_by(highway) %>%
  dplyr::slice_sample(n = 100, replace = TRUE) %>%
  dplyr::ungroup() %>%
  sf::st_as_sf()
# //TASK //////////////////////////////////////////////////////////////////////
  
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Merge the two
edges <- bind_rows(edges_walkable %>% mutate(is_walkable = TRUE), 
                   edges_unwalkable %>% mutate(is_walkable = FALSE)) %>% 
  mutate(edge_id = seq(1,nrow(.)))
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 2. Define `getAzimuth()` function.

getAzimuth <- function(line){

  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. Use the `st_line_sample()` function to sample three points at locations 0.48, 0.5, and 0.52 along the line. These points will be used to calculate the azimuth.
  # 2. Use `st_cast()` function to convert the 'MULTIPOINT' object into a 'POINT' object.
  # 3. Extract coordinates using `st_coordinates()`.
  # 4. Assign the coordinates of the midpoint to `mid_p`.
  # 5. Calculate the azimuths from the midpoint in both directions and save them as `mid_azi_1` and `mid_azi_2`, respectively.
  
  # 1-3
  mid_p3 <- line %>%
    sf::st_line_sample(sample = c(0.48, 0.50, 0.52)) %>%
    sf::st_cast("POINT") %>%
    sf::st_coordinates()
  
  # 4
  mid_p <- mid_p3[2, c("X", "Y")]
  
  # 5
  mid_azi_1 <- geosphere::bearing(
    p1 = c(mid_p["X"], mid_p["Y"]),
    p2 = c(mid_p3[1, "X"], mid_p3[1, "Y"])
  )
  
  mid_azi_2 <- geosphere::bearing(
    p1 = c(mid_p["X"], mid_p["Y"]),
    p2 = c(mid_p3[3, "X"], mid_p3[3, "Y"])
  )
  # //TASK //////////////////////////////////////////////////////////////////////
 
  
  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  return(tribble(
    ~type,    ~X,            ~Y,             ~azi,
    "mid1",    mid_p["X"],   mid_p["Y"],      mid_azi_1,
    "mid2",    mid_p["X"],   mid_p["Y"],      mid_azi_2,
  # =========== NO MODIFY ZONE ENDS HERE ========================================
  ))
}

Step 3. Apply the function to all street segments

# TASK ////////////////////////////////////////////////////////////////////////
# Apply getAzimuth() function to all edges.
# Remember that you need to pass edges object to st_geometry() before you apply getAzimuth()
edges_azi <- edges %>%
  sf::st_geometry() %>%
  purrr::map_dfr(getAzimuth)

# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
edges_azi <- edges_azi %>% 
  bind_cols(edges %>% 
              st_drop_geometry() %>% 
              slice(rep(1:nrow(edges),each=2))) %>% 
  st_as_sf(coords = c("X", "Y"), crs = 4326, remove=FALSE) %>% 
  mutate(img_id = seq(1, nrow(.)))
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 4. Define a function that formats request URL and download images.

getImage <- function(iterrow){
  # This function takes one row of `edges_azi` and downloads GSV image using the information from the row.
  
  # TASK ////////////////////////////////////////////////////////////////////////
  type    <- iterrow$type[1]
  location <- paste(iterrow$Y[1], iterrow$X[1], sep = ",")   # lat,lon
  heading  <- iterrow$azi[1]
  edge_id  <- iterrow$edge_id[1]
  img_id   <- iterrow$img_id[1]

  key <- Sys.getenv("GOOGLE_API")
  
  endpoint <- "https://maps.googleapis.com/maps/api/streetview"
  
  request <- glue::glue(
    "{endpoint}?size=640x640&location={location}&heading={heading}&pitch=0&fov=90&key={key}"
  )
  
  fname <- glue::glue(
    "GSV-nid_{img_id}-eid_{edge_id}-type_{type}-Location_{location}-heading_{heading}.jpg"
  )
  
  img_dir <- "gsv_images" 
  fpath   <- file.path(img_dir, fname)
  
  furl <- request
  # //TASK //////////////////////////////////////////////////////////////////////

  
  
  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  # Download images
  if (!file.exists(fpath)){
    download.file(furl, fpath, mode = 'wb') 
  }
  # =========== NO MODIFY ZONE ENDS HERE ========================================
}

Step 5. Download GSV images

getwd()

# create the folder ONCE in project directory
dir.create("gsv_images", showWarnings = FALSE, recursive = TRUE)
setwd("/home/rstudio")  # or the project path RStudio shows at the top
dir.create("gsv_images", showWarnings = FALSE, recursive = TRUE)

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
for (i in seq(1,nrow(edges_azi))){
  getImage(edges_azi[i,])
}
# =========== NO MODIFY ZONE ENDS HERE ========================================

ZIP THE DOWNLOADED IMAGES AND NAME IT ‘gsv_images.zip’ FOR STEP 6.

Step 6. Apply computer vision

Use this Google Colab script to apply the pretrained semantic segmentation model to your GSV images.

Step 7. Merging the processed data back to R

Once all of the images are processed and saved in your Colab session as a CSV file, download the CSV file and merge it back to edges_azi.

seg_output <- readr::read_csv("~/data/seg_output")

## Rows: 440 Columns: 20
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (20): img_id, road, sidewalk, building, wall, fence, pole, traffic light...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# TASK ////////////////////////////////////////////////////////////////////////
# Read the downloaded CSV file containing the semantic segmentation results.
seg_output <- readr::read_csv("~/data/seg_output")

## Rows: 440 Columns: 20
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (20): img_id, road, sidewalk, building, wall, fence, pole, traffic light...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# //TASK ////////////////////////////////////////////////////////////////////////

# TASK ////////////////////////////////////////////////////////////////////////  
# 1. Join the `seg_output` data to `edges_azi`.
# 2. Calculate the proportion of predicted pixels for the following categories: `building`, `sky`, `road`, and `sidewalk`. If there are other categories you are interested in, feel free to include their proportions as well.
# 3. Calculate the proportion of greenness using the `vegetation` and `terrain` categories.
# 4. Calculate the building-to-street ratio. For the street, use `road` and `sidewalk` pixels; including `car` pixels is optional.
edges_seg_output <- edges_azi %>% 
  dplyr::left_join(seg_output, by = "img_id") %>% 
  dplyr::mutate(
    # choose denominator: total pixels for relevant classes
    total_pix = rowSums(dplyr::across(
      dplyr::all_of(c("building", "sky", "road", "sidewalk", "vegetation", "terrain"))
    ), na.rm = TRUE),
    
    prop_building = building  / total_pix,
    prop_sky      = sky       / total_pix,
    prop_road     = road      / total_pix,
    prop_sidewalk = sidewalk  / total_pix,
    
    # 3) greenness = vegetation + terrain
    prop_green = (vegetation + terrain) / total_pix,
    
    # 4) street = road + sidewalk; building-to-street ratio
    street_pix        = road + sidewalk,
    prop_street       = street_pix / total_pix,
    bldg_street_ratio = dplyr::if_else(street_pix > 0,
                                       building / street_pix,
                                       NA_real_)
  )
  
# //TASK ////////////////////////////////////////////////////////////////////////

Section 3. Summarize and analyze the results.

Analysis 1 - Visualize Spatial Distribution

# TASK ////////////////////////////////////////////////////////////////////////
# Plot interactive map(s)
library(leaflet)

walk_pts  <- edges_seg_output %>% dplyr::filter(is_walkable)
unwalk_pts <- edges_seg_output %>% dplyr::filter(!is_walkable)

make_seg_map <- function(df, value_col, title, label_suffix = "%") {
  df2 <- df %>% dplyr::mutate(val = .data[[value_col]])
  
  pal <- colorNumeric("viridis", domain = df2$val, na.color = "transparent")
  
  leaflet(df2) |>
    addProviderTiles(providers$CartoDB.Positron) |>
    addCircleMarkers(
      radius = 5, stroke = FALSE,
      fillOpacity = 0.85,
      fillColor = ~pal(val),
      popup = ~paste0(title, ": ", round(val*100, 1), label_suffix)
    ) |>
    addLegend(
      "bottomright", pal = pal, values = ~val,
      title = title,
      labFormat = labelFormat(suffix = label_suffix)
    )
}

# SIDEWALK
make_seg_map(walk_pts,  "prop_sidewalk", "Sidewalk share Walkable Tract")

make_seg_map(unwalk_pts,"prop_sidewalk", "Sidewalk share Unwalkable Tract")

# GREENNESS
make_seg_map(walk_pts,  "prop_green", "Greenness share Walkable Tract ")

make_seg_map(unwalk_pts,"prop_green", "Greenness share Unwalkable Tract")

# BUILDING-TO-STREET
make_seg_map(walk_pts,  "bldg_street_ratio", "Building–Street Ratio Walkable Tract ", label_suffix="")

make_seg_map(unwalk_pts,"bldg_street_ratio", "Building–Street Ratio Unwalkable Tract", label_suffix="")

# //TASK //////////////////////////////////////////////////////////////////////

Analysis Summary

Across all three visual features—building–street ratio, greenness, and sidewalk share, consistent spatial differences appear between the walkable and unwalkable tracts.

Walkable tract (13121005000):

-Shows much more greenness, with more segments displaying medium to high vegetation proportions.

-Sidewalk coverage is more frequent and more evenly distributed.

-Building–street ratios are lower, meaning the streetscape is less dominated by large building surfaces and has more open, pedestrian-friendly views.

Unwalkable tract (13121002600):

-Displays very low greenness, with many segments close to zero vegetation.

-Sidewalk presence is sparse or minimal, reflecting poor pedestrian infrastructure.

-Building–street ratios are higher, indicating more visually “enclosed” street edges with buildings dominating road views.

Overall, the spatial maps suggest that the walkable tract provides more greenery and safer pedestrian infrastructure, while the unwalkable tract is visually harsher and less pedestrian-oriented.

Analysis 2 - Boxplot

# TASK ////////////////////////////////////////////////////////////////////////
# Create boxplot(s) using ggplot2 package.
library(ggplot2)

## 
## Attaching package: 'ggplot2'

## The following object is masked from 'package:NLP':
## 
##     annotate

library(tidyr)

plot_vars <- c(
  "prop_building","prop_sky","prop_road",
  "prop_sidewalk","prop_green","bldg_street_ratio"
)

edges_long <- edges_seg_output %>%
  st_drop_geometry() %>%
  select(is_walkable, all_of(plot_vars)) %>%
  pivot_longer(cols = plot_vars, names_to = "metric", values_to = "value")

## Warning: Using an external vector in selections was deprecated in tidyselect 1.1.0.
## ℹ Please use `all_of()` or `any_of()` instead.
##   # Was:
##   data %>% select(plot_vars)
## 
##   # Now:
##   data %>% select(all_of(plot_vars))
## 
## See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

ggplot(edges_long, aes(x = is_walkable, y = value, fill = is_walkable)) +
  geom_boxplot(alpha = 0.8) +
  facet_wrap(~ metric, scales = "free_y") +
  scale_fill_manual(values = c("FALSE"="#ff8c69", "TRUE"="#4daf4a"),
                    labels = c("Unwalkable","Walkable")) +
  labs(x = "Tract Type", y = "Proportion / Ratio",
       title = "Distribution of Visual Features by Walkability") +
  theme_minimal()

## Warning: Removed 108 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

# //TASK //////////////////////////////////////////////////////////////////////

Boxplot Interpretation (Distribution Comparison)

The boxplots clearly show distributional differences between walkable and unwalkable tracts:

Greenness (prop_green): Walkable areas have consistently higher greenness, with almost double the median of the unwalkable tract.

Sidewalks (prop_sidewalk): Sidewalk coverage is higher and more common in the walkable tract, while the unwalkable tract has very low and inconsistent sidewalk proportions.

Road Proportion (prop_road): Walkable areas have slightly higher road proportions, possibly reflecting more connected, finer-grained street networks.

Building Proportion: Unwalkable areas have a higher proportion of building pixels, meaning the environment appears more visually constrained and less pedestrian-friendly.

Building–Street Ratio: Unwalkable segments tend to have higher ratios, reinforcing that buildings dominate more of the street-facing view.

Overall, the distributions confirm that walkable segments are greener, less visually enclosed, and offer better sidewalk conditions.

Analysis 3 - Mean Comparison (t-test)

# TASK ////////////////////////////////////////////////////////////////////////
# Perform t-tests and report both the differences in means and their statistical significance.
# As long as you can deliver the message clearly, you can use any format/package you want.

test_vars <- c(
  "prop_building","prop_sky","prop_road",
  "prop_sidewalk","prop_green","bldg_street_ratio"
)

t_results <- purrr::map_dfr(test_vars, function(v){
  df <- edges_seg_output %>%
    st_drop_geometry() %>%
    select(is_walkable, !!sym(v)) %>%
    filter(!is.na(.data[[v]]))
  
  y <- df[[v]]
  g <- df$is_walkable
  
  t_out <- t.test(y ~ g)
  
  tibble::tibble(
    metric = v,
    walkable_mean = mean(y[g], na.rm = TRUE),
    unwalkable_mean = mean(y[!g], na.rm = TRUE),
    difference = walkable_mean - unwalkable_mean,
    p_value = t_out$p.value
  )
})

t_results

## # A tibble: 6 × 5
##   metric            walkable_mean unwalkable_mean difference  p_value
##   <chr>                     <dbl>           <dbl>      <dbl>    <dbl>
## 1 prop_building            0.0609          0.0746   -0.0137  5.44e- 2
## 2 prop_sky                 0.229           0.312    -0.0831  5.34e-14
## 3 prop_road                0.381           0.362     0.0194  8.36e- 3
## 4 prop_sidewalk            0.0347          0.0444   -0.00971 3.28e- 3
## 5 prop_green               0.294           0.207     0.0872  3.24e- 9
## 6 bldg_street_ratio        0.155           0.202    -0.0473  3.21e- 2

# //TASK //////////////////////////////////////////////////////////////////////

Summary

All p-values are statistically significant, meaning these visual differences are not random. All p-values are statistically significant, meaning these visual differences are not random. The strongest difference is greenness (p ≈ 1.1e-09).

Overall, these metrics reveal that the walkable tract provides a more open, greener, and pedestrian-supportive environment.

Final Interpretation

Together, the spatial distribution maps, boxplots, and t-tests paint a consistent picture. The walkable tract (13121005000) exhibits substantially more greenness, higher sidewalk coverage, and lower building–street enclosure, all of which are well-known contributors to walkability. In contrast, the unwalkable tract (13121002600) has more building-dominated streetscapes, minimal vegetation, and weaker sidewalk infrastructure. These differences are statistically significant across nearly all visual categories. The results suggest that environments with vegetation, open sight lines, and pedestrian infrastructure are strongly associated with perceived walkability, while visually dense, low-greenery corridors hinder pedestrian-friendly conditions.