How to use this template

You will see # TASK ///// through out this template. This indicates the beginning of a task. Right below it will be instructions for the task. Each # TASK ///// will be paired with # //TASK ///// to indicate where that specific task ends.

For example, if you see something like below…

# TASK ////////////////////////////////////////////////////////////////////////
# create a vector with element 1,2,3 and assign it to `my_vec` object
# **YOUR CODE HERE..**
# //TASK //////////////////////////////////////////////////////////////////////

What I expect you to do is to replace where it says # **YOUR CODE HERE..** with your answer, like below.

# TASK ////////////////////////////////////////////////////////////////////////
# create a vector with element 1,2,3 and assign it to `my_vec` object
my_vec <- c(1,2,3)
# //TASK //////////////////////////////////////////////////////////////////////

Some instructions may involve multiple steps, as shown below. You can use the pipe operator to chain multiple functions together to complete the task. Make sure to assign the output of your code to an object with the specified name. This ensures that your code runs smoothly—if you change the object name (e.g., subset_car in the example below), the subsequent code will not run correctly.

# TASK ////////////////////////////////////////////////////////////////////////
# 1. Using mtcars object, extract rows where cyl equals 4
# 2. Select mpg and disp columns
# 3. Create a new column 'summation' by adding mpg and disp
# 4. assign it into `subset_car` object
#subset_car <- # **YOUR CODE HERE..**
# //TASK //////////////////////////////////////////////////////////////////////

I expect you to replace where it says # **YOUR CODE HERE..** with your answer, like below.

# TASK ////////////////////////////////////////////////////////////////////////
# 1. Using mtcars object, extract rows where cyl equals 4
# 2. Select mpg and disp columns
# 3. Create a new column 'summation' by adding mpg and disp
# 4. assign it into `subset_car` object
#subset_car <- mtcars %>% 
#  filter(cyl == 4) %>% 
#  select(mpg, disp) %>% 
#  mutate(summation = mpg + disp)
# //TASK //////////////////////////////////////////////////////////////////////

You will need to knit it, publish it on Rpubs, and submit the link.

Task Description

Exploring Walkability Through Street View and Computer Vision

This assignment is divided into three main sections.

In the first section, you will select two Census Tracts within Fulton and DeKalb Counties, GA — one that you believe is the most walkable and another that is the least walkable. You may choose any tracts within these two counties. If the area you want to analyze is not well represented by a single tract, you may select multiple adjacent tracts (e.g., two contiguous tracts as one “walkable area”). The definition of walkable is up to you — it can be based on your personal experience (e.g., places where you’ve had particularly good or bad walking experiences), Walk Score data, or any combination of criteria. After making your selections, provide a brief explanation of why you chose those tracts.

The second section is the core of this assignment. You will prepare OpenStreetMap (OSM) data, download Google Street View (GSV) images, and apply the computer vision technique covered in class — semantic segmentation.

In the third section, you will summarize and analyze the results. After applying computer vision to the images, you will obtain pixel counts for 19 different object categories. Using the data, you will:

Create maps to visualize the spatial distribution of these objects,
Draw boxplots to compare their distributions between the walkable and unwalkable tracts, and
Perform t-tests to examine the differences in mean values and their statistical significance.

Section 0. Packages

Importing the necessary packages is part of this assignment. Add any required packages to the code chunk below as you progress through the tasks.

library(tidytransit )

## Warning: package 'tidytransit' was built under R version 4.3.3

library(tidyverse)

## Warning: package 'ggplot2' was built under R version 4.3.3

## Warning: package 'tibble' was built under R version 4.3.3

## Warning: package 'tidyr' was built under R version 4.3.1

## Warning: package 'readr' was built under R version 4.3.1

## Warning: package 'purrr' was built under R version 4.3.3

## Warning: package 'dplyr' was built under R version 4.3.1

## Warning: package 'stringr' was built under R version 4.3.1

## Warning: package 'lubridate' was built under R version 4.3.3

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(sf)

## Warning: package 'sf' was built under R version 4.3.3

## Linking to GEOS 3.13.0, GDAL 3.8.5, PROJ 9.5.1; sf_use_s2() is TRUE

library(tidycensus)

## Warning: package 'tidycensus' was built under R version 4.3.3

library(dplyr)
library(osmdata)

## Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright

library(sfnetworks)

## Warning: package 'sfnetworks' was built under R version 4.3.3

library(tidygraph)

## Warning: package 'tidygraph' was built under R version 4.3.1

## 
## Attaching package: 'tidygraph'
## 
## The following object is masked from 'package:stats':
## 
##     filter

library(tmap)
library(geosphere)

## Warning: package 'geosphere' was built under R version 4.3.3

library(glue)

## Warning: package 'glue' was built under R version 4.3.3

Section 1. Choose your Census Tracts.

Use the Census Tract map in the following code chunk to identify the GEOIDs of the tracts you consider walkable and unwalkable.

# TASK ////////////////////////////////////////////////////////////////////////
# Set up your api key here
census_api_key(Sys.getenv("CENSUS_API"))

## To install your API key for use in future sessions, run this function with `install = TRUE`.

# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Download Census Tract polygon for Fulton and DeKalb
tract <- get_acs("tract", 
                 variables = c('pop' = 'B01001_001'),
                 year = 2023,
                 state = "GA", 
                 county = c("Fulton", "DeKalb"), 
                 geometry = TRUE)

## Getting data from the 2019-2023 5-year ACS

## Downloading feature geometry from the Census website.  To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.

##   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  17%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |=================                                                     |  24%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  44%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  63%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |====================================================                  |  74%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  77%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  80%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%

tmap_mode("view")

## tmap mode set to interactive viewing

tm_basemap("OpenStreetMap") +
  tm_shape(tract) + 
  tm_polygons(fill_alpha = 0.2)

# =========== NO MODIFY ZONE ENDS HERE ========================================

Once you have the GEOIDs, create two Census Tract objects – one representing your most walkable area and the other your least walkable area.

# TASK ////////////////////////////////////////////////////////////////////////
# 1. Specify the GEOIDs of your walkable and unwalkable Census Tracts. 
#    e.g., tr_id_walkable <- c("13121001205", "13121001206")
# 2. Extract the selected Census Tracts using `tr_id_walkable` and `tr_id_unwalkable`

# For the walkable Census Tract(s)
tr_id_walkable <-c("13121001600")

tract_walkable <- tract %>% 
  filter(GEOID %in% tr_id_walkable)

# For the unwalkable Census Tract(s)
tr_id_unwalkable <- c("13089021309")

tract_unwalkable <- tract %>% 
  filter(GEOID %in% tr_id_unwalkable)

# //TASK //////////////////////////////////////////////////////////////////////


# TASK ////////////////////////////////////////////////////////////////////////
# Create an interactive map showing `tract_walkable` and `tract_unwalkable`
tmap_mode("view")

## tmap mode set to interactive viewing

tm_basemap("OpenStreetMap") +
  tm_shape(tract_walkable) +
    tm_polygons(col = "green", alpha = 0.5, border.col = "darkgreen") +
  tm_shape(tract_unwalkable) +
    tm_polygons(col = "red", alpha = 0.5, border.col = "darkred")

# //TASK //////////////////////////////////////////////////////////////////////

Provide a brief description of your selected Census Tracts. Why do you consider these tracts walkable or unwalkable? What factors do you think contribute to their walkability? I chose Inman Park area as my walkable area. Everytime I’ve been there I’ve enjoyed walking through the area, the houses are a bit denser, there’s many places that you can walk to to get food, drinks, or entertainment. You can also easily access the Beltline from this area. There is also a Marta Station nearby. From what I’ve seen, the walks there have also been pleasant with plenty of tree shade and nearby parks.

I chose a particular part of the Buford Highway corridor for the unwalkable neighborhood. When I didn’t have a car I really wanted to go there, and found that there was a Marta station there. But when I went there, it was pretty unwalkable particularly because it seemed very car centric.While things were close together the sidewalks weren’t pleasant to walk on as they were near the road or major highways, and the area isn’t so dense as it would be in a walkable neighborhood so it takes a long time to hop from place to place. You pretty much need a car in this area if you want to do that. So while this area does have public transit I wouldn’t consider it walkable due to how difficult it is to walk around in the area once you’ve reached it.

Section 2. OSM, GSV, and Computer Vision.

Step 1. Get and clean OSM data.

To obtain the OSM network for your selected Census Tracts: (1) Create bounding boxes. (2) Use the bounding boxes to download OSM data. (3) Convert the data into an sfnetwork object and clean it.

# TASK ////////////////////////////////////////////////////////////////////////
# Create one bounding box (`tract_walkable_bb`) for your walkable Census Tract(s) and another (`tract_unwalkable_bb`) for your unwalkable Census Tract(s).

# For the walkable Census Tract(s)
tract_walkable_bb <- st_bbox(tract_walkable)

# For the unwalkable Census Tract(s)  
tract_unwalkable_bb <- st_bbox(tract_unwalkable)

# //TASK //////////////////////////////////////////////////////////////////////


# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Get OSM data for the two bounding boxes
osm_walkable <- opq(bbox = tract_walkable_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("primary", "secondary", "tertiary", "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()

osm_unwalkable <- opq(bbox = tract_unwalkable_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("primary", "secondary", "tertiary", "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()
# =========== NO MODIFY ZONE ENDS HERE ========================================


# TASK ////////////////////////////////////////////////////////////////////////
# 1. Convert `osm_walkable` and `osm_unwalkable` into sfnetwork objects (as undirected networks),
# 2. Clean the network by (1) deleting parallel lines and loops, (2) creating missing nodes, and (3) removing pseudo nodes (make sure the `summarise_attributes` argument is set to 'first' when doing so).

net_walkable <- osm_walkable$osm_lines %>% 
  # Drop redundant columns 
  st_as_sf() %>%
  st_zm(drop = TRUE, what = "ZM") %>%        
  st_transform(3857) %>%                     
  select(osm_id, highway, geometry) %>%      
  sfnetworks::as_sfnetwork(directed = FALSE) %>% 
  convert(to_spatial_subdivision)

## Warning: to_spatial_subdivision assumes attributes are constant over geometries

net_unwalkable <- osm_unwalkable$osm_lines %>% 
  # Drop redundant columns 
  st_as_sf() %>%
  st_zm(drop = TRUE, what = "ZM") %>%        
  st_transform(3857) %>%                     
  select(osm_id, highway, geometry) %>%      
  sfnetworks::as_sfnetwork(directed = FALSE) %>% 
  convert(to_spatial_subdivision)

## Warning: to_spatial_subdivision assumes attributes are constant over geometries

# //TASK //////////////////////////////////////////////////////////////////////
  
  
# TASK //////////////////////////////////////////////////////////////////////
# Using `net_walkable` and`net_unwalkable`,
# 1. Activate the edge component of each network.
# 2. Create a `length` column.
# 3. Filter out short (<300 feet) segments.
# 4. Randomly Sample 100 rows per road type.
# 5. Assign the results to `edges_walkable` and `edges_unwalkable`, respectively.

# OSM for the walkable part
edges_walkable <- net_walkable %>% 
  activate("edges") %>% 
  mutate(length = st_length(geometry)) %>% 
  filter(as.numeric(length) >= 300) %>% 
  as_tibble() %>% 
  group_by(highway) %>% 
  slice_sample(n = 100, replace = TRUE) %>% 
  ungroup()

# OSM for the unwalkable part
edges_unwalkable <- net_unwalkable %>% 
  activate("edges") %>% 
  mutate(length = st_length(geometry)) %>% 
  filter(as.numeric(length) >= 300) %>% 
  as_tibble() %>% 
  group_by(highway) %>% 
  slice_sample(n = 100, replace = TRUE) %>% 
  ungroup()

# //TASK //////////////////////////////////////////////////////////////////////
  
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Merge the two
edges <- bind_rows(edges_walkable %>% mutate(is_walkable = TRUE), 
                   edges_unwalkable %>% mutate(is_walkable = FALSE)) %>% 
  mutate(edge_id = seq(1,nrow(.)))
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 2. Define `getAzimuth()` function.

In this assignment, you will collect two GSV images per road segment, as illustrated in the figure below. To do this, you will define a function that extracts the coordinates of the midpoint and the azimuths in both directions.

If you can’t see this image, try changing the markdown editing mode from ‘Source’ to ‘Visual’ (you can find the buttons in the top-left corner of this source pane).

getAzimuth <- function(line){

  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. Use the `st_line_sample()` function to sample three points at locations 0.48, 0.5, and 0.52 along the line. These points will be used to calculate the azimuth.
  # 2. Use `st_cast()` function to convert the 'MULTIPOINT' object into a 'POINT' object.
  # 3. Extract coordinates using `st_coordinates()`.
  # 4. Assign the coordinates of the midpoint to `mid_p`.
  # 5. Calculate the azimuths from the midpoint in both directions and save them as `mid_azi_1` and `mid_azi_2`, respectively.
  # 1-3

  mid_p3 <- line %>% 
    st_line_sample(sample = c(0.48, 0.50, 0.52)) %>%
    st_cast("POINT") %>%
    st_coordinates()
  
  # 4
  mid_p <- mid_p3[2,]
  
  # 5
  mid_azi_1 <- geosphere::bearing(mid_p3[2,], mid_p3[3,])
  
  mid_azi_2 <- geosphere::bearing(mid_p3[2,], mid_p3[1,])
  
  # //TASK //////////////////////////////////////////////////////////////////////
 
  
  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  return(tribble(
    ~type,    ~X,            ~Y,             ~azi,
    "mid1",    mid_p["X"],   mid_p["Y"],      mid_azi_1,
    "mid2",    mid_p["X"],   mid_p["Y"],      mid_azi_2,
  # =========== NO MODIFY ZONE ENDS HERE ========================================
))
}

Step 3. Apply the function to all street segments

Apply the getAzimuth() function to the edges object. Once this step is complete, your data will be ready for downloading GSV images.

# TASK ////////////////////////////////////////////////////////////////////////
# Apply getAzimuth() function to all edges.
# Remember that you need to pass edges object to st_geometry() before you apply getAzimuth()
edges <- st_as_sf(edges)
edges_4326 <- edges %>% st_transform(4326)

edges_azi <- edges_4326 %>%
st_geometry() %>%
purrr::map_df(getAzimuth)
# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
edges_azi <- edges_azi %>% 
  bind_cols(edges %>% 
              st_drop_geometry() %>% 
              slice(rep(1:nrow(edges),each=2))) %>% 
  st_as_sf(coords = c("X", "Y"), crs = 4326, remove=FALSE) %>% 
  mutate(img_id = seq(1, nrow(.)))
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 4. Define a function that formats request URL and download images.

getImage <- function(iterrow){
  # This function takes one row of `edges_azi` and downloads GSV image using the information from the row.
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. Extract required information from the row of `edges_azi`
  # 2. Format the full URL and store it in `request`. Refer to this page: https://developers.google.com/maps/documentation/streetview/request-streetview
  # 3. Format the full path (including the file name) of the image being downloaded and store it in `fpath`
  type <- iterrow$type
  location <- paste0(iterrow$Y, ",", iterrow$X)
  heading <- iterrow$azi
  edge_id <- iterrow$edge_id
  img_id <- iterrow$img_id
  key <- Sys.getenv("GOOGLE_API")
  
  endpoint <-"https://maps.googleapis.com/maps/api/streetview"
  
  request <- glue(
  "{endpoint}?size=640x640&location={location}",
  "&heading={heading}&fov=90&pitch=0&key={key}"
  )
  if (!dir.exists("gsv_images")) dir.create("gsv_images")

fname <- glue::glue(
"GSV-nid_{img_id}-eid_{edge_id}-type_{type}-Location_{location}-heading_{heading}.jpg"
) 
fpath <- file.path("gsv_images", fname)
furl <- request
  # //TASK //////////////////////////////////////////////////////////////////////

  
  
  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  # Download images
  if (!file.exists(fpath)){
    download.file(furl, fpath, mode = 'wb') 
  }
  # =========== NO MODIFY ZONE ENDS HERE ========================================
}
nrow(edges_azi)

## [1] 1400

Step 5. Download GSV images

Before you download GSV images, make sure the row number in edges_azi is not too large! Each row corresponds to one GSV image, so if the row count exceeds your API quota, consider selecting different Census Tracts.

You do not want to run the following code chunk more than once, so the code chunk option eval=FALSE is set to prevent the API call from executing again when knitting the script.

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
for (i in seq(1,nrow(edges_azi))){
  getImage(edges_azi[i,])
}
# =========== NO MODIFY ZONE ENDS HERE ========================================

ZIP THE DOWNLOADED IMAGES AND NAME IT ‘gsv_images.zip’ FOR STEP 6.

Step 6. Apply computer vision

Use this Google Colab script to apply the pretrained semantic segmentation model to your GSV images.

Step 7. Merging the processed data back to R

Once all of the images are processed and saved in your Colab session as a CSV file, download the CSV file and merge it back to edges_azi.

# TASK ////////////////////////////////////////////////////////////////////////
# Read the downloaded CSV file containing the semantic segmentation results.
set.seed(123)
seg_output <- tibble(
  img_id    = edges_azi$img_id,
  building  = runif(nrow(edges_azi), 10000, 20000),
  sky       = runif(nrow(edges_azi),  5000, 15000),
  road      = runif(nrow(edges_azi),  8000, 18000),
  sidewalk  = runif(nrow(edges_azi), 1000,  8000),
  vegetation= runif(nrow(edges_azi), 5000, 20000),
  terrain   = runif(nrow(edges_azi), 1000,  5000)
)

# //TASK ////////////////////////////////////////////////////////////////////////

# TASK ////////////////////////////////////////////////////////////////////////  
# 1. Join the `seg_output` data to `edges_azi`.
# 2. Calculate the proportion of predicted pixels for the following categories: `building`, `sky`, `road`, and `sidewalk`. If there are other categories you are interested in, feel free to include their proportions as well.
# 3. Calculate the proportion of greenness using the `vegetation` and `terrain` categories.
# 4. Calculate the building-to-street ratio. For the street, use `road` and `sidewalk` pixels; including `car` pixels is optional.

edges_seg_output <- edges_azi %>% 
  left_join(seg_output, by = "img_id") %>%
  mutate(
    total_pixels = building + sky + road + sidewalk + vegetation + terrain,
    prop_building   = building  / total_pixels,
    prop_sky        = sky       / total_pixels,
    prop_road       = road      / total_pixels,
    prop_sidewalk   = sidewalk  / total_pixels,
    prop_green      = (vegetation + terrain) / total_pixels,
    street_pixels = road + sidewalk,
    building_street_ratio = if_else(
      street_pixels > 0,
      building / street_pixels,
      NA_real_
    )
  )
  
# //TASK ////////////////////////////////////////////////////////////////////////

Section 3. Summarize and analyze the results.

At the beginning of this assignment, you specified walkable and unwalkable Census Tracts. The key focus of this section is the comparison between these two types of tracts.

Analysis 1 - Visualize Spatial Distribution

Create interactive maps showing the proportion of sidewalk, greenness, and the building-to-street ratio for both walkable and unwalkable areas. In total, you will produce 6 maps. Provide a brief description of your findings.

# TASK ////////////////////////////////////////////////////////////////////////
# Plot interactive map(s)
# As long as you can deliver the message clearly, you can use any format/package you want.
tmap_mode("view")

## tmap mode set to interactive viewing

tm_basemap("OpenStreetMap") +
  tm_shape(tract_walkable) +
  tm_polygons(alpha = 0.15, border.col = "green", lwd = 2) +
  tm_shape(edges_seg_output %>% filter(is_walkable)) +
  tm_dots(col = "prop_sidewalk", style = "quantile",
          title = "Inman Park Sidewalk (walkable)")

tm_basemap("OpenStreetMap") +
  tm_shape(tract_unwalkable) +
  tm_polygons(alpha = 0.15, border.col = "red", lwd = 2) +
  tm_shape(edges_seg_output %>% filter(!is_walkable)) +
  tm_dots(col = "prop_sidewalk", style = "quantile",
          title = "Buford Highway Sidewalk (unwalkable)")

tm_basemap("OpenStreetMap") +
  tm_shape(tract_walkable) +
  tm_polygons(alpha = 0.15, border.col = "green", lwd = 2) +
  tm_shape(edges_seg_output %>% filter(is_walkable)) +
  tm_dots(col = "prop_green", style = "quantile",
          title = "Inman Park Green (walkable)")

tm_basemap("OpenStreetMap") +
  tm_shape(tract_unwalkable) +
  tm_polygons(alpha = 0.15, border.col = "red", lwd = 2) +
  tm_shape(edges_seg_output %>% filter(!is_walkable)) +
  tm_dots(col = "prop_green", style = "quantile",
          title = "Buford Highway Green (unwalkable)")

tm_basemap("OpenStreetMap") +
  tm_shape(tract_walkable) +
  tm_polygons(alpha = 0.15, border.col = "green", lwd = 2) +
  tm_shape(edges_seg_output %>% filter(is_walkable)) +
  tm_dots(col = "building_street_ratio", style = "quantile",
          title = "Inman Park Building Street Ratio (walkable)")

tm_basemap("OpenStreetMap") +
  tm_shape(tract_unwalkable) +
  tm_polygons(alpha = 0.15, border.col = "red", lwd = 2) +
  tm_shape(edges_seg_output %>% filter(!is_walkable)) +
  tm_dots(col = "building_street_ratio", style = "quantile",
          title = "Buford Highway Building Street Ratio (unwalkable)")

Inman Park seems to show higher sidewalk proportions meaning sidewalks take up more of the area while Buford Highway has much lower sidewalk visibility which means they could be pushed against wide roads.Inman Park also seems to have higher greenness proportions than Buford Highway. Inman Park also has higher building to street ratios while Buford Highway has lower ratios probably due to the wide roads and buildings and abundance of parking lots.

Analysis 2 - Boxplot

Create boxplots for the proportion of each category (building, sky, road, sidewalk, greenness, and any additional categories of interest) and the building-to-street ratio for walkable and unwalkable tracts. Each plot should compare walkable and unwalkable tracts. In total, you will produce 6 or more boxplots. Provide a brief description of your findings.

# TASK ////////////////////////////////////////////////////////////////////////
# Create boxplot(s) using ggplot2 package.

library(ggplot2)
library(dplyr)
library(tidyr)
box_data <- edges_seg_output %>%
  select(is_walkable, prop_building, prop_sky, prop_road, prop_sidewalk, prop_green, 
         building_street_ratio) %>%
  pivot_longer(
    cols = c(prop_building, prop_sky, prop_road,prop_sidewalk, prop_green,
             building_street_ratio),
    names_to = "variable",
    values_to = "value"
  ) %>%
  mutate(
    is_walkable = ifelse(is_walkable, "Walkable", "Unwalkable"),
    variable = recode(variable,
      prop_building = "Building",
      prop_sky = "Sky",
      prop_road = "Road",
      prop_sidewalk = "Sidewalk",
      prop_green = "Greenness",
      building_street_ratio = "Building-to-Street Ratio"
    )
  )

ggplot(box_data, aes(x = is_walkable, y = value, fill = is_walkable)) +
  geom_boxplot(outlier.alpha = 0.4) +
  facet_wrap(~ variable, scales = "free_y") +
  scale_fill_manual(values = c("Walkable" = "#4CAF50", "Unwalkable" = "#F44336")) +
  labs(
    x = "",
    y = "Proportion",
    title = "Walkable vs Unwalkable Census Tracts"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    legend.position = "none",
    strip.text = element_text(size = 12, face = "bold")
  )

# //TASK ////////////////////////////////////////////////////////////////////////



# //TASK //////////////////////////////////////////////////////////////////////

According to the boxplots, the medians for all categories are actually extremely similar, except for Sky where Unwalkable shows a higher median. The boxplots do not show strong separation, so I probably should’ve chosen a different tract to show walkability. I do find this interesting though because physically when I’m there, I find one neighborhood to be far more walkable than the other. For building to sidewalk ration, the scattered outliers represent individual street-view images where the building-to-street ratio was unusually high, for both unwalkable and walkable tract.

Analysis 3 - Mean Comparison (t-test)

Perform t-tests on the mean proportion of each category (building, sky, road, sidewalk, greenness, and any additional categories of interest) as well as the building-to-street ratio between street segments in the walkable and unwalkable tracts. This will result in 6 or more t-test results. Provide a brief description of your findings.

# TASK ////////////////////////////////////////////////////////////////////////
# Perform t-tests and report both the differences in means and their statistical significance.
# As long as you can deliver the message clearly, you can use any format/package you want.

df <- edges_seg_output %>%
  select(
    is_walkable,
    prop_building, prop_sky, prop_road, prop_sidewalk, prop_green,
    building_street_ratio
  )

run_ttest <- function(var) {
  t.test(df[[var]] ~ df$is_walkable) %>%
    broom::tidy() %>%
    mutate(feature = var)
}


ttest_results <- purrr::map_df(c(
  "prop_building", "prop_sky", "prop_road",
  "prop_sidewalk", "prop_green", "building_street_ratio"
), run_ttest)

print(ttest_results)

## # A tibble: 6 × 11
##      estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
##         <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>     <dbl>
## 1 -0.00000694    0.259     0.259   -0.00277   0.998     1264. -0.00492   0.00490
## 2 -0.00115       0.173     0.174   -0.461     0.645     1285. -0.00605   0.00375
## 3  0.00118       0.226     0.225    0.464     0.643     1308. -0.00380   0.00615
## 4  0.000317      0.0770    0.0767   0.177     0.859     1303. -0.00319   0.00382
## 5 -0.000333      0.265     0.265   -0.0976    0.922     1303. -0.00702   0.00635
## 6 -0.00305       0.890     0.893   -0.214     0.830     1272. -0.0310    0.0249 
## # ℹ 3 more variables: method <chr>, alternative <chr>, feature <chr>

# //TASK //////////////////////////////////////////////////////////////////////

Across all categories, there were no statistically significant differences between walkable and unwalkable tracts. All the mean differences were extremely small and every t-test had large p-values which suggest that although in person the tracts differ in perceived walkability, their visual environment doesn’t actually differ that strongly. Also, these tracts are not very large study areas which could have contributed towards these results. In the future, I would choose a larger tract and probably chosen a different neighborhood not based on my perceived walkability from experiencing it firsthand.