Task Description

Exploring Walkability Through Street View and Computer Vision

This assignment is divided into three main sections.

In the first section, you will select two Census Tracts within Fulton and DeKalb Counties, GA — one that you believe is the most walkable and another that is the least walkable. You may choose any tracts within these two counties. If the area you want to analyze is not well represented by a single tract, you may select multiple adjacent tracts (e.g., two contiguous tracts as one “walkable area”). The definition of walkable is up to you — it can be based on your personal experience (e.g., places where you’ve had particularly good or bad walking experiences), Walk Score data, or any combination of criteria. After making your selections, provide a brief explanation of why you chose those tracts.

The second section is the core of this assignment. You will prepare OpenStreetMap (OSM) data, download Google Street View (GSV) images, and apply the computer vision technique covered in class — semantic segmentation.

In the third section, you will summarize and analyze the results. After applying computer vision to the images, you will obtain pixel counts for 19 different object categories. Using the data, you will:

Create maps to visualize the spatial distribution of these objects,
Draw boxplots to compare their distributions between the walkable and unwalkable tracts, and
Perform t-tests to examine the differences in mean values and their statistical significance.

Section 0. Packages

Importing the necessary packages is part of this assignment. Add any required packages to the code chunk below as you progress through the tasks.

library(tidyverse)
library(magrittr)
library(sf)
library(tmap)
library(here)
library(osmdata)
library(sfnetworks)
library(units)
library(tidygraph)
library(progress)
library(nominatimlite)
library(sfnetworks)
library(geosphere)
library(fs)
library(tidycensus)
ttm()

Section 1. Choose your Census Tracts.

Use the Census Tract map in the following code chunk to identify the GEOIDs of the tracts you consider walkable and unwalkable.

# TASK ////////////////////////////////////////////////////////////////////////
# Set up your api key here
census_api_key(
  Sys.getenv("CENSUS_API_KEY")
)
# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Download Census Tract polygon for Fulton and DeKalb
tract <- get_acs("tract", 
                 variables = c('pop' = 'B01001_001'),
                 year = 2023,
                 state = "GA", 
                 county = c("Fulton", "DeKalb"), 
                 geometry = TRUE)

tmap_mode("view")
tm_basemap("OpenStreetMap") +
  tm_shape(tract) + 
  tm_polygons(fill_alpha = 0.2)
# =========== NO MODIFY ZONE ENDS HERE ========================================

Once you have the GEOIDs, create two Census Tract objects – one representing your most walkable area and the other your least walkable area.

# TASK ////////////////////////////////////////////////////////////////////////
# 1. Specify the GEOIDs of your walkable and unwalkable Census Tracts. 
#    e.g., tr_id_walkable <- c("13121001205", "13121001206")
# 2. Extract the selected Census Tracts using `tr_id_walkable` and `tr_id_unwalkable`

# For the walkable Census Tract(s)
tr_id_walkable <- c("13121001002", "13089022501")

tract_walkable <- tract %>%
  filter(GEOID %in% tr_id_walkable)

# For the unwalkable Census Tract(s)
tr_id_unwalkable <- c("13121012000", "13089022403")

tract_unwalkable <- tract %>%
  filter(GEOID %in% tr_id_unwalkable)

# //TASK //////////////////////////////////////////////////////////////////////


# TASK ////////////////////////////////////////////////////////////////////////
# Create an interactive map showing `tract_walkable` and `tract_unwalkable`
tmap_mode("view")

# 2. Create the map
tm_shape(tract_walkable) +
  tm_polygons(
    col = "green",
    alpha = 0.6,
    border.col = "black",
    id = "GEOID",
    popup.vars = TRUE,
    title = "Walkable Tracts"
  ) +
tm_shape(tract_unwalkable) +
  tm_polygons(
    col = "red",
    alpha = 0.6,
    border.col = "black",
    id = "GEOID",
    popup.vars = TRUE,
    title = "Unwalkable Tracts"
  ) +
tm_layout(
  legend.outside = TRUE,
  main.title = "Walkable vs. Unwalkable Census Tracts",
  main.title.size = 1.2
)

# //TASK //////////////////////////////////////////////////////////////////////

I selected two census tract from Fulton county and DeKalb county that are walkable and unwalkable based on my lived experiences. In Fulton County, I picked the walkable tract around Georgia Tech campus which, from my lived experience, I can tell is very walkable and also from university perspective, we know that university campuses are usually walkable as many students tend to live within walking/biking distance. The unwalkable census tract that I picked is a bit down south of dowtown area where I see a lot of roadway intersection and the I-75 HOV lane which is indicate of this area because mostly accessible by automobiles and not many pedestrians.

In DeKalb County, I picked the walkable tract around downtown Decatur which, from my lived experience, I can tell is very walkable and as I see look up this area on Google Maps, I can see many shops/cafes close to each other which is also indicative that the area is walkable. The unwalkable census tract that I picked is left to the dowtown area of Decatur where I see a lot of roadways and very less density of streets and roads that can potentially make it unwalkable.

Section 2. OSM, GSV, and Computer Vision.

Step 1. Get and clean OSM data.

To obtain the OSM network for your selected Census Tracts: (1) Create bounding boxes. (2) Use the bounding boxes to download OSM data. (3) Convert the data into an sfnetwork object and clean it.

# TASK ////////////////////////////////////////////////////////////////////////
# Create one bounding box (`tract_walkable_bb`) for your walkable Census Tract(s) and another (`tract_unwalkable_bb`) for your unwalkable Census Tract(s).

# For the walkable Census Tract(s)
tract_walkable_bb <- st_bbox(tract_walkable)

# For the unwalkable Census Tract(s)  
tract_unwalkable_bb <- st_bbox(tract_unwalkable)

tract_walkable_bb_poly <- st_as_sfc(tract_walkable_bb)
tract_unwalkable_bb_poly <- st_as_sfc(tract_unwalkable_bb)

tmap_mode("view")

tm_shape(tract_walkable) + tm_polygons(col = "green", alpha = 0.6) +
tm_shape(tract_unwalkable) + tm_polygons(col = "red", alpha = 0.6) +
tm_shape(tract_walkable_bb_poly) + tm_borders(col = "darkgreen", lwd = 3) +
tm_shape(tract_unwalkable_bb_poly) + tm_borders(col = "darkred", lwd = 3)

# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Get OSM data for the two bounding boxes
osm_walkable <- opq(bbox = tract_walkable_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("primary", "secondary", "tertiary", "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()

osm_unwalkable <- opq(bbox = tract_unwalkable_bb) %>%
  add_osm_feature(key = 'highway', 
                  value = c("primary", "secondary", "tertiary", "residential")) %>%
  osmdata_sf() %>% 
  osm_poly2line()
# =========== NO MODIFY ZONE ENDS HERE ========================================


# TASK ////////////////////////////////////////////////////////////////////////
# 1. Convert `osm_walkable` and `osm_unwalkable` into sfnetwork objects (as undirected networks),
# 2. Clean the network by (1) deleting parallel lines and loops, (2) creating missing nodes, and (3) removing pseudo nodes (make sure the `summarise_attributes` argument is set to 'first' when doing so).

net_walkable <- osm_walkable$osm_lines %>% 
  # Drop redundant columns 
  select(osm_id, highway) %>% 
  as_sfnetwork(directed = FALSE) %>%
  activate("edges") %>%  
  filter(!edge_is_multiple()) %>%
  filter(!edge_is_loop()) %>%
  convert(to_spatial_smooth) %>%          
  convert(to_spatial_simple, summarise_attributes = "first")

net_unwalkable <- osm_unwalkable$osm_lines %>% 
  # Drop redundant columns 
  select(osm_id, highway) %>% 
  as_sfnetwork(directed = FALSE) %>%
  activate("edges") %>%  
  filter(!edge_is_multiple()) %>%
  filter(!edge_is_loop()) %>%
  convert(to_spatial_smooth) %>%          
  convert(to_spatial_simple, summarise_attributes = "first")
  
# //TASK //////////////////////////////////////////////////////////////////////
  
  
# TASK //////////////////////////////////////////////////////////////////////
# Using `net_walkable` and`net_unwalkable`,
# 1. Activate the edge component of each network.
# 2. Create a `length` column.
# 3. Filter out short (<300 feet) segments.
# 4. Randomly Sample 100 rows per road type.
# 5. Assign the results to `edges_walkable` and `edges_unwalkable`, respectively.

# OSM for the walkable part
edges_walkable <- net_walkable %>%
  activate("edges") %>%
  mutate(
    length = as.numeric(st_length(.)) 
  ) %>%
  filter(length >= 300) %>%
  as_tibble() %>%                             # convert edges to a tibble for sampling
  group_by(highway) %>%
  slice_sample(n = 100, replace = TRUE) %>%
  ungroup()

# OSM for the unwalkable part
edges_unwalkable <- net_unwalkable %>%
  activate("edges") %>%
  mutate(
    length = as.numeric(st_length(.)) 
  ) %>%
  filter(length >= 300) %>%
  as_tibble() %>%                             # convert edges to a tibble for sampling
  group_by(highway) %>%
  slice_sample(n = 100, replace = TRUE) %>%
  ungroup()

# //TASK //////////////////////////////////////////////////////////////////////
  
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Merge the two
edges <- bind_rows(edges_walkable %>% mutate(is_walkable = TRUE), 
                   edges_unwalkable %>% mutate(is_walkable = FALSE)) %>% 
  mutate(edge_id = seq(1,nrow(.)))
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 2. Define `getAzimuth()` function.

In this assignment, you will collect two GSV images per road segment, as illustrated in the figure below. To do this, you will define a function that extracts the coordinates of the midpoint and the azimuths in both directions.

getAzimuth <- function(line){

  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. Use the `st_line_sample()` function to sample three points at locations 0.48, 0.5, and 0.52 along the line. These points will be used to calculate the azimuth.
  # 2. Use `st_cast()` function to convert the 'MULTIPOINT' object into a 'POINT' object.
  # 3. Extract coordinates using `st_coordinates()`.
  # 4. Assign the coordinates of the midpoint to `mid_p`.
  # 5. Calculate the azimuths from the midpoint in both directions and save them as `mid_azi_1` and `mid_azi_2`, respectively.
  
  # 1-3
  mid_p3 <- line %>% 
    st_line_sample(sample = c(0.48, 0.5, 0.52)) %>% 
    st_cast("POINT") %>%
    st_coordinates()
  # 4
  mid_p <- mid_p3[2, ]
  
  # 5
  mid_azi_1 <- geosphere::bearing(mid_p3[2, ], mid_p3[3, ])
  
  mid_azi_2 <- geosphere::bearing(mid_p3[2, ], mid_p3[1, ])
  
  # //TASK //////////////////////////////////////////////////////////////////////
 
  
  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  return(tribble(
    ~type,    ~X,            ~Y,             ~azi,
    "mid1",    mid_p["X"],   mid_p["Y"],      mid_azi_1,
    "mid2",    mid_p["X"],   mid_p["Y"],      mid_azi_2))
  # =========== NO MODIFY ZONE ENDS HERE ========================================

}

Step 3. Apply the function to all street segments

Apply the getAzimuth() function to the edges object. Once this step is complete, your data will be ready for downloading GSV images.

# TASK ////////////////////////////////////////////////////////////////////////
# Apply getAzimuth() function to all edges.
# Remember that you need to pass edges object to st_geometry() before you apply getAzimuth()
edges_azi <- edges %>% 
  st_geometry() %>% 
  map_df(getAzimuth, .progress = T)

# //TASK //////////////////////////////////////////////////////////////////////

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
edges_azi <- edges_azi %>% 
  bind_cols(edges %>% 
              st_drop_geometry() %>% 
              slice(rep(1:nrow(edges),each=2))) %>% 
  st_as_sf(coords = c("X", "Y"), crs = 4326, remove=FALSE) %>% 
  mutate(img_id = seq(1, nrow(.)))
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 4. Define a function that formats request URL and download images.

getImage <- function(iterrow){
  # This function takes one row of `edges_azi` and downloads GSV image using the information from the row.
  
  # TASK ////////////////////////////////////////////////////////////////////////
  # 1. Extract required information from the row of `edges_azi`
  # 2. Format the full URL and store it in `request`. Refer to this page: https://developers.google.com/maps/documentation/streetview/request-streetview
  # 3. Format the full path (including the file name) of the image being downloaded and store it in `fpath`
  type = iterrow$type
  location <- paste0(iterrow$Y %>% round(5), ",", iterrow$X %>% round(5))
  heading <- iterrow$azi %>% round(1)
  edge_id <- iterrow$edge_id
  img_id <- iterrow$img_id      # use the img_id you just created
  highway <- iterrow$highway
  key <- Sys.getenv('GOOGLE_API_KEY')
  
  endpoint <- "https://maps.googleapis.com/maps/api/streetview"
  
  request <- glue::glue("{endpoint}?size=640x640&location={location}&heading={heading}&fov=90&pitch=0&key={key}")
  fname <- glue::glue("GSV-nid_{img_id}-eid_{edge_id}-type_{type}-Location_{location}-heading_{heading}.jpg") # Don't change this code for fname
  fpath <- file.path("GSV_images", fname)
  # //TASK //////////////////////////////////////////////////////////////////////

  # =========== NO MODIFICATION ZONE STARTS HERE ===============================
  # Download images
  if (!file.exists(fpath)){
    download.file(request, fpath, mode = 'wb') 
  }
  # =========== NO MODIFY ZONE ENDS HERE ========================================
}

Step 5. Download GSV images

Before you download GSV images, make sure the row number in edges_azi is not too large! Each row corresponds to one GSV image, so if the row count exceeds your API quota, consider selecting different Census Tracts.

# =========== NO MODIFICATION ZONE STARTS HERE ===============================
for (i in seq(1,nrow(edges_azi))){
  getImage(edges_azi[i,])
}
# =========== NO MODIFY ZONE ENDS HERE ========================================

Step 6. Apply computer vision

Use this Google Colab script to apply the pretrained semantic segmentation model to your GSV images.

Step 7. Merging the processed data back to R

Once all of the images are processed and saved in your Colab session as a CSV file, download the CSV file and merge it back to edges_azi.

# TASK ////////////////////////////////////////////////////////////////////////
# Read the downloaded CSV file containing the semantic segmentation results.
seg_output <- read.csv("seg_output.csv")
# //TASK ////////////////////////////////////////////////////////////////////////

# TASK ////////////////////////////////////////////////////////////////////////  
# 1. Join the `seg_output` data to `edges_azi`.
# 2. Calculate the proportion of predicted pixels for the following categories: `building`, `sky`, `road`, and `sidewalk`. If there are other categories you are interested in, feel free to include their proportions as well.
# 3. Calculate the proportion of greenness using the `vegetation` and `terrain` categories.
# 4. Calculate the building-to-street ratio. For the street, use `road` and `sidewalk` pixels; including `car` pixels is optional.

edges_seg_output <- edges_azi %>%
  inner_join(seg_output, by= "img_id")

#percentages of building, sky, road, and sidewalk, greenness 
edges_seg_output %<>% 
  mutate(pct_building = building/(768*768),
         pct_sky = sky/(768*768),
         pct_road = road/(768*768),
         pct_sidewalk = sidewalk/(768*768),
         prop_greenness = (vegetation + terrain) / (768*768),
         building_street_ratio = building / (road + sidewalk))
  
# //TASK ////////////////////////////////////////////////////////////////////////

Section 3. Summarize and analyze the results.

At the beginning of this assignment, you specified walkable and unwalkable Census Tracts. The key focus of this section is the comparison between these two types of tracts.

Analysis 1 - Visualize Spatial Distribution

Create interactive maps showing the proportion of sidewalk, greenness, and the building-to-street ratio for both walkable and unwalkable areas. In total, you will produce 6 maps. Provide a brief description of your findings.

# TASK ////////////////////////////////////////////////////////////////////////
# Plot interactive map(s)
# tmap interactive mode

# Split into walkable and unwalkable
walkable_areas <- edges_seg_output %>% filter(is_walkable == TRUE)
unwalkable_areas <- edges_seg_output %>% filter(is_walkable == FALSE)

# tmap interactive mode
tmap_mode("view")

# Map
#Greeness proportion — walkable
tm_basemap("OpenStreetMap") +
  tm_shape(edges_seg_output %>% filter(is_walkable == TRUE)) +
  tm_dots(size = 0.7,
          fill = "prop_greenness", 
          fill.scale = tm_scale(values = c("darkgray", "yellow", "green", "darkgreen")))

#Greeness proportion — unwalkable
tm_basemap("OpenStreetMap") +
  tm_shape(edges_seg_output %>% filter(is_walkable == FALSE)) +
  tm_dots(
    size = 0.7,
    fill = "prop_greenness",
    fill.scale = tm_scale(values = c("darkgray", "yellow", "green", "darkgreen"))
  )

#Sidewalk proportion — walkable
tm_basemap("OpenStreetMap") +
  tm_shape(edges_seg_output %>% filter(is_walkable == TRUE)) +
  tm_dots(size = 0.7,
          col = "pct_sidewalk",        
          palette = c("#ffffd4", "#fed98e", "#fe9929", "#cc4c02"),
          title = "Sidewalk %")

#Sidewalk proportion — unwalkable
tm_basemap("OpenStreetMap") +
  tm_shape(edges_seg_output %>% filter(is_walkable == FALSE)) +
  tm_dots(size = 0.7,
          col = "pct_sidewalk",         
          palette = c("#ffffd4", "#fed98e", "#fe9929", "#cc4c02"),
          title = "Sidewalk %")

tm_basemap("OpenStreetMap") +
  tm_shape(edges_seg_output %>% filter(is_walkable == TRUE)) +
  tm_dots(
    size = 0.7,
    col = "building_street_ratio",
    style = "fixed",
    breaks = c(0, 0.25, 0.5, 0.75, 1), 
    palette = c("#EED8AE", "#C4A484", "#8B5A2B", "brown"),
    title = "Building-Street Ratio"
  )

#Building street proportion — unwalkable
tm_basemap("OpenStreetMap") +
  tm_shape(edges_seg_output %>% filter(is_walkable == FALSE)) +
  tm_dots(
    size = 0.7,
    col = "building_street_ratio",
    style = "fixed",
    breaks = c(0, 0.25, 0.5, 0.75, 1), 
    palette = c("#EED8AE", "#C4A484", "#8B5A2B", "brown"),
    title = "Building-Street Ratio"
  )

# //TASK //////////////////////////////////////////////////////////////////////

From these plots, we can observe the following: 1. The proportion of greenness increases we move out of main city center/ midtown areas to suburbs such as Decatur in this case. This is expected as city centers/ midtown areas usually have more skyscrapers. While I considered Georgia Tech census tract to be a walkable area, I observe more greenness in walkable areas of Decatur. It will be interesting to explore further the causality of such a relationship. For unwalkable areas, we observe a similar relationship, i.e. less greenness in city centers and more as we move to the suburbs. 2. Sidewalks: A bit puzzled with these results as it picks very small percentage of sidewalks in walkable and unwalkable areas. I wonder if the model is trained well to detect sidewalks. I believe we saw this in class lecture as well where the model did not do quite well with detecting sidewalks. 3. Building-to-street ratio. The building to street ratio seems to be much higher in midtown area near Georgia Tech. We expect this to be the case as city center areas are a lot more dense and there are less open spaces and more area covered by retail, offices, etc.

Analysis 2 - Boxplot

Create boxplots for the proportion of each category (building, sky, road, sidewalk, greenness, and any additional categories of interest) and the building-to-street ratio for walkable and unwalkable tracts. Each plot should compare walkable and unwalkable tracts. In total, you will produce 6 or more boxplots. Provide a brief description of your findings.

# TASK ////////////////////////////////////////////////////////////////////////
# Create boxplot(s) using ggplot2 package.

# Create boxplots with walkability on x-axis
edges_seg_output %>%
  pivot_longer(
    cols = c(building_street_ratio, prop_greenness, pct_building, 
             pct_sky, pct_road, pct_sidewalk), 
    names_to = 'variable', 
    values_to = 'value'
  ) %>%
  ggplot(aes(x = is_walkable, y = value, fill = is_walkable)) +
  geom_boxplot(alpha = 0.7, outlier.size = 0.7, color = "gray30") +
  scale_fill_manual(values = c("TRUE" = "#91cf60", "FALSE" = "#fc8d59"),
                    labels = c("Unwalkable", "Walkable")) +
  facet_wrap(~variable, scales = "free_y", nrow = 2) +
  theme_bw(base_size = 13) +
  labs(
    title = "Distribution of Built Environment and Visual Features by Walkability",
    x = "Walkability",
    y = "Proportion / Ratio",
    fill = "Walkability"
  ) +
  theme(
    legend.position = "none",
    strip.background = element_rect(fill = "#f0f0f0", color = "gray70"),
    strip.text = element_text(face = "bold"),
    axis.text.x = element_text(angle = 0, hjust = 0.5)
  )

# //TASK //////////////////////////////////////////////////////////////////////

We can observe that for unwalkable tracts, the median is higher for sidewalks than for walkable areas. This is a bit puzzling as we would expect the percentage of sidewalks to be higher for walkable areas. Multiple factors could have contributed to this analysis. On the other hand, we observe that the median of proportion of greenness in walkable areas is higher than proportion of greenness in unwalkable areas which is an expected result. For sky, we can observe that the median of unwalkable areas is higher than that of walkable areas. I am assuming this is expected as more sky cover could mean less shade and heat making an area less walkable. The other variables, i.e. building and road, building to street ratio are quite similar for walkable and unwalkable areas. We would have to do some more robust statistical tests to understand if the difference is significant.

Analysis 3 - Mean Comparison (t-test)

Perform t-tests on the mean proportion of each category (building, sky, road, sidewalk, greenness, and any additional categories of interest) as well as the building-to-street ratio between street segments in the walkable and unwalkable tracts. This will result in 6 or more t-test results. Provide a brief description of your findings.

# TASK ////////////////////////////////////////////////////////////////////////
# Perform t-tests and report both the differences in means and their statistical significance.
# As long as you can deliver the message clearly, you can use any format/package you want.
# Filter values for building proportion

walkable_building <- edges_seg_output %>% filter(is_walkable == TRUE) %>% pull(pct_building)
unwalkable_building <- edges_seg_output %>% filter(is_walkable == FALSE) %>% pull(pct_building)

walkable_sky <- edges_seg_output %>% filter(is_walkable == TRUE) %>% pull(pct_sky)
unwalkable_sky <- edges_seg_output %>% filter(is_walkable == FALSE) %>% pull(pct_sky)

walkable_road <- edges_seg_output %>% filter(is_walkable == TRUE) %>% pull(pct_road)
unwalkable_road <- edges_seg_output %>% filter(is_walkable == FALSE) %>% pull(pct_road)

walkable_sidewalk <- edges_seg_output %>% filter(is_walkable == TRUE) %>% pull(pct_sidewalk)
unwalkable_sidewalk <- edges_seg_output %>% filter(is_walkable == FALSE) %>% pull(pct_sidewalk)

walkable_greenness <- edges_seg_output %>% filter(is_walkable == TRUE) %>% pull(prop_greenness)
unwalkable_greenness <- edges_seg_output %>% filter(is_walkable == FALSE) %>% pull(prop_greenness)

walkable_b2s_ratio <- edges_seg_output %>% filter(is_walkable == TRUE) %>% pull(building_street_ratio)
unwalkable_b2s_ratio <- edges_seg_output %>% filter(is_walkable == FALSE) %>% pull(building_street_ratio)

t_test_building <- t.test(walkable_building, unwalkable_building)
t_test_sky <- t.test(walkable_sky, unwalkable_sky)
t_test_road <- t.test(walkable_road, unwalkable_road)
t_test_sidewalk <- t.test(walkable_sidewalk, unwalkable_sidewalk)
t_test_greenness <- t.test(walkable_greenness, unwalkable_greenness)
t_test_b2s_ratio <- t.test(walkable_b2s_ratio, unwalkable_b2s_ratio)

results_list <- list(
  Building = list(w = walkable_building, u = unwalkable_building, t = t_test_building),
  Sky = list(w = walkable_sky, u = unwalkable_sky, t = t_test_sky),
  Road = list(w = walkable_road, u = unwalkable_road, t = t_test_road),
  Sidewalk = list(w = walkable_sidewalk, u = unwalkable_sidewalk, t = t_test_sidewalk),
  Greenness = list(w = walkable_greenness, u = unwalkable_greenness, t = t_test_greenness),
  B2S_Ratio = list(w = walkable_b2s_ratio, u = unwalkable_b2s_ratio, t = t_test_b2s_ratio)
)

# Function to extract only t-statistic and p-value
extract_summary <- function(name, obj) {
  data.frame(
    category = name,
    t_stat = as.numeric(obj$t$statistic),
    p_value = obj$t$p.value
  )
}

# Apply and combine
final_summary_table <- do.call(rbind,
                               mapply(extract_summary,
                                      names(results_list),
                                      results_list,
                                      SIMPLIFY = FALSE))

final_summary_table

##            category     t_stat     p_value
## Building   Building  2.3963161 0.016664653
## Sky             Sky -1.8791802 0.060367691
## Road           Road  1.6906825 0.091088770
## Sidewalk   Sidewalk -2.8962253 0.003818248
## Greenness Greenness -0.6176407 0.536883677
## B2S_Ratio B2S_Ratio  2.8964662 0.003842006

# //TASK //////////////////////////////////////////////////////////////////////

The t statistics for Building, Sidewalk, and B2S_Ratio all have significant p-values. This indicates that the mean values for these features differ significantly between walkable and unwalkable census tracts.

For example, the significant p-value for Building indicates that the percentage of building area differs between walkable and unwalkable census tracts. The positive t-statistic, i.e. 2.396 with higher mean in walkable areas suggest that a greater presence of buildings is positively associated with walkability. In other words, we reject the null hypothesis which states that there is no difference in means and conclude that walkable census tracts tend to have a higher percentage of building area than unwalkable tracts. While we do observe this association, it is important to note that we do not have enough information to establish a causal relationship between the two variables.

Note: I picked census tracts that were not continuous and I realized this a bit later in the process that the bounding box covered the entire area between two selected census tracts and calculated each category for the entire covered area. I tried to mitigate this issue but visually focusing on the areas I initially identified as walkable and unwalkable and provided my findings/ conclusions accordingly.