In the first section, you need to select one Census Tract that you think is the most walkable and another one that you think is least walkable within Fulton and DeKalb Counties, GA. As long as the two Census Tracts are within the two counties, you can pick any two you want. If the area you want to use as walkable/unwalkable area is not well-covered by one Census Tract, you can select multiple tracts (e.g., selecting three adjacent tracts as one walkable area). The definition of ‘walkable’ can be your own - you can choose solely based on your experience (e.g., had best/worst walking experience), refer to Walk Score, or any other mix of criteria you want. After you make the selection, provide a short write-up with a map explaining why you chose those Census Tracts.
The second section is the main part of this assignment in which you prepare OSM data, download GSV images, apply computer vision (i.e., semantic segmentation).
In the third section, you will summarise and analyze the output and provide your findings. After you apply computer vision to the images, you will have the number of pixels in each image that represent 150 categories in your data. You will focus on the following categories in your analysis: building, sky, tree, road, and sidewalk. Specifically, you will (1) create maps to visualize the spatial distribution of different elements, (2) compare the mean of each category between the two Census Tract and (3) draw box plots to compare the distributions.
library(tidyverse)
library(tidycensus)
library(osmdata)
library(sfnetworks)
library(units)
library(sf)
library(tidygraph)
library(tmap)
library(here)
ttm()
Select walkable Census Tract(s) and unwalkable Census Tract(s) within Fulton and DeKalb counties.
In the quest to search for Census Tracts, you can use an approach similar to what we did in Step 1 of ‘Module4_getting_GSV_images.Rmd’. This time, instead of cities, we are focusing on Census Tracts; and the search boundary is the two counties, instead of metro Atlanta.
Provide a brief description and visualization of your Census Tracts. Why do you think the Census Tracts are walkable and unwalkable? What were the contributing factors?
tracts <- tidycensus::get_acs(geography = "tract",
variables = c("hhinc" = 'B19013_001'),
year = 2020,
output = "wide",
state = "GA",
county = c("Fulton", "DeKalb"),
geometry = TRUE)
Most Walkable: FIPS = 13121001002. Reasoning: The census tract where Georgia Tech is located has been validated to have a notably high walkability score, a fact supported by personal experience.
Least Walkable: FIPS = 13121007710. Reasoning: Hosting Cowart lake and the Piedmont Driving Club, this census tract has one of the lowest walkability scores.
selected_tracts <- tracts %>%
filter(as.character(GEOID) %in% c("13121001002", "13121007710"))
tmap_mode("view")
tm_basemap("Esri.WorldTopoMap") +
tm_shape(tracts) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")+
tm_layout(legend.show = FALSE)+
tm_shape(selected_tracts) +
tm_polygons(col = 'GEOID', alpha = 1, palette = "Set2")
# TASK ////////////////////////////////////////////////////////////////////////
# 1. Set up your api key here
tidycensus::census_api_key(Sys.getenv('census_api'))
# //TASK //////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Download Census Tract polygon for Fulton and DeKalb
tract <- get_acs("tract",
variables = c('tot_pop' = 'B01001_001'),
year = 2020,
state = "GA",
county = c("Fulton", "DeKalb"),
geometry = TRUE)
# =========== NO MODIFY ZONE ENDS HERE ========================================
# TASK ////////////////////////////////////////////////////////////////////////
# The purpose of this TASK is to create one bounding box for walkable Census Tract and another bounding box for unwalkable Census Tract.
# As long as you generate what's needed for the subsequent codes, you are good. The numbered list of tasks below is to provide some hints.
# 1. Write the GEOID of walkable & unwalkable Census Tracts. e.g., tr1_ID <- c("13121001205", "13121001206")
# 2. Extract the selected Census Tracts using tr1_ID & tr2_ID
# 3. Create their bounding boxes using st_bbox(), and
# 4. Assign them to tract_1_bb and tract_1_bb, respectively.
# 5. Change the coordinate system to GCS, if necessary.
# For the walkable Census Tract(s)
# 1.
tr1_ID <- "13121001002" # **YOUR CODE HERE..** --> For example, tr1_ID <- c("13121001205", "13121001206").
# 2~4
tract_1_bb <- tract %>%
filter(as.character(GEOID) %in% tr1_ID)%>%
st_bbox(crs = st_crs(4326))%>%
st_as_sfc()
# **YOUR CODE HERE..**
# For the unwalkable Census Tract(s)
# 1.
tr2_ID <- '13121007710' # **YOUR CODE HERE..**
# 2~4
tract_2_bb <- tract %>% filter(as.character(GEOID) %in% tr2_ID)%>%
st_bbox(crs = st_crs(4326))%>%
st_as_sfc()
# **YOUR CODE HERE..**
# //TASK //////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Get OSM data for the two bounding box
osm_1 <- opq(bbox = tract_1_bb) %>%
add_osm_feature(key = 'highway',
value = c("motorway", "trunk", "primary",
"secondary", "tertiary", "unclassified",
"residential")) %>%
osmdata_sf() %>%
osm_poly2line()
osm_2 <- opq(bbox = tract_2_bb) %>%
add_osm_feature(key = 'highway',
value = c("motorway", "trunk", "primary",
"secondary", "tertiary", "unclassified",
"residential")) %>%
osmdata_sf() %>%
osm_poly2line()
# =========== NO MODIFY ZONE ENDS HERE ========================================
# TASK ////////////////////////////////////////////////////////////////////////
# 1. Convert osm_1 and osm_2 to sfnetworks objects (set directed = FALSE)
# 2. Clean the network by (1) deleting parallel lines and loops, (2) create missing nodes, and (3) remove pseudo nodes,
# 3. Add a new column named length using edge_length() function.
net1 <- osm_1$osm_lines %>%
sfnetworks::as_sfnetwork(directed = FALSE) %>%
activate("edges") %>%
filter(!edge_is_multiple()) %>%
filter(!edge_is_loop())%>%
convert(.,sfnetworks::to_spatial_subdivision)%>%
mutate(length = edge_length())
# **YOUR CODE HERE..**
net2 <- osm_2$osm_lines %>%
sfnetworks::as_sfnetwork(directed = FALSE) %>%
activate("edges") %>%
filter(!edge_is_multiple()) %>%
filter(!edge_is_loop())%>%
convert(.,sfnetworks::to_spatial_subdivision)%>%
mutate(length = edge_length())
# **YOUR CODE HERE..**
# //TASK //////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# OSM for the walkable part
edges_1 <- net1 %>%
# Extract 'edges'
st_as_sf("edges") %>%
# Drop redundant columns
select(osm_id, highway, length) %>%
# Drop segments that are too short (100m)
mutate(length = as.vector(length)) %>%
filter(length > 100) %>%
# Add a unique ID for each edge
mutate(edge_id = seq(1,nrow(.)),
is_walkable = "walkable")
# OSM for the unwalkable part
edges_2 <- net2 %>%
# Extract 'edges'
st_as_sf("edges") %>%
# Drop redundant columns
select(osm_id, highway, length) %>%
# Drop segments that are too short (100m)
mutate(length = as.vector(length)) %>%
filter(length > 100) %>%
# Add a unique ID for each edge
mutate(edge_id = seq(1,nrow(.)),
is_walkable = "unwalkable")
# Merge the two
edges <- bind_rows(edges_1, edges_2)
# =========== NO MODIFY ZONE ENDS HERE ========================================
getAzi <- function(line){
# This function takes one edge (i.e., a street segment) as an input and
# outputs a data frame with four points (start, mid1, mid2, and end) and their azimuth.
# TASK ////////////////////////////////////////////////////////////////////////
# 1. From `line` object, extract the coordinates using st_coordinates() and extract the first two rows.
# 2. Use atan2() function to calculate the azimuth in degree.
# Make sure to adjust the value such that 0 is north, 90 is east, 180 is south, and 270 is west.
# 1
start_p <- line %>%
st_coordinates() %>%
.[1:2,1:2]
start_azi <- atan2(start_p[2,"X"] - start_p[1, "X"],
start_p[2,"Y"] - start_p[1, "Y"])*180/pi # **YOUR CODE HERE..** --> For example, atan2()..
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# Repeat what you did above, but for last two rows (instead of the first two rows).
# Remember to flip the azimuth so that the camera would be looking at the street that's being measured
end_p <- line %>%
st_coordinates() %>%
.[(nrow(.)-1):nrow(.),1:2]
end_azi <- atan2(end_p[2,"X"] - end_p[1, "X"],
end_p[2,"Y"] - end_p[1, "Y"])*180/pi
end_azi <- if (end_azi < 180) {end_azi + 180} else {end_azi - 180}
# //TASK //////////////////////////////////////////////////////////////////////
# TASK ////////////////////////////////////////////////////////////////////////
# 1. From `line` object, use st_line_sample() function to generate points at 45%, 50% and 55% locations.
# 2. Use st_cast() function to convert 'MULTIPOINT' object to 'POINT' object.
# 3. Extract coordinates using st_coordinates().
# 4. Use the 50% location to define `mid_p` object.
# 5. Use the 45% and 55% points and atan2() function to calculate azimuth `mid_azi`.
mid_p3 <- line %>%
st_line_sample(sample = c(0.45, 0.5, 0.55)) %>%
st_cast("POINT") %>%
st_coordinates()
mid_p <- mid_p3[2,]
mid_azi <- atan2(mid_p3[3,"X"] - mid_p3[1, "X"],
mid_p3[3,"Y"] - mid_p3[1, "Y"])*180/pi
mid_azi2 <- ifelse(mid_azi < 180, mid_azi + 180, mid_azi - 180)
# //TASK //////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
return(tribble(
~type, ~X, ~Y, ~azi,
"start", start_p[1,"X"], start_p[1,"Y"], start_azi,
"mid1", mid_p["X"], mid_p["Y"], mid_azi,
"mid2", mid_p["X"], mid_p["Y"], mid_azi2,
"end", end_p[2,"X"], end_p[2,"Y"], end_azi))
# =========== NO MODIFY ZONE ENDS HERE ========================================
}
We can apply getAzi() function to the edges object. We
finally append edges object to make use of the columns in
edges object (e.g., is_walkable column). When
you are finished with this code chunk, you will be ready to download GSV
images.
# TASK ////////////////////////////////////////////////////////////////////////
# Apply getAzi() function to all edges.
# Remember that you need to pass edges object to st_geometry() before you apply getAzi()
endp_azi <- edges %>%
st_geometry() %>%
map_df(getAzi, .progress = T) # **YOUR CODE HERE..**
# //TASK //////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
endp <- endp_azi %>%
bind_cols(edges %>%
st_drop_geometry() %>%
slice(rep(1:nrow(edges),each=4))) %>%
st_as_sf(coords = c("X", "Y"), crs = 4326, remove=FALSE) %>%
mutate(node_id = seq(1, nrow(.)))
# =========== NO MODIFY ZONE ENDS HERE ========================================
get_image <- function(iterrow){
# This function takes one row of endp and downloads GSV image using the information from endp.
# TASK ////////////////////////////////////////////////////////////////////////
# Finish this function definition.
# 1. Extract required information from the row of endp, including
# type (i.e., start, mid1, mid2, end), location, heading, edge_id, node_id, source (i.e., outdoor vs. default) and key.
# 2. Format the full URL and store it in furl.
# 3. Format the full path (including the file name) of the image being downloaded and store it in fpath
type = iterrow$type
location <- paste0(iterrow$Y %>% round(5), ",", iterrow$X %>% round(5))
heading <- iterrow$azi %>% round(1)
edge_id <- iterrow$edge_id
node_id <- iterrow$node_id
key <- Sys.getenv("google_api")
furl <- glue::glue("https://maps.googleapis.com/maps/api/streetview?size=640x640&location={location}&heading={heading}&fov=90&pitch=0&key={key}") # **YOUR CODE HERE..**
fname <- glue::glue("GSV-nid_{node_id}-eid_{edge_id}-type_{type}-Location_{location}-heading_{heading}.jpg") # Don't change this code for fname
fpath <- here("/Users/helenalindsay/Documents/Fall_23/CP8883/Major3/downloaded_image", fname) # **YOUR CODE HERE..**
# //TASK //////////////////////////////////////////////////////////////////////
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Download images
if (!file.exists(fpath)){
download.file(furl, fpath, mode = 'wb')
}
# =========== NO MODIFY ZONE ENDS HERE ========================================
}
Before you download GSV images, make sure
the row number of endp is not too large! The row number of
endp will be the number of GSV images you will be
downloading. Before you download images, always double-check your Google
Cloud Console’s Billing tab to make sure that you will not go above the
free credit of $200 each month. The price is $7 per 1000 images.
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Loop!
for (i in seq(1,nrow(endp))){
get_image(endp[i,])
}
# =========== NO MODIFY ZONE ENDS HERE ========================================
ZIP THE DOWNLOADED IMAGES AND NAME IT ‘gsv_images.zip’ FOR STEP 6.
Now, use Google Colab to apply the semantic segmentation model.
Merge the segmentation output to edges.
# Read the downloaded CSV file from Google Drive
seg_output <- read.csv("/Users/helenalindsay/Documents/Fall_23/CP8883/Major3/major3_images.csv")
# =========== NO MODIFICATION ZONE STARTS HERE ===============================
# Join the segmentation result to endp object.
seg_output_nodes <- endp %>% inner_join(seg_output, by=c("node_id"="img_id")) %>%
select(type, X, Y, node_id, building, sky, tree, road, sidewalk) %>%
mutate(across(c(building, sky, tree, road, sidewalk), function(x) x/(640*640)))
# =========== NO MODIFY ZONE ENDS HERE ========================================
At the beginning of this assignment, you defined one Census Tract as walkable and the other as unwalkable. The key to the following analysis is the comparison between walkable/unwalkable Census Tracts.
Create maps of the proportion of building, sky, tree, road, and sidewalk for walkable and unwalkable areas. In total, you will have 10 maps.
Below the maps, provide a brief description of your findings from the maps.
# TASK ////////////////////////////////////////////////////////////////////////
# Create map(s) to visualize the `pspnet_nodes` objects.
# As long as you can deliver the message clearly, you can use any format/package you want.
# Assuming geometry columns are named 'geometry' in both sf_data and census_data
selected_tracts <- st_transform(selected_tracts, crs = 4326)
seg_output_nodes <- st_transform(seg_output_nodes, crs = 4326)
joined <- st_join(seg_output_nodes, selected_tracts, join = st_intersects)
joined <- st_transform(joined, crs = 4326)
tract_1_bb <- st_transform(tract_1_bb, crs = 4326)
tract_2_bb <- st_transform(tract_2_bb, crs = 4326)
walkable <- st_intersection(joined, tract_1_bb)
unwalkable <- st_intersection(joined, tract_2_bb)
tmap_mode("view")
tmap_options(
basemaps = c("Esri.WorldTopoMap"),
bg.color = "lightgray" # Adjust background color if needed
)
walkable_map_build <- tm_basemap() +
tm_shape(walkable) +
tm_dots(col = 'building', alpha = 1, palette = "Set2", size = 0.05)+
tm_shape(selected_tracts) +
tm_borders(lwd = 4) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")
unwalkable_map_build <- tm_basemap() +
tm_shape(unwalkable) +
tm_dots(col = 'building', alpha = 1, palette = "Set2", size = 0.05)+
tm_shape(selected_tracts) +
tm_borders(lwd = 4) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")
walkable_map_sky <- tm_basemap() +
tm_shape(walkable) +
tm_dots(col = 'sky', alpha = 1, palette = "Set2", size = 0.05)+
tm_shape(selected_tracts) +
tm_borders(lwd = 4) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")
unwalkable_map_sky <- tm_basemap() +
tm_shape(unwalkable) +
tm_dots(col = 'sky', alpha = 1, palette = "Set2", size = 0.05)+
tm_shape(selected_tracts) +
tm_borders(lwd = 4) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")
walkable_map_tree <- tm_basemap() +
tm_shape(walkable) +
tm_dots(col = 'tree', alpha = 1, palette = "Set2", size = 0.05)+
tm_shape(selected_tracts) +
tm_borders(lwd = 4) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")
unwalkable_map_tree <- tm_basemap() +
tm_shape(unwalkable) +
tm_dots(col = 'tree', alpha = 1, palette = "Set2", size = 0.05)+
tm_shape(selected_tracts) +
tm_borders(lwd = 4) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")
walkable_map_road <- tm_basemap() +
tm_shape(walkable) +
tm_dots(col = 'road', alpha = 1, palette = "Set2", size = 0.05)+
tm_shape(selected_tracts) +
tm_borders(lwd = 4) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")
unwalkable_map_road <- tm_basemap() +
tm_shape(unwalkable) +
tm_dots(col = 'road', alpha = 1, palette = "Set2", size = 0.05)+
tm_shape(selected_tracts) +
tm_borders(lwd = 4) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")
walkable_map_side <- tm_basemap() +
tm_shape(walkable) +
tm_dots(col = 'sidewalk', alpha = 1, palette = "Set2", size = 0.05)+
tm_shape(selected_tracts) +
tm_borders(lwd = 4) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")
unwalkable_map_side <- tm_basemap() +
tm_shape(unwalkable) +
tm_dots(col = 'sidewalk', alpha = 1, palette = "Set2", size = 0.05)+
tm_shape(selected_tracts) +
tm_borders(lwd = 4) +
tm_polygons(col = 'GEOID', alpha = 0, palette = "Set2")
tmap_arrange(walkable_map_build, unwalkable_map_build, ncol = 2)
tmap_arrange(walkable_map_sky, unwalkable_map_sky, ncol = 2)
tmap_arrange(walkable_map_tree, unwalkable_map_tree, ncol = 2)
tmap_arrange(walkable_map_road, unwalkable_map_road, ncol = 2)
tmap_arrange(walkable_map_side, unwalkable_map_side, ncol = 2)
# //TASK //////////////////////////////////////////////////////////////////////
Contrary to conventional expectations, the examined maps do not demonstrate consistent differences between walkable and unwalkable areas. In both walkable and unwalkable tracts, there is no notable discrepancy in the proportion of sky, tree, road, and sidewalks. As seen in the map legend, the proportion of buildings seem to have a larger variance for the unwalkable areas compared to the walkable areas (Georgia Tech Campus). The range of values for unwalkable areas spans from 0 to 0.25, while walkable areas show a narrower range, ranging from 0 to 0.14. This observation suggests that the distribution of building proportions is more dispersed in unwalkable areas, indicating greater variability in the presence or density of buildings across those regions.
Calculate the mean of the proportion of building, sky, tree, road, and sidewalk for walkable and unwalkable areas. In total, you will have 10 mean values.
After the calculation, provide a brief description of your findings.
# TASK ////////////////////////////////////////////////////////////////////////
# Perform the calculation as described above.
# As long as you can deliver the message clearly, you can use any format/package you want.
library(kableExtra)
build_mean <- mean(walkable$building)
sky_mean <- mean(walkable$sky)
tree_mean <- mean(walkable$tree)
road_mean <- mean(walkable$road)
side_mean <- mean(walkable$sidewalk)
non_build_mean <- mean(unwalkable$building)
non_sky_mean <- mean(unwalkable$sky)
non_tree_mean <- mean(unwalkable$tree)
non_road_mean <- mean(unwalkable$road)
non_side_mean <- mean(unwalkable$sidewalk)
mean_values <- data.frame(
Attribute = c("Building", "Sky", "Tree", "Road", "Sidewalk"),
Walkable = c(build_mean, sky_mean, tree_mean, road_mean, side_mean),
Unwalkable = c(non_build_mean, non_sky_mean, non_tree_mean, non_road_mean, non_side_mean))
kable(mean_values, caption = "Mean Proportions of Features in Walkable vs Unwalkable Areas") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
| Attribute | Walkable | Unwalkable |
|---|---|---|
| Building | 0.0184103 | 0.0220717 |
| Sky | 0.2722483 | 0.2460552 |
| Tree | 0.2629425 | 0.2615108 |
| Road | 0.3031522 | 0.3290750 |
| Sidewalk | 0.0109588 | 0.0097878 |
# //TASK //////////////////////////////////////////////////////////////////////
Contrary to expectations, the means of the features did not differ significantly between walkable and unwalkable areas. Although the proportion of buildings is slightly higher in unwalkable areas, the difference is not pronounced. As for the proportion of sky, walkable areas tend to have a slightly higher proportion of sky compared to unwalkable areas, but the difference is not pronounced for this either. Walkable areas have a slightly higher proportion of sidewalks. However, due to the subtle nature of this difference, it is challenging to draw a conclusive link between the observed proportions and the expectations of good pedestrian infrastructure.
Draw box plots comparing the proportion of building, sky, tree, road, and sidewalk between walkable and unwalkable areas. Each plot presents two boxes: one for walkable areas and the other for unwalkable areas. In total, you will have 5 plots.
After the calculation, provide a brief description of your findings.
# TASK ////////////////////////////////////////////////////////////////////////
# Create box plot(s) using geom_boxplot() function from ggplot2 package.
# Use `seg_output_nodes` object to draw the box plots.
# You will find `pivot_longer` function useful.
#tract_2_bb <- st_transform(tract_2_bb, crs = 4326)
endp_sub <- endp %>%
select(c(is_walkable,node_id))%>%
st_drop_geometry()
seg_output_nodes_join <- inner_join(seg_output_nodes, endp_sub, by = 'node_id')
df_long <- seg_output_nodes_join %>%
pivot_longer(
cols = c(building, sky, tree, road, sidewalk),
names_to = "Feature",
values_to = "Proportion"
)
# Create boxplot
ggplot(df_long, aes(x = Feature, y = Proportion, fill = is_walkable)) +
geom_boxplot() +
facet_wrap(~ is_walkable, scales = "free_y") +
theme_minimal() +
labs(
title = "Distribution of Proportions by Walkability",
x = "Feature",
y = "Proportion"
) +
scale_fill_brewer(palette = "Set2")
# //TASK //////////////////////////////////////////////////////////////////////
The boxplot analysis indicates that contrary to what was expected, there is no significant difference in the distribution of features between walkable and unwalkable areas across all variables. The means for each feature appear to be consistent regardless of the walkability classification. In both walkable and unwalkable tracts, there is no notable discrepancy in the proportion of sidewalks. This unexpected finding challenges the common assumption that good pedestrian infrastructure, represented by sidewalks, is a reliable indicator of walkability. Similarly, the distribution of buildings does not display a clear distinction between walkable and unwalkable areas. The mean proportions remain comparable, suggesting that the urban and development characteristics reflected in building density do not significantly vary based on walkability and vise versa. The same trend can be seen in the proportion of trees and sky. The presence of trees and higher proportion of sky, often associated with suburban or rural settings, do not seem to be reliable predictors of walkability in this analysis.
In summary, the analysis suggests that, contrary to conventional expectations, the examined features do not demonstrate consistent differences between walkable and unwalkable areas. The absence of significant variations in the proportions of sidewalks, buildings, trees, sky, and roads challenges preconceived notions about the factors influencing walkability in the studied tracts. Further investigation and consideration of additional variables may be necessary to gain a comprehensive understanding of the walkability dynamics in the analyzed areas.