Disclaimer: The contents of this document come from 12. Transportation of Geocomputation with R (Lovelace, Nowosad, & Muenchow, 2019). This document is prepared for CP6521 Advanced GIS, a graduate-level city planning elective course at Georgia Tech in Spring 2019. For any question, contact the instructor, Yongsung Lee, Ph.D. via yongsung.lee(at)gatech.edu.
This document is also published on RPubs.
library(tidyverse)
library(sf)
library(spDataLarge)
library(stplanr) # geographic transport data package
library(tmap)     

Objectives of today’s tutorial

  1. Describe the geographical pattern of transport behavior in the city.
  2. Identify key public transport nodes and routes along which cycling to rail stations could be encouraged, as the first stage in multi-model trips.
  3. Analyze travel ‘desire lines’, to find where many people drive short distances.
  4. Identify cycle route locations that will encourage less car driving and more cycling.

12.2 A case study of Bristol

12.3 Transport Zone

bristol_region = osmdata::getbb("Bristol", format_out = "sf_polygon")

Possible issues with this simple approach
1. The first OSM boundary returned by OSM may not be the official boundary used by local authorities.
2. Even if OSM returns the official boundary, this may be inappropriate for transport research because they bear little relation to where people travel.

Alternative: Travel to Work Areas (TTWAs)
1. First defined as contiguous zones within which 75% of the population travels to work
2. Because Bristol is a major employer attracting travel from surrounding towns, its TTWA is substantially larger than the city bounds
3. In this tutorial, we use officially defined zones of intermediate geographic resolution (their official name is Middle layer Super Output Areas or MSOAs). Each houses around 8,000 people.

spDataLarge::bristol_ttwa  # 1 feature
spDataLarge::bristol_zones # 102 features
tmap_mode("view")
tm_basemap("OpenStreetMap.Mapnik") + 
  tm_shape(bristol_zones) +
  tm_fill(col = "orange", alpha = 0.5) + 
  tm_borders(col = "white", lwd = 1) + 
  tm_shape(bristol_ttwa) +
  tm_borders(col = "black", lwd = 2)  
names(bristol_zones) # contains no attribute data on transport
nrow(bristol_zones) # 102 rows  
nrow(bristol_od)    # 2910 rows 
zones_attr = bristol_od %>% 
  group_by(o) %>%                    # by zone of origin
  summarize_if(is.numeric, sum) %>%  # compute total number of people living in each zone by mode of transport
  dplyr::rename(geo_code = o)        # Renamed o to geo_code, which is the common key with bristol_zones 
summary(zones_attr$geo_code %in% bristol_zones$geo_code)
zones_joined = left_join(bristol_zones, zones_attr, by = "geo_code")
sum(zones_joined$all) # all commuters within Bristol 
names(zones_joined)   # all column names 
zones_od = bristol_od %>% 
  group_by(d) %>% 
  summarize_if(is.numeric, sum) %>% 
  dplyr::select(geo_code = d, all_dest = all) %>% 
  inner_join(zones_joined, ., by = "geo_code")
# https://github.com/Robinlovelace/geocompr/blob/master/code/12-zones.R
tmap_mode("plot")
tm_shape(zones_od) + tm_fill(c("all", "all_dest"), 
                             palette = viridis::plasma(4),
                             breaks = c(0, 2000, 4000, 10000, 40000),
                             title = "Trips")  +
  tm_borders(col = "black", lwd = 0.5) + 
  tm_facets(free.scales = FALSE, ncol = 2) +
  tm_layout(panel.labels = c("Zone of origin", "Zone of destination"))

12.4 Desire lines

First, identify top five od pairs with the largest commuters

od_top5 = bristol_od %>% 
  arrange(desc(all)) %>% 
  top_n(5, wt = all)
## # A tibble: 5 x 7
##   o         d           all bicycle  foot car_driver train
##   <chr>     <chr>     <dbl>   <dbl> <dbl>      <dbl> <dbl>
## 1 E02003043 E02003043  1493      66  1296         64     8
## 2 E02003047 E02003043  1300     287   751        148     8
## 3 E02003031 E02003043  1221     305   600        176     7
## 4 E02003037 E02003043  1186      88   908        110     3
## 5 E02003034 E02003043  1177     281   711        100     7

A few observations about od_top5:
- Walking is the most popular mode of transport among the top 5 origin-destination pairs
- Zone E02003043 is a popular destination (Bristol city center, the destination of all the top 5 OD pairs)
- The intrazonal trips, from one part of zone E02003043 to another (first row), constitute the most traveled OD pair in the dataset

Next, calculate the percentage of each desire line that is made by these active modes:

bristol_od$Active = (bristol_od$bicycle + bristol_od$foot) /
  bristol_od$all * 100
od_intra = filter(bristol_od, o == d)
od_inter = filter(bristol_od, o != d)
?od2line
desire_lines = od2line(od_inter, zones_od)
desire_lines_top5 = od2line(od_top5, zones_od)

tmap_mode("view")
# tmaptools::palette_explorer()
tm_basemap("OpenStreetMap.Mapnik") + 
  tm_shape(desire_lines) +
  tm_lines(palette = "plasma", breaks = c(0, 5, 10, 20, 40, 100),
    lwd = "all",
    scale = 9,
    title.lwd = "Number of trips",
    alpha = 0.6,
    col = "Active",
    title = "Active travel (%)"
  ) +
  tm_shape(desire_lines_top5) +
  tm_lines(lwd = 5, col = "black", alpha = 0.7) +
  tm_scale_bar()

12.5 Routes

To avoid long processing times via online routing services, let’s focus on selected od pairs, whose commuters are more likely to change travel modes in reponse to new policies.
- Desire lines along which a high (300+) number of car trips take place that are up to 5 km in distance.
- 5 km Euclidean distance (or around 6-8 km of route distance) can realistically be cycled by many people, especially if they are riding an electric bicycle (‘ebike’)

desire_lines$distance = as.numeric(st_length(desire_lines))
desire_carshort = dplyr::filter(desire_lines, car_driver > 300 & distance < 5000)
route_carshort = line2route(desire_carshort, route_fun = route_osrm)
desire_carshort$geom_car = st_geometry(route_carshort)
tmap_mode("view")
tm_basemap("OpenStreetMap.Mapnik") + tm_shape(desire_carshort$geometry) + tm_lines(col = "blue", lwd = 2)
tm_basemap("OpenStreetMap.Mapnik") + tm_shape(desire_carshort$geom_car) + tm_lines(col = "red", lwd = 2)

12.6 Nodes

Two types of transport nodes (points)
- Nodes not directly on the network such as zone centroids or individual origins and destinations such as houses and workplaces.
- Nodes that are a part of transport networks, representing individual pathways, intersections between pathways (junctions) and points for entering or exiting a transport network such as bus stops and train stations.

Public transport stops are particularly important nodes that can be represented as either type of node:
- A bus stop that is part of a road, or
- A large rail station that is represented by its pedestrian entry point hundreds of meters from railway tracks.

A common barrier preventing people from switching away from cars for commuting to work is that the distance from home to work is too far to walk or cycle. Public transport can reduce this barrier by providing a fast and high-volume option for common routes into cities. From an active travel perspective, public transport ‘legs’ of longer journeys divide trips into three:
- The origin leg, typically from residential areas to public transport stations.
- The public transport leg, which typically goes from the station nearest a trip’s origin to the station nearest its destination.
- The destination leg, from the station of alighting to the destination.

Having said that, let’s switch our gears and identify those od pairs with most public transport travel: i.e., the top three desire lines in terms of rails use.

desire_rail = top_n(desire_lines, n = 3, wt = train)

Now, let’s ‘break-up’ each of these lines into three pieces (origin leg, public transport leg, and destination legs)
- In fact, this task consists of three stages
- Create a matrix (of origins, destinations and the ‘via’ points representing rail stations here)
- Identify nearest neighbors (i.e., nearest rail stations) for the origins and destinations, and
- Convert a sinlge desire line to multilines
- Fortunately, all these manipulations are completed in one easy step by stplanr::line_via()
- For more information, visit Desire Lines Extended vignette.

?line_via

Compare the number of columns before and after you run line_via().

ncol(desire_rail)
desire_rail = line_via(desire_rail, bristol_stations)
ncol(desire_rail)

The initial desire_rail lines now have three additional geometry list columns:
- They represent travel from home to the origin station, from there to the destination, and finally from the destination station to the destination.
- The destination leg is very short (walking distance) but the origin legs may be sufficiently far to justify investment in cycling infrastructure to encourage people to cycle to the stations on the outward leg of peoples’ journey to work in the residential areas surrounding the three origin stations.

12.7 Route networks

For this subsection, We use bristol_ways, a preprocessed dataset via osmdata
- bristol_ways contains point and line data for the case study area
- On this data set, local routing is possible with stplanr::sum_network_routes()

summary(bristol_ways)

It’s useful to convert route networks to mathematical graphs, but manual conversion may drop the geographic attributes of route networks (e.g., xy coordinates). Thus, we use stplanr::SpatialLineNetwork() to create an object that is both route networks and such graphs (i.e., class igraph).

ways_freeway = bristol_ways %>% filter(maxspeed == "70 mph") 

tmap_mode("view")
tm_basemap("Stamen.TonerLite") + 
  tm_shape(bristol_ttwa) +
  tm_borders(col = "red", lwd = .5)  + 
  tm_shape(ways_freeway) + 
  tm_lines(col = "blue", lwd = 1)
ways_sln = stplanr::SpatialLinesNetwork(ways_freeway)
slotNames(ways_sln)
weightfield(ways_sln)
class(ways_sln@g)

ways_sln is an sfNetwork object. You can access individual components by the @ operator.
- the spatial component of the network (named sl)
- the graph component (g)
- the weightfield, the edge variable used for shortest path calculation (by default segment distance)

# before plotting, compute ‘edge betweenness’, the number of shortest paths passing through each edge 
e = igraph::edge_betweenness(ways_sln@g) 
# plot the geometry with the line length defined by the edge betweenness 
plot(ways_sln@sl$geometry, lwd = e / 500)

12.8 Prioritizing new infrastructure

Now, in this last subsection, we need a data set that we prepared in 12.6 Nodes, desire_rail. Note that it contains three additional geometry columns, leg_orig, leg_via, and leg_dest, all of these make a single desire line (i.e., commuting flow) from a home zone to a work zone.

Here are our work flow.
- Set leg_org as the geometry column of desire_reail.
- Compute the travel distance and duration for the leg_org, or the trip from a home zone centroid to the nearest rail station.
- Make sure the crs is 4326.

route_rail = desire_rail %>% 
  st_set_geometry("leg_orig") %>% 
  line2route(route_fun = route_osrm) %>% 
  st_set_crs(4326)
route_cycleway = rbind(route_rail, route_carshort)
route_cycleway$all = c(desire_rail$all, desire_carshort$all) # bring in # of commuters for individual routes  
tmap_mode("plot")
bristol_stations_top = bristol_stations[desire_rail, , op = st_is_within_distance, dist = 500]
tm_shape(bristol_ttwa) +
  tm_borders(col = "darkblue") +
  tm_shape(bristol_ways) +
  tm_lines(col = "highway", lwd = 3, palette = c("lightgreen", "grey", "pink")) +
  tm_scale_bar() +
  tm_shape(route_cycleway) +
  tm_lines(col = "blue", lwd = "all", scale = 20, alpha = 0.6) +
  tm_shape(bristol_stations_top) +
  tm_dots(size = 0.3, col = "red") +
  tm_layout(legend.position = c("LEFT", "TOP"))