The PCT is not only a web tool, it is a research and open data project that has resulted in many megabytes of valuable data (Lovelace et al. 2017). In this training session we hope you will learn how to download and use these open datasets. This may be of use to anyone interested in data driven planning for sustainable and active travel futures.
This guide supports workshops on advanced usage and development of the Propensity to Cycle Tool (PCT).
Beginner and intermediate PCT events focus on using the PCT via the web application hosted at www.pct.bike and the data provided by the PCT in QGIS.
The focus here is on analysing cycling potential in the open source statistical programming language R. We use R because the PCT was developed in, and can be extended with, R code. Using open source software with a command-line interface reduces barriers to entry, enabling the development of open access transport models for more citizen-led and participatory transport planning, including integration with the A/B Street city simulation and editing software (Lovelace 2021).
To view a video of our previous advanced training workshop at the Cycle Active City 2021 Conference, see https://www.youtube.com/watch?v=OiLzjrBMQmU.
To see the ‘marked up’ contents of the vignette (with results evaluated) see here.
If you are new to R, you should install R and RStudio before the course. For instructions on that, see the download links at cran.r-project.org and RStudio.com.
R is a powerful statistical programming language for data science and a wide range of other applications and, like any language, takes time to learn. To get started we recommend the following free resources:
If you want to calculate cycle routes from within R, you are recommended to sign-up for a CycleStreets API key. See here to apply and see here for instructions on creating a ‘environment variable’ (recommended for experienced R users only).
It may also be worth taking a read about the PCT if you’re not familiar with it before the course starts.
In addition to computer hardware (a laptop) and software (an up-to-date R set-up and experience using R) pre-requisites, you should have read, or at least have working knowledge of the contents of, the following publications, all of which are freely available online:
To ensure your computer is ready for the course, you should be able to run the following lines of R code on your computer:
install.packages("remotes")
pkgs = c(
"cyclestreets",
"mapview",
"pct",
"sf",
"stats19",
"stplanr",
"tidyverse",
"devtools"
)
remotes::install_cran(pkgs)
# remotes::install_github("ITSLeeds/pct")
To test your computer is ready to work with PCT data in R, you can also try running the code hosted at https://raw.githubusercontent.com/ITSLeeds/pct/master/inst/test-setup.R to check everything is working:
source("https://github.com/ITSLeeds/pct/raw/master/inst/test-setup.R")
If you have any questions before the workshop, feel free to ask a question on the package’s issue tracker (requires a GitHub login): https://github.com/itsleeds/pct/issues
Preliminary timings:
The guide covers:
In this section you will learn about the open datasets provided by the PCT project and how to use them. While the most common use of the PCT is via the interactive web application hosted at www.pct.bike, there is much value in downloading the data, e.g. to identify existing cycling infrastructure in close proximity to routes with high potential, and to help identify roads in need of interventions from a safety perspective, using data from the constantly evolving and community-driven global geographic database OpenStreetMap (OSM) (Barrington-Leigh and Millard-Ball 2017).
In this session, which assumes you have experience using R, you will learn how to:
In this example we will use data from North Yorkshire, a mixed region containing urban areas such as York and many rural areas. You can use the PCT, which works at the regional level, for North Yorkshire or any other region by clicking on the area you’re interested in on the main map at https://www.pct.bike. If you know the URL of the region you’re interested in, you can navigate straight there, in this case by typing in or clicking on the link https://www.pct.bike/m/?r=north-yorkshire.
From there you will see a map showing the region. Before you download and use PCT data, it is worth exploring it on the PCT web app.
Exercise: explore the current level and distribution of cycling:
You can use the little-known ‘Freeze Lines’ functionality in the PCT’s web app to identify the zone origin and destinations of trips that would use improvements in a particular place. You can do this by selecting the Fast Routes option from the Cycling Flows menu, zooming into the area of interest, and then clicking on the Freeze Lines checkbox to prevent the selected routes from moving when you zoom back out.
Figure 4.1: Areas that may benefit from improved cycle provision on Clifton Bridge, according to the PCT.
On the PCT web app Click on the Region data tab, shown in the top of Figure 4.1, just beneath the ‘north’ in the URL. You should see a web page like that shown in Figure 4.2, which highlights the Region data table alongside the Map, Region stats, National Data, Manual, and About page links.
Figure 4.2: The Region data tab in the PCT.
Data downloaded in this way can be imported into GIS software such as QGIS, for analysis and visualisation.
However, the PCT was built in R so the best way to understand and modify the results is using R, or a similar language for data analysis.
The subsequent sections demonstrate using R to access, analyse, visualise and model datasets provided by the pct package.
We will get the same PCT datasets as in previous sections but using the R interface.
If you have not already done so, you will need to install the R packages we will use for this section (and the next) by typing and executing the following command in the R console: install.packages("pct", "sf", "dplyr", "tmap").
library(pct)
library(sf) # key package for working with spatial vector data
library(tidyverse) # in the tidyverse
library(tmap) # installed alongside mapview
tmap_options(check.and.fix = TRUE) # tmap setting
The pct package has been developed specifically for use with PCT data.
To learn more about this package, see https://itsleeds.github.io/pct/.
region_name = "north-yorkshire"
zones_all = get_pct_zones(region_name)
lines_all = get_pct_lines(region_name)
# note: the next command may take a few seconds
routes_all = get_pct_routes_fast(region_name)
rnet_all = get_pct_rnet(region_name)
plot(zones_all$geometry)
plot(lines_all$geometry, col = "blue", add = TRUE)
plot(routes_all$geometry, col = "green", add = TRUE)
plot(rnet_all$geometry, col = "red", lwd = sqrt(rnet_all$bicycle), add = TRUE)
The PCT provides a school route network layer that can be especially important when planning cycling interventions in residential areas (Goodman et al. 2019). Due to the sensitive nature of school data, we cannot make route or OD data level data available. However, the PCT provides travel to school data at zone and route network levels, as shown in Figure 4.3. (Note: to get this data from the PCT website you must select School travel in the Trip purpose menu before clicking on Region data.)
zones_school = get_pct_zones(region = region_name, purpose = "school")
rnet_school = get_pct_rnet(region = region_name, purpose = "school")
As we will see in Section 6, combining school and commute network data can result in a more comprehensive network.
Figure 4.3: Open access data on cycling to school potential from the PCT, at zone (left) and route network (right) levels. These datasets can support planning interventions, especially ‘safe routes to school’ and interventions in residential areas. To see the source code that generates these plots, see the ‘source’ link at the top of the page.
Exercise: Explore the datasets you have downloaded. Use functions such as plot() or qtm() to visualise these datasets, and try out different colour schemes
This section is designed for people with experience with the PCT and cycling uptake estimates who want to learn more about how uptake models work and how to generate new scenarios of change. Reproducible and open R code will be used to demonstrate the concepts so knowledge of R or other programming languages is recommended but not essential, as there will be conceptual exercises covering the factors linked to mode shift. In it you will:
One of the benefits of the PCT is its ability to generate scenarios that model where people might cycle in future. Several cycling uptake scenarios are included on the PCT website. We also have R functions for these scenarios. For example, the PCT’s ‘Government Target’ scenario allows us to calculate the cycling uptake that would be required to correspond to a scenario in which we meet the government’s aim to double cycling levels by 2025, using a 2013 baseline.
The following code chunk uses the R function uptake_pct_govtarget_2020() (from the pct package) to recreate this ‘Government Target’ scenario.
lines_all$pcycle = lines_all$bicycle / lines_all$all
lines_all$euclidean_distance = as.numeric(sf::st_length(lines_all))
lines_all$pcycle_govtarget = uptake_pct_govtarget_2020(
distance = lines_all$rf_dist_km,
gradient = lines_all$rf_avslope_perc
) * 100 + lines_all$pcycle
Exercise: Generate a ‘Go Dutch’ scenario for North Yorkshire using the function uptake_pct_godutch(): (Hint: the process is very similar to that used to generate the ‘Government Target’ scenario)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.505 6.881 20.750 22.367 36.265 56.052
(#fig:dutch_pcycle)Percent cycling currently (left) and under a ‘Go Dutch’ scenario (right) in the North Yorkshire.
pct_uptake_godutch() - how could it be modified?Let’s develop a simple model representing the government’s aim, that “half of all journeys in towns and cities will be cycled or walked” by 2030. We will assume that this means that all journeys made in urban areas, as defined by the Office for National Statistics, will be made by these active modes. We only have commute data in the data we downloaded, but this is a good proxy for mode share overall.
The first stage is to identify urban areas in North Yorkshire. We use data from the House of Commons Research Briefing on City and Town Classifications to define areas based on their town/city status. The code chunk below shows the benefits of R in terms of being able to get and join data onto the route data we have been using:
# Get data on the urban_rural status of LSOA zones
urban_rural = readr::read_csv("https://researchbriefings.files.parliament.uk/documents/CBP-8322/oa-classification-csv.csv")
ggplot(urban_rural) +
geom_bar(aes(citytownclassification)) +
coord_flip()
# Join this with the PCT commute data that we previously downloaded
urban_rural = rename(urban_rural, geo_code = lsoa_code)
zones_all_joined = left_join(zones_all, urban_rural)
routes_all_joined = left_join(routes_all, urban_rural, by = c("geo_code1" = "geo_code"))
tm_shape(zones_all_joined) +
tm_polygons("citytownclassification")
Figure 5.1: Classification of areas in Great Britain (left) and North Yorkshire (right).
After the classification dataset has been joined, the proportion of trips made by walking and cycling in towns and cities across North Yorkshire can be calculated as follows.
# Select only zones for which the field `citytownclassification` contains the word "Town" or "City"
routes_towns = routes_all_joined %>%
filter(grepl(pattern = "Town|City", x = citytownclassification))
round(sum(routes_towns$foot + routes_towns$bicycle) / sum(routes_towns$all) * 100)
## [1] 34
Currently, only around 34% of commute trips in the region’s ‘town’ areas are made by walking and cycling (27% across all zones in North Yorkshire, and a much lower proportion in terms of distance). We explore this in more detail by looking at the relationship between trip distance and mode share for existing commuter journeys, as shown in Figure 5.2 (a).
We will create a scenario representing the outcome of policies that incentivise people to replace car trips with walking and cycling. This focuses on the red boxes in Figure 5.2. In this scenario, we replace 50% of car trips of less than 1 km with walking, and replace 10% of car trips of 1-2 km length with walking. Many of the remaining car trips will be replaced by cycling, with the percentages of trips that switch for each OD determined by the uptake function in the Go Dutch Scenario of the PCT. The results of this scenario are shown in Figure 5.2 (b).
# Reduce the number of transport mode categories
routes_towns_recode = routes_towns %>%
mutate(public_transport = train_tube + bus,
car = car_driver + car_passenger,
other = taxi_other + motorbike
) %>%
dplyr::select(-car_driver, -car_passenger, -train_tube, -bus)
# Set distance bands to use in the bar charts
routes_towns_recode$dist_bands = cut(x = routes_towns_recode$rf_dist_km, breaks = c(0, 1, 3, 6, 10, 15, 20, 30, 1000), include.lowest = TRUE)
# Set the colours to use in the bar charts
col_modes = c("#fe5f55", "grey", "#ffd166", "#90be6d", "#457b9d")
# Plot bar chart showing modal share by distance band for existing journeys
base_results = routes_towns_recode %>%
sf::st_drop_geometry() %>%
dplyr::select(dist_bands, car, other, public_transport, bicycle, foot) %>%
tidyr::pivot_longer(cols = matches("car|other|publ|cy|foot"), names_to = "mode") %>%
mutate(mode = factor(mode, levels = c("car", "other", "public_transport", "bicycle", "foot"), ordered = TRUE)) %>%
group_by(dist_bands, mode) %>%
summarise(Trips = sum(value))
g1 = ggplot(base_results) +
geom_col(aes(dist_bands, Trips, fill = mode)) +
scale_fill_manual(values = col_modes) + ylab("Trips")
g1
# Create the new scenario:
# First we replace some car journeys with walking, then replace some of the
# remaining car journeys with cycling
routes_towns_recode_go_active = routes_towns_recode %>%
mutate(
foot_increase_proportion = case_when(
# specifies that 50% of car journeys <1km in length will be replaced with walking
rf_dist_km < 1 ~ 0.5,
# specifies that 10% of car journeys 1-2km in length will be replaced with walking
rf_dist_km >= 1 & rf_dist_km < 2 ~ 0.1,
TRUE ~ 0
),
# Specify the Go Dutch scenario we will use to replace remaining car trips with cycling
bicycle_increase_proportion = uptake_pct_godutch_2020(distance = rf_dist_km, gradient = rf_avslope_perc),
# Make the changes specified above
car_reduction = car * foot_increase_proportion,
car = car - car_reduction,
foot = foot + car_reduction,
car_reduction = car * bicycle_increase_proportion,
car = car - car_reduction,
bicycle = bicycle + car_reduction
)
# Plot bar chart showing how modal share has changed in our new scenario
active_results = routes_towns_recode_go_active %>%
sf::st_drop_geometry() %>%
dplyr::select(dist_bands, car, other, public_transport, bicycle, foot) %>%
tidyr::pivot_longer(cols = matches("car|other|publ|cy|foot"), names_to = "mode") %>%
mutate(mode = factor(mode, levels = c("car", "other", "public_transport", "bicycle", "foot"), ordered = TRUE)) %>%
group_by(dist_bands, mode) %>%
summarise(Trips = sum(value))
g2 = ggplot(active_results) +
geom_col(aes(dist_bands, Trips, fill = mode)) +
scale_fill_manual(values = col_modes) + ylab("Trips")
g2
Figure 5.2: Relationship between distance (x axis) and mode share (y axis) in towns and cities in North Yorkshire. (a) left: existing mode shares; (b) right: mode shares under high active travel uptake scenario.
Exercise: Instead of a scenario in which all types of car journey (i.e. both car drivers and car passengers) are replaced by walking or cycling, can you create a scenario in which solely journeys by car drivers are replaced by walking or cycling? The scenario we just created applies only to urban areas - can you adapt it so that the same changes in walking and cycling uptake are applied across the whole of North Yorkshire, including both urban and rural areas?
The scenario outlined above may sound ambitious, but it only just meets the government’s aim for walking and cycling to account for 50% of trips in Town and Cities, at least when looking exclusively at single stage commutes in a single region. Furthermore, while the scenario represents a ~200% (3 fold) increase in the total distance travelled by active modes, it only results in a 17% reduction in car km driven in towns. The overall impact on energy use, resource consumption and emissions is much lower for the region overall, including rural areas.
In the context of the government’s aim of fully decarbonising the economy by 2050, the analysis above suggests that more stringent measures focussing on long distance trips, which account for the majority of emissions, may be needed. However, it is still useful to see where there is greatest potential for car trips to be replaced by walking and cycling, as shown in Figure 5.3.
Figure 5.3: Illustration of route network based on car trips that could be replaced by bicycle trips, based on Census data on car trips to work and the Go Dutch uptake function used in the PCT.
The PCT is not limited to commuter data only, it also provides a range of school data for each region in England and Wales to be downloaded with relative ease.
In the example below, we add a purpose to the get_pct_rnet() function of school.
This allows us to get estimates of cycling potential on the road network for school trips, commuter trips, and school and commuter trips combined.
Note in the figure below that the combined route network provides a more comprehensive (yet still incomplete) overview of cycling potential in the study region.
# get pct rnet data for schools
rnet_school = get_pct_rnet(region = region_name, purpose = "school")
rnet_school = subset(rnet_school, select = -c(`cambridge_slc`)) # subset columns for bind
rnet_all = subset(rnet_all, select = -c(`ebike_slc`,`gendereq_slc`,`govnearmkt_slc`)) # subset columns for bind
rnet_school_commute = rbind(rnet_all,rnet_school) # bind commute and schools rnet data
rnet_school_commute$duplicated_geometries = duplicated(rnet_school_commute$geometry) # find duplicated geometries
rnet_school_commute$geometry_txt = sf::st_as_text(rnet_school_commute$geometry)
rnet_combined = rnet_school_commute %>%
group_by(geometry_txt) %>% # group by geometry
summarise(across(bicycle:dutch_slc, sum, na.rm = TRUE)) # and summarise route network which is not a duplicate
Figure 6.1: Comparison of commute, school, and combined commute and school route networkworks, under the Go Dutch scenario.
These links may be useful when working through the exercises: