The propensity to cycle to rail stations

14 July 2016. Slides made with RStudio. Reproducible code: https://github.com/npct/pct-rail-test

Context: data analysis software

Microsoft acquired Revolution Analytics in January 2016
Opportunities for development of open source technologies for sustainable planning

The Propensity to Cycle Tool

Motivation

Lack of strategic cycle networks

What is the PCT?

An open source, publicly accessible web-based tool for assessing the geographical distribution and potential benefits of cycling potential nationwide
Best illustrated in a demo: http://pct.bike/

What kind of questions can it help answer?

Where should we build for existing cyclists?
New cyclists in the medium term?
Long-term strategy?
Along which routes to create the strategic joined-up network?
Where should cycling interventions be prioritised around public transport infrastructure?

Phase I (Feb - July 2016)

Build and test a prototype model
Identify 'desire lines' of greatest potential
Make the tool scalable nationally
Create a website that will make the Propensity to Cycle Tool a publicly accessible resource

Phase II (January 2016 - March 2017)

Version 1 - nationwide (V1 launch: June 2016)

Route-allocated hilliness, network layer (complete)
Include health outcomes (HEAT)
National-level results (Anna Goodman)
V1.5 - smaller (LSOA) zones (Jan 2017)
Training

Version 2 - local deployment

Include non-commute trips
Compatibility with Local Authority trip demand models
Micro-level analysis (V 2.2)

Additional work/spin outs

Case studies of use (e.g. Manchester, Kent, Yorkshire)
Method for identifying severance
Case study along HS2 route
'Hackathons' to stimulate the tool's development

Data sources for cycling potential to stations

Warning: We'll be using R code

Using the following packages

pkgs = c(
  "stplanr", # transport data handling
  "sp", # package for spatial data
  "tmap", # package for mapping
  "deldir", # voronoi polygons
  "readr", # fast read/write
  "rgeos", # gis functions
  "dplyr" # data analysis
)
lapply(pkgs, library, character.only = T)

## [[1]]
##  [1] "stplanr"   "sp"        "knitr"     "stats"     "graphics" 
##  [6] "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[2]]
##  [1] "stplanr"   "sp"        "knitr"     "stats"     "graphics" 
##  [6] "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[3]]
##  [1] "tmap"      "stplanr"   "sp"        "knitr"     "stats"    
##  [6] "graphics"  "grDevices" "utils"     "datasets"  "methods"  
## [11] "base"     
## 
## [[4]]
##  [1] "deldir"    "tmap"      "stplanr"   "sp"        "knitr"    
##  [6] "stats"     "graphics"  "grDevices" "utils"     "datasets" 
## [11] "methods"   "base"     
## 
## [[5]]
##  [1] "readr"     "deldir"    "tmap"      "stplanr"   "sp"       
##  [6] "knitr"     "stats"     "graphics"  "grDevices" "utils"    
## [11] "datasets"  "methods"   "base"     
## 
## [[6]]
##  [1] "rgeos"     "readr"     "deldir"    "tmap"      "stplanr"  
##  [6] "sp"        "knitr"     "stats"     "graphics"  "grDevices"
## [11] "utils"     "datasets"  "methods"   "base"     
## 
## [[7]]
##  [1] "dplyr"     "rgeos"     "readr"     "deldir"    "tmap"     
##  [6] "stplanr"   "sp"        "knitr"     "stats"     "graphics" 
## [11] "grDevices" "utils"     "datasets"  "methods"   "base"

Where are they?

Source: http://www.projectmapping.co.uk/

Downloading official data on stations

Source: data.gov.uk

u = "http://www.dft.gov.uk/NaPTAN/snapshot/NaPTANcsv.zip"
zf = file.path(tempdir(), "NaPTANcsv.zip")
download.file(url = u, destfile = zf)
unzip(zf, exdir = tempdir())
stations = read_csv(file = file.path(tempdir(), "stops.csv"))
rail = filter(stations, StopType == "RSE")
saveRDS(rail, "data/rail.Rds")

The naptan data

What does it mean? Details: gov.uk

# A tibble: 6 x 43
      ATCOCode NaptanCode PlateCode CleardownCode                CommonName CommonNameLang ShortCommonName ShortCommonNameLang Landmark
         <chr>      <chr>     <chr>         <chr>                     <chr>          <chr>           <chr>               <chr>    <chr>
1 0100BRP90087    bstgtaw                                  Brunswick Street             en                                             
2 0100BRP90088    bstgtjp                                   Brigstocke Road             en                                             
3 0100BRP90089    bstgtpj                                   Brigstocke Road             en                                             
4 0100BRP90090    bstgwgm                                    Denbigh Street             en                                             
5 0100BRP90091    bstgwta                                    Denbigh Street             en                                             
6 0100CLFDOWN0                                    Clifton Down Rail Station             en                                             
# ... with 34 more variables: LandmarkLang <chr>, Street <chr>, StreetLang <chr>, Crossing <chr>, CrossingLang <chr>, Indicator <chr>,
#   IndicatorLang <chr>, Bearing <chr>, NptgLocalityCode <chr>, LocalityName <chr>, ParentLocalityName <chr>,
#   GrandParentLocalityName <chr>, Town <chr>, TownLang <chr>, Suburb <chr>, SuburbLang <chr>, LocalityCentre <int>, GridType <chr>,
#   Easting <int>, Northing <int>, Longitude <dbl>, Latitude <dbl>, StopType <chr>, BusStopType <chr>, TimingStatus <chr>,
#   DefaultWaitTime <chr>, Notes <chr>, NotesLang <chr>, AdministrativeAreaCode <chr>, CreationDateTime <time>,
#   ModificationDateTime <time>, RevisionNumber <int>, Modification <chr>, Status <chr>

The geographical distribution of stations

rail = readRDS("data/rail.Rds")
coordinates(rail) = ~Longitude+Latitude
plot(rail)

Usage stats

Source: Office of Rail and Road

For more info: see this infographic on usage stats.

Case study data: Cambridge

Data from the PCT: pct.bike/cambridgeshire/

z = readRDS("../pct-data/cambridgeshire/z.Rds")
qtm(z) +
  qtm(rail, bubble.size = 0.3)

Methodology

Splitting a single line in 3

Identifying shortest paths

Various ways of doing this
Voronoi polygons is most visual:

rail_cam = readRDS("data/rail_cam.Rds")
voronai = deldir(rail_cam$Easting, rail_cam$Northing)
plot(voronai)

Find closest stations

Create travel travel to station 'desire lines'

for(i in 1:nrow(cents)){
  o = cents[i,]
  d = rail_cam[mat[i],]
  od = sbind(SpatialPoints(o), SpatialPoints(d))
  if(i == 1)
      L = SpatialLines(list(Lines(list(Line(coordinates(od))),"X"))) else
        L = sbind(L, SpatialLines(list(Lines(list(Line(coordinates(od))),"X"))))
}
ld = SpatialLinesDataFrame(L, cents@data["rail"], match.ID = F)

The results

Concept: from a hackathon

Using @robinlovelace tool (stplanr, modified) to review @cyclestreets journeys from their API, with @qgis. pic.twitter.com/t17LebZFJu
— Matt Turner (@MattTurnerSheff) January 31, 2016

Further work

Identify the problem we're trying to solve

Building cycle paths to stations (layer in the PCT)
Identify where there's a short-fall of cycle parking?
Identify target areas to incentivise to cycle?
Provision of cycles at 'arrival station'

Work that needs to be done:

Find if rail is a viable option for non-rail trips
- Trains may alread by at capacity (e.g. Ely)
- Many destinations are not close to rail stations
Find if nearest station is suitable per rail commuter - People may not go to the nearest station - TransportAPI or Google's Distance Matrix API can help
Explore travel options upon arrival at 'destination station'
Get data on cycle facilities at stations

Any questions?

Email: r.lovelace at leeds.ac.uk
Twitter: @robinlovelace
Thanks for listening