14 July 2016. Slides made with RStudio. Reproducible code: https://github.com/npct/pct-rail-test

Context: data analysis software

  • Microsoft acquired Revolution Analytics in January 2016

  • Opportunities for development of open source technologies for sustainable planning

The Propensity to Cycle Tool

Motivation

Lack of strategic cycle networks

What is the PCT?

  • An open source, publicly accessible web-based tool for assessing the geographical distribution and potential benefits of cycling potential nationwide

  • Best illustrated in a demo: http://pct.bike/

What kind of questions can it help answer?

  • Where should we build for existing cyclists?
  • New cyclists in the medium term?
  • Long-term strategy?
  • Along which routes to create the strategic joined-up network?
  • Where should cycling interventions be prioritised around public transport infrastructure?

Phase I (Feb - July 2016)

  • Build and test a prototype model
  • Identify 'desire lines' of greatest potential
  • Make the tool scalable nationally
  • Create a website that will make the Propensity to Cycle Tool a publicly accessible resource

Phase II (January 2016 - March 2017)

Version 1 - nationwide (V1 launch: June 2016)

  • Route-allocated hilliness, network layer (complete)
  • Include health outcomes (HEAT)
  • National-level results (Anna Goodman)
  • V1.5 - smaller (LSOA) zones (Jan 2017)
  • Training

Version 2 - local deployment

  • Include non-commute trips
  • Compatibility with Local Authority trip demand models
  • Micro-level analysis (V 2.2)

Additional work/spin outs

  • Case studies of use (e.g. Manchester, Kent, Yorkshire)
  • Method for identifying severance
  • Case study along HS2 route
  • 'Hackathons' to stimulate the tool's development

Data sources for cycling potential to stations

Warning: We'll be using R code

Using the following packages

pkgs = c(
  "stplanr", # transport data handling
  "sp", # package for spatial data
  "tmap", # package for mapping
  "deldir", # voronoi polygons
  "readr", # fast read/write
  "rgeos", # gis functions
  "dplyr" # data analysis
)
lapply(pkgs, library, character.only = T)
## [[1]]
##  [1] "stplanr"   "sp"        "knitr"     "stats"     "graphics" 
##  [6] "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[2]]
##  [1] "stplanr"   "sp"        "knitr"     "stats"     "graphics" 
##  [6] "grDevices" "utils"     "datasets"  "methods"   "base"     
## 
## [[3]]
##  [1] "tmap"      "stplanr"   "sp"        "knitr"     "stats"    
##  [6] "graphics"  "grDevices" "utils"     "datasets"  "methods"  
## [11] "base"     
## 
## [[4]]
##  [1] "deldir"    "tmap"      "stplanr"   "sp"        "knitr"    
##  [6] "stats"     "graphics"  "grDevices" "utils"     "datasets" 
## [11] "methods"   "base"     
## 
## [[5]]
##  [1] "readr"     "deldir"    "tmap"      "stplanr"   "sp"       
##  [6] "knitr"     "stats"     "graphics"  "grDevices" "utils"    
## [11] "datasets"  "methods"   "base"     
## 
## [[6]]
##  [1] "rgeos"     "readr"     "deldir"    "tmap"      "stplanr"  
##  [6] "sp"        "knitr"     "stats"     "graphics"  "grDevices"
## [11] "utils"     "datasets"  "methods"   "base"     
## 
## [[7]]
##  [1] "dplyr"     "rgeos"     "readr"     "deldir"    "tmap"     
##  [6] "stplanr"   "sp"        "knitr"     "stats"     "graphics" 
## [11] "grDevices" "utils"     "datasets"  "methods"   "base"

Where are they?

Source: http://www.projectmapping.co.uk/

Downloading official data on stations

Source: data.gov.uk

u = "http://www.dft.gov.uk/NaPTAN/snapshot/NaPTANcsv.zip"
zf = file.path(tempdir(), "NaPTANcsv.zip")
download.file(url = u, destfile = zf)
unzip(zf, exdir = tempdir())
stations = read_csv(file = file.path(tempdir(), "stops.csv"))
rail = filter(stations, StopType == "RSE")
saveRDS(rail, "data/rail.Rds")

The naptan data

What does it mean? Details: gov.uk

# A tibble: 6 x 43
      ATCOCode NaptanCode PlateCode CleardownCode                CommonName CommonNameLang ShortCommonName ShortCommonNameLang Landmark
         <chr>      <chr>     <chr>         <chr>                     <chr>          <chr>           <chr>               <chr>    <chr>
1 0100BRP90087    bstgtaw                                  Brunswick Street             en                                             
2 0100BRP90088    bstgtjp                                   Brigstocke Road             en                                             
3 0100BRP90089    bstgtpj                                   Brigstocke Road             en                                             
4 0100BRP90090    bstgwgm                                    Denbigh Street             en                                             
5 0100BRP90091    bstgwta                                    Denbigh Street             en                                             
6 0100CLFDOWN0                                    Clifton Down Rail Station             en                                             
# ... with 34 more variables: LandmarkLang <chr>, Street <chr>, StreetLang <chr>, Crossing <chr>, CrossingLang <chr>, Indicator <chr>,
#   IndicatorLang <chr>, Bearing <chr>, NptgLocalityCode <chr>, LocalityName <chr>, ParentLocalityName <chr>,
#   GrandParentLocalityName <chr>, Town <chr>, TownLang <chr>, Suburb <chr>, SuburbLang <chr>, LocalityCentre <int>, GridType <chr>,
#   Easting <int>, Northing <int>, Longitude <dbl>, Latitude <dbl>, StopType <chr>, BusStopType <chr>, TimingStatus <chr>,
#   DefaultWaitTime <chr>, Notes <chr>, NotesLang <chr>, AdministrativeAreaCode <chr>, CreationDateTime <time>,
#   ModificationDateTime <time>, RevisionNumber <int>, Modification <chr>, Status <chr>

The geographical distribution of stations

rail = readRDS("data/rail.Rds")
coordinates(rail) = ~Longitude+Latitude
plot(rail)

Usage stats

Source: Office of Rail and Road

Case study data: Cambridge

Data from the PCT: pct.bike/cambridgeshire/

z = readRDS("../pct-data/cambridgeshire/z.Rds")
qtm(z) +
  qtm(rail, bubble.size = 0.3)

Methodology

Splitting a single line in 3

Identifying shortest paths

  • Various ways of doing this
  • Voronoi polygons is most visual:
rail_cam = readRDS("data/rail_cam.Rds")
voronai = deldir(rail_cam$Easting, rail_cam$Northing)
plot(voronai) 

Find closest stations

Create travel travel to station 'desire lines'

for(i in 1:nrow(cents)){
  o = cents[i,]
  d = rail_cam[mat[i],]
  od = sbind(SpatialPoints(o), SpatialPoints(d))
  if(i == 1)
      L = SpatialLines(list(Lines(list(Line(coordinates(od))),"X"))) else
        L = sbind(L, SpatialLines(list(Lines(list(Line(coordinates(od))),"X"))))
}
ld = SpatialLinesDataFrame(L, cents@data["rail"], match.ID = F)

The results

Concept: from a hackathon

Further work

Identify the problem we're trying to solve

  • Building cycle paths to stations (layer in the PCT)
  • Identify where there's a short-fall of cycle parking?
  • Identify target areas to incentivise to cycle?
  • Provision of cycles at 'arrival station'

Work that needs to be done:

  • Find if rail is a viable option for non-rail trips
    • Trains may alread by at capacity (e.g. Ely)
    • Many destinations are not close to rail stations
  • Find if nearest station is suitable per rail commuter - People may not go to the nearest station - TransportAPI or Google's Distance Matrix API can help
  • Explore travel options upon arrival at 'destination station'
  • Get data on cycle facilities at stations

Any questions?

  • Email: r.lovelace at leeds.ac.uk
  • Twitter: @robinlovelace
  • Thanks for listening