A cheap, simple framework for distributed MATSim

Pieter Fourie - Future Cities Laboratory
14 March 2016

Singapore scenario

  • 300+ lines, 800+ routes, 2000+ stops, 70k+ links
  • Needs PT on the network because PT a dominant mode here, detailed PT analysis essential
  • Initially plans have negative scores because of low car availability, stranded agents
  • Large solution space -> long runs; i.e. 2 days on 24 cores
  • Long initialization time: 2hrs on our 24 core server

Example run of Singapore, 30% replanning

Singapore standard scenario

Reference stopwatch

Problem

  • QSim is limiting factor in performance.
  • Increase QSim performance with increasing cores
  • BUT You can only improve QSim by so much (tightly coupled, simple network decomposition)

Motivation: performance

  • Reduce iterations (QSim expensive, limited parallellization)
  • Reduce time per iteration, initialization (Large networks, transit routing)
  • Increase agent plan memory
  • Increase sample size (esp. important for transit: scaleability, error-prone because have to change many configurations)
  • Use cheap computation, e.g. amazon EC2 or office network
  • i.e. remove CPU, RAM limitations of single SMP

Design aims: integration

  • Monolithic replanning modules are hard to maintain in an evolving project
  • Need flexibility to grow with main MATSim project, easily integrate new functionality
  • Produce comparable results to MATSim

Design

PSim operation

  • Builds on previous work with Johannes Illenberger and Kai Nagel
  • Extension to the MATSim model; a meta-model that runs for a number of cycles in series
  • Fires agent events based on link travel times from the preceding QSim iteration

Design

  • Experiments with the car-only version of PSim showed that it could dramatically reduce simulation times
  • Having it run in series means that QSim waits for plans

PSim performance

Master-slave setup

Master-slave operation

Design elements

  • Transit PSim
  • Plan and person serialization
  • Load balancing
  • Operation modes: serial vs parallel (parallel implies operating on information one iteration older)

  • Full transit performance transmission

  • Stochastic boarding model for PSim

  • Works with most modules

Performance of distributed system

  • Works, in principle
  • Parallel is preferable

Reference

Reference stopwatch

Parallel

Parallel stopwatch

Meta-model performance: Sioux Falls

plot of chunk sf.scorecompare

Meta-model characteristics

  • Increases the number of mutation combinations performed on a plan
  • Co-evolutionary algorithm relies on mutation rate and number of iterations
  • Number of unique mutation combinations performed on plans is multinomial distribution
  • Each plan can be said to have a 'genome' of mutations, i.e. I000A001B003A004G010
  • Standard MATSim can increase any given genome by a maximum of 1 letter

Genomic analysis of plans

  • How does genome length relate to exploration of solution space?

  • Can genomes possibly tell us something about scenario characteristics/complexity?

  • Can genome analysis inform effective replanning strategies, e.g by varying replanning rates or chaining strategies based on effective combinations ('genes') observed in the genome?

Gene length: Sioux Falls

MATSim reference run

plot of chunk sf.genelength.len.ref

Q:P = 1:10, parallel

plot of chunk sf.genelength.len.p10

Gene score: Sioux Falls

MATSim reference run

plot of chunk sf.genelength.scor.ref

Q:P = 1:10, parallel

plot of chunk sf.genelength.score.p10

plot of chunk sf.survivor.iter.ref

plot of chunk sf.survivor.iter.p10

plot of chunk sf.survivor.score.ref

plot of chunk sf.survivor.score.p10

Outstanding issues

Design

  • Make PSim a replanning module, no regeneration of data structures for replanning

Performance tests

  • Number of slaves vs PSim iters vs number of threads per slave vs time

New strategies

  • Diversification, large agent memories, randomized routing

Ensemble simulations

  • Instead of a single master, have multiple masters running differrent combinations of agent plans
  • Requires new modes of analysis

Singapore standard scenario

Diversification with 20 plans per agent