Gravity models

Transport Planning Lab - Week 7

Rafael Verduzco based on Dr McArthur

University of Glasgow

2025-10-02

Introduction

The four-step transport model

Orgin-destination logic

Source: Oshan (2022). https://doi.org/10.4337/9781789903942.00020

Getting started

Follow the next steps

  1. Create a new RStudio project
  2. Download the data and R file available from Moodle
  3. Copy the files into your Rstudio project directory.
  4. Create a new RMD or R file (ignore the R script included for now)

Read in the data

library(tidyverse)

# Read OD Norway data
od_norway <- read_csv('data/od_norway.csv')
glimpse(od_norway)
Rows: 1,089
Columns: 4
$ origin      <dbl> 1201, 1246, 1221, 1234, 1219, 1238, 1243, 1242, 1227, 1266…
$ destination <dbl> 1201, 1201, 1201, 1201, 1201, 1201, 1201, 1201, 1201, 1201…
$ distance    <dbl> 10.09861, 16.21161, 145.45896, 122.83646, 173.36263, 73.13…
$ flow        <dbl> 120379, 2601, 114, 9, 40, 77, 696, 39, 6, 8, 146, 347, 153…

Description of the data included.

  • The file od_norway.csv contains data for Norwegian municipalities in the county of Hordaland
  • Column ‘origin’ contains the ID of municipality of origin
  • Column ‘destination’ contains the ID of municipality of destination
  • The column ‘distance’ contains the road network distance from each municipality to every other
  • The column ‘flow’ gives the number of observed trips between the OD pair

We want to derive a trip distribution model showing the flows between each pair of municipalities.

The unconstrained model

The unconstrained gravity model

\[\begin{equation*} T_{ij} = O_i D_j f(d_{ij}) \end{equation*}\]

  • We need to define the function \(f(d_{ij})\)

Spatial deterrence or decaying functions

Implement negative exponential

  • Let’s use an exponential function: \(f(d_{ij})=exp(-\sigma d_{ij})\)
  • Assume \(\sigma = 0.05\)

\[\begin{equation*} T_{ij} = O_i D_k exp(-\sigma d_{ij}) \end{equation*}\]

Processing the distances

# Impedance using negative exponential function
impedance <- exp(-0.05 * od_norway$distance)

Compute trips generated and attracted

# Aggregate generated trips O_i
od_norway <- od_norway %>% 
  group_by(origin) %>% 
  mutate(O_i = sum(flow))

# Aggregate attracted trips D_j
od_norway <- od_norway %>% 
  group_by(destination) %>% 
  mutate(D_j = sum(flow)) %>% 
  ungroup()

Predict trip distribution using the unconstrained model

# Fit unconstrained model
od_norway$unc_predicted <- od_norway$O_i * od_norway$D_j * impedance

Is it any good?

  • We want to know whether this prediction is any good
  • We can have a look at the data
  • We can also use a measure of goodness of fit
  • In a linear regression we could use a value for \(R^2\)
  • For this sort of model, it has been suggested (by Stuart Fotheringham) we could use the Standardised Root Mean Square Error (SRMSE)
  • This would allow us to compare a predicted trip distribution matrix to an observed one

Comparing the estimates and the observed

# Trips predicted
sum(od_norway$unc_predicted)
[1] 15827635038
# Total observed flows
sum(od_norway$flow)
[1] 231743

Distribution of the flows

summary(od_norway$unc_predicted)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
0.000e+00 3.720e+02 6.992e+03 1.453e+07 1.304e+05 1.111e+10 
summary(od_norway$flow)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
     0.0      0.0      1.0    212.8      7.0 120379.0 

The Standardised Root Mean Square Error (SRMSE)

\[\begin{equation*} \textrm{SRMSE} = \frac{ \sqrt{\frac{\Sigma_i \Sigma_j (\hat{T}_{ij} - T_{ij})^2}{IJ}} }{ \frac{T}{IJ} } \end{equation*}\]

Writing a function to calculate the SRMSE

# Define FN: standardized residual mean squared error
srmse <- function(observed, fitted) {
  sumsquares <- sum((observed - fitted)^2)
  srmse <- sqrt((sumsquares / length(observed))) /
    (sum(observed) / length(observed))
  return(srmse)
}

What is the SRMSE value for our model?

srmse(od_norway$flow, od_norway$unc_predicted)
[1] 1594282

Doubly constrained gravity model

Doubly constrained gravity model formulation

\[\begin{equation*} \hat{T_{ij}} = A_i O_i B_j D_j exp(-\sigma d_{ij}) \end{equation*}\] \[\begin{align*} A_i &= \frac{1}{B_j D_j exp(-\sigma d_{ij})} \\ B_j &= \frac{1}{A_i O_i exp(-\sigma d_{ij})} \end{align*}\]

Implementing the constraints

  • Note that balancing factors are inter-dependent
  • Balancing factors can be estimated using an iterative routine
  • We write our own function to impose the constraints
  • This is found in the file balancing_dc.R
  • We need to supply the marginal constraints (origin and destination sums)
  • We also supply the matrix we want to be updated

Step 1

Step 2

Updating the estimated trip distribution matrix

# Load balancing factor function
source('balancing_dc.R')

# Estimate balancing factors
od_norway <- balancing_dc(od_norway, impedance) 

# Fit doubly constrained model
od_norway$dc_pred <- 
  od_norway$O_i * od_norway$Ai * od_norway$D_j * od_norway$Bj * impedance

Is this an improvement?

# Predicted
sum(od_norway$dc_pred)
[1] 231743
# Observed
sum(od_norway$flow)
[1] 231743
# SRMSE
srmse(od_norway$flow, od_norway$dc_pred)
[1] 2.974554

What about trying \(\sigma=0.06\)

Re-run the doubly-constrained model using \(\sigma=0.06\)

# Impedance
impedance <- exp(-0.06 * od_norway$distance)
# Estimate balancing factors
od_norway <- balancing_dc(od_norway, impedance) 
# Fit doubly constrained model
od_norway$dc_pred2 <- 
  od_norway$O_i * od_norway$Ai * od_norway$D_j * od_norway$Bj * impedance
# SRMSE
srmse(od_norway$flow, od_norway$dc_pred2)
[1] 2.621599

Selecting a parameter

  • We could keep trying different parameter values to get a better fit
  • It would be better to ask R to do this for us
  • R has various optimisation functions which we could use
  • We will use the aptly named optimise() function
  • We have to write a function which the function can optimise

Wrapping our code in an optimisable function

# Wrap the steps in a function
srmse_beta <- function(beta){
  # Impedance
  impedance <- exp(-beta * od_norway$distance)
  # Estimate balancing factors
  od_data <- balancing_dc(od_norway, impedance) 
  # Fit doubly constrained model
  dc_pred <- 
    od_data$O_i * od_data$Ai * od_data$D_j * od_data$Bj * impedance
  # SRMSE
  srmse(od_data$flow, dc_pred)
}

Optimising the distance decay parameter

optimised_sigma <- optimise(srmse_beta, maximum = FALSE, interval = c(0, 0.4))
optimised_sigma
$minimum
[1] 0.2362804

$objective
[1] 0.7275815

Let’s try the inverse power function

\[\begin{equation*} \hat{T_{ij}} = A_i O_i B_j D_j d_{ij}^{-\sigma} \end{equation*}\] \[\begin{align*} A_i &= \frac{1}{B_j D_j d_{ij}^{-\sigma}} \\ B_j &= \frac{1}{A_i O_i d_{ij}^{-\sigma}} \end{align*}\]

Implementing it in R

Wrap the steps in a function

srmse_power <- function(beta){
  # Impedance
  impedance <- od_norway$distance ^ -beta
  # Estimate balancing factors
  od_norway <- balancing_dc(od_norway, impedance) 
  # Fit doubly constrained model
  dc_pred <- 
    od_norway$O_i * od_norway$Ai * od_norway$D_j * od_norway$Bj * impedance
  # SRMSE
  srmse(od_norway$flow, dc_pred)
}

Optimising

optimised_power <- 
  optimise(srmse_power, maximum = FALSE, interval = c(0, 4))
optimised_power
$minimum
[1] 2.556406

$objective
[1] 0.5603094

Which decaying function is better in this context?

Conclusion

  • These are some of the basics of how gravity models work
  • We have seen how to estimate a trip distribution matrix from modelled generations/attractions
  • We have also seen how to calibrate a gravity model using observed data
  • We have also experimented with different decaying functions
  • There are other frameworks to fit the gravity model, e.g. maximum entropy, maximum likelihood, poisson regression.

Individual activity

There is a plan to construct a tunnel between origin 1201 and destination 1253. This will reduce the distance by 20 kilometres. Assume that the total trips generated and attracted remain unchanged (i.e. using the unconstrained model). Use your most accurate model and calibrated parameters to estimate:

  1. The change in flow in this OD pair.
  2. How other OD pairs migh change changing