A spatial interaction model of museum visits

In this model we will use a small real dataset of residential zones and museums from Leeds to develop a simple spatial interaction (SI) model. The first stage is to load the data.

Loading the test data

There are two input datasets for this model: populations in administrative zones and museums. For simplicity, both can be represented as points.

The zonal population data

x = c("ggplot2", "sp", "rgeos", "mapproj", "rgdal", "maptools")
lapply(x, require, character.only = T)
## Loading required package: sp
## Loading required package: rgeos
## rgeos version: 0.2-17, (SVN revision 392) GEOS runtime version:
## 3.3.8-CAPI-1.7.8 Polygon checking: TRUE
## Loading required package: mapproj
## Loading required package: maps
## Loading required package: rgdal
## rgdal: version: 0.8-9, (SVN revision 470) Geospatial Data Abstraction
## Library extensions to R successfully loaded Loaded GDAL runtime: GDAL
## 1.10.0, released 2013/04/24 but rgdal build and GDAL runtime not in sync:
## ... consider re-installing rgdal!! Path to GDAL shared files:
## /usr/share/gdal/1.10 Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012,
## [PJ_VERSION: 480] Path to PROJ.4 shared files: (autodetected)
## Loading required package: maptools
## Loading required package: grid
## Loading required package: lattice
## Note: the specification for class "im" in package 'maptools' seems
## equivalent to one from package 'sp': not turning on duplicate class
## definitions for this class.
## Note: the specification for class "owin" in package 'maptools' seems
## equivalent to one from package 'sp': not turning on duplicate class
## definitions for this class.
## Note: the specification for class "ppp" in package 'maptools' seems
## equivalent to one from package 'sp': not turning on duplicate class
## definitions for this class.
## Note: the specification for class "psp" in package 'maptools' seems
## equivalent to one from package 'sp': not turning on duplicate class
## definitions for this class.
## Checking rgeos availability: TRUE
## [[1]]
## [1] TRUE
## 
## [[2]]
## [1] TRUE
## 
## [[3]]
## [1] TRUE
## 
## [[4]]
## [1] TRUE
## 
## [[5]]
## [1] TRUE
## 
## [[6]]
## [1] TRUE
w <- readOGR(dsn = ".", layer = "wards")  # read in the wards of Leeds
## OGR data source with driver: ESRI Shapefile 
## Source: ".", layer: "wards"
## with 68 features and 339 fields
## Feature type: wkbPolygon with 2 dimensions
w@data <- w@data[, 1:3]  # remove unwanted columns
pops <- SpatialPointsDataFrame(coords = coordinates(w), data = w@data, proj4string = CRS("+init=epsg:27700"))  # convert to points
plot(w)
points(pops)  # geographical centroids (population weighted would be better)

plot of chunk unnamed-chunk-1


h <- pops[which(grepl("Harehills", pops$ZONE_LABEL)), ]  # select centre of study
id.near <- which(gDistance(h, w, byid = T) < 3000)  # select areas close to Harehills
## Warning: spgeom1 and spgeom2 have different proj4 strings
hnear <- pops[id.near, ]
plot(w[id.near, ])

plot of chunk unnamed-chunk-1

The Royal Armoury

We use the recently developed osmar package to illustrate the location of the Royal Armoury museum. In fact, only a single coordinate is actually needed for this simple model (knowing the museum floor plan will be useful for other applications) as shown with the creation of Leeds City Museum as a point object:

library(osmar)
## Loading required package: XML
## Loading required package: RCurl
## Loading required package: bitops
## Loading required package: gtools
## Loading required package: geosphere
## Attaching package: 'osmar'
## The following object is masked from 'package:utils':
## 
## find
src <- osmsource_api()
bb <- center_bbox(-1.532258, 53.791866, 60, 60)
# these coordinates were obtained from GeoHack:
# http://tools.wmflabs.org/geohack/geohack.php?pagename=Royal_Armouries_Museum&params=53.791866_N_-1.532258_E_region:GB_scale:2000

arm <- get_osm(bb, source = src)
plot(arm)  # inbuilt osmar object plotting

plot of chunk unnamed-chunk-2

summary(arm)
## osmar object
## 44 nodes, 3 ways, 2 relations 
## 
## osmar$nodes object
## 44 nodes, 5 tags 
## 
## ..$attrs data.frame: 
##     id, version, changeset, lat, lon, user, uid, visible,
##     timestamp 
## ..$tags data.frame: 
##     id, k, v 
##  
## Bounding box:
##       lat    lon
## min 53.79 -1.533
## max 53.79 -1.529
## 
## Key-Value contingency table:
##       Key           Value Freq
## 1 barrier         bollard    1
## 2 tourism          museum    1
## 3    name Royal Armouries    1
## 
## 
## osmar$ways object
## 3 ways, 9 tags, 45 refs 
## 
## ..$attrs data.frame: 
##     id, visible, timestamp, version, changeset, user, uid 
## ..$tags data.frame: 
##     id, k, v 
## ..$refs data.frame: 
##     id, ref 
##  
## Key-Value contingency table:
##       Key            Value Freq
## 1 ncn_ref               67    1
## 2    name Armouries Square    1
## 3 highway          footway    1
## 
## 
## osmar$relations object
## 2 relations, 11 tags, 0 node_refs, 567 way_refs 
## 
## ..$attrs data.frame: 
##     id, visible, timestamp, version, changeset, user, uid 
## ..$tags data.frame: 
##     id, k, v 
## ..$refs data.frame: 
##     id, type, ref, role 
##  
## Key-Value contingency table:
##     Key   Value Freq
## 1 route bicycle    2
## 2  type   route    2
## 3   ref      67    1
arm$ways$tags  # it's the joined nodes of the building we're after
##         id        k                      v
## 1  4326734  ncn_ref                     67
## 2  4326734  highway                footway
## 3  4326734  bicycle                    yes
## 4  4326734     foot                    yes
## 5 28836553     area                    yes
## 6 28836553     name       Armouries Square
## 7 28836553  highway             pedestrian
## 8 78431237 building                    yes
## 9 78431237   source OS OpenData StreetView

arm.ids <- find(arm, way(tags(k == "building")))
arm.ids  # one way
## [1] 78431237
arm.ids <- find_down(arm, way(arm.ids))
arm.ids  # composed of 25 nodes
## $node_ids
##  [1] 9.204e+08 9.204e+08 9.204e+08 9.204e+08 1.414e+09 1.893e+09 9.204e+08
##  [8] 9.204e+08 9.204e+08 9.204e+08 9.204e+08 1.414e+09 1.414e+09 9.204e+08
## [15] 9.204e+08 9.204e+08 9.204e+08 1.414e+09 1.414e+09 9.204e+08 9.204e+08
## [22] 9.204e+08 9.204e+08 9.204e+08 9.204e+08
## 
## $way_ids
## [1] 78431237
## 
## $relation_ids
## NULL
armoury.museum <- subset(arm, ids = arm.ids)
plot(armoury.museum)

plot of chunk unnamed-chunk-2

arm <- armoury.museum
rm(armoury.museum)  # name too long

Now that the Armoury Museum has been saved using raw Open Street Map data, it must by processed before it is included as a point in the SI model.

class(arm)
## [1] "osmar" "list"
args(as_sp)
## function (obj, what = c("points", "lines", "polygons"), crs = osm_crs(), 
##     simplify = TRUE) 
## NULL
arm <- as_sp(arm, "polygons")  # convert to widely used sp data structure
plot(arm)  # now an sp data object

plot of chunk unnamed-chunk-3


# change grid reference
arm <- SpatialPolygonsDataFrame(spTransform(arm, CRSobj = CRS("+init=epsg:27700")), 
    data = arm@data)
writeOGR(arm, dsn = "shps/", layer = "arm-floorplan", driver = "ESRI Shapefile")  # save it
## Error: layer exists, use a new layer name

Leeds City Museum

A quicker process (omitting the floor plan) was used to creat the Leeds City Museum object:

lcmuseum <- SpatialPoints(coords = matrix(c(-1.547323, 53.801617), nrow = 1), 
    proj4string = CRS("+init=epsg:4326"))
lcmuseum <- spTransform(lcmuseum, CRSobj = CRS("+init=epsg:27700"))
museums <- spRbind(SpatialPoints(coordinates(arm), proj4string = CRS("+init=epsg:27700")), 
    lcmuseum)

Implementing the model

The simplest possible implementation

plot(w[id.near, ])
points(hnear)
points(museums, lwd = 3)

plot of chunk unnamed-chunk-5

Spatial interaction model n. 1

The phenomenon of people travelling to museums is similar to that of people travelling from many locations to a few shopping centres. Therefore we use a model commonly used to model flows to retail centres as an analogy. The equation estimates the flow (S) from each residential location (i) to each destination (j) (Wilson, 2000, p. 66):

\[ S_{ij} = A_ie_iP_iW_j^\alpha e^{-\beta c_{ij}} \]

where \[ A_i = frac{1}{\sum_k^\alpha e^{-\beta c_{ij}} \]

In the context of the input data presented above, we will in the first instance assume that \( e_i = 0.1 \) for all areas, that all areas are equally attractive (W = 1) and that \( c_{ij} \) is proportional to the Euclidean distance between the zone centroid and the museum.

Euclidean distances

round(gDistance(hnear, museums, byid = T)/1000, 2)
##           33   34   35   38   39   45   52   54   55   56
## 78431237 3.2 3.46 1.72 5.95 3.28 5.35 2.35 5.82 5.30 1.96
##          3.8 2.45 1.68 6.98 3.30 4.25 3.66 5.41 5.62 0.70
dij <- gDistance(museums, hnear, byid = T)/1000

Implementing the model in R

alpha <- 0.3
beta <- 0.3
W <- A <- rep(1, times = nrow(museums@coords))

for (j in 1:nrow(museums@coords)) {
    A <- 1/(sum(W[j] * exp(-beta * dij[, j])))
}

S <- A * 0.1 * hnear$ST1210001 * W * exp(-beta * dij)
plot(hnear$ST1210001, S[, 1])

plot of chunk unnamed-chunk-7

plot(dij[, 1], S[, 1])

plot of chunk unnamed-chunk-7

simodel <- lm(S[, 1] ~ hnear$ST1210001 + dij)
summary(simodel)
## 
## Call:
## lm(formula = S[, 1] ~ hnear$ST1210001 + dij)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -5.459 -4.696 -0.713  1.189 10.498 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      92.01583   10.38800    8.86  0.00012 ***
## hnear$ST1210001   0.00770    0.00166    4.63  0.00358 ** 
## dij78431237     -19.89561    3.44963   -5.77  0.00119 ** 
## dij              -1.48276    2.59818   -0.57  0.58893    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.83 on 6 degrees of freedom
## Multiple R-squared:  0.962,  Adjusted R-squared:  0.942 
## F-statistic:   50 on 3 and 6 DF,  p-value: 0.000122

The linear regression model presented above is a 'sanity check' to see if the model responds to distance (from only museum 1 remember) and population. The answer is resounding YES: flow is negatively correlated with distance and positively so with population.

Preliminary plots of the model

S[1:5]
## [1]  69.73  72.61 126.07  51.75  71.21
plot(hnear)
for (j in 1:nrow(museums@coords)) {
    for (i in 1:nrow(hnear@data)) {
        lines(c(coordinates(hnear)[i, 1], coordinates(museums)[j, 1]), c(coordinates(hnear)[i, 
            2], coordinates(museums)[j, 2]), lwd = (S[i + (j - 1) * nrow(hnear@data)]/mean(S))^3)
    }
}

plot of chunk unnamed-chunk-8

From first impressions, the spatial microsimulation model seems to work: flows are positively correlated with the population of the origins and distance to the destination. It is posible to see from the above model, for example that origins that are particularly close to one museum tend to have much greater flow to that museum than to the other more distant one.

There is much to do to further refine this model:

References

Wilson, A. G. (2000). Complex spatial systems: the modelling foundations of urban and regional analysis. Pearson Education.