In this model we will use a small real dataset of residential zones and museums from Leeds to develop a simple spatial interaction (SI) model. The first stage is to load the data.
There are two input datasets for this model: populations in administrative zones and museums. For simplicity, both can be represented as points.
x = c("ggplot2", "sp", "rgeos", "mapproj", "rgdal", "maptools")
lapply(x, require, character.only = T)
## Loading required package: sp
## Loading required package: rgeos
## rgeos version: 0.2-17, (SVN revision 392) GEOS runtime version:
## 3.3.8-CAPI-1.7.8 Polygon checking: TRUE
## Loading required package: mapproj
## Loading required package: maps
## Loading required package: rgdal
## rgdal: version: 0.8-9, (SVN revision 470) Geospatial Data Abstraction
## Library extensions to R successfully loaded Loaded GDAL runtime: GDAL
## 1.10.0, released 2013/04/24 but rgdal build and GDAL runtime not in sync:
## ... consider re-installing rgdal!! Path to GDAL shared files:
## /usr/share/gdal/1.10 Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012,
## [PJ_VERSION: 480] Path to PROJ.4 shared files: (autodetected)
## Loading required package: maptools
## Loading required package: grid
## Loading required package: lattice
## Note: the specification for class "im" in package 'maptools' seems
## equivalent to one from package 'sp': not turning on duplicate class
## definitions for this class.
## Note: the specification for class "owin" in package 'maptools' seems
## equivalent to one from package 'sp': not turning on duplicate class
## definitions for this class.
## Note: the specification for class "ppp" in package 'maptools' seems
## equivalent to one from package 'sp': not turning on duplicate class
## definitions for this class.
## Note: the specification for class "psp" in package 'maptools' seems
## equivalent to one from package 'sp': not turning on duplicate class
## definitions for this class.
## Checking rgeos availability: TRUE
## [[1]]
## [1] TRUE
##
## [[2]]
## [1] TRUE
##
## [[3]]
## [1] TRUE
##
## [[4]]
## [1] TRUE
##
## [[5]]
## [1] TRUE
##
## [[6]]
## [1] TRUE
w <- readOGR(dsn = ".", layer = "wards") # read in the wards of Leeds
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "wards"
## with 68 features and 339 fields
## Feature type: wkbPolygon with 2 dimensions
w@data <- w@data[, 1:3] # remove unwanted columns
pops <- SpatialPointsDataFrame(coords = coordinates(w), data = w@data, proj4string = CRS("+init=epsg:27700")) # convert to points
plot(w)
points(pops) # geographical centroids (population weighted would be better)
h <- pops[which(grepl("Harehills", pops$ZONE_LABEL)), ] # select centre of study
id.near <- which(gDistance(h, w, byid = T) < 3000) # select areas close to Harehills
## Warning: spgeom1 and spgeom2 have different proj4 strings
hnear <- pops[id.near, ]
plot(w[id.near, ])
We use the recently developed osmar package to illustrate the location of the Royal Armoury museum. In fact, only a single coordinate is actually needed for this simple model (knowing the museum floor plan will be useful for other applications) as shown with the creation of Leeds City Museum as a point object:
library(osmar)
## Loading required package: XML
## Loading required package: RCurl
## Loading required package: bitops
## Loading required package: gtools
## Loading required package: geosphere
## Attaching package: 'osmar'
## The following object is masked from 'package:utils':
##
## find
src <- osmsource_api()
bb <- center_bbox(-1.532258, 53.791866, 60, 60)
# these coordinates were obtained from GeoHack:
# http://tools.wmflabs.org/geohack/geohack.php?pagename=Royal_Armouries_Museum¶ms=53.791866_N_-1.532258_E_region:GB_scale:2000
arm <- get_osm(bb, source = src)
plot(arm) # inbuilt osmar object plotting
summary(arm)
## osmar object
## 44 nodes, 3 ways, 2 relations
##
## osmar$nodes object
## 44 nodes, 5 tags
##
## ..$attrs data.frame:
## id, version, changeset, lat, lon, user, uid, visible,
## timestamp
## ..$tags data.frame:
## id, k, v
##
## Bounding box:
## lat lon
## min 53.79 -1.533
## max 53.79 -1.529
##
## Key-Value contingency table:
## Key Value Freq
## 1 barrier bollard 1
## 2 tourism museum 1
## 3 name Royal Armouries 1
##
##
## osmar$ways object
## 3 ways, 9 tags, 45 refs
##
## ..$attrs data.frame:
## id, visible, timestamp, version, changeset, user, uid
## ..$tags data.frame:
## id, k, v
## ..$refs data.frame:
## id, ref
##
## Key-Value contingency table:
## Key Value Freq
## 1 ncn_ref 67 1
## 2 name Armouries Square 1
## 3 highway footway 1
##
##
## osmar$relations object
## 2 relations, 11 tags, 0 node_refs, 567 way_refs
##
## ..$attrs data.frame:
## id, visible, timestamp, version, changeset, user, uid
## ..$tags data.frame:
## id, k, v
## ..$refs data.frame:
## id, type, ref, role
##
## Key-Value contingency table:
## Key Value Freq
## 1 route bicycle 2
## 2 type route 2
## 3 ref 67 1
arm$ways$tags # it's the joined nodes of the building we're after
## id k v
## 1 4326734 ncn_ref 67
## 2 4326734 highway footway
## 3 4326734 bicycle yes
## 4 4326734 foot yes
## 5 28836553 area yes
## 6 28836553 name Armouries Square
## 7 28836553 highway pedestrian
## 8 78431237 building yes
## 9 78431237 source OS OpenData StreetView
arm.ids <- find(arm, way(tags(k == "building")))
arm.ids # one way
## [1] 78431237
arm.ids <- find_down(arm, way(arm.ids))
arm.ids # composed of 25 nodes
## $node_ids
## [1] 9.204e+08 9.204e+08 9.204e+08 9.204e+08 1.414e+09 1.893e+09 9.204e+08
## [8] 9.204e+08 9.204e+08 9.204e+08 9.204e+08 1.414e+09 1.414e+09 9.204e+08
## [15] 9.204e+08 9.204e+08 9.204e+08 1.414e+09 1.414e+09 9.204e+08 9.204e+08
## [22] 9.204e+08 9.204e+08 9.204e+08 9.204e+08
##
## $way_ids
## [1] 78431237
##
## $relation_ids
## NULL
armoury.museum <- subset(arm, ids = arm.ids)
plot(armoury.museum)
arm <- armoury.museum
rm(armoury.museum) # name too long
Now that the Armoury Museum has been saved using raw Open Street Map data, it must by processed before it is included as a point in the SI model.
class(arm)
## [1] "osmar" "list"
args(as_sp)
## function (obj, what = c("points", "lines", "polygons"), crs = osm_crs(),
## simplify = TRUE)
## NULL
arm <- as_sp(arm, "polygons") # convert to widely used sp data structure
plot(arm) # now an sp data object
# change grid reference
arm <- SpatialPolygonsDataFrame(spTransform(arm, CRSobj = CRS("+init=epsg:27700")),
data = arm@data)
writeOGR(arm, dsn = "shps/", layer = "arm-floorplan", driver = "ESRI Shapefile") # save it
## Error: layer exists, use a new layer name
A quicker process (omitting the floor plan) was used to creat the Leeds City Museum object:
lcmuseum <- SpatialPoints(coords = matrix(c(-1.547323, 53.801617), nrow = 1),
proj4string = CRS("+init=epsg:4326"))
lcmuseum <- spTransform(lcmuseum, CRSobj = CRS("+init=epsg:27700"))
museums <- spRbind(SpatialPoints(coordinates(arm), proj4string = CRS("+init=epsg:27700")),
lcmuseum)
plot(w[id.near, ])
points(hnear)
points(museums, lwd = 3)
The phenomenon of people travelling to museums is similar to that of people travelling from many locations to a few shopping centres. Therefore we use a model commonly used to model flows to retail centres as an analogy. The equation estimates the flow (S) from each residential location (i) to each destination (j) (Wilson, 2000, p. 66):
\[ S_{ij} = A_ie_iP_iW_j^\alpha e^{-\beta c_{ij}} \]
where \[ A_i = frac{1}{\sum_k^\alpha e^{-\beta c_{ij}} \]
In the context of the input data presented above, we will in the first instance assume that \( e_i = 0.1 \) for all areas, that all areas are equally attractive (W = 1) and that \( c_{ij} \) is proportional to the Euclidean distance between the zone centroid and the museum.
round(gDistance(hnear, museums, byid = T)/1000, 2)
## 33 34 35 38 39 45 52 54 55 56
## 78431237 3.2 3.46 1.72 5.95 3.28 5.35 2.35 5.82 5.30 1.96
## 3.8 2.45 1.68 6.98 3.30 4.25 3.66 5.41 5.62 0.70
dij <- gDistance(museums, hnear, byid = T)/1000
alpha <- 0.3
beta <- 0.3
W <- A <- rep(1, times = nrow(museums@coords))
for (j in 1:nrow(museums@coords)) {
A <- 1/(sum(W[j] * exp(-beta * dij[, j])))
}
S <- A * 0.1 * hnear$ST1210001 * W * exp(-beta * dij)
plot(hnear$ST1210001, S[, 1])
plot(dij[, 1], S[, 1])
simodel <- lm(S[, 1] ~ hnear$ST1210001 + dij)
summary(simodel)
##
## Call:
## lm(formula = S[, 1] ~ hnear$ST1210001 + dij)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.459 -4.696 -0.713 1.189 10.498
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 92.01583 10.38800 8.86 0.00012 ***
## hnear$ST1210001 0.00770 0.00166 4.63 0.00358 **
## dij78431237 -19.89561 3.44963 -5.77 0.00119 **
## dij -1.48276 2.59818 -0.57 0.58893
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.83 on 6 degrees of freedom
## Multiple R-squared: 0.962, Adjusted R-squared: 0.942
## F-statistic: 50 on 3 and 6 DF, p-value: 0.000122
The linear regression model presented above is a 'sanity check' to see if the model responds to distance (from only museum 1 remember) and population. The answer is resounding YES: flow is negatively correlated with distance and positively so with population.
S[1:5]
## [1] 69.73 72.61 126.07 51.75 71.21
plot(hnear)
for (j in 1:nrow(museums@coords)) {
for (i in 1:nrow(hnear@data)) {
lines(c(coordinates(hnear)[i, 1], coordinates(museums)[j, 1]), c(coordinates(hnear)[i,
2], coordinates(museums)[j, 2]), lwd = (S[i + (j - 1) * nrow(hnear@data)]/mean(S))^3)
}
}
From first impressions, the spatial microsimulation model seems to work: flows are positively correlated with the population of the origins and distance to the destination. It is posible to see from the above model, for example that origins that are particularly close to one museum tend to have much greater flow to that museum than to the other more distant one.
There is much to do to further refine this model:
Wilson, A. G. (2000). Complex spatial systems: the modelling foundations of urban and regional analysis. Pearson Education.