Pre-requisites

For this analysis we use R. Once this is installed and running in an appropriate IDE (e.g. RStudio), you’ll need to install the following packages:

pkgs = c("sp", "stplanr", "tmap")

If these are not yet installed on your computer, they can be installed with the following command:

install.packages(pkgs)

Then you’ll need to load them with the library() command:

lapply(pkgs, library, character.only = TRUE)

Then we’ll download and unzip the data as follows:

u = "https://www.juntadeandalucia.es/institutodeestadisticaycartografia/DERA/ficheros/G07_Sistema_Urbano.zip"
download.file(u, "G07_Sistema_Urbano.zip")
unzip("G07_Sistema_Urbano.zip")

Exploring the data

What have we just downloaded? We can see each of spatial .shp files, in ascending order of size, with the following commands:

f = list.files(path = "G07_SistemaUrbano/", pattern = ".shp$")
s = file.size(paste0("G07_SistemaUrbano/", f))
paste(f, round(s / 1000000), "MB")[order(s)]
##  [1] "su06_ed_singular.shp 0 MB"                     
##  [2] "su02_2_nucleo_pun.shp 0 MB"                    
##  [3] "su03_1_poblamiento_entidad_colectiva.shp 0 MB" 
##  [4] "su07_verde_urb.shp 3 MB"                       
##  [5] "su08_grid_poblacion_250.shp 7 MB"              
##  [6] "su02_1_nucleo_pol.shp 8 MB"                    
##  [7] "su03_2_poblamiento_entidad_singular.shp 9 MB"  
##  [8] "su01_asentamiento.shp 18 MB"                   
##  [9] "su03_3_poblamiento_nucleo_diseminado.shp 24 MB"
## [10] "su04_manzana.shp 49 MB"                        
## [11] "su05_ed_rural.shp 54 MB"

Which are appropriate for estimating travel patterns? For that we need places in space (points or polygons) and an estimate of the population. We can plot this with the following code:

Let’s plot each of the files.

# Warning: may take several minutes to run - skip to next code chunk to save time
shp_list = as.list(f)
names(shp_list) = f
par(mfrow = c(4, 3))
for(i in f){
  shp_list[[i]] = read_shape(paste0("G07_SistemaUrbano/", i))
  plot(shp_list[[i]], main = i, sub = paste(names(shp_list[[i]]), collapse = " "))
}
par(mfrow = c(1, 1))

# Save the data to avoid having read in the shapefiles next time
dir.create("data")
saveRDS(shp_list, "data/shp_list.Rds")

The point dataset is particularly interesting, as these mark centrepoints of populations:

f[grepl(pattern = "pun", f)]
## [1] "su02_2_nucleo_pun.shp"

Load this point dataset separately and analyse it, as follows:

z = read_shape("G07_SistemaUrbano/su02_2_nucleo_pun.shp")
plot(z)

length(z) # number of zones
## [1] 2736
sum(z$POBLACION) / 1000000 # total population (millions)
## [1] 8.124171
mean(z$POBLACION) # mean population
## [1] 2969.361

The results show that we are dealing with a large area: over million people. The average size of a zone centroid is just under 3 thousand, a suitable size for analysing aggregate travel patterns. This size of dataset is manageable on a single computer. However, for the purposes of demonstrating the method, we will use only a subset of the data focussed on Seville, as illustrated below:

sevilla = z[z$MUNICIPIO == "Sevilla",]
buff = rgeos::gBuffer(sevilla, width = 10000)
plot(buff)
points(z)

z = z[buff,]
plot(z)

To estimate the travel flows between the points within the buffer around Sevilla, we’ll use the Radiation Model:

flow = od_radiation(p = z, pop_var = "POBLACION")
plot(flow)

summary(flow$flow)
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's 
##     0.000     0.000     0.052   723.700    24.330 30440.000        29

The results show we have estimated the flow between all combination of OD pairs in the study region. The next stage is to keep only the ones with high estiamates flows, above the mean in this case, for plotting:

flow = flow[!is.na(flow$flow),]
flow = flow[flow$flow > mean(flow$flow),]
plot(flow, lwd = flow$flow / mean(flow$flow))

flow_wgs = spTransform()
mapview::mapview(flow)

From this point we can proceed to analyse cycling potential in a model that is unconstrained by the current rate of cycling.

Next steps