Plotting Openstreetmap polygons and lines

Illustrates plotting polygons of buildings, parks, waterways, and other structures, with a couple of maps shown at the bottom of this document.

The polygons are obtained from openstreetmap (OSM). While the R package osmar enables ready manipulation of OSM data, it downloads ALL data in a given bounding box (bbox), and bboxes are thus constrained to being very small. This script circumvents that restriction by directly downloading only those mapquest API data that match given filters. This also results in categorically different physical structures (such as buildings and waterways) being stored as different R structures, enabling them to be subsequently plotted in different colours.

Get raw data

The bbox in this example cover a fair portion of central London, U.K. The variable “type” includes just some of the possible OSM categories, as defined in get.osm.structs below. Additional possibilities include way[natural=tree] but trees are very much a work in progress on OSM. Local tweaks of these types will likely be required. For example, lakes are generally not waterways, rather they are tagged as natural=water. Similarly, landuse=grass should capture all parks, yet may not.

get.raw.dat <- function(type = "building") {
    bbox <- "[bbox=-0.21,51.484,-0.06,51.54]"
    # bbox <- '[bbox=-0.15,51.5,-0.1,51.52]' # smaller bit of London
    if (type == "building") {
        dat <- getURL(paste("http://open.mapquestapi.com/xapi/api/0.6/", "way[building=*]", 
            bbox, sep = ""))
    } else if (type == "waterway") {
        dat <- getURL(paste("http://open.mapquestapi.com/xapi/api/0.6/", "way[waterway=*]", 
            bbox, sep = ""))
    } else if (type == "natural") {
        dat <- getURL(paste("http://open.mapquestapi.com/xapi/api/0.6/", "way[natural=water]", 
            bbox, sep = ""))
    } else if (type == "grass") {
        dat <- getURL(paste("http://open.mapquestapi.com/xapi/api/0.6/", "way[landuse=grass]", 
            bbox, sep = ""))
    } else if (type == "park") {
        dat <- getURL(paste("http://open.mapquestapi.com/xapi/api/0.6/", "way[leisure=park]", 
            bbox, sep = ""))
    } else if (type == "amenity") {
        dat <- getURL(paste("http://open.mapquestapi.com/xapi/api/0.6/", "way[amenity=*]", 
            bbox, sep = ""))
    } else if (type == "shop") {
        dat <- getURL(paste("http://open.mapquestapi.com/xapi/api/0.6/", "way[shop=*]", 
            bbox, sep = ""))
    } else if (type == "boundary") {
        dat <- getURL(paste("http://open.mapquestapi.com/xapi/api/0.6/", "way[boundary=*]", 
            bbox, sep = ""))
    } else if (type == "highway") {
        dat <- getURL(paste("http://open.mapquestapi.com/xapi/api/0.6/", "way[highway=*]", 
            bbox, sep = ""))
    }
    return(dat)
}

Extract polygons from raw data

The raw XML data are then converted to SpatialPolygonDataFrames (SPDFs), using the osmar routines find to hierarchically trawl the OSM XML data, and as_sp for the actual conversion. The function get.polys extracts the polygons for one type of structure, and returns one SPDF. This function is called recursively from get.all.polys, which specifies which types of OSM structures are extracted, dowloads and stores the raw (XML) data, and then processes those data by converting them to SPDFs. The conversion generally takes longer than the downloading, with buildings and highways naturally taking the longest. For the maps below, downloading took a few minutes, while conversion to SPDFs took a few tens of minutes.

The SPDFs produced from the conversion are stored locally for plotting with the makemap routine.

Both the extraction (get.all.polys) and the plotting (makemap) rely on the same list of OSM categories, or data types. These are therefore specified in the following, separate routine (get.osm.structs), which is called in both cases. Each category of OSM data is stored both as raw XML data, and as an SPDF. The former are called “dat” with the capitalised first letter of dat.types appended (so, for example, “datB” for building), while the SPDFs follow same format, and are called “poly”. The first few lines of get.osm.structs check if multiple dat.types have the same first letter, in which case the first two letters are appended—but no deeper comparisons are done, and thus no two dat.types can share the same first two letters.

require("osmar")
get.osm.structs <- function() {
    dat.types <- c("building", "amenity", "waterway", "grass", "natural", "park", 
        "highway", "boundary")
    letters <- sapply(dat.types, function(x) toupper(substr(x, 1, 1)))
    matches <- sapply(letters, function(x) which(letters %in% x))
    # returns list of matches for each letter
    indx <- which(sapply(matches, function(x) length(x) > 1))
    # returns which matches have more than one element. There won't be many of
    # these, so a loop is implemented:
    if (length(indx) > 0) {
        for (i in 1:length(indx)) {
            indx.i <- matches[[indx[i]]]  # Only need to extract first element
            letters[indx.i] <- toupper(substr(dat.types[indx.i], 1, 2))
        }
    }
    dat <- data.frame(cbind(dat.types, letters))
    row.names(dat) <- NULL
    names(dat) <- c("dat.types", "letters")
    return(dat)
}

Only complication in get.polys are handlers for specific data types. In particular, grass arises not just as a landuse type, but as k="roof:material" v="grass", necessitating a specific handler.

get.polys <- function(dat, type = "building") {
    datp <- xmlParse(dat)
    dato <- as_osmar(datp)
    if (type == "grass") {
        pids <- find(dato, way(tags(k == "landuse" & v == type)))
    } else if (type == "park") {
        pids <- find(dato, way(tags(v == type)))
    } else {
        pids <- find(dato, way(tags(k == type)))
    }
    pids <- find_down(dato, way(pids))
    sp <- subset(dato, ids = pids)
    if (type == "boundary" | type == "highway") {
        sp <- as_sp(sp, "lines")
    } else {
        sp <- as_sp(sp, "polygons")
    }

    return(sp)
}

Then finally, the function to download and process all data:

get.all.polys <- function() {
    osm.structs <- get.osm.structs()

    require(data.table)
    st0 <- Sys.time()
    cat("Downloading raw data ...\n", rep("-", 50), "\n", sep = "")
    fnames <- NULL
    for (i in 1:dim(osm.structs)[1]) {
        cat("Dowloading ", toString(osm.structs$dat.types[i]), " \t... ", sep = "")
        dat <- get.raw.dat(toString(osm.structs$dat.types[i]))
        st <- timetaken(st0)
        cat(" done; time = ", st, "\n", sep = "")
        fname <- paste("dat", toString(osm.structs$letters[i]), sep = "")
        fnames <- c(fnames, fname)  # used in subsequent processing stage
        assign(fname, dat)
        save(list = c(fname), file = fname)
    }

    cat("\nProcessing raw data ...\n", rep("-", 50), "\n", sep = "")
    st0 <- Sys.time()
    for (i in 1:dim(osm.structs)[1]) {
        cat("Processing ", toString(osm.structs$dat.types[i]), " \t... ", sep = "")
        dat <- get.polys(get(fnames[i]), type = toString(osm.structs$dat.types[i]))
        st <- timetaken(st0)
        cat(" done; time = ", st, "\n", sep = "")
        fname <- paste("poly", toString(osm.structs$letters[i]), sep = "")
        assign(fname, dat)
        save(list = c(fname), file = fname)
    }
}

Remove small polygons

This function removes all polygons below a given percentile from the SPDF, in case only big polygons are desired.

remove.small.polys <- function(sp = sp, limit = 0.75) {
    areas <- unlist(lapply(slot(sp, "polygons"), function(x) slot(x, "area")))
    ai <- sort(areas)[floor(length(areas) * limit)]
    indx <- which(areas >= ai)
    sp <- sp[indx, ]
    return(sp)
}

Get lat–lon limits for plotting

Extracts plotting limits from an SPDF. apsect1 returns xlims that cover the same range as ylims, otherwise the aspect is fitted to the data. Note, however, that stretching the aspect ratio makes building polygons look rather odd.

get.xylims <- function (poly, aspect1=TRUE)
{
  xrange <- range (unlist (lapply (slot (poly, "polygons"),  
                    function (x) min (slot (x, "labpt") [1]))))
  yrange <- range (unlist (lapply (slot (poly, "polygons"),  
                    function (x) min (slot (x, "labpt") [2]))))
  if (aspect1) {
    if (diff (xrange) > diff (yrange)) {
      r <- diff (xrange) / diff (yrange)
      yrange [1] <- yrange [1] - r * diff (yrange) / 2
      yrange [2] <- yrange [2] + r * diff (yrange) / 2
    } else {
      r <- diff (yrange) / diff (xrange)
      xrange [1] <- xrange [1] - r * diff (xrange) / 2
      xrange [2] <- xrange [2] + r * diff (xrange) / 2
    }
  }
  return (list (x=xrange, y=yrange))
}

Plot polygons and lines

plot.polys is the base function for setting up the map and plotting the first SPDF (probably best done with buildings, “polyBU”). Subsequent ones are then overlaid with add.polys and add.lines. Graphics are best produced as .png files, and not as x11 or other screen devices. Width and height need to be adjusted to fit device or desired output size. The values below are for a 16:10 monitor of (1680,1050) rescaled to 16:9 = (1680,945).

Note that the internal plotfun directly extracts the xy-coordinates from the slots of the spdf, which is not really an acceptable S4 method. A more acceptable alternative might be to use fortify from ggplot2, but the method used here is far more computationally efficient.

plot.polys <- function(poly = poly, xylims = xylims, filedump = FALSE, col = "gray40") {
    if (filedump) {
        png(filename = "London_map.png", width = 1680, height = 945, type = "cairo-png", 
            bg = "white")
    } else {
        x11(width = 14)
    }
    par(mar = c(0, 0, 0, 0))
    plot(NULL, NULL, xlim = xylims$x, ylim = xylims$y, xaxt = "n", yaxt = "n", 
        xlab = "", ylab = "", bty = "n")
    usr <- par("usr")
    rect(usr[1], usr[3], usr[2], usr[4], border = NA, col = "gray20")
    plotfun <- function(i) {
        xy <- slot(slot(i, "Polygons")[[1]], "coords")
        polypath(xy, border = NA, col = col)
    }
    junk <- lapply(slot(poly, "polygons"), plotfun)
}

add.polys <- function(poly = poly, col = "gray40") {
    plotfun <- function(i) {
        xy <- slot(slot(i, "Polygons")[[1]], "coords")
        polypath(xy, border = NA, col = col)
    }
    junk <- lapply(slot(poly, "polygons"), plotfun)
}

add.lines <- function(lines = lines, col = "gray60") {
    plotfun <- function(i) {
        xy <- slot(slot(i, "Lines")[[1]], "coords")
        lines(xy, col = col)
    }
    junk <- lapply(slot(lines, "lines"), plotfun)
}

Make the Map

This is the main function that pulls it all together. It needs to be run once with get.data = TRUE, which will download all data from the mapquest API and store them locally. Subsequent runs can then use these local data (get.data = FALSE).

makemap <- function (filedump=TRUE, streets=FALSE)
{
  osm.structs <- get.osm.structs ()

  for (i in 1:dim (osm.structs) [1]) {
    fname <- paste ("poly", toString (osm.structs$letters [i]), sep="")
    load (fname)
  }

  col.grass <- rgb (100, 120, 100, maxColorValue=255)
  col.water <- rgb (77, 77, 92, maxColorValue=255)
  xylims <- get.xylims (polyBU, aspect1=FALSE)

  # Streets are plotted underneath everything, so first polygons are merely default fillers, to be re-plotted later.
  plot.polys (polyN, xylims, filedump=filedump, col=col.water)
  if (!is.na (streets)) add.lines (polyH, col=streets) # highways   
  #add.lines (polyBO, col="white") # boundaries
  # Then amenities, grass and parks:
  add.polys (polyA, col=col.grass)
  add.polys (polyG, col=col.grass)
  add.polys (polyP, col=col.grass)
  # Then buildings:
  add.polys (polyBU, col="gray40")
  # Water has to be plotted last to go over the top of grass and parks   
  add.polys (polyW, col=col.water)
  add.polys (polyN, col=col.water)
  if (filedump) { junk <- dev.off (which=dev.cur()) }
}
makemap(filedump = FALSE, streets = NA)

plot of chunk theplot1

makemap(filedump = FALSE, streets = "black")

plot of chunk theplot2

makemap(filedump = FALSE, streets = "white")

plot of chunk theplot3