Executive Summary

The following code will help you build your own maps in R using base plotting, Lattice plot methods for spatial data, the ggplot2 system, the GoogleVis Chart API and interactive javascript visualizations. Enjoy and have fun with it!

Prerequisites

You may want to download the following files:

  1. First, you need a shapefile of the country you want to build a map for. The shapefile format is a popular geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri. For our examples, we will use two shapefiles for Norway: The first shapefile of Norway is available from the website of the Norwegian Kartverket. You can download the file here. We will use the shapefile called NO_Arealdekke_pol.shp. Don’t forget to read the terms of use.

  2. The second shapefile is available from the website of the Global Administrative Areas at this location. Download the shapefile format for Norway. You can also download shapefiles for other countries there.

  3. You may also need a file containing the X-Y coordinate of spatial locations you want to plot. In our case, we will use a file containing the Norwegian postal code areas. This file is available here. I have also used the coordinates of cities and other areas from this website.

Building the Map of Norway in R

First we load the necessary R packages, download the needed files and set up the directory.

# Installing all the libraries to build maps in R
library(plyr)         # To manipulate data
library(ggplot2)      # To have ggplot2 graphic interface
library(lattice)      # To have Lattice graphic interface
library(rgdal)        # To load "shapefiles" into R and use in conversions of spatial formats 
library(rgeos)        # To use in geometrical operations
library(spatstat)     # To use with Voronoi Tesselation
library(sp)           # Methods for retrieving coordinates from spatial objects.
library(maptools)     # A package for building maps
library(maps)         # A package for building maps
library(RColorBrewer) # To build RColorBrewer color palettes
library(grDevices)    # To build grDevices color palettes
library(reshape2)     # To have more flexibility when transforming data
library(rCharts)      # To create and customize interactive javascript visualizations in R
library(knitr)        # For dynamic report generation in R
library(base64enc)    # Tools for base64 encoding
suppressPackageStartupMessages(library(googleVis)) # To use with Google visualization in R

# Setting your working directory
setwd("/Users/isa/Documents/Courses/Johns Hopkins/RKart_Norge")

Building Maps using R’s Base Plotting System

There are several ways to build maps in R. First let’s build a simple map of Norway showing a few Norwegian cities. We will use R’s base plotting system.

# First, let's load the first shapefile of Norway into R
fylke <- readOGR(dsn="/Users/isa/Documents/Courses/Johns Hopkins/RKart_Norge/33_N5000_shape" ,
                 layer="NO_Arealdekke_pol")
## OGR data source with driver: ESRI Shapefile 
## Source: "/Users/isa/Documents/Courses/Johns Hopkins/RKart_Norge/33_N5000_shape", layer: "NO_Arealdekke_pol"
## with 1324 features
## It has 5 fields
fylke=fylke[fylke$OBJTYPE!="Havflate",]
norway=gUnaryUnion(fylke)

# Secondly, we create a simple dataframe with a few Norwegian cities coordinates
name <- c("Postnummer", "LAT", "LON"); Oslo <- c("0252", 59.9084400, 10.7230100); 
Larvik <- c("3252",59.0686800,10.0473000); Bergen <- c("5052",60.3713600,5.3537800);  
Haugesund<- c("5532",59.4094000,5.2995100); Trondheim <- c("7052",63.4240100,10.4409600);
Nordkapp <- c("9751",70.9821400,25.9707600); Tromso <- c("9252",69.6478700,18.9566000)
cities <- as.data.frame(rbind(Oslo, Larvik, Bergen, Haugesund, Trondheim, Tromso, Nordkapp),col.names=name, stringsAsFactors=FALSE) 
cities$V1 <- as.character(cities$V1); cities$V2 <- as.numeric(cities$V2); cities$V3 <- as.numeric(cities$V3)
colnames(cities) <- name;

# We check classes and formats
proj=CRS(proj4string(norway)); proj; class(cities) 
## CRS arguments:
##  +proj=utm +zone=33 +datum=WGS84 +units=m +no_defs +ellps=WGS84
## +towgs84=0,0,0
## [1] "data.frame"
# We convert the dataframe "cities" to a spatial object
coordinates(cities)<-~LON+LAT
class(cities)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
proj4string(cities)=CRS("+init=epsg:4326")
# The spTransform function provides transformation between datum(s) and projections
cities.proj=spTransform(cities, proj)
# We check that norway and cities.proj are identical before merging them
identical(proj4string(cities.proj),proj4string(norway))
## [1] TRUE
# We are now ready to plot the first map with R'base plotting system.
plot(norway, border="darkgrey")
# We plot the cities coordinates
points(cities.proj, col="red", pch=16)
# We want to add the city names
text(cities.proj, labels=row.names(cities), cex=0.6, font=2, offset=0.5, adj=c(0,2))
# We need a legend
legend("bottomright", legend=c("©Kartverket"),col=black, cex=0.5, bty="n")

Performing a Voronoi Tesselation with Base Plotting in R

Now, let’s build a map showing all postal code areas with a Voronoi Tesselation:

  1. We plot the map of Norway as previously shown with the first shapefile of Norway.

  2. We plot all the 4463 postal codes coordinates just as we plotted a few cities earlier.

  3. We perform a Voronoi Tesselation. See the following wiki link for more details on Voronoi Tesselation. We use the “spatstat” package in R to do this.

# We load the postal codes into R.
postal <- read.delim("postnummer.csv", colClasses="character")
postal <- transform(postal, LON=as.numeric(LON), LAT=as.numeric(LAT))
# We remove all postal codes not in use
postal2=postal[!grepl("ikkje i bruk", postal$POSTSTAD),]
# Some locations have the same postal codes. We transform the data to avoid plotting problems. 
post=ddply(postal2, ~LON+LAT, summarise, POSTNR=paste(POSTNR, collapse=", "))
# We convert to the right spatial format
proj=CRS(proj4string(norway)); coordinates(post)=~LON+LAT
proj4string(post)=CRS("+init=epsg:4326");  post.proj=spTransform(post, proj)
# We create a map of Norway and plot the postal codes
plot(norway, border="darkgrey")
points(post.proj, col="red", pch=".")

We are now ready to perform the Voronoi Tesselation.

# We perform a Voronoi Tesselation using the "spatstat" package in R.
bboks.norway=bbox(norway)
bboks.owin=owin(bboks.norway[1,], bboks.norway[2,])
post.ppp=as.ppp(coordinates(post.proj), W=bboks.owin)
diagram=dirichlet(post.ppp)
diagram.poly=as(diagram, "SpatialPolygons")
proj4string(diagram.poly)=proj
 # We plot the results with base plotting
plot(diagram.poly)

# We remove what would be plotted outside the borders of Norway
diagram.poly.norway=gIntersection(diagram.poly, norway, byid=TRUE)
plot(diagram.poly.norway)
# We plot Norway with its flag colors. Each postal code areas get its own colors
pal <- colorRampPalette(c("blue", "white", "red"))
plot(diagram.poly.norway, col=pal(4463), border="black")
legend("bottomright", legend=c("©Kartverket"),col=black, cex=0.5, bty="n")

Building Maps in R using the ‘sp’ Package (Lattice Plot)

Now let’s say we want to visualize a variable called churn showing the churn rate for each of the 19 Norwegian regions (fylke). We want the results to be shown in a map. We use the R package called ‘sp’ to do this.

# We download and load the second shapefile for Norway from this location:
# http://gadm.org/download
norway2 <- readOGR(dsn="/Users/isa/Documents/Courses/Johns Hopkins/RKart_Norge/NOR_adm" ,
                   layer="NOR_adm2")
# We take a look at the data
norway2_data <- norway2@data
str(norway2_data); head(norway2_data)

# We create a dataframe for the churn rate per regions
d <- c( "Akershus" , "Aust-Agder", "Buskerud", "Finnmark",  "Hedmark" ,  "Hordaland",  "Møre og Romsdal" , "Nord-Trøndelag" ,   "Nordland" ,  "Oppland", "Oslo", "Ãstfold", "Rogaland",  "Sogn og Fjordane",  "Sør-Trøndelag",  "Telemark",  "Troms", "Vest-Agder" , "Vestfold" )
e <- c(1.0, 1.1, 1.5, 1.55, 2.9, 3.12, 3.1, 4.2, 4.3, 4.8, 5.1, 5.3, 5.5, 5.56, 7.9, 8.3, 11, 5.6, 6.1)
name3 <- c("NAME_1", "Churn"); dt2 <- as.data.frame(cbind(d, e), stringsAsFactors=FALSE) 
dt2$e <- as.numeric(dt2$e); colnames(dt2) <- name3; churn <- dt2
# We plot the Norwegian regions using the unionSpatialPolygons function from the 'maptools' package
IDs <- norway2_data$ID_1
# We merge Polygons
norway3_new <- unionSpatialPolygons(norway2, IDs)
# We build the new SpatialPolygonsDataFrame with the churn rate
norway4_new <- SpatialPolygonsDataFrame(norway3_new, churn) 

# Then you can use spplot to visualize the Norwegian regions with their respective churn rate
# We define the color palette
pal2 <- colorRampPalette(c("white", "red"))
# Remove the plot frame
trellis.par.set(axis.line=list(col=NA))
# Plot the regions with Lattice
spplot(norway4_new, "Churn", main="Churn Rate per Norwegian Region (Fylke)", 
       lwd=.4, col="white", col.regions=pal2(19), colorkey = FALSE, bty="n")

Building Maps in R with the ‘ggplot2’ Package

Now, let’s try to visualize the churn rate dataset using the ggplot2 package. We will reuse the churn dataframe from the last example.

# We create the data foundation from the shapefile
norway2@data$id = rownames(norway2@data)
norway2.points = fortify(norway2, region="id")
norway2.df = join(norway2.points, norway2@data, by="id")
# We merge with churn regional data
merged <- merge(norway2.df, churn, by = "NAME_1")
merged <- merged[order(merged$order), ]
# We create the color palette and breaks
pal3 <- colorRampPalette(c("white", "blue"))
breaks <- c(quantile(merged$Churn))
# Plot the map with ggplot2
q <- ggplot(data = merged, aes(x = long, y = lat, group = group))
q <- q + geom_polygon(color = "lightgrey", aes(fill = factor(Churn)))
q <- q + ggtitle("Churn Rate per Norwegian Region (Fylke)")
q <- q + scale_fill_manual( values = pal3(19), breaks= breaks ) 
q <- q + labs(fill="Churn Rate")
q <- q + theme(panel.background = element_rect(fill = "white", colour = "grey"),
       panel.grid.major = element_line(colour = "grey")) 
q

Building Interactive Maps in R using the ‘GoogleVis’ Package

Now let’s say we want to visualize a variable called profit and compare results across Nordic countries. We want the results to be shown in an interactive map. We use the R package called ‘Googlevis’ to do this. The basic idea behind the GoogleVis package is:

  1. The R function creates an HTML page

  2. The HTML page calls the Google Charts API

  3. The results are displayed with an interactive HTML graphic

Please note that, you can choose to change the map settings to a European map using region=“150” in the GvisGeoChart command line. The default setting shows a world map. See GoogleVis Examples on how to get started with the GoogleVis package.

op <- options(gvis.plot.tag='chart')
# We create a dataset called revenue showing profit per Nordic countries.
name2 <- c("Country", "Profit"); a <- c("Norway", "Sweden", "Denmark", "Finland", "Iceland")
b <- c(10, 3, 5, 7, 5)
dt <- as.data.frame(cbind(a, b), stringsAsFactors=FALSE) 
dt$b <- as.numeric(dt$b); colnames(dt) <- name2; revenue <- dt
# We plot the results
nordic <- gvisGeoChart(revenue, locationvar="Country",
                   colorvar="Profit",options=list(width=600, height=400,region="154"))
plot(nordic)
# The results can be seen in your web browser

Building Interactive Maps in R using the ‘rChart’ Package

Finally, we want to visualize interactively a street view of Oslo, the capital of Norway, with the rChart package in R and plot some popups. rChart is an incredibly easy way to create interactive javascript visualizations using R. See rChart for further details on how to get started with rChart.

# We create a map of the center of Oslo with two pops-up
map3 <- Leaflet$new() 
map3$setView(c(59.914, 10.72), zoom = 13) # Longitude and Latitude of Oslo center - 59.9140100;10.7246670
map3$marker(c(59.9140100,10.7246670), bindPopup = "<p> Hi. I am a popup </p>")
map3$marker(c(59.9085200,10.7645700), bindPopup = "<p> Hello Oslo! </p>")
map3$save('Oslo.html', 'inline', cdn=TRUE)
cat('<iframe src="Oslo.html" width=100%, height=600></iframe>')
# To visualiza the results in r just type the following command:
# map3
# To publish on RPubs
map3$publish('Oslo.html', host = 'rpubs')
Claim your page on RPubs at: 
http://rpubs.com/publish/claim/79032/07e113a61b354321ad9866c2b545e63d

Acknowledgements

Thanks a lot to the Norwegian Kartverket for sharing shapefiles with data scientists.

A big big thanks to Karl Ove Hufthammer! His wonderful blog was a great source of inspiration and a main reference when writing this blog. The code for the Voronoi Tesselation is all his, I just copied it.

A big thanks to Erik Bolstad for making the csv files with postal code coordinates available in Norway.

Thanks to the Global Administrative Areas for giving access to shapefiles of the world.

References