Creating Interactive Visualisation Tools With R

Overview


  • Themes
    • Interactive Data Analysis
    • Exploratory Spatial Data Analysis
    • Time Series
    • Maps with Backdrops

More dplyr


rain %>%  group_by(Year,Month) %>% 
  summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>% 
  ts(start=c(1850,1),freq=12) -> rain_ts
rain_ts %>% window(c(1870,1),c(1871,12))
##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 1870 2666.2 1975.3 1500.5 1024.8 1862.8  789.2 1038.6 1510.5 2045.5 5177.6
## 1871 3148.3 2343.7 1731.7 2654.5  657.6 2040.1 3705.0 1869.9 2083.4 2774.3
##         Nov    Dec
## 1870 1733.2 1902.2
## 1871 2000.1 1902.0
  • New R commands
    • ungroup - undoes group_by: needed for next command
    • transmute - like mutate but drops unreferenced variables: here used to select out a single column as a variable
    • without the ungroup, transmute would also include Year and Month

A dynamic time series graph


library(dygraphs) # A dynamic graph library
rain_ts %>% dygraph  # Try moving the pointer along the curve

… with a window


rain_ts %>% window(c(1850,1),c(1889,12)) %>% dygraph(width=800,height=300) 

  • dygraph works in a pipeline
  • If it is at the end, it just produces the dynamic graphic
  • Note also width and height options

An interactive window


rain_ts %>% dygraph(width=800,height=300) %>% dyRangeSelector

  • You can also pipeline dygraphs into functions that add extra controls, like dyRangeSelector.

Also add interactive rolling mean


rain_ts %>% dygraph(width=800,height=300) %>% dyRangeSelector %>% dyRoller(rollPeriod = 600)

  • Another control is dyRoller, a ‘rolling’ mean. Edit number of months in window.
  • Initially use 600 months (50 years)
  • Early values fluctuate more, as less than full window of data is available.

Multiple dygraphs


  • You may need to look at several time series simultaneously
    • and link zooming and windowing operations
  • This can be done with the group option in dygraph
  • For an example, create two time series, for example, Dublin and Belfast
rain %>%  group_by(Year,Month) %>% filter(Station=="Dublin Airport") %>%
    summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
    ts(start=c(1850,1),freq=12) ->  dub_ts
rain %>%  group_by(Year,Month) %>% filter(Station=="Belfast") %>%
    summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
    ts(start=c(1850,1),freq=12) ->  bel_ts
beldub_ts <- cbind(bel_ts,dub_ts)
window(beldub_ts,c(1850,1),c(1850,5))
##          bel_ts dub_ts
## Jan 1850  115.7   75.8
## Feb 1850  120.5   47.8
## Mar 1850   56.8   18.5
## Apr 1850  142.6   97.5
## May 1850   57.9   58.6

The multiple dygraph


beldub_ts %>% dygraph(width=800,height=360) %>% dyRangeSelector

A three-way comparison - with rolling mean


📖 There was no R code for the last example


  • Thats because it is a self-test exercise!
  • Try to recreate this yourself
  • Hints
    1. Filter out the Cork data (Station is 'Cork Airport')
    2. Make the three-way time series with cbind
    3. Create a dygraph then add range selector and roller controls using %>%
  • Answer next week

👀 Sneak preview of an alternative view


To get this to work, wait until lecture 4

dub_ts %>% dygraph(width=800,height=130,group="dub_belf",main="Dublin") 

bel_ts %>% dygraph(width=800,height=170,group="dub_belf",main="Belfast") %>% dyRangeSelector

👀 Another example


🌍 Interactive Maps

The leaflet package


library(leaflet) 
leaflet(data=stations,height=430,width=390) %>% addTiles %>%
  setView(-8,53.5,6) %>% addCircleMarkers # Works out that 'lat' and 'long' are location

Change provider tiles - 1


leaflet(data=stations,height=430,width=390) %>% addProviderTiles('CartoDB.Positron')  %>%
  setView(-8,53.5,6) %>% addCircleMarkers

Change provider tiles - 2


leaflet(data=stations,height=430,width=390) %>% addProviderTiles('Stamen.Watercolor')  %>%
  setView(-8,53.5,6) %>% addCircleMarkers

Change provider tiles - 3


leaflet(data=stations,height=430,width=390) %>% addProviderTiles('CartoDB.DarkMatter')  %>%
  setView(-8,53.5,6) %>% addCircleMarkers(fillColor='wheat')

Change provider tiles - 4


leaflet(data=stations,height=430,width=390) %>% addProviderTiles('Stamen.Toner')  %>%
  setView(-8,53.5,6) %>% addCircleMarkers(fillColor='firebrick',stroke=FALSE,fillOpacity = 0.8)

Other tile providers


Using shapefiles


load("maps.RData")
leaflet(data=counties.spdf,height=430,width=600) %>% 
  addTiles %>% addPolygons(fillOpacity=0.4,weight=1,color='black') # Note shapefiles must be lat/long projection

Using shapefiles - choropleth mapping


color_fun <- colorNumeric(c('wheat','firebrick'),counties.spdf$POPDENSITY)
leaflet(data=counties.spdf,height=430,width=600) %>% 
    addProviderTiles('CartoDB.Positron')  %>% addPolygons(fillOpacity=0.4,weight=1,fillColor=~color_fun(POPDENSITY))

Shading by quantiles


color_fun <- colorQuantile('Greens',counties.spdf$POPDENSITY)
leaflet(data=counties.spdf,height=430,width=600) %>% 
    addProviderTiles('CartoDB.Positron')  %>% addPolygons(fillOpacity=0.4,weight=1,fillColor=~color_fun(POPDENSITY))

Pre-set Palettes


Pop-Ups


leaflet(data=counties.spdf,height=430,width=600) %>% addProviderTiles('CartoDB.Positron')  %>% 
  addPolygons(fillOpacity=0.4,weight=1,fillColor=~color_fun(POPDENSITY),popup=~COUNTYNAME)

Legend


leaflet(data=counties.spdf,height=430,width=600) %>% addProviderTiles('CartoDB.Positron')  %>% 
  addPolygons(fillOpacity=0.4,weight=1,fillColor=~color_fun(POPDENSITY)) %>% 
  addLegend(pal=color_fun,values=~POPDENSITY,title="Percentile of Density",position='bottomleft')

Putting it all together


  • Using some ideas from last week
  • Average Rainfall By Station
  • Link to location
  • Colour circles according to rainfall
  • Add them to a map with a legend

1 - Computing the averages


  • left_join links the stations data frame to rain_summary to get the geographical information

load('rainfall.RData')
rain %>% group_by(Station) %>%  summarise(mrain=mean(Rainfall))  -> rain_summary
rain_summary %>% left_join(stations) -> station_means
## Joining by: "Station"
head(station_means, n=4)
## Source: local data frame [4 x 10]
## 
##   Station     mrain Elevation  Easting Northing   Lat  Long  County
##     (chr)     (dbl)     (int)    (dbl)    (dbl) (dbl) (dbl)   (chr)
## 1  Ardara 140.36753        15 180787.7 394679.0 54.79 -8.29 Donegal
## 2  Armagh  68.32096        62 287831.3 345772.0 54.35 -6.64  Armagh
## 3  Athboy  74.74356        87 270400.0 261700.0 53.60 -6.93   Meath
## 4 Belfast  87.10995       115 329623.4 363141.3 54.50 -5.99  Antrim
## Variables not shown: Abbreviation (chr), Source (chr)

2 - Creating the colour mapping


color_fun <- colorNumeric('Blues',station_means$mrain)
previewColors(color_fun,fivenum(station_means$mrain))

3 - Creating the map


leaflet(data=station_means,height=430,width=600) %>% addProviderTiles('CartoDB.Positron')  %>%
  setView(-8,53.5,6) %>% addCircleMarkers(fillColor=~color_fun(mrain),weight=0,fillOpacity = 0.85) %>%
  addLegend(pal=color_fun,values=~mrain,title="Rainfall",position='bottomleft')

Another self-test:


Try re-creating the previous map, but using median rainfall instead of mean. Note the median function can be used in summary.

Conclusion

💡 New ideas


  • New R ideas
    • dygraph
    • leaflet
    • functions that return functions
    • Palettes from RColorBrewer
  • Next lecture - More Interactive methods