dygraphs to analyse time series dataThe analysis of rainfall is essential for estimating the impacts of climate change on the water cycle, water balance and for flood mitigation (Met Éireann, 2020). Met Éireann (2020) have outlined the value of historical records for analysing rainfall data as long term rainfall time series have become crucial for understanding past and emerging changes to the precipitation regime and the hydrological cycle. This analysis focused on the use of dygraphs to examine and investigate rainfall time series data.
The code demonstrated in this analysis illustrated the use of dygraph plots, using the dygraphs package, to chart rainfall time series data in R on a monthly basis at the following weather stations:
Houghton & O’Cinnéide (1976: 34) outlined that Ireland is subject to ‘year-round cyclonic activity owing to its high latitiude, and the south coast is also exposed to frequent southerly flow from the ocean’. As a result, Houghton & O’Cinnéide (1976: 35) outlined that a general ‘horse-shoe shaped’ precipitation pattern exists across Ireland, where the heaviest rainfall occurs near the north, west and south coasts and lightest amounts in the midlands and east. Thus, in addition to identifying long term trends in rainfall levels within Ireland, this study expected to identify a spatial variation in rainfall intensity from the four stations.
Thedygraphs package enables interaction with time series data: a chart where the X axis represents time, and the Y axis represents the evolution of one or several variables (Figure 1). This interaction enables the user to zoom to specific time periods, hover over a data point to gather additional information, and to view several time series data sets simultaneously (The R Graph Gallery, 2018).
Figure 1: Dygraph depicting the evolution of temperature observations within a time series (RStudio Blog, 2014)
The working data set used in this demonstration was rainfall.RData, a historic R binary data file that runs from 1850 to 2014 by month. This data file was loaded into R:
setwd("E:/GY672")
load("data/rainfall.RData")
For the demonstration, the following packages were loaded into R:
library(dygraphs)#a dynamic interactive graph library
library(tidyverse)#for data manipulation
library(knitr) #creates tables from data sets
The first step in visualising the time series data within a dygraph was to sort the data into the desired format. From initial investigation of the data set in the tables below, modification was needed to group the data on a monthly basis for each station.
ArdaraData<-rain[1:5,]
DerryData<- rain[1981: 1985,]
knitr::kable(ArdaraData, caption = "Ardara Rainfall Dataset")
knitr::kable(DerryData, caption = "Derry Rainfall Dataset")
| Year | Month | Rainfall | Station |
|---|---|---|---|
| 1850 | Jan | 169.0 | Ardara |
| 1851 | Jan | 236.4 | Ardara |
| 1852 | Jan | 249.7 | Ardara |
| 1853 | Jan | 209.1 | Ardara |
| 1854 | Jan | 188.5 | Ardara |
| Year | Month | Rainfall | Station |
|---|---|---|---|
| 1850 | Jan | 96.9 | Derry |
| 1851 | Jan | 136.3 | Derry |
| 1852 | Jan | 143.5 | Derry |
| 1853 | Jan | 119.7 | Derry |
| 1854 | Jan | 107.7 | Derry |
The tidyverse R package contains the dplyr library, a grammar of data manipulation. This was employed to carry out desired commands such as:
filter() which picks cases based on their valuessummarise() which reduces multiple values down to a single summarygroup_by() which allows the user to perform any operation “by group”ungroup() which undoes group_by()transmute() which computes new columns and drops unreferenced variablesThese commands summarised and grouped the rainfall data on a monthly basis with the following code:
rain_ts<-rain %>% group_by(Year,Month) %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup() %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) #creates a time series object
rain_ts %>% window(c(1870,1),c(1871,12)) #picks out a specified time period for a dygraph
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct
## 1870 2666.2 1975.3 1500.5 1024.8 1862.8 789.2 1038.6 1510.5 2045.5 5177.6
## 1871 3148.3 2343.7 1731.7 2654.5 657.6 2040.1 3705.0 1869.9 2083.4 2774.3
## Nov Dec
## 1870 1733.2 1902.2
## 1871 2000.1 1902.0
With the modified time series data set, visualisation of the monthly time series within a dygraph was possible:
rain_ts %>% dygraph()
However, to add a more sophisticated interaction within the dygraph, the dyRangeSelector() command provided a straightforward interface for panning and zooming to specific time periods:
rain_ts %>% dygraph(width=800,height=300) %>% dyRangeSelector()
To analyse four time series data sets simultaneously to perform comparison, each specific station had to be grouped into a unique data frame, and then combined together:
#group Dublin Airport station rainfall data together
Dublin<-rain %>% group_by(Year,Month) %>% filter(Station=="Dublin Airport") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup() %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12)
#group Belfast station rainfall data together
Belfast<-rain %>% group_by(Year,Month) %>% filter(Station=="Belfast") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup() %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12)
#group University College Galway station rainfall data together
Galway<-rain %>% group_by(Year,Month) %>% filter(Station=="University College Galway") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12)
#group Cork Airport station rainfall data together
Cork<-rain %>% group_by(Year,Month) %>% filter(Station=="Cork Airport") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12)
#combine data frames together
bdcg_ts<-cbind(Dublin, Belfast, Galway, Cork)
After each data frame had been combined, the resulting data frame was visualised as an interactive dygraph. Modifications were applied to the code to add a stackedGraph element for better visibility, suppress the drawing of the grid for the X axis, to add a custom fillColor for the dyRangeSelector, and to add a dyHighlight function to highlight the relevant time series:
bdcg_ts %>%
dygraph(width=800,height=450, main="Rainfall Time Series") %>%
dyRangeSelector(strokeColor = "", fillColor = "black")%>%
dyAxis("y", label = "Rainfall ")%>%
dyAxis("x", drawGrid = FALSE, label = "Date")%>%
dyLegend(width = 400) %>%
dyUnzoom() %>% #adds an unzoom button to refresh the dygraph
dyCrosshair(direction = "vertical") #for specific analysis within the dygraph
To separately analyse each time series data frame simultaneously, each data frame was linked together with the group command. Here, interaction with the dyRangeSlector() was also applied to the specified group created:
#use group to link multiple embedded dygraphs
Dublin %>% dygraph(width=800,height=130,group="linked_ts",main="Dublin") %>%
dyOptions(stackedGraph = TRUE)%>%
dyAxis("y", label = "Rainfall ")%>%
dyAxis("x", drawGrid = FALSE, label = "Date")%>%
dyLegend(show = "always", hideOnMouseOut = FALSE)%>%
dyUnzoom() %>%
dyCrosshair(direction = "vertical")
Belfast %>% dygraph(width=800,height=170,group="linked_ts",main="Belfast") %>%
dyOptions(stackedGraph = TRUE)%>%
dyAxis("y", label = "Rainfall ")%>%
dyAxis("x", drawGrid = FALSE, label = "Date")%>%
dyLegend(show = "always", hideOnMouseOut = FALSE)%>%
dyUnzoom() %>%
dyCrosshair(direction = "vertical")
Cork %>% dygraph(width=800,height=170,group="linked_ts",main="Cork")%>%
dyOptions(stackedGraph = TRUE)%>%
dyAxis("y", label = "Rainfall ")%>%
dyAxis("x", drawGrid = FALSE, label = "Date")%>%
dyLegend(show = "always", hideOnMouseOut = FALSE)%>%
dyUnzoom() %>%
dyCrosshair(direction = "vertical")
Galway %>% dygraph(width=800,height=170,group="linked_ts",main="Galway")%>%
dyRangeSelector(strokeColor = "", fillColor = "black")%>%
dyOptions(stackedGraph = TRUE)%>%
dyAxis("y", label = "Rainfall ")%>%
dyAxis("x", drawGrid = FALSE, label = "Date")%>%
dyLegend(show = "always", hideOnMouseOut = FALSE)%>%
dyUnzoom() %>%
dyCrosshair(direction = "vertical")
The result was four individual interactive and linked charts which enabled the analysis of rainfall patterns in the four locations by zooming in (double click to zoom back out) and the acquisition of raw data for each month by hovering.
To smooth out the display of a series to investigate long term trends and patterns, a rollPeriod was applied using the dyRoller command. This resulted in each plotted point representing an average of the number of time stamps specified in the roll period. In this demonstration, a rollPeriod of 300 months was specified to account for a rolling mean of 25 years. The roll period can also be modified within the box at the bottom-left of the graph so that users are aware of the averaging within the time series and can edit to smooth out to the desired average.
Dublin %>% dygraph(width=800,height=130,group="linked_ts_mean",main="Dublin") %>%
dyRoller(rollPeriod = 300)%>%
dyOptions(stackedGraph = TRUE)%>%
dyAxis("y", label = "Rainfall ")%>%
dyAxis("x", drawGrid = FALSE, label = "Date")%>%
dyLegend(show = "always", hideOnMouseOut = FALSE)%>%
dyUnzoom() %>%
dyCrosshair(direction = "vertical")
Belfast %>% dygraph(width=800,height=170,group="linked_ts_mean",main="Belfast") %>%
dyRoller(rollPeriod = 300)%>%
dyOptions(stackedGraph = TRUE)%>%
dyAxis("y", label = "Rainfall ")%>%
dyAxis("x", drawGrid = FALSE, label = "Date")%>%
dyLegend(show = "always", hideOnMouseOut = FALSE)%>%
dyUnzoom() %>%
dyCrosshair(direction = "vertical")
Cork %>% dygraph(width=800,height=170,group="linked_ts_mean",main="Cork")%>%
dyRoller(rollPeriod = 300)%>%
dyOptions(stackedGraph = TRUE)%>%
dyAxis("y", label = "Rainfall ")%>%
dyAxis("x", drawGrid = FALSE, label = "Date")%>%
dyLegend(show = "always", hideOnMouseOut = FALSE)%>%
dyUnzoom() %>%
dyCrosshair(direction = "vertical")
Galway %>% dygraph(width=800,height=170,group="linked_ts_mean",main="Galway")%>%
dyRoller(rollPeriod = 300)%>%
dyRangeSelector(strokeColor = "", fillColor = "black")%>%
dyOptions(stackedGraph = TRUE)%>%
dyAxis("y", label = "Rainfall ")%>%
dyAxis("x", drawGrid = FALSE, label = "Date")%>%
dyLegend(show = "always", hideOnMouseOut = FALSE)%>%
dyUnzoom() %>%
dyCrosshair(direction = "vertical")
Patterns and trends identified were as follows:
dyRangeSelector(), comparison between individual decades was enabled which illustrated a seasonal pattern within each year, with January and February experiencing the highest rainfall values, a dip in values during the summer months, and rise in September and October.summarise() command, and can be seen in the table below:mean_rain<-rain %>%
group_by(Station) %>%
filter(Station== c('Dublin Airport','Belfast','University College Galway','Cork Airport'))%>%
summarise(MeanRainfall=mean(Rainfall))
knitr::kable(mean_rain, caption = "Mean Rainfall")
| Station | MeanRainfall |
|---|---|
| Belfast | 87.51362 |
| Cork Airport | 100.53232 |
| Dublin Airport | 61.95354 |
| University College Galway | 102.27344 |
Through the use of interactive dygraphs, the expanded long term rainfall time series illustrated positive (winter) and negative (summer) seasonal trends, along with a general rise in rainfall values over the period 1850-2014, that Met Éireann (2020) outlined. The use of dygraphs to display these trends validated the value of historical records as smaller time periods are not always as representative of longer records (Met Éireann, 2020).
Separately analysing historical rainfall levels at different locations with the group command illustrated the variation in precipitation characteristics across Ireland in response to prevailing airflow patterns which divide the country into two basic regions: 1) the west, which has heavy precipitation and a strong winter maximum associated with westerly airflow, and 2) a drier regime in the midlands and east, where seasonal contrasts are less pronounced (Houghton & O’Cinnéide, 1976: 33).
This study effectively displayed the ability of dygraphs to not only display dense time series, but to also offer the user the ability to explore and analyse the data. Users are able to interpret separate portions of a data set (e.g. specific months) as well as the time series in its entirety. Essentially, the more data used for analysis, the higher the functionality of the dygraph (quintagroup, 2021).
dplyr: part of the tidyverse (2020) Overview [online]. Available at: https://dplyr.tidyverse.org/ (accessed 8 January 2021)
GitHub (2021) dygraphs for R [online]. Available at: https://rstudio.github.io/dygraphs/index.html (accessed 8 January 2021)
Houghton. J.G. & O’Cinnéide, M.S. (1976) Airflow Types and Rainfall in Ireland. Association of Pacific Coast Geographers, 38, 33-48.
Met Éireann (2020) Rainfall Climate of Ireland [online]. Available at: https://www.met.ie/climate/what-we-measure/rainfall#:~:text=Most%20of%20the%20eastern%20half,rainfall%20exceeds%202000mm%20per%20year (accessed 8 January 2021)
quintagroup (2021) Dygraphs for Data Visualisation [online]. Available at: https://quintagroup.com/cms/js/dygraphs (accessed 8 January 2021)
RStudio Blog (2014) htmlwidgets: JavaScript data visualization for R [online]. Available at: https://rstudioblog.wordpress.com/2014/12/18/htmlwidgets-javascript-data-visualization-for-r/ (accessed 9 January 2021)
The R Graph Gallery (2018) An introduction to interactive time series with R and dygraphs [online]. Available at: https://www.r-graph-gallery.com/316-possible-inputs-for-the-dygraphs-library.html (accessed 8 January 2021)
Tidyverse.org (2020) Tideyverse packages [online]. Available at: https://www.tidyverse.org/packages/ (accessed 7 January 2021)