# Install packages if necessary
if (!require("dygraphs")) install.packages("dygraphs", dependencies = TRUE)
if (!require("dplyr")) install.packages("dplyr", dependencies = TRUE)
if (!require("RColorBrewer")) install.packages("RColorBrewer", dependencies = TRUE)
# Load libraries
library(dygraphs)
library(dplyr)
library(RColorBrewer)Visualising Rainfall Patterns in Irish Cities using Dygraphs
Interactive visualisations with dygraphs in R
When it comes to creating interactive visualizations in R there are several options available but if you are working with time series data, the dygraphs package offers some advantages.
dygraphs, a dynamic graphics library, offers features like zooming, panning and rolling averages, allowing you to explore your data interactively. You can easily zoom into specific time ranges, pan across your data and hover to see precise values.
Additionally, dygraphs lets you overlay multiple time series and add annotations to highlight key events. This functionality is useful for examining trends and patterns and it enhances user engagement.
To see some interactive examples of what you can achieve with dygraphs, check out these interactive charts and learn more about time series visualizations in R.
Installing the dygraphs package
If the dygraphs package is not already installed, install it then load the library. Also install and load the dplyr and RColorBrewer packages as needed:
Irish Rainfall Data
Next, import your data from your working directory. In this example, rainfall data is stored in a dataframe named rain in a .RData file. It contains rainfall data for 25 stations around Ireland. It spans 164 years (1850 - 2014) and was made available by Prof. Conor Murphy and Dr. Simon Noone from ICARUS. Using this data we can create a dygraph of total monthly rainfall.
Import the data using load() and view the first and last rows of the dataframe using head() and tail():
load("rainfall.RData")
head(rain)tail(rain)List of all stations
To get a list of all unique station names in the Station column of the rain dataframe and print:
# Get a list of all unique station names
rain %>%
select(Station) %>%
distinct() %>%
pull(Station) %>%
unique() -> station_names
# Print station names
print(station_names) [1] "Ardara" "Derry"
[3] "Malin Head" "Armagh"
[5] "Belfast" "Strokestown"
[7] "Markree Castle" "Drumsna"
[9] "Birr" "Athboy"
[11] "University College Galway" "Cappoquinn"
[13] "Mullingar" "Phoenix Park"
[15] "Dublin Airport" "Shannon Airport"
[17] "Portlaw" "Foulksmills"
[19] "Enniscorthy" "Rathdrum"
[21] "Valentia" "Cork Airport"
[23] "Killarney" "Roches Point"
[25] "Waterford"
Next, we need to convert the data and extract the stations of interest.
Working with time series data
Dygraphs require data be in a time series format. This can be done using a dataframe with properly configured date and time columns, a time series object (ts) or an extensible time series object (xts). Since we are only interested in 4 of the 25 stations in the dataset, those need to be extracted. Typically, extraction and conversion would require 3 steps:
- Extract a specific station using
filter()from thedplyrpackage:
filter(rain, Station == "Belfast") -> filtered_data- Use
summarise()to calculate the rainfall by Year and Month:
summarise(filtered_data, Rainfall = sum(Rainfall), .by = c(Year, Month)) -> summarised_data- Create a time series object (
ts) by extracting the rainfall and specifying the start Year, Month and the frequency:
ts(summarised_data$Rainfall, start = c(1850, 1), frequency = 12) -> belfastBecause each step feeds into the succeeding one, we can use pipe (%>%) to connect the steps and feed the results without creating separate variables:
rain %>% filter(Station=="Belfast") %>%
summarise(Rainfall=sum(Rainfall),.by=c(Year,Month)) %>% pull(Rainfall) %>%
ts(start=c(1850, 1), frequency = 12) -> belfastCreate time series objects for the other stations:
# Cork airport
rain %>% filter(Station=="Cork Airport") %>%
summarise(Rainfall=sum(Rainfall),.by=c(Year,Month)) %>% pull(Rainfall) %>%
ts(start=c(1850, 1), frequency = 12) -> cork
# Dublin airport
rain %>% filter(Station=="Dublin Airport") %>%
summarise(Rainfall=sum(Rainfall),.by=c(Year,Month)) %>% pull(Rainfall) %>%
ts(start=c(1850, 1), frequency = 12) -> dublin
# University College Galway
rain %>% filter(Station=="University College Galway") %>%
summarise(Rainfall=sum(Rainfall),.by=c(Year,Month)) %>% pull(Rainfall) %>%
ts(start=c(1850, 1), frequency = 12) -> galwayMerge the time series objects into one and view the first few rows:
all_stations <- cbind(belfast, cork, dublin, galway)
head(all_stations) belfast cork dublin galway
[1,] 115.7 155.3 75.8 108.9
[2,] 156.4 359.5 112.0 163.8
[3,] 157.2 216.2 80.3 174.9
[4,] 107.2 191.3 74.7 152.8
[5,] 116.2 157.0 101.1 133.1
[6,] 16.1 9.9 11.3 33.9
Creating a Dygraph
To create the dygraph, pipe the merged datasets to the dygraph() command.
all_stations %>% dygraph()Labelling
A title can be added using the main argument and the axes can be labelled using the dyAxis() function. Use dyLegend() to configure the width of the legend. The width and height of the dygraph can also be specified:
all_stations %>% dygraph(main = 'Total Monthly Rainfall', width=900, height=320) %>%
dyAxis("x", label = "Year") %>%
dyAxis("y", label = "Rainfall (mm)") %>%
dyLegend(width = 500)Changing Colours
Colours can be configured using the dySeries() function. For each time series object you can configure a label and color:
all_stations %>% dygraph() %>%
dyAxis("x", label = "Year") %>%
dyAxis("y", label = "Rainfall (mm)") %>%
dySeries("belfast", label = "Belfast", color = "blue") %>%
dySeries("cork", label = "Cork", color = "red") %>%
dySeries("dublin", label = "Dublin", color = "green") %>%
dySeries("galway", label = "Galway", color = "brown")If you are not sure about what colours to use, the RColorBrewer package can help. ColorBrewer is designed to help users choose effective colour schemes. Using brewer.pal(), select a colour palette with four colours then use dySeries() to apply the colours as before:
clrs <- brewer.pal(4, "Set1")
#print(clrs)
all_stations %>% dygraph() %>%
dyAxis("x", label = "Year") %>%
dyAxis("y", label = "Rainfall (mm)") %>%
dySeries("belfast", label = "Belfast", color = clrs[1]) %>%
dySeries("cork", label = "Cork", color = clrs[2]) %>%
dySeries("dublin", label = "Dublin", color = clrs[3]) %>%
dySeries("galway", label = "Galway", color = clrs[4])Adding a Dynamic Range Selector
A dynamic range selector allows users to select and zoom into specific portions of the graph. To add a dynamic range selector use dyRangeSelector():
all_stations %>% dygraph() %>%
dyRangeSelector()Adding an Interactive Rolling Mean
You can also and a rolling moving mean which smooths out short term fluctuations and highlights long term trends. Adding an interactive rolling mean is done using dyRoller(). In this example a 50-year rolling mean is added:
all_stations %>% dygraph(width = 800, height = 300) %>%
dyRangeSelector() %>%
dyRoller(rollPeriod = 600)Note that the rollPeriod is stated in months (600 months = 50 years). Users can also enter their own values but the default is 50 years.
Grouping Dygraphs
An alternate way of looking at multiple stations is displaying each dygraph individually and synchronising them using a range selector. To do this, pipe each time series object to a new dygraph, configure the width and height then use the group parameter to assign the dygraphs to a single group. The range selector is configured on the last dygraph:
belfast %>% dygraph(width=800,height=150,group="ts_grp",main="Belfast")
cork %>% dygraph(width=800,height=150,group="ts_grp",main="Cork")
dublin %>% dygraph(width=800,height=150,group="ts_grp",main="Dublin")
galway %>% dygraph(width=800,height=190,group="ts_grp",main="Galway") %>% dyRangeSelector()An interactive rolling mean can also be added to the group:
belfast %>% dygraph(width=800,height=150,group="ts_grp",main="Belfast") %>% dyRoller(rollPeriod = 600)
cork %>% dygraph(width=800,height=150,group="ts_grp",main="Cork") %>% dyRoller(rollPeriod = 600)
dublin %>% dygraph(width=800,height=150,group="ts_grp",main="Dublin") %>% dyRoller(rollPeriod = 600)
galway %>% dygraph(width=800,height=190,group="ts_grp",main="Galway") %>% dyRangeSelector() %>% dyRoller(rollPeriod = 600)Subsetting time series data
Sometimes you may need to focus on a specific portion of the time series. A simple option for doing this is using the window() function as it allows you to extract a portion of the time series data. To do this, specify the start year and month and the end year and month:
belfast %>% window(c(1900, 1), c(1920, 12)) %>% dygraph(width=800,height=120,group="ts_grps",main="Belfast")
cork %>% window(c(1900, 1), c(1920, 12)) %>% dygraph(width=800,height=120,group="ts_grps",main="Cork")
dublin%>% window(c(1900, 1), c(1920, 12)) %>% dygraph(width=800,height=120,group="ts_grps",main="Dublin")
galway %>% window(c(1900, 1), c(1920, 12)) %>% dygraph(width=800,height=160,group="ts_grps",main="Galway") %>% dyRangeSelector()Rainfall Patterns in Belfast, Cork, Dublin and Galway for 1900 - 1950
Examination of the total monthly rainfall using a multi-series dygraph for 1900-1950 reveals that long-term patterns between the cities have generally followed the same trend. Prior to the early 1920’s the highest monthly rainfall typically occurred in Cork or Galway. Post the early 1920’s, there is increased variability with Belfast, Cork or Galway commonly seeing highest monthly rainfall in a given year.
Code
clrs <- brewer.pal(4, "Set1")
all_stations %>% window(c(1900, 1), c(1950, 12)) %>% dygraph(main = "Total Monthly Rainfall (1900-1950)", width = 900, height = 500) %>%
dyAxis("x", label = "Year") %>%
dyAxis("y", label = "Rainfall (mm)") %>%
dySeries("belfast", label = "Belfast", color = clrs[1]) %>%
dySeries("cork", label = "Cork", color = clrs[2]) %>%
dySeries("dublin", label = "Dublin", color = clrs[3]) %>%
dySeries("galway", label = "Galway", color = clrs[4]) %>%
dyLegend(width = 500) %>%
dyRangeSelector()Looking at the 10-year rolling mean for the same period, Cork and Galway regularly had the highest average rainfall until the mid 1920’s. After that point Belfast and Galway generally had the highest average rainfall in a given month/year. Up to 1920 all cities had fairly consistent average rainfall but decadal variability increased with notable fluctuation in average rainfall for all cities between 1920 and 1950. Along with the increased variability, monthly rainfall totals for all cities increased across the period.
Code
clrs <- brewer.pal(4, "Set1")
all_stations %>% window(c(1900, 1), c(1950, 12)) %>% dygraph(main = "10-yr Rolling Mean Rainfall (1900-1950)", width = 900, height = 500) %>%
dyAxis("x", label = "Year") %>%
dyAxis("y", label = "Rainfall (mm)") %>%
dySeries("belfast", label = "Belfast", color = clrs[1]) %>%
dySeries("cork", label = "Cork", color = clrs[2]) %>%
dySeries("dublin", label = "Dublin", color = clrs[3]) %>%
dySeries("galway", label = "Galway", color = clrs[4]) %>%
dyRangeSelector() %>%
dyLegend(width = 500) %>%
dyRoller(rollPeriod = 60)Conclusion
Dygraphs are an intuitive way of visualizing time series data. Its interactive options make it a good choice for exploring and analyzing temporal patterns and it is capable of handling large datasets efficiently. For more information about dygraphs in R, be sure to check out the comprehensive documentation available at rdrr.io.