Introduction

A time series is defined as “a sequence of observations taken sequentially over time” (Box et al., 2015). An analysis of the time series helps evaluate and understand the pattern of change over time. Dygraphs is a charting package in R which provides useful and interactive tools, like zoom, for plotting time series data.

In this exercise, five sets of dygraphs depicting the time series of monthly rainfall data for the period 1850-2014 will be created; individual time series for four stations (Dublin Airport, Cork Airport, University College Galway, and Belfast), and one combined dygraph of the four stations. The rainfall data has been downloaded from Moodle. A description of the code used will be included as well as an overall analysis of rainfall pattern.

Creating dygraphs of time series of monthly rainfall

First set the working directory. This is where the files are saved. Then load the neccessary files.

setwd('C:/Users/Ledi/Documents/R')
getwd()
## [1] "C:/Users/Ledi/Documents/R"
load("C:/Users/Ledi/Documents/R/maps.RData")
load("C:/Users/Ledi/Documents/R/rainfall.RData")

Check the data. The functions head() and tail() display the first and last six parts of a data frame, respectively. The function dim() shows the number of columns and rows.

head(stations)
##       Station Elevation Easting Northing   Lat  Long    County
## 1      Athboy        87  270400   261700 53.60 -6.93     Meath
## 2 Foulksmills        71  284100   118400 52.30 -6.77   Wexford
## 3   Mullingar       112  241780   247765 53.47 -7.37 Westmeath
## 4     Portlaw         8  246600   115200 52.28 -7.31 Waterford
## 5    Rathdrum       131  319700   186000 52.91 -6.22   Wicklow
## 6 Strokestown        49  194500   279100 53.75 -8.10 Roscommon
##   Abbreviation      Source
## 1           AB Met Eireann
## 2            F Met Eireann
## 3            M Met Eireann
## 4            P Met Eireann
## 5           RD Met Eireann
## 6            S Met Eireann
dim(stations)
## [1] 25  9
head(rain)
##   Year Month Rainfall Station
## 1 1850   Jan    169.0  Ardara
## 2 1851   Jan    236.4  Ardara
## 3 1852   Jan    249.7  Ardara
## 4 1853   Jan    209.1  Ardara
## 5 1854   Jan    188.5  Ardara
## 6 1855   Jan     32.3  Ardara
tail(rain)
##       Year Month  Rainfall   Station
## 49495 2009   Dec  99.90000 Waterford
## 49496 2010   Dec  70.20000 Waterford
## 49497 2011   Dec  80.67308 Waterford
## 49498 2012   Dec 113.84615 Waterford
## 49499 2013   Dec 136.15385 Waterford
## 49500 2014   Dec  28.75000 Waterford

There are 25 columns in the stations data frame, which means there are 25 stations.

Then load the packages which will be used to create the dygraphs. The package dplyr provides a set of verbs which are used to manipulate data. It is faster than the traditional R packages and makes code easier to decipher.

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(dygraphs)

Create a time series graph for the total rainfall for the period 1850-2014.

rain %>%  group_by(Year,Month) %>% 
  summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>% 
  ts(start=c(1850,1),freq=12) -> rain_ts
rain_ts %>% window(c(1850,1),c(1851,12))
##         Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct
## 1850 2836.3 2158.9  964.1 3457.2 1492.1 1362.4 2584.8 1906.4 1763.1 1567.8
## 1851 4875.5 1379.9 1997.9 1368.1 1124.2 2537.1 2258.3 2416.2 1234.5 2834.2
##         Nov    Dec
## 1850 2828.5 2600.1
## 1851 1134.3 1719.2

The %>% command is known as a ‘pipeline’. This takes the information on the left (in this case the rain data frame) and pipes it forward into a function. It reduces the use of parenthesis, keeping the code less cluttered, and makes the code easier to read.

The above code would translate into: Take the rain data then group it by year and month then summarize the sum of rainfall, then ungroup (which undoes the group_by, and this is necessary for the transmute function ), then transmute the rainfall (which modifies values in the data frame and drops unreferenced variables), and create a monthly time series which has January 1850 as the starting date. A new data frame is created, names rain_ts, which will be used in subsequent code.

Then create a dygraph with an interactive window of the total rainfall for the period 1850-2014. The command dyRangeSelector creates a range selector at the bottom of the graph which allows you zoom in on any time frame of your choice within the period 1850-2014 to evaluate rainfall patterns and/or amounts.

rain_ts %>% dygraph(width=800,height=300,main="Irish Rainfall 1850-2014") %>% dyRangeSelector

Repeat the same code to create a dygraph of the time series for each station individually, but add the filter() command to select the station of interest, and create a time series data frame of that station. Repeat this for the other three stations.

rain %>%  group_by(Year,Month) %>% filter(Station=="Dublin Airport") %>%
  summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
  ts(start=c(1850,1),freq=12) ->  DUB_ts
DUB_ts %>% dygraph(width=800,height=360,main="Dublin Airport") %>% dyRangeSelector
rain %>%  group_by(Year,Month) %>% filter(Station=="Belfast") %>%
  summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
  ts(start=c(1850,1),freq=12) ->  BELF_ts
BELF_ts %>% dygraph(width=800,height=360,main="Belfast") %>% dyRangeSelector
rain %>%  group_by(Year,Month) %>% filter(Station=="University College Galway") %>%
  summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
  ts(start=c(1850,1),freq=12) ->  NUIG_ts
NUIG_ts %>% dygraph(width=800,height=360,main="University College Galway") %>% dyRangeSelector
rain %>%  group_by(Year,Month) %>% filter(Station=="Cork Airport") %>%
  summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
  ts(start=c(1850,1),freq=12) ->  ORK_ts
ORK_ts %>% dygraph(width=800,height=360,main="Cork Airport") %>% dyRangeSelector

The time series for each station are called: DUB_ts for Dublin Airport, BELF_ts for Belfast, NUIG_ts for University College Galway, and ORK_ts for Cork Airport.

It is also possible to look at all four stations simultaneously. The cbind() function will combine the stations together. Then proceed to create the dygraph decipicting rainfall all four stations simultaneously for the period 1850-2014.

FourStations_ts <- cbind(BELF_ts,DUB_ts,NUIG_ts,ORK_ts)
window(FourStations_ts,c(1850,1),c(1850,12))
##          BELF_ts DUB_ts NUIG_ts ORK_ts
## Jan 1850   115.7   75.8   108.9  155.3
## Feb 1850   120.5   47.8   131.5   92.6
## Mar 1850    56.8   18.5    56.6   56.0
## Apr 1850   142.6   97.5   120.5  207.2
## May 1850    57.9   58.6    69.8   35.3
## Jun 1850    62.0   43.6    74.7   11.4
## Jul 1850    96.3   66.0    89.1  179.0
## Aug 1850   110.4   41.2   136.8   46.5
## Sep 1850    65.8   54.2    85.2   40.7
## Oct 1850    87.6   40.4    90.7   53.8
## Nov 1850   104.4   60.0   131.3  153.2
## Dec 1850    57.6   81.1    90.6  169.4
FourStations_ts %>% dygraph(width=800,height=360,main="Four Stations") %>% dyRangeSelector

Discussion

Dublin Airport station is located in the east of Ireland, Cork Airport is in the south of the country, Belfast station is in the north and University College Galway is located in the west, providing an overall accurate representation of raifall for Ireland.

Looking at the stations individually, the lowest rainfall at Belfast station was recorded for September 1894 and the highest for December 1978. At University College Galway the highest amount of rainfall for the period 1850-2014 was recorded for November 2009. This is significant because intense widespread floodings were reported for the county of Galway in November 2009, believed to have been caused by exceptional rainfall. The lowest amount of rainfall at University College Galway was recorded for April 1938, which is also the overall driest month recorded for the period 1850-2014.At Cork Airport, the highest was recorded for December 1899, which is the highest overall in for this period at the four stations combined, and the lowest was in June 1921.

Generally, the rainfall patterns are consistent across the four stations but differ in amount. There are several peaks in rainfall, with the most significant peaks predominantly occurring in the winter months (December, January, February) and occasionally in October and November, with the wettest month for the period 1850-2014 was recorded at Cork Airport for December 1899 (460.5mm). The lowest rainfall occurs throughout spring and summer, particularly from April to June. A few outliers are present, occurring mainly in February and September. The driest month was recorded at University College Galway for April 1938 (0.4mm).

Bilbiography

Box G.E.P., Jenkins G.M., Reinsel G. C. Reinsel & Ljung G.M.(2015) Time Series Analysis: Forecasting and Control, 5th edition. John Wiley & Sons, New Jersey.

CRAN Index, https://cran.r-project.org/web/packages/available_packages_by_name.html last accessed on 17th January 2019.