A time series is defined as “a sequence of observations taken sequentially over time” (Box et al., 2015). An analysis of the time series helps evaluate and understand the pattern of change over time. Dygraphs is a charting package in R which provides useful and interactive tools, like zoom, for plotting time series data.
In this exercise, five sets of dygraphs depicting the time series of monthly rainfall data for the period 1850-2014 will be created; individual time series for four stations (Dublin Airport, Cork Airport, University College Galway, and Belfast), and one combined dygraph of the four stations. The rainfall data has been downloaded from Moodle. A description of the code used will be included as well as an overall analysis of rainfall pattern.
First set the working directory. This is where the files are saved. Then load the neccessary files.
setwd('C:/Users/Ledi/Documents/R')
getwd()
## [1] "C:/Users/Ledi/Documents/R"
load("C:/Users/Ledi/Documents/R/maps.RData")
load("C:/Users/Ledi/Documents/R/rainfall.RData")
Check the data. The functions head() and tail() display the first and last six parts of a data frame, respectively. The function dim() shows the number of columns and rows.
head(stations)
## Station Elevation Easting Northing Lat Long County
## 1 Athboy 87 270400 261700 53.60 -6.93 Meath
## 2 Foulksmills 71 284100 118400 52.30 -6.77 Wexford
## 3 Mullingar 112 241780 247765 53.47 -7.37 Westmeath
## 4 Portlaw 8 246600 115200 52.28 -7.31 Waterford
## 5 Rathdrum 131 319700 186000 52.91 -6.22 Wicklow
## 6 Strokestown 49 194500 279100 53.75 -8.10 Roscommon
## Abbreviation Source
## 1 AB Met Eireann
## 2 F Met Eireann
## 3 M Met Eireann
## 4 P Met Eireann
## 5 RD Met Eireann
## 6 S Met Eireann
dim(stations)
## [1] 25 9
head(rain)
## Year Month Rainfall Station
## 1 1850 Jan 169.0 Ardara
## 2 1851 Jan 236.4 Ardara
## 3 1852 Jan 249.7 Ardara
## 4 1853 Jan 209.1 Ardara
## 5 1854 Jan 188.5 Ardara
## 6 1855 Jan 32.3 Ardara
tail(rain)
## Year Month Rainfall Station
## 49495 2009 Dec 99.90000 Waterford
## 49496 2010 Dec 70.20000 Waterford
## 49497 2011 Dec 80.67308 Waterford
## 49498 2012 Dec 113.84615 Waterford
## 49499 2013 Dec 136.15385 Waterford
## 49500 2014 Dec 28.75000 Waterford
There are 25 columns in the stations data frame, which means there are 25 stations.
Then load the packages which will be used to create the dygraphs. The package dplyr provides a set of verbs which are used to manipulate data. It is faster than the traditional R packages and makes code easier to decipher.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(dygraphs)
Create a time series graph for the total rainfall for the period 1850-2014.
rain %>% group_by(Year,Month) %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> rain_ts
rain_ts %>% window(c(1850,1),c(1851,12))
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct
## 1850 2836.3 2158.9 964.1 3457.2 1492.1 1362.4 2584.8 1906.4 1763.1 1567.8
## 1851 4875.5 1379.9 1997.9 1368.1 1124.2 2537.1 2258.3 2416.2 1234.5 2834.2
## Nov Dec
## 1850 2828.5 2600.1
## 1851 1134.3 1719.2
The %>% command is known as a ‘pipeline’. This takes the information on the left (in this case the rain data frame) and pipes it forward into a function. It reduces the use of parenthesis, keeping the code less cluttered, and makes the code easier to read.
The above code would translate into: Take the rain data then group it by year and month then summarize the sum of rainfall, then ungroup (which undoes the group_by, and this is necessary for the transmute function ), then transmute the rainfall (which modifies values in the data frame and drops unreferenced variables), and create a monthly time series which has January 1850 as the starting date. A new data frame is created, names rain_ts, which will be used in subsequent code.
Then create a dygraph with an interactive window of the total rainfall for the period 1850-2014. The command dyRangeSelector creates a range selector at the bottom of the graph which allows you zoom in on any time frame of your choice within the period 1850-2014 to evaluate rainfall patterns and/or amounts.
rain_ts %>% dygraph(width=800,height=300,main="Irish Rainfall 1850-2014") %>% dyRangeSelector
Repeat the same code to create a dygraph of the time series for each station individually, but add the filter() command to select the station of interest, and create a time series data frame of that station. Repeat this for the other three stations.
rain %>% group_by(Year,Month) %>% filter(Station=="Dublin Airport") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> DUB_ts
DUB_ts %>% dygraph(width=800,height=360,main="Dublin Airport") %>% dyRangeSelector
rain %>% group_by(Year,Month) %>% filter(Station=="Belfast") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> BELF_ts
BELF_ts %>% dygraph(width=800,height=360,main="Belfast") %>% dyRangeSelector
rain %>% group_by(Year,Month) %>% filter(Station=="University College Galway") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> NUIG_ts
NUIG_ts %>% dygraph(width=800,height=360,main="University College Galway") %>% dyRangeSelector
rain %>% group_by(Year,Month) %>% filter(Station=="Cork Airport") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> ORK_ts
ORK_ts %>% dygraph(width=800,height=360,main="Cork Airport") %>% dyRangeSelector
The time series for each station are called: DUB_ts for Dublin Airport, BELF_ts for Belfast, NUIG_ts for University College Galway, and ORK_ts for Cork Airport.
It is also possible to look at all four stations simultaneously. The cbind() function will combine the stations together. Then proceed to create the dygraph decipicting rainfall all four stations simultaneously for the period 1850-2014.
FourStations_ts <- cbind(BELF_ts,DUB_ts,NUIG_ts,ORK_ts)
window(FourStations_ts,c(1850,1),c(1850,12))
## BELF_ts DUB_ts NUIG_ts ORK_ts
## Jan 1850 115.7 75.8 108.9 155.3
## Feb 1850 120.5 47.8 131.5 92.6
## Mar 1850 56.8 18.5 56.6 56.0
## Apr 1850 142.6 97.5 120.5 207.2
## May 1850 57.9 58.6 69.8 35.3
## Jun 1850 62.0 43.6 74.7 11.4
## Jul 1850 96.3 66.0 89.1 179.0
## Aug 1850 110.4 41.2 136.8 46.5
## Sep 1850 65.8 54.2 85.2 40.7
## Oct 1850 87.6 40.4 90.7 53.8
## Nov 1850 104.4 60.0 131.3 153.2
## Dec 1850 57.6 81.1 90.6 169.4
FourStations_ts %>% dygraph(width=800,height=360,main="Four Stations") %>% dyRangeSelector
Dublin Airport station is located in the east of Ireland, Cork Airport is in the south of the country, Belfast station is in the north and University College Galway is located in the west, providing an overall accurate representation of raifall for Ireland.
Looking at the stations individually, the lowest rainfall at Belfast station was recorded for September 1894 and the highest for December 1978. At University College Galway the highest amount of rainfall for the period 1850-2014 was recorded for November 2009. This is significant because intense widespread floodings were reported for the county of Galway in November 2009, believed to have been caused by exceptional rainfall. The lowest amount of rainfall at University College Galway was recorded for April 1938, which is also the overall driest month recorded for the period 1850-2014.At Cork Airport, the highest was recorded for December 1899, which is the highest overall in for this period at the four stations combined, and the lowest was in June 1921.
Generally, the rainfall patterns are consistent across the four stations but differ in amount. There are several peaks in rainfall, with the most significant peaks predominantly occurring in the winter months (December, January, February) and occasionally in October and November, with the wettest month for the period 1850-2014 was recorded at Cork Airport for December 1899 (460.5mm). The lowest rainfall occurs throughout spring and summer, particularly from April to June. A few outliers are present, occurring mainly in February and September. The driest month was recorded at University College Galway for April 1938 (0.4mm).
Box G.E.P., Jenkins G.M., Reinsel G. C. Reinsel & Ljung G.M.(2015) Time Series Analysis: Forecasting and Control, 5th edition. John Wiley & Sons, New Jersey.
CRAN Index, https://cran.r-project.org/web/packages/available_packages_by_name.html last accessed on 17th January 2019.