Dygraph is an R interface for an interactive time series plotting. As in this assignment, it enables the visualisation of long term data series’ on a graph, with the ability to alter the graph to any time period. it allows for time series analysis of a range of different data sets.
The data set used in this assignment has been converted to an R binary data file for convenience in the aassignment.there are no missing values and the data runs from 1850 to 2014. The data file comes from Conor Murphy and Simon Noone of Maynooth University who undertook a large scale data digitisation and recovery of rainfall data in Ireland with the help of students of the university to extent the rainfall time series back to 1850. This assignment uses rainfall data from four different locations, Dublin Airport, Belfast, Cork Airport and University College Galway. These data sets are downloaded nd inoput into R for a time series analysis.
First the libraries are that are going to be needed throughout the asdignment are all loaded. These libraries are needed for the time series graphs. The tidyverse library gives a better layout for the tables that are produced in R.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(dygraphs)
library(reshape2)
library(tidyverse)
## ── Attaching packages ────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.0.0 ✔ readr 1.1.1
## ✔ tibble 1.4.2 ✔ purrr 0.2.5
## ✔ tidyr 0.8.1 ✔ stringr 1.3.1
## ✔ ggplot2 3.0.0 ✔ forcats 0.3.0
## ── Conflicts ───────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
The working directory is set for session. This is where R will read the file from nd save the file to. Dir() shows ewhats in the directory chosen.
setwd("/Volumes/JORDAN/Chris' Assignment")
dir()
## [1] "Barplot1.jpeg" "Bel_dygraph.png"
## [3] "Cork_dygraph.png" "Dub_Bel_Cork_Gal_Dygraph.png"
## [5] "Dub_Bel_Dygraph.png" "Dub_dygraph.png"
## [7] "Dygraph1.png" "Dygraph3_Range_selector.png"
## [9] "Dygraph4_Interactive_Roller.png" "dyrgraph2.png"
## [11] "Gal_dygraph.png" "Heatmap1.jpeg"
## [13] "Plot1_Filetered.jpeg" "Plot1.jpeg"
## [15] "rainfall.rdata" "Rmarkdown.Rmd"
## [17] "Untitled.html" "Untitled.Rmd"
## [19] "Untitled2.html" "Untitled2.Rmd"
Looking at what is in the directory, the data file can be loaded in. the load() function is used to load the data in. Then a lookl at what the data set looks like from the first couple of lines using the head() function. It gives the information about each individual station; eastings, northings, latitude and longitude for example as well as rainfall.
load('rainfall.rdata')
head(stations)
## # A tibble: 6 x 9
## Station Elevation Easting Northing Lat Long County Abbreviation Sour…
## <chr> <int> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
## 1 Athboy 87 270400 261700 53.6 -6.93 Meath AB Met …
## 2 Foulks… 71 284100 118400 52.3 -6.77 Wexfo… F Met …
## 3 Mullin… 112 241780 247765 53.5 -7.37 Westm… M Met …
## 4 Portlaw 8 246600 115200 52.3 -7.31 Water… P Met …
## 5 Rathdr… 131 319700 186000 52.9 -6.22 Wickl… RD Met …
## 6 Stroke… 49 194500 279100 53.8 -8.1 Rosco… S Met …
Now, a look at the mean rinfall for each station. The unecessary information is removed from the output, leaving just the station and the mean rainfall for the station.
rain %>% group_by(Station) %>%
summarise(mrain=mean(Rainfall)) -> rain_summary
head(rain_summary)
## # A tibble: 6 x 2
## Station mrain
## <chr> <dbl>
## 1 Ardara 140.
## 2 Armagh 68.3
## 3 Athboy 74.7
## 4 Belfast 87.1
## 5 Birr 70.8
## 6 Cappoquinn 121.
After looking at the rainfall for the stations at the top of the data set, a look at the monthly mean rainfall for the whole data set was need to see the average rainfall for the data.
rain %>% group_by(Month) %>%
summarise(mrain=mean(Rainfall)) -> rain_months
head(rain_months)
## # A tibble: 6 x 2
## Month mrain
## <fct> <dbl>
## 1 Jan 113.
## 2 Feb 83.2
## 3 Mar 79.5
## 4 Apr 68.7
## 5 May 71.3
## 6 Jun 72.7
The barchart shwos the mean rainfall for the data set. It is as expected with rainfall being highest in the winter montha of December and January. However October sees a higher mean rainfall than February. In mid-year, the lower mean rainfall is expected and it increases at it moves towards the latter end of the months.
with(rain_months,barplot(mrain,names=Month,las=3,col='dodgerblue'))
A look at the rainfall on a yearly timescale now aloows for the analysis of tje variability of raonfall throughout the 164 year data set. Just after 1850 there n extremely wet year followed imeediately by a very dry year can be seen. Thus shows compound events happenig previously, even if they are increasing in frequency recently due to climat change. The late 1800s shows a standout dry year in the data set while the mid-1800s shows a standout wet year also. the century shows a wide range of variability from data recorded compared to the centuries that follow where variability reduces significantly. However, extrenmely wet and dry years can still be seen in the early 2000s with possible compound events evident from the graph.
rain %>% group_by(Year) %>%
summarise(total_rain=sum(Rainfall)) -> rain_years
with(rain_years,plot(Year,total_rain,type='l',col='dodgerblue'))
The following table shows the average mean rainfall for each station per month.
rain %>% group_by(Month,Station) %>%
summarise(mean_rain=mean(Rainfall)) -> rain_season_station
head(rain_season_station)
## # A tibble: 6 x 3
## # Groups: Month [1]
## Month Station mean_rain
## <fct> <chr> <dbl>
## 1 Jan Ardara 175.
## 2 Jan Armagh 74.6
## 3 Jan Athboy 84.9
## 4 Jan Belfast 101.
## 5 Jan Birr 79.9
## 6 Jan Cappoquinn 154.
This heat map is a useful visualisation of the montly rainfall for each of the stations in the data set. The yellow shows the high levels of rainfall and the red is a lower level of rainfal. it displays the mean rainfall for each month. It shows the high level of rainfall in the Autumn and Winter moths and the drier months being Spring and Summer.
rain_season_station %>% acast(Station~Month) %>% heatmap(Colv=NA)
## Using mean_rain as value column: use value.var to override.
The following is the interactive time series plot of the four stations chsen for the time series analysis, Dublin Airport, Belfast, Cork Airoport and University College Galway. The four stations can be seen here in the one graph, with the range selector at the bottom of the graph. It allows the analyst to decide the time period they wish to look at or analyse by sliding the range selector along the bottom.
rain %>% group_by(Year,Month) %>% filter(Station=="Dublin Airport") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> dub_ts
rain %>% group_by(Year,Month) %>% filter(Station=="Belfast") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> bel_ts
rain %>% group_by(Year,Month) %>% filter(Station=="Cork Airport") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> cork_ts
rain %>% group_by(Year,Month) %>% filter(Station=="Cork Airport") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> cork_ts
rain %>% group_by(Year,Month) %>% filter(Station=="University College Galway") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> gal_ts
beldubcorkgal_ts <- cbind(bel_ts,dub_ts,cork_ts,gal_ts)
window(beldubcorkgal_ts,c(1850,1),c(1850,5))
## bel_ts dub_ts cork_ts gal_ts
## Jan 1850 115.7 75.8 155.3 108.9
## Feb 1850 120.5 47.8 92.6 131.5
## Mar 1850 56.8 18.5 56.0 56.6
## Apr 1850 142.6 97.5 207.2 120.5
## May 1850 57.9 58.6 35.3 69.8
beldubcorkgal_ts %>% dygraph(width=800,height=360) %>% dyRangeSelector
The last four time series graphs shows the four stations on different plots while using the same range selector. The plot with all four stations can be crowded when attempting an analysis of the full time period. Thus, the four stations on fifferent plots using the same range selector makes this easier.
dub_ts %>% dygraph(width=800,height=130,group="dub_belf_cork_gal",main="Dublin")
bel_ts %>% dygraph(width=800,height=130,group="dub_belf_cork_gal",main="Belfast")
cork_ts %>% dygraph(width=800,height=130,group="dub_belf_cork_gal",main="Cork")
gal_ts %>% dygraph(width=800,height=130,group="dub_belf_cork_gal",main="Galway") %>% dyRangeSelector
Dublin in the overall range of the data shows a lower anount of rainfall on average and is drier throughout the data set. Belfast shows similar rainfall readings to Dublin, being a small bit wetter on average.Cork is the wettest station of the four on average. it has higher mean rainfall throughout the time series, and alos has the largest outlier in the data set. Galway being simliar to Dublin and Belfast, while being on the wetter side slightly. The west side of the island of Ireland proves to be the wetter side of the island on average rainfall from the dats set. Both Cork Aiport and University Colleger Galway being wetter than Belfast and Dublin Airport throughout the time series. Cork especially shows the most rainfall out of the four stations.