Introduction

This is assignment we will attempt to investigate rainfall in Ireland since mid-19th century by creating a map of rainfall in Ireland for the 25 weather stations, color coding the symbol for each station according to its median rainfall level in January and containing an explanation of the data that is being used, the code used to create the map or dygraph, the embedded map or dygraph, and a brief discussion of patterns that have been identified.

R Packages

A list of R packages we’ll need for this assignment is given below. Install in the usual way [e.g. install.packages(“leaflet”) ].

Name of required R Packages: dplyr, dygraphs and leaflet

We first load all the required R packages:

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(dygraphs)
## Warning: package 'dygraphs' was built under R version 3.3.2
library(leaflet)
## Warning: package 'leaflet' was built under R version 3.3.2

Set local working directory, Open Data and inspect data

setwd("C:/Ireland")
load('rainfall.RData') 
head(stations, n=4)
## # A tibble: 4 × 9
##       Station Elevation Easting Northing   Lat  Long    County
##         <chr>     <int>   <dbl>    <dbl> <dbl> <dbl>     <chr>
## 1      Athboy        87  270400   261700 53.60 -6.93     Meath
## 2 Foulksmills        71  284100   118400 52.30 -6.77   Wexford
## 3   Mullingar       112  241780   247765 53.47 -7.37 Westmeath
## 4     Portlaw         8  246600   115200 52.28 -7.31 Waterford
## # ... with 2 more variables: Abbreviation <chr>, Source <chr>
head(rain, n=4)
## # A tibble: 4 × 4
##    Year  Month Rainfall Station
##   <dbl> <fctr>    <dbl>   <chr>
## 1  1850    Jan    169.0  Ardara
## 2  1851    Jan    236.4  Ardara
## 3  1852    Jan    249.7  Ardara
## 4  1853    Jan    209.1  Ardara

Note that Rainfall database contains two dataframe of interest, ‘stations’ and ‘rain’. Dataframe ‘stations’ has columns of interest to us: Station, Lat, Long and County.Dataframe ‘rain’ has columns of interest to us: Year, Month, Rainfall and Station. Also note that column Station is common in both the dataframes, which will become our basis for joining (or linking) these two dataframes latter in the assignment.Note that the echo = TRUE parameter was added to the code chunk to allow printing of the R code that generated the plot.Similarly n=4 ensures that only four records of the data set is displayed.

Group Rain data by year and compute Mean, Standard Deviation and Median Rainfall

rain %>% group_by(Year) %>% 
  summarise(mnrain=mean(Rainfall),sdrain=sd(Rainfall), mdrain=median(Rainfall))  -> rain_by_Year
head(rain_by_Year, n=4)
## # A tibble: 4 × 4
##    Year    mnrain   sdrain mdrain
##   <dbl>     <dbl>    <dbl>  <dbl>
## 1  1850  85.07233 41.83323  74.65
## 2  1851  82.93133 51.63810  71.35
## 3  1852 112.39567 70.65901  92.75
## 4  1853  86.68200 50.92695  76.30

In the above codes Rain' data frame is grouped based on column Year. Mean, Standard Deviation and Median Rainfall is computed and saves in new data frame:rain_by_year’.

Including Interactive Graphs

We embed here the 4 year roll over plot of Yearly Rainfall showing Mean, SD and Median Rainfall statistics:

rain_by_Year %>% ungroup %>% arrange(Year) %>% transmute(mnrain, sdrain, mdrain) %>% 
  ts(start=c(1850,1),freq=12) ->  dub_ts

dub_ts %>% dygraph(width=800,height=350, main="4 Year Roll Over interactive plot of Annual Rainfall") %>% dyRangeSelector %>% dyRoller(rollPeriod = 48)

Note that the Time Series dataframe is created using `ts()’ function before plotting.Here mnrain = Mean Rainfall, sdrain = Standard Deviation of Rainfall and mdrain = Median Rainfall.

Total Annual Rainfall - All Stations vs. Station: Strokestown

rain %>% group_by(Year) %>% 
  summarise(total_rain=sum(Rainfall)) -> rain_years
with(rain_years,plot(Year,total_rain,type='l',col='dodgerblue', main="Total Yearly Rainfall"))

rain %>% group_by(Year) %>% 
  filter(Station=='Strokestown') %>%
  summarise(total_rain=sum(Rainfall)) -> rain_years_str
with(rain_years_str,plot(Year,total_rain,type='l',col='dodgerblue', main="Total Yearly Rainfall for Strokestown"))

Note that we can similarly plot graphs for each of the 25 Stations.

Group by Month - Mean, SD, Median Barplots

rain %>% group_by(Month) %>% 
  summarise(mnrain=mean(Rainfall),sdrain=sd(Rainfall), mdrain=median(Rainfall))  -> rain_by_Month
head(rain_by_Month, n=4)
## # A tibble: 4 × 4
##    Month    mnrain   sdrain mdrain
##   <fctr>     <dbl>    <dbl>  <dbl>
## 1    Jan 112.64355 57.56599  104.6
## 2    Feb  83.24975 51.45153   74.4
## 3    Mar  79.53280 44.33714   73.1
## 4    Apr  68.74165 36.39073   62.5
barplot(rain_by_Month$mdrain,names=rain_by_Month$Month,las=3,col='dodgerblue', main="Median Rainfall", sub="(Overall Monthwise)")

barplot(rain_by_Month$mnrain,names=rain_by_Month$Month,las=3,col='dodgerblue', main="Average Rainfall", sub="(Overall Monthwise)")

barplot(rain_by_Month$sdrain,names=rain_by_Month$Month,las=3,col='dodgerblue', main="Rainfall Variability", sub="(Overall Monthwise)")

Note that Mean and Median Rainfall is the highest in the month of January and December closely followed that by October and November.

Monthplot and Decomposition of Rainfall

rain %>% group_by(Year,Month) %>% summarise(rf=sum(Rainfall)) -> monthly_total
monthly_total$rf %>% ts(freq=12,start=1850) -> rain_ts

rain_ts %>% window(c(1850,1),c(1869,12)) %>%
  monthplot(col='dodgerblue',col.base='indianred',lwd.base=3, main="MonthPlot")

rain_ts  %>% window(c(1850,1),c(1859,12)) %>% 
  stl(s.window='periodic') %>% plot(main="Time Series Decomposition of Rainfall")

Various Ways Data is Organized

# Group by Station - Mean, SD, Median

rain %>% group_by(Station) %>% 
  summarise(mnrain=mean(Rainfall),sdrain=sd(Rainfall), mdrain=median(Rainfall))  -> rain_by_Station
head(rain_by_Station, n=4)
## # A tibble: 4 × 4
##   Station    mnrain   sdrain mdrain
##     <chr>     <dbl>    <dbl>  <dbl>
## 1  Ardara 140.36753 65.74703  131.5
## 2  Armagh  68.32096 32.82964   65.4
## 3  Athboy  74.74356 35.60322   70.7
## 4 Belfast  87.10995 42.70573   82.1
# Group by Station X Year - Mean, SD, Median

rain %>% group_by(Year,Station) %>% 
  summarise(mnrain=mean(Rainfall),sdrain=sd(Rainfall), mdrain=median(Rainfall)) -> rain_by_YrStation
head(rain_by_YrStation, n=4)
## Source: local data frame [4 x 5]
## Groups: Year [1]
## 
##    Year Station    mnrain   sdrain mdrain
##   <dbl>   <chr>     <dbl>    <dbl>  <dbl>
## 1  1850  Ardara 132.80000 33.71301 136.00
## 2  1850  Armagh  64.38333 22.85881  69.45
## 3  1850  Athboy  70.88333 17.88874  70.65
## 4  1850 Belfast  89.80000 29.50815  91.95
# Group by Station X Month - Mean, SD, Median

rain %>% group_by(Month,Station) %>% 
  summarise(mnrain=mean(Rainfall),sdrain=sd(Rainfall), mdrain=median(Rainfall)) -> rain_by_MthStation
head(rain_by_MthStation, n=4)
## Source: local data frame [4 x 5]
## Groups: Month [1]
## 
##    Month Station    mnrain   sdrain mdrain
##   <fctr>   <chr>     <dbl>    <dbl>  <dbl>
## 1    Jan  Ardara 174.82606 64.59834  171.6
## 2    Jan  Armagh  74.57242 28.91264   75.0
## 3    Jan  Athboy  84.94759 32.84887   87.1
## 4    Jan Belfast 101.20718 40.60078  102.1
# Group by Station X Month - Median for January

rain %>% group_by(Month,Station) %>%  filter(Month=='Jan') %>% 
  summarise(mdrainJan=median(Rainfall)) %>% left_join(stations)  %>% 
  select(Month, Station, mdrainJan, Long,Lat,County) -> rain_Median_Jan
## Joining, by = "Station"
head(rain_Median_Jan, , n=4)
## Source: local data frame [4 x 6]
## Groups: Month [1]
## 
##    Month Station mdrainJan  Long   Lat  County
##   <fctr>   <chr>     <dbl> <dbl> <dbl>   <chr>
## 1    Jan  Ardara     171.6 -8.29 54.79 Donegal
## 2    Jan  Armagh      75.0 -6.64 54.35  Armagh
## 3    Jan  Athboy      87.1 -6.93 53.60   Meath
## 4    Jan Belfast     102.1 -5.99 54.50  Antrim

Map of Median Annual rainfall in Ireland for the 25 weather stations

Whether station is represented by a Circle

Circle Colour Coded based on Median Rainfall in January

Map of Median Annual rainfall in Ireland for the 25 weather stations

Whether station is represented by a SYMBOL

SYMBOL Colour Coded based on Median Rainfall in January

Final Remarks

Main aim of the above project is the accomplish completion of given assignment by publishing a blog (on Rpubs) describing one of the process of 1. Creating a map of average annual rainfall in Ireland for the 25 weather stations, colour coding the symbol for each station according to its median rainfall level in January.The above process enabled us to learn many things in the area of reproducible research. It helped us gaining knowledge and develop skills in the area of rmarkdown abilities, integrating maps and graphs into html document, adding interactive functionalities, use of amazing leaflet, dygraphs and dplyr packages and overall experience of integrating it all into one document/R script.We mainly learned t integrate the power of R Package in embed it into document to publish powerful, interactive and dynamic presentation or dashboard.