The objective is to create a map of rainfall in Ireland for the 25 weather stations, colour coding the symbol for each station according to its median rainfall level in January.
On the following sections, the titles correspond to the procedures executed. For each one of them, the grey boxes show the codes inserted on the R script, followed by the outputs generated and explanatory texts about the codes inserted.
Finally, the conclusion brings a brief discussion of patterns identified.
setwd("D:/Program Files/RStudio")
load('rainfall.RData')
Dataset with two varibles uploaded: rain and stations
head(stations)
## Station Elevation Easting Northing Lat Long County
## 1 Athboy 87 270400 261700 53.60 -6.93 Meath
## 2 Foulksmills 71 284100 118400 52.30 -6.77 Wexford
## 3 Mullingar 112 241780 247765 53.47 -7.37 Westmeath
## 4 Portlaw 8 246600 115200 52.28 -7.31 Waterford
## 5 Rathdrum 131 319700 186000 52.91 -6.22 Wicklow
## 6 Strokestown 49 194500 279100 53.75 -8.10 Roscommon
## Abbreviation Source
## 1 AB Met Eireann
## 2 F Met Eireann
## 3 M Met Eireann
## 4 P Met Eireann
## 5 RD Met Eireann
## 6 S Met Eireann
‘Stations’ contains information about the 25 weather stations, in particular their location.
head(rain)
## Year Month Rainfall Station
## 1 1850 Jan 169.0 Ardara
## 2 1851 Jan 236.4 Ardara
## 3 1852 Jan 249.7 Ardara
## 4 1853 Jan 209.1 Ardara
## 5 1854 Jan 188.5 Ardara
## 6 1855 Jan 32.3 Ardara
‘Rain’ contains rainfall records for each station over a 164 year period (January 1850 to December 2014)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
rain %>% group_by(Station) %>%
summarise(mrain=median(Rainfall)) -> rain_summary
head(rain_summary)
## # A tibble: 6 x 2
## Station mrain
## <chr> <dbl>
## 1 Ardara 132.
## 2 Armagh 65.4
## 3 Athboy 70.7
## 4 Belfast 82.1
## 5 Birr 66.2
## 6 Cappoquinn 113.
County boundaries and settlements for leaflet
The ‘dplyr’ package provides an extension of the R statistical programming language that allows easier manipulation of data, and a ‘pipelining’-style notation, which in some data processing situations makes instructions much clearer than the classical ‘mathematical function’ notation.
Summarise applies a summary function to each group and creates a new data frame with one entry for each group. Here, the R summary function used is the median rainfall.
head(rain_summary)
## # A tibble: 6 x 2
## Station mrain
## <chr> <dbl>
## 1 Ardara 132.
## 2 Armagh 65.4
## 3 Athboy 70.7
## 4 Belfast 82.1
## 5 Birr 66.2
## 6 Cappoquinn 113.
The ‘head’ function returns the first part of the analysed object, which is the rainfall summary by station, grouped using the pipeline operator (to return the last part of the object, the tail function would be used)
rain %>% group_by(Month) %>%
summarise(mrain=median(Rainfall)) -> rain_months
head(rain_months)
## # A tibble: 6 x 2
## Month mrain
## <fct> <dbl>
## 1 Jan 105.
## 2 Feb 74.4
## 3 Mar 73.1
## 4 Apr 62.5
## 5 May 65.5
## 6 Jun 65.2
This time, the pipeline operator is used to group the median of rainfall data by month instead of grouping it by station
Summarise applies a summary function to each group and creates a new data frame with one entry for each group. Here, the R summary function used is the median monthly rainfall.
The ‘head’ function returns the first part of the analysed object, which is the median of monthly rainfall
rain %>% group_by(Month,Station) %>%
summarise(median_rain=median(Rainfall)) -> Median_rain_month_station
head(Median_rain_month_station)
## # A tibble: 6 x 3
## # Groups: Month [1]
## Month Station median_rain
## <fct> <chr> <dbl>
## 1 Jan Ardara 172.
## 2 Jan Armagh 75
## 3 Jan Athboy 87.1
## 4 Jan Belfast 102.
## 5 Jan Birr 77.5
## 6 Jan Cappoquinn 147.
The pipeline operator is used here to group the median of rainfall data by month and by station (first we had grouped it just by month and then just by station)
The R summary function used here is the median monthly rainfall by station (previously we had done just the median monthly rainfall, without designating the stations)
The ‘head’ function returns the first part of the analysed object, which is the median monthly rainfall of each station.
library(reshape2)
Median_rain_month_station %>% acast(Station~Month) %>% heatmap(Colv=NA)
## Using median_rain as value column: use value.var to override.
The ‘reshape2’ is an R package that makes it easy to transform data between wide and long formats (wide data has a column for each variable, while long-format data has a column for possible variable types and a column for the values of those variables)
The pipeline operator is used here to group the median of rainfall data by month and by station in a heat map. This code takes the rain data frame and runs through the steps to aggregated, reshape and compute the heat map.
rain %>% group_by(Month,Station) %>%
filter(Month=='Jan') %>%
summarise(median_rain=median(Rainfall)) -> Median_rain_Jan
The pipeline operator is used here to group the median of rainfall data by month and by station
As our objective is to map the median rainfall level in January over the weather stations in Ireland, the filter function is used here to filter the January data among the median monthly rainfall data.
The R summary function used here is the median January rainfall by station (previously we had done the median monthly rainfall by station, without filtering the month January)
rain %>% group_by(Year,Month) %>% summarise(median_rain=median(Rainfall)) %>%
filter(Month=='Jan') -> monthly_medians
The pipeline operator is used here to group the median rainfall data by year and by month
The January medians are selected using the ‘filter’ function
plot(monthly_medians$median_rain[1:165],type='b',ann=F, axes=F, cex=0.6)
axis(1,at=seq(1,165,by=1),labels=1850:2014); axis(2); title('Median Rainfall for January 1850-2014')
The filtered median rainfall data for January over the period 1850-2014 is now used to create a plot. ‘Monthly_medians’ is the data frame created previously, from which we want to extract (using the ‘$’ symbol) the median rainfall observed for the period of 1850 to 2014 (columns 1 to 165)
library(leaflet)
leaflet(data=stations,height=430,width=390) %>% addTiles %>%
setView(-8,53.5,6) %>% addCircleMarkers
## Assuming "Long" and "Lat" are longitude and latitude, respectively
Leaflet is an open-source JavaScript library for interactive maps
This function creates a Leaflet map widget using html widgets. This widget can be rendered on HTML pages generated from R Markdown. The data we want to represent in the map are the geographical location of the weather stations across Ireland (data=stations).
The ‘setView’ are a series of methods to manipulate the map. The coordinates of the map bounds (latitude and longitude) are described inside the brackets.
leaflet(data=stations,height=430,width=390) %>%
addProviderTiles('CartoDB.Positron') %>%
setView(-8,53.5,6) %>% addCircleMarkers
## Assuming "Long" and "Lat" are longitude and latitude, respectively
CartoDB is a software cloud computing platform that provides GIS and web mapping tools for display in a web browser. Positron is a grey-scale map, light based, which is ideal for a non obtrusive basemap for data visualization.
library(dplyr)
The ‘dplyr’ package provides an extension of the R statistical programming language that allows easier manipulation of data, and a ‘pipelining’-style notation, which in some data processing situations makes instructions much clearer than the classical ‘mathematical function’ notation.
load('rainfall.RData')
rain %>% group_by(Station) %>% summarise(median_rain=median(Rainfall)) -> rain_summary
rain_summary %>% left_join(stations) -> station_medians
## Joining, by = "Station"
The pipeline operator is used here to join the median rainfall values to the geographical location of the stations, creating station median objects with’lat’ and ‘long’ information and median rainfall data.
head(station_medians, n=10)
## # A tibble: 10 x 10
## Station median_rain Elevation Easting Northing Lat Long County
## <chr> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 Ardara 132. 15 180788. 394679. 54.8 -8.29 Doneg~
## 2 Armagh 65.4 62 287831. 345772. 54.4 -6.64 Armagh
## 3 Athboy 70.7 87 270400 261700 53.6 -6.93 Meath
## 4 Belfast 82.1 115 329623. 363141. 54.5 -5.99 Antrim
## 5 Birr 66.2 73 208017. 203400. 53.1 -7.88 Offaly
## 6 Cappoq~ 113. 76 213269. 104800. 52.2 -7.8 Water~
## 7 Cork A~ 90 154 167336. 64387. 51.8 -8.47 Cork
## 8 Derry 80.5 37 226324. 416014. 55.0 -7.58 Derry
## 9 Drumsna 80.2 45 200000 295800 53.9 -8 Leitr~
## 10 Dublin~ 56 71 319767. 240361. 53.4 -6.2 Dublin
## # ... with 2 more variables: Abbreviation <chr>, Source <chr>
The ‘head’ function returns the first part (here, the 10 first lines) of the analysed object, which is the median objects with’lat’ and ‘long’ information and median rainfall data.
color_mapping <- colorNumeric('PuBuGn',station_medians$median_rain)
previewColors(color_mapping,fivenum(station_medians$median_rain))
color_mapping
fivenum(station_medians$median_rain)
| 56 | |
| 73 | |
| 82.1 | |
| 91.9 | |
| 131.5 |
leaflet(data=station_medians,height=430,width=600) %>%
addProviderTiles('CartoDB.Positron') %>%
setView(-8,53.5,6) %>%
addCircleMarkers(fillColor=~color_mapping(median_rain),weight=0,fillOpacity = 0.75) %>%
addLegend(pal=color_mapping,values=~median_rain,title="Rainfall(median)",position='bottomleft')
## Assuming "Long" and "Lat" are longitude and latitude, respectively
A sequential palette (PuBuGn, a gradient from blue to green) was chosen to order median rainfall data that progress from low to high.The data frame (station_medians) from which the data (median_rain) was extracted (using the ‘$’ symbol) is the one that joins the median rainfall values to the geographical location of the stations, creating station median objects with’lat’ and ‘long’ information and median rainfall data.
The ‘previewColors’ function is used to give a preview of the color palette selected
The ‘leaflet’ function creates a Leaflet map widget using html widgets. This widget can be rendered on HTML pages generated from R Markdown. The data we want to represent in the map are the geographical location of the weather stations across Ireland (data=stations).
CartoDB is a software cloud computing platform that provides GIS and web mapping tools for display in a web browser. Positron is a grey-scale map, light based, which is ideal for a non obtrusive basemap for data visualization.
‘setView’ are a series of methods to manipulate the map. The coordinates of the map bounds (latitude and longitude) are described inside the brackets.
The function ‘addCircleMarkers’ is used to add graphic elements and layers to the map widget. The fill-opacity attribute is a presentation attribute defining the opacity of the paint server (e.g. color, gradient, pattern) applied to the circles.
The ‘addLegend’ function creates a legend to the map, and it is possible to select the ‘title’ and the ‘position’ of the legend
rain %>% arrange(Year,Month) -> rain2
head(rain2,n=4)
## # A tibble: 4 x 4
## Year Month Rainfall Station
## <dbl> <fct> <dbl> <chr>
## 1 1850 Jan 169 Ardara
## 2 1850 Jan 96.9 Derry
## 3 1850 Jan 108. Malin Head
## 4 1850 Jan 92.5 Armagh
local_monthplot <- function(station,raindata) {
raindata %>% filter(Station == station) -> local_rd
local_rd$Rainfall %>% ts(freq=12,start=1850) -> rain_ts
rain_ts %>% monthplot(col='dodgerblue',col.base='indianred',lwd.base=3)
}
local_monthplot('Derry',rain2)
png('test.png',width=400,height=300)
local_monthplot('Derry',rain2)
dev.off()
## png
## 2
library(leaflet)
stations %>% mutate(Filename=paste0('mp ',Station,'.png')) -> files
files %>% select(Station,Filename) %>% head
## # A tibble: 6 x 2
## Station Filename
## <chr> <chr>
## 1 Athboy mp Athboy.png
## 2 Foulksmills mp Foulksmills.png
## 3 Mullingar mp Mullingar.png
## 4 Portlaw mp Portlaw.png
## 5 Rathdrum mp Rathdrum.png
## 6 Strokestown mp Strokestown.png
for (i in 1:nrow(files))
with(files, {
png(Filename[i],width=400,height=300)
local_monthplot(Station[i],rain2)
dev.off()} )
Analysing the graphs and maps obtained, it was possible to see that, on the heatmap, the months with most rainfall are April, May and June. The stations with the greatest amount of precipitation vary according to the month, but in general, some of the stations with a significant amount of precipitation on the most rainy months (April, May and June) are: Foulksmills, Roches Point, Portlaw, and Ardara.
The graph featuring the median rainfall for January 1850-2014 shows a peak of precipitation between 1940 and 1958. There is oscilation on the rainfall amount along the years but generally, there is a pattern mantained.
Finally, the results obtained on the color map for January rainfall over the period 1850-2014 show that there are three stations that, on the overall period, recorded the highest amount of precipitation, which are featured in dark green. The geographical distribution of rainfall on the period analysed shows that the South and East of Ireland are the regions with the highest records of precipitation, with one singular spot on the Northwest.