Historical climate data sets are being transcribed from paper format into digital format which is useful for analysis. This is important in the face of climate change as long data series are useful for determining if climate events are precedented or unprecedented. In the Irish case, a rainfall data series has been transcribed for twenty-five weather stations and extends from the period 1850 to 2014. This data is readily available and can be used to investigate if current rainfall extremes or droughts are precedented or unprecedented. For the purpose of this exercise there will be initial exploration of the rainfall data using plots to visualise trends and then a map of rainfall in Ireland for 25 weather stations will be created, colour coding the symbol for each station according to its median rainfall level in January. The following code was used to conduct this exercise.
The knitr library is initially loaded and the working directory is set to where the rainfall data is saved. The rainfall data is then loaded using the load function for twenty-five stations in Ireland ranging from 1850-2014.
library(knitr)
#set working directory to where rainfall data is stored
setwd("~/College/Msc Climate Change/NCG602A/Chris")
dir()
## [1] "Climate Change assignment.Rmd"
## [2] "Climate_Change_assignment.html"
## [3] "Climate_Change_assignment.Rmd"
## [4] "maps.RData"
## [5] "NCG assignment with all code including pop ups.R"
## [6] "NCG602A Assignment 2.R"
## [7] "NCG602A_Assignment_2.html"
## [8] "Output graphs"
## [9] "Rainfall.RData"
## [10] "STL - A Seasonal-Trend Decomposition Procedure Based on Loess..pdf"
## [11] "test.png"
## [12] "Thumbs.db"
load("Rainfall.RData")
Once the rainfall data is loaded there are two variables within the rainfall data, named rain and stations. The head function is then used to view the first six lines of the variable rain, which contains rainfall records for each station over the period 1850 to 2014.
head(rain)
## Year Month Rainfall Station
## 1 1850 Jan 169.0 Ardara
## 2 1851 Jan 236.4 Ardara
## 3 1852 Jan 249.7 Ardara
## 4 1853 Jan 209.1 Ardara
## 5 1854 Jan 188.5 Ardara
## 6 1855 Jan 32.3 Ardara
The head function is then used to view the first six lines of data for the variable stations. Stations contains information regarding weather stations in particular their location.
head(stations)
## Station Elevation Easting Northing Lat Long County
## 1 Athboy 87 270400 261700 53.60 -6.93 Meath
## 2 Foulksmills 71 284100 118400 52.30 -6.77 Wexford
## 3 Mullingar 112 241780 247765 53.47 -7.37 Westmeath
## 4 Portlaw 8 246600 115200 52.28 -7.31 Waterford
## 5 Rathdrum 131 319700 186000 52.91 -6.22 Wicklow
## 6 Strokestown 49 194500 279100 53.75 -8.10 Roscommon
## Abbreviation Source
## 1 AB Met Eireann
## 2 F Met Eireann
## 3 M Met Eireann
## 4 P Met Eireann
## 5 RD Met Eireann
## 6 S Met Eireann
The dplyr package is then loaded. This package allows for easier manipulation of the data using a pipelining style notation.The pipeline operator is ‘%>%’ where the left hand side of the operator is the value and the right hand side is a function. It is useful when several functions are applied together.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
To find the median rainfall for each station the group_by function and summarise was used. Summarise applies a summary function to each group and creates a new data frame. The group_by function earmarks variables to group a data frame by. Rain is grouped by month, and then summarised by median rainfall over the entire period, the output is a new data frame named rain_months. The head function is then used to view the first six lines of the new data frame rain_months. From the output it has two columns, month and median rainfall.
rain %>% group_by(Month) %>% summarise(mrain=median(Rainfall)) -> rain_months
head(rain_months)
## # A tibble: 6 x 2
## Month mrain
## <fct> <dbl>
## 1 Jan 105.
## 2 Feb 74.4
## 3 Mar 73.1
## 4 Apr 62.5
## 5 May 65.5
## 6 Jun 65.2
Then an initial exploration of the data is conducted using the new data frame rain_months, where median rainfall for each of the months is plotted on a bar chart, using barplot function. It is evident from the chart that January has a high median rainfall in comparison to the rest of the months for the period extending from 1850 to 2014.
barplot(rain_months$mrain,names=rain_months$Month,las=3,col='dodgerblue')
The group_by function is used again to create a new data frame, where rain is grouped by month and station and then summarised by median rainfall, the new data frame created is named Median_rain_month_station. The pipeline operator is useful here as there are several functions applied in succession. The new data frame can then be explored using the head() function, and from the output it is clear that the data frame consists of three columns, month, station and median rainfall.
rain %>% group_by(Month,Station) %>%
summarise(median_rain=median(Rainfall)) -> Median_rain_month_station
head(Median_rain_month_station)
## # A tibble: 6 x 3
## # Groups: Month [1]
## Month Station median_rain
## <fct> <chr> <dbl>
## 1 Jan Ardara 172.
## 2 Jan Armagh 75
## 3 Jan Athboy 87.1
## 4 Jan Belfast 102.
## 5 Jan Birr 77.5
## 6 Jan Cappoquinn 147.
However, the new data frame Median_rain_month_station contains data for all of the months ranging from 1850 to 2014.The purpose of this exercise is to create a map of median rainfall for January only. Therefore, the filter function is applied to the data frame Median_rain_month_station so that January is the only month included in the analysis. Initially the rain data is grouped by month and station, then January is filtered out and the data is summarised by median rainfall. The new data frame is called Median_rain_Jan and the head function is used to investigate the new data frame. From the output the data frame contains three columns, month which only includes January, station and median rainfall.
rain %>% group_by(Month,Station) %>%
filter(Month=='Jan') %>%
summarise(median_rain=median(Rainfall)) -> Median_rain_Jan
head(Median_rain_Jan)
## # A tibble: 6 x 3
## # Groups: Month [1]
## Month Station median_rain
## <fct> <chr> <dbl>
## 1 Jan Ardara 172.
## 2 Jan Armagh 75
## 3 Jan Athboy 87.1
## 4 Jan Belfast 102.
## 5 Jan Birr 77.5
## 6 Jan Cappoquinn 147.
A new data frame was created using the group_by function, where rain is grouped by year and month, then summarised by median rainfall and the month January filtered out. The new dataframe was named monthly_medians. The new data frame was then used to produce a plot of median rainfall for January ranging from 1850 to 2014. This plot highlights the variation in median rainfall for January, where some years have extremely low median rainfall in contrast to other years that have extremely high median rainfall.
rain %>% group_by(Year,Month) %>% summarise(median_rain=median(Rainfall)) %>% filter(Month=='Jan') -> monthly_medians
head(monthly_medians)
## # A tibble: 6 x 3
## # Groups: Year [6]
## Year Month median_rain
## <dbl> <fct> <dbl>
## 1 1850 Jan 108.
## 2 1851 Jan 156.
## 3 1852 Jan 157.
## 4 1853 Jan 142.
## 5 1854 Jan 120.
## 6 1855 Jan 13.1
plot(monthly_medians$median_rain[1:165],type='b',ann=F, axes=F, cex=0.6)
axis(1,at=seq(1,165,by=1),labels=1850:2014); axis(2); title('Median Rainfall for January 1850-2014')
To explore this further a map of Ireland is created with the weather station location points. To create maps the leaflet library must be loaded. Leaflet can be used to create interactive maps. The leaflet function is used to create a map including station data, the addTiles adds an extra layer to the map and addCircleMarkers is used to mark the geographical location of the stations, for the purpose of this exercise the colour orange was chosen. It is clear from the map that there is uneven distribution of stations, with a large cluster of stations situated in the south east.
library(leaflet)
leaflet(data=stations,height=430,width=390) %>% addTiles %>%
setView(-8,53.5,6) %>% addCircleMarkers(fillColor='darkorange',stroke=FALSE,fillOpacity = 0.99)
## Assuming "Long" and "Lat" are longitude and latitude, respectively
However for the purpose of this exercise, a map is to be created of median rainfall for January at weather station locations. So far the rain data frame has been focused on, with one variable called station, which states the station from which the observations are recorded. The other data frame station contains geographical information regarding the stations, such as the elevation, eastings, northings etc. To investiagate the relationship between median rainfall for January and location, eastings of the weather stations are required, the rain data frame does not have this information, but it does note the station name. The station data frame notes the co-ordinates of each station. These data frames need to be joined together so that it contains the appropriate information for mapping. The left_join function can be used, which takes two arguments both of which are dataframes. The function uses a common variable from both dataframes, and merges the data together on a row by row basis. For the purpose of this exercise the data frame Median_rain_Jan is merged with the station data frame and the new data frame is called station_medians. The head function is then used to investigate the new data frame, where the output illustrates that the median rainfall for January has been merged with the station data.
Median_rain_Jan %>% left_join(stations) -> station_medians
## Joining, by = "Station"
head(station_medians, n=10)
## # A tibble: 10 x 11
## # Groups: Month [1]
## Month Station median_rain Elevation Easting Northing Lat Long County
## <fct> <chr> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <chr>
## 1 Jan Ardara 172. 15 180788. 394679. 54.8 -8.29 Doneg~
## 2 Jan Armagh 75 62 287831. 345772. 54.4 -6.64 Armagh
## 3 Jan Athboy 87.1 87 270400 261700 53.6 -6.93 Meath
## 4 Jan Belfast 102. 115 329623. 363141. 54.5 -5.99 Antrim
## 5 Jan Birr 77.5 73 208017. 203400. 53.1 -7.88 Offaly
## 6 Jan Cappoq~ 147. 76 213269. 104800. 52.2 -7.8 Water~
## 7 Jan Cork A~ 135. 154 167336. 64387. 51.8 -8.47 Cork
## 8 Jan Derry 97.3 37 226324. 416014. 55.0 -7.58 Derry
## 9 Jan Drumsna 99.1 45 200000 295800 53.9 -8 Leitr~
## 10 Jan Dublin~ 63 71 319767. 240361. 53.4 -6.2 Dublin
## # ... with 2 more variables: Abbreviation <chr>, Source <chr>
A colour palette is then created for mapping median rainfall using color_fun. For the purpose of this exercise different shades of red were chosen to represent median rainfall. The lighter the colour the lower the median rainfall is for the station location. The darker the red the higher the median rainfall is for the station location.
color_fun <- colorNumeric('Reds',station_medians$median_rain)
previewColors(color_fun,fivenum(station_medians$median_rain))
color_fun
fivenum(station_medians$median_rain)
| 63 | |
| 92.9 | |
| 105.7 | |
| 126.4 | |
| 177.7 |
A map is then created using the leaflet function, the CartoDB.Positron tile was chosen as it had a grey map background and allows the different colour circles to stand out, however the background of the map can be changed by using a different provider tile. The previously created colour_fun function is used to create a range of reds that represent variation in median rainfall at weather station locations, and a legend is positioned in the bottom right which allows for interpretation of the different colour circles. The map illustrates the range of median rainfall for the twenty-five stations in Ireland for the period 1850 to 2014. The darker the circles are the higher the median rainfall is at the station, in contrast the lighter the circles are the lower the median rainfall. The range of circle colours on the map suggests that the southwest has the highest median rainfall. The south east has medium to high median rainfall while the midlands have relatively low median rainfall.
leaflet(data=station_medians,height=430,width=600) %>% addProviderTiles('CartoDB.Positron') %>%
setView(-8,53.5,6) %>% addCircleMarkers(fillColor=~color_fun(median_rain),weight=0,fillOpacity = 0.99) %>%
addLegend(pal=color_fun,values=~median_rain,title="Jan Median Rainfall (mm)",position='bottomright')
## Assuming "Long" and "Lat" are longitude and latitude, respectively
To investigate this further an interactive map is created using similar code to the previous map, but with a popup, so when the circles are selected the station name appears.The stations with the darkest circles are Killarney, Valentia and Ardara which suggests that these stations have the highest median rainfall. In contrast the lightest circles are situated in Dublin airport, Armagh and Birr, which suggests that these stations have the lowest median rainfall. There appears to be an overall trend that median rainfall for January is slightly greater in the west of Ireland, compared to the East, with the most pronounced median rain fall in the south west.
leaflet(data=station_medians,height=430,width=600) %>% addProviderTiles('CartoDB.Positron') %>%
setView(-8,53.5,6) %>% addCircleMarkers(fillColor=~color_fun(median_rain),weight=0,fillOpacity = 0.99, popup=~Station) %>%
addLegend(pal=color_fun,values=~median_rain,title="Jan Median Rainfall (mm)",position='bottomright')
## Assuming "Long" and "Lat" are longitude and latitude, respectively
From conducting this exercise a variety of different patterns were observed. It was clear from the barplot that January had high median rainfall in comparison to other months. This was further investigated by creating a graph of median rainfall for January ranging from 1850 to 2014. From this graph it was evident that there was variation in median rainfall every year, while some years experienced extremely high median rainfall, other years experienced extremely low median rainfall. A map was then produced to highlight the locations of the stations from where the data was gathered. Further exploration was conducted by colour coding the circles for each station according to its median rainfall. The southwest appeared to have the highest median rainfall, followed by the the southeast and northwest. The midlands appeared to have the lowest median rainfall. Popups were then added to the circles on the map, so when selected the station name appeared. This highlighted that Killarney had the highest median rainfall, and Dublin airport had the lowest median rainfall. Overall it appears that the south west has the highest median rainfall values.