Dr. J. Kavanagh
2022-09-12
We are going to use the Rainfall.RData package for this session.
This data is a collection of rainfall levels captured across Ireland from 1850-2014. The data is in two parts: stations and rain. This a truncated version of a longer lesson plan from Prof. Chris Brunsdon at the National Centre for Geocomputation available here and here.
## # A tibble: 6 × 9
## Station Elevation Easting Northing Lat Long County Abbreviation Source
## <chr> <int> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
## 1 Athboy 87 270400 261700 53.6 -6.93 Meath AB Met E…
## 2 Foulksmills 71 284100 118400 52.3 -6.77 Wexford F Met E…
## 3 Mullingar 112 241780 247765 53.5 -7.37 Westme… M Met E…
## 4 Portlaw 8 246600 115200 52.3 -7.31 Waterf… P Met E…
## 5 Rathdrum 131 319700 186000 52.9 -6.22 Wicklow RD Met E…
## 6 Strokestown 49 194500 279100 53.8 -8.1 Roscom… S Met E…
## # A tibble: 6 × 4
## Year Month Rainfall Station
## <dbl> <ord> <dbl> <chr>
## 1 1850 Jan 169 Ardara
## 2 1851 Jan 236. Ardara
## 3 1852 Jan 250. Ardara
## 4 1853 Jan 209. Ardara
## 5 1854 Jan 188. Ardara
## 6 1855 Jan 32.3 Ardara
## # A tibble: 6 × 2
## Station mrain
## <chr> <dbl>
## 1 Ardara 140.
## 2 Armagh 68.3
## 3 Athboy 74.7
## 4 Belfast 87.1
## 5 Birr 70.8
## 6 Cappoquinn 121.
This is simpler than the earlier examples using lubridate, but the principal is the same.
## # A tibble: 6 × 2
## Month mrain
## <ord> <dbl>
## 1 Jan 113.
## 2 Feb 83.2
## 3 Mar 79.5
## 4 Apr 68.7
## 5 May 71.3
## 6 Jun 72.7
rain_months %>%
ggplot(aes(x=Month,y=mrain)) +
geom_bar(stat='identity') +
labs(y='Mean Rainfall') +
theme_economist()## `summarise()` has grouped output by 'Month'. You can override using the
## `.groups` argument.
## # A tibble: 6 × 3
## # Groups: Month [1]
## Month Station mean_rain
## <ord> <chr> <dbl>
## 1 Jan Ardara 175.
## 2 Jan Armagh 74.6
## 3 Jan Athboy 84.9
## 4 Jan Belfast 101.
## 5 Jan Birr 79.9
## 6 Jan Cappoquinn 154.
Typically data exists in two formats: long and wide. At the present the rain_season_station dataframe is in the long format and we need to make it a wide format. We can use the reshape2 package for this.
## Using mean_rain as value column: use value.var to override.
## Jan Feb Mar Apr May Jun Jul
## Ardara 174.82606 126.82303 123.02000 98.79333 96.90727 105.24061 123.70485
## Armagh 74.57242 55.97182 56.48879 53.67030 59.23182 62.72939 72.50636
## Athboy 84.94759 62.62133 62.44944 58.97874 62.16260 68.11460 76.28662
## Belfast 101.20718 74.50206 73.10221 65.90492 69.23426 74.48525 87.70003
## Birr 79.92074 57.88501 58.42056 54.07187 60.25831 61.96445 75.10084
## Cappoquinn 153.97159 117.77099 110.02890 94.00365 95.81437 94.86357 104.09372
## Aug Sep Oct Nov Dec
## Ardara 145.24788 152.80727 174.44788 176.45030 186.14182
## Armagh 81.92182 69.02576 80.94242 73.73121 79.05939
## Athboy 88.84710 76.35077 88.14627 80.95767 87.05997
## Belfast 102.44499 87.11622 106.35394 100.31760 102.95073
## Birr 86.75669 72.21722 83.80810 77.87266 81.74334
## Cappoquinn 125.42245 116.04466 146.70658 141.10012 154.87402
A heatmap is a useful visualisation for exploring density within data. The colour schema is fairly straightforward, the darkest red is the highest values and the lightest yellows are the lowest values.
## Using mean_rain as value column: use value.var to override.
The rainfall data is clearly a time series dataset, however, R also creates specific file types called ts which are designed for use with specific packages such as dygraphs. The majority of financial data is best explored in the ts data format. We will be exploring stock market data at a later stage using live data from Yahoo! Finance.
# This creates a rainfall ts dataframe from using the rain data, summarised as the sum of rainfall per month and year.
rain %>%group_by(Year,Month) %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> rain_ts## `summarise()` has grouped output by 'Year'. You can override using the
## `.groups` argument.
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct
## 1870 2666.2 1975.3 1500.5 1024.8 1862.8 789.2 1038.6 1510.5 2045.5 5177.6
## 1871 3148.3 2343.7 1731.7 2654.5 657.6 2040.1 3705.0 1869.9 2083.4 2774.3
## Nov Dec
## 1870 1733.2 1902.2
## 1871 2000.1 1902.0
First select the station in Birr, Co. Offaly and use the same code for creating the overall ts data.
rain %>% group_by(Year,Month) %>% filter(Station=="Birr") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> birr_ts## `summarise()` has grouped output by 'Year'. You can override using the
## `.groups` argument.
Second follow this up with a comparative, in this case Shannon Airport.
rain %>% group_by(Year,Month) %>% filter(Station=="Shannon Airport") %>%
summarise(Rainfall=sum(Rainfall)) %>% ungroup %>% transmute(Rainfall) %>%
ts(start=c(1850,1),freq=12) -> shannon_ts## `summarise()` has grouped output by 'Year'. You can override using the
## `.groups` argument.
Finall merge these three ts objects together into one new file.
birr_shannon_natl_ts <- cbind(birr_ts,shannon_ts, rain_ts)
# Check your results!
window(birr_shannon_natl_ts,c(1850,1),c(1850,5))## birr_ts shannon_ts rain_ts
## Jan 1850 93.3 85.2 2836.3
## Feb 1850 41.5 81.3 2158.9
## Mar 1850 10.3 27.1 964.1
## Apr 1850 81.7 87.2 3457.2
## May 1850 72.2 71.0 1492.1
A cross-comparison of rainfall at two stations and the national average across the entire dataset.
Create a dygraph of four distinct stations across Ireland showing the mean of rainfall per year. Each station should be within a specific province of Ireland.
Create a dygraph of the mean and median rainfall per year and compare it to the sum of rainfall.
We are going to be using the quantmod package, which is useful for exploring financial data from the stock exchange. Specifically we’re going to be looking up the information from Yahoo! Finance.
Create specific dates, in this case I’ve chosen a ten year span from 2007-2017.
Next get the relevant information from the publicly listed company of your choice. In my case, Apple Corp.
## [1] "AAPL"
This creates a new xts object called AAPL
## AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted
## 1989-01-03 0.359375 0.361607 0.357143 0.360491 100016000 0.284788
## 1989-01-04 0.363839 0.376116 0.361607 0.375000 239948800 0.296250
## 1989-01-05 0.375000 0.386161 0.368304 0.377232 307328000 0.298013
## 1989-01-06 0.377232 0.388393 0.377232 0.380580 198665600 0.300658
## 1989-01-09 0.383929 0.385045 0.377232 0.383929 79307200 0.303304
## 1989-01-10 0.379464 0.382813 0.370536 0.380580 103320000 0.300658