Multidimensional raster data

The data are collected over space and certain time intervals (hourly, daily, monthly, or yearly interval). Each layer is represented “time” of the collected data.

The common storage formats for multidimensional raster data are netCDF (.nc file) , GRIB, and HDF.

ERA5 data

Briefly, ERA5 is a reanalysis data that provides hourly estimates of a large number of atmospheric, land, and oceanic climate variables.

  • Gridded data

  • Global coverage with a spatial resolution of 0.1°× 0.1° (climate), 0.25°× 0.25° (atmosphere), 0.5°× 0.5° (ocean wave).

  • Data are available from 1940 to the present (hourly).

Working with ERA5 data

We need to register the account and log in to retrieve the data. Register here!

There are several options to get the data;

Download and clean the ERA5 data using R.

  1. Download ERA5 data using R

Install package install.package(c("foreach", "ecmwfr"))

library(foreach)
library(ecmwfr)

Set a key to a key chain. In this step, we need to set the API key to download the data. This information can be obtained from the user profile on the ERA5 webpage.

wf_set_key(user ="xxxx", key = "yyyy", service = "cds")

Get a command line request to provide the required details. ph is the folder to store the files.

ph <- "G:/My Drive/R program (spatial)/R-practice/era5data" 
vr <- c("2m_temperature")
yr <- 2018
mn <- sprintf("0%d",1:2)
dy <- c(sprintf("0%d",1:9), 10:31)
tm <- c(paste0("0",0:9,":00"),paste0(10:23,":00"))

Note for the sub-region extraction “area”, the order is (N, W, S, E). See BoundaryBox

In this example, we will retrieve temperature data for Ethiopia (E 32°59’00”--E 47°58’00”/N 14°53’00”--N 3°23’00”).

for (i in 1:length(yr)) {
  for (j in 1:length(mn)) {
    request <- list(
      "dataset_short_name" = "reanalysis-era5-land",
      "product_type" = "reanalysis",
      "variable" = vr[1],
      "year" = yr[i],
      "month" = mn[j],
      "day" = dy,
      "time" = tm,
      "area" = "15/31/3/48",
      "format" = "netcdf",
      "target" = paste0("era5_",yr[i],"_",mn[j],".nc")
    )
    
    file <- wf_request(
      user     = "116588",   # user ID (for authentification)
      request  = request,  # the request
      transfer = TRUE,     # download the file
      path     = ph       # store data in current working directory
    )
    
  }
}

2 Extract data for “Ethiopia”

Import all .nc files from the folder in which we stored the ERA5 data. Make a blank stack (using stack()) to store rasterlayer data and use a brick() command to read it.

library(raster)
library(sp)
library(rgdal)
setwd("G:/My Drive/R program (spatial)/R-practice/era5data")
listfile <- dir(pattern = "*.nc") 

d0 <- stack()

for (i in 1:length(listfile)) {
  
  #read data
  filename <- brick(listfile[i])
  
  d0 <- stack(d0,  filename)
  
  
}

nlayers(d0)
## [1] 1416

The next step is to extract the data (clip) within the boundary of Ethiopia. We need the boundary province of Ethiopia from the previous lesson.

adm1 <- raster::getData("GADM", country="ETH", level=1)
adm1
## class       : SpatialPolygonsDataFrame 
## features    : 11 
## extent      : 33.00154, 47.95823, 3.398823, 14.84548  (xmin, xmax, ymin, ymax)
## crs         : +proj=longlat +datum=WGS84 +no_defs 
## variables   : 10
## names       : GID_0,   NAME_0,   GID_1,      NAME_1,          VARNAME_1, NL_NAME_1,    TYPE_1, ENGTYPE_1, CC_1, HASC_1 
## min values  :   ETH, Ethiopia, ETH.1_1, Addis Abeba,                   ,        NA, Astedader,      City,   01,  ET.AA 
## max values  :   ETH, Ethiopia, ETH.9_1,      Tigray, Tegré|Tigrai|Tigre,        NA,     Kilil,     State,   15,  ET.TI

Extract and average temperature by each boundary province of Ethiopia.

temp = raster::extract(d0, adm1, method="simple", fun=mean, sp=T, na.rm =T)
temp
## class       : SpatialPolygonsDataFrame 
## features    : 11 
## extent      : 33.00154, 47.95823, 3.398823, 14.84548  (xmin, xmax, ymin, ymax)
## crs         : +proj=longlat +datum=WGS84 +no_defs 
## variables   : 1426
## names       : GID_0,   NAME_0,   GID_1,      NAME_1,          VARNAME_1, NL_NAME_1,    TYPE_1, ENGTYPE_1, CC_1, HASC_1,      X2018.01.01, X2018.01.01.01.00.00, X2018.01.01.02.00.00, X2018.01.01.03.00.00, X2018.01.01.04.00.00, ... 
## min values  :   ETH, Ethiopia, ETH.1_1, Addis Abeba,                   ,        NA, Astedader,      City,   01,  ET.AA, 283.954038188686,     282.568128361036,     281.374774583541,     280.823603462202,     281.064874547267, ... 
## max values  :   ETH, Ethiopia, ETH.9_1,      Tigray, Tegré|Tigrai|Tigre,        NA,     Kilil,     State,   15,  ET.TI, 298.855151061923,     298.492073956865,     298.125568366647,     297.487538338062,     296.973847511285, ...

Convert SpatialPolygonDataframe into data frame.

temp_df<- as.data.frame(temp)
ncol(temp_df)
## [1] 1426

Rename the column into date and time.

names(temp_df)[11:1426] <- gsub("\\.","",substr(names(temp_df)[11:1426], 2, 14))

Reshape from long to wide format and keep all variables simultaneously, using the gather () command in the tidyr package.

library(tidyr)
## 
## Attaching package: 'tidyr'
## The following object is masked from 'package:raster':
## 
##     extract
temp_df1 <- gather(temp_df, date, temp, 11:1426, factor_key = T)

Convert the format of date and time.

temp_df1$date1 <- as.POSIXct(as.character(temp_df1$date) , format = "%Y%m%d %H")
temp_df1$date2 <- as.POSIXct(as.character(temp_df1$date) , format = "%Y%m%d")

Average temperature by date and province.

library(dplyr)
avetemp <- temp_df1 %>% 
  group_by(date2, NAME_1) %>% 
  summarise(meantemp = mean(temp, na.rm = T))
## `summarise()` has grouped output by 'date2'. You can override using the
## `.groups` argument.
head(avetemp, 10)
## # A tibble: 10 × 3
## # Groups:   date2 [1]
##    date2               NAME_1                                      meantemp
##    <dttm>              <chr>                                          <dbl>
##  1 2018-01-01 00:00:00 Addis Abeba                                     289.
##  2 2018-01-01 00:00:00 Afar                                            298.
##  3 2018-01-01 00:00:00 Amhara                                          292.
##  4 2018-01-01 00:00:00 Benshangul-Gumaz                                298.
##  5 2018-01-01 00:00:00 Dire Dawa                                       292.
##  6 2018-01-01 00:00:00 Gambela Peoples                                 303.
##  7 2018-01-01 00:00:00 Harari People                                   291.
##  8 2018-01-01 00:00:00 Oromia                                          294.
##  9 2018-01-01 00:00:00 Somali                                          297.
## 10 2018-01-01 00:00:00 Southern Nations, Nationalities and Peoples     296.

Finally!! let’s convert the unit of temperature from Kelvin to Celsius.

avetemp$meantemp <- avetemp$meantemp-273.15
head(avetemp, 10)
## # A tibble: 10 × 3
## # Groups:   date2 [1]
##    date2               NAME_1                                      meantemp
##    <dttm>              <chr>                                          <dbl>
##  1 2018-01-01 00:00:00 Addis Abeba                                     15.5
##  2 2018-01-01 00:00:00 Afar                                            24.6
##  3 2018-01-01 00:00:00 Amhara                                          19.1
##  4 2018-01-01 00:00:00 Benshangul-Gumaz                                25.3
##  5 2018-01-01 00:00:00 Dire Dawa                                       19.0
##  6 2018-01-01 00:00:00 Gambela Peoples                                 30.0
##  7 2018-01-01 00:00:00 Harari People                                   17.6
##  8 2018-01-01 00:00:00 Oromia                                          20.7
##  9 2018-01-01 00:00:00 Somali                                          23.6
## 10 2018-01-01 00:00:00 Southern Nations, Nationalities and Peoples     23.1

Done!!!!🤩


Reference

  1. Li, B., Li, Y., Chen, Y., Zhang, B., & Shi, X. (2020). Recent fall Eurasian cooling linked to North Pacific sea surface temperatures and a strengthening Siberian high. Nature Communications11(1), 5202.

  2. Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., Thépaut, J-N. (2023): ERA5 hourly data on single levels from 1940 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS), DOI: 10.24381/cds.adbb2d47 (Accessed on 22-05-2023)