This Tutorial complements the paper Persiano et al. (2022) as supplementary material. It is a step-by-step guide that will help the user to set up a geostatistically-based regional model (i.e., total negative deviation top-kriging, TNDTK, see e.g. Pugliese et al., 2014, 2016) for interpolating flow-duration curves (FDCs) in gauged and ungauged sites. For further information about TNDTK, please see the papers listed in the References.
IMPORTANT NOTES
The dataset (available here) described in Persiano et al. (2022) consists of a vector layer of the contours of 24,148 elementary catchments in Europe and the associated representation of the streamflow regime in terms of empirical flow–duration curves (FDCs). FDCs are estimated by means of TNDTK, based on the empirical FDCs available for 2484 discharge measurement stations across Europe. Given the number of catchments (both gauged and ungauged) included in the dataset, the procedure is computationally intensive if the dataset is used as a whole, thus its use is not recommended with consumer level PCs (especially with RAM lower than 4GB). Therefore, in this Tutorial we employ a dataset with 27 gauged catchments and 3 target ungauged catchments located in Tyrol (Austria) and South Tyrol (Italy), to be intended as an hands-on guide to catch a glimpse of the TNDTK method. We invite the user to adapt the script in other to replicate the experiment, but we do recommend to start with a smaller subset.
The example of application refers to the following procedure:
The required inputs of the procedure are:
This Tutorial is tested with R-project v4.2.0 (release date 22/04/2022).
All the material necessary to run this Tutorial is deployed on GitHub, available here. The folder contains:
In the present Tutorial, the material is downloaded directly from R, therefore we recommend to follow the steps described below.
# Set the working directory (example)
setwd("C:/Users/uga06241/Desktop/TNDTK_Tutorial")
# Download GitHub material
download.file(url = "https://github.com/SimonePersiano/TNDTK/archive/master.zip",
destfile = "TNDTK-master.zip")
# Unzip the .zip folder
unzip(zipfile = "TNDTK-master.zip")
# Unzip .zip folders within the unzipped master folder
zipfiles <- list.files(path = "TNDTK-main", pattern = "*.zip", full.names = TRUE)
print(zipfiles)
## [1] "TNDTK-main/GaugedCatchments.zip" "TNDTK-main/GaugedStations.zip"
## [3] "TNDTK-main/StreamflowData.zip" "TNDTK-main/UngaugedCatchments.zip"
for (f in zipfiles)
{
unzip(f)
}
Some R packages are required. Please make sure to download them beforehand. You can use the Rstudio dedicated menu (Tools -> Install Packages…) or base function from the console install.packages()
library(rtop)
library(rgeos)
library(rgdal)
library(pracma)
# Running function for computing the TND
source("TNDTK-main/tnd.R")
# Running function for resampling FDCs
source("TNDTK-main/resample_FDC.R")
# Reading polygon shapefile of the gauged catchment boundaries
GaugedCatchments <- readOGR("GaugedCatchments/GaugedCatchments.shp",stringsAsFactors=FALSE)
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\uga06241\Desktop\TNDTK_Tutorial\GaugedCatchments\GaugedCatchments.shp", layer: "GaugedCatchments"
## with 27 features
## It has 8 fields
# Reading point shapefile of the corresponding gauging stations
GaugedStations <- readOGR("GaugedStations/GaugedStations.shp",stringsAsFactors=FALSE)
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\uga06241\Desktop\TNDTK_Tutorial\GaugedStations\GaugedStations.shp", layer: "GaugedStations"
## with 27 features
## It has 5 fields
# Reading polygon shapefile of the ungauged catchment boundaries
UngaugedCatchments <- readOGR("UngaugedCatchments/UngaugedCatchments.shp",stringsAsFactors=FALSE)
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\uga06241\Desktop\TNDTK_Tutorial\UngaugedCatchments\UngaugedCatchments.shp", layer: "UngaugedCatchments"
## with 3 features
## It has 5 fields
# Visualizing attribute table
print(as.data.frame(GaugedCatchments))
## Cod River Station Zone X_3035_cor Y_3035_cor
## 0 201558 Valserbach St. Jodok am Brenner AUT 4435002 2662160
## 1 201566 Gschnitzbach Steinach am Brenner AUT 4432319 2665493
## 2 201574 Sill Puig AUT 4431323 2667697
## 3 201996 Zamserbach Neukaser AUT 4449804 2658610
## 4 202275 Sill Lueg AUT 4434424 2658413
## 5 212043 Isel Hinterbichl AUT 4498804 2658559
## 6 212076 Tauernbach Matreier Tauernhaus AUT 4510435 2670347
## 7 212092 Isel Br?hl AUT 4515162 2654672
## 8 212118 Kalser Bach Sp?ttling AUT 4521182 2660442
## 9 212126 Teischnitzbach Sp?ttling-Taurer AUT 4521347 2660460
## 10 212183 Isel Waier AUT 4513287 2656663
## 11 212217 Debantbach Obernu?dorf AUT 4536874 2643158
## 12 212225 Winkeltalbach Ausservillgraten AUT 4506722 2633416
## 13 31950PG Pflerscherbach Gossensass ITA 4430598 2648254
## 14 32470PG Eisack Sterzing ITA 4430483 2643737
## 15 33550PG Pfitscherbach Ried ITA 4437295 2647886
## 16 36250PG Jaufentalerbach Gasteig ITA 4428773 2641968
## 17 36750PG Mareiterbach Sterzing ITA 4430187 2642209
## 18 43350PG Rienz Welsberg ITA 4482729 2628747
## 19 45750PG Gsieserbach Pichl ITA 4486903 2630603
## 20 48750PG Antholzerbach Salomonsbrunn ITA 4480052 2636593
## 21 51450PG Ahr Steinhaus ITA 4471739 2655459
## 22 54970PG Reinbach Kematen ITA 4470842 2646501
## 23 57150PG Ahr Kematen ITA 4470167 2645292
## 24 61750PG Gader Pedraces ITA 4466397 2612264
## 25 64550PG Gader Montal ITA 4464657 2631017
## 26 71550PG Villn?sserbach St.Peter ITA 4449565 2615557
## Altitude Area_km2
## 0 1121.00 109.71688
## 1 1039.75 112.18438
## 2 1004.00 343.63063
## 3 1785.00 23.99750
## 4 1216.00 21.77938
## 5 1327.14 106.85875
## 6 1502.00 59.54375
## 7 919.00 514.21063
## 8 1486.00 47.76000
## 9 1491.00 13.66313
## 10 930.00 283.99875
## 11 1169.00 57.93062
## 12 1256.00 62.28813
## 13 1063.00 73.77250
## 14 947.00 141.73313
## 15 1365.00 109.67125
## 16 952.00 44.11500
## 17 939.00 207.43187
## 18 1095.04 261.14875
## 19 1196.07 118.10500
## 20 1095.00 83.28750
## 21 1044.00 148.81938
## 22 859.00 117.08500
## 23 848.00 417.49313
## 24 1321.00 121.50250
## 25 814.08 386.54625
## 26 1060.65 45.18688
# Plotting
# Visualizing watershed boundaries
plot(GaugedCatchments)
# Visualizing watershed boundaries
plot(UngaugedCatchments,col=rgb(red = 1, green = 0, blue = 0, alpha = 0.5),add=TRUE)
# Adding station locations to the previous plot
plot(GaugedStations,pch=24,bg="green",cex=1.2,add=TRUE)
# Extraction of the POR-FDC from the daily streamflow series observed at the given gauged sites
# Creating a list with FDCs for the gauged sites
FDC <- list(Q=list(),d=list())
for (i in 1:length(GaugedCatchments$Cod)) # For each gauged site...
{
code <- as.character(GaugedCatchments$Cod[i])
file_i <- list.files(path="StreamflowData",
pattern=code, full.names = TRUE)
dataQ_i <- read.csv(file_i,
header=TRUE,stringsAsFactors=FALSE,
encoding="latin1",sep=",")
# Building POR-FDC
FDC$Q[[code]] <- sort(dataQ_i$Q,decreasing=TRUE)
# Weibull plotting position
FDC$d[[code]] <- 1:length(FDC$Q[[code]])/length(FDC$Q[[code]]+1)
}
# Computing Mean Annual Flow (MAF)
MAF <- c()
for (i in 1:length(GaugedCatchments$Cod))
{
MAF[i] <- mean(FDC$Q[[i]])
}
GaugedCatchments$MAF <- MAF
# Re-computing catchment areas
GaugedCatchments$Area_km2 <- as.numeric(gArea(GaugedCatchments,byid=TRUE)/(1000*1000))
# Function for plotting logarithmic sequences (useful for the plot)
lseq <- function(from,to,length.out)
{
exp(seq(log(from), log(to), length.out = length.out))
}
# Plotting dimensionless FDCs
plot(FDC$d[[1]],FDC$Q[[1]]/mean(FDC$Q[[1]]),
type="l",log="y",
ylim=c(0.001,100),xlim=c(0,1),
xlab="Duration [-]",
ylab="Dimensionless streamflow [-]",
col="black",axes=FALSE)
grid(NULL,NULL, lty = 3, col = "lightgray")
abline(v=seq(0,1,0.1), lty = 3, col = "lightgray")
abline(h=lseq(0.01,100,5), lty = 3, col = "lightgray")
axis(1,at=seq(0,1,0.1),labels = paste(seq(0,1,0.1)))
axis(2,at=lseq(0.01,100,5),labels = paste(lseq(0.01,100,5)))
for (i in 2:length(GaugedCatchments$Cod))
{
lines(FDC$d[[i]],FDC$Q[[i]]/mean(FDC$Q[[i]]),col="blue")
}
Computation of the total negative deviation (TND), by using a Gaussian transformation of the duration axis.
# Years
years <- rep(10,length(GaugedCatchments$Cod))
# Minimum record length for maximum duration to be used in TND computations
min.year <- min(years)
# Lower bound for maximum durations
maxd <- min.year*365/(min.year*365+1)
# Computing TND
TND <- c()
for (i in 1:length(GaugedCatchments$Cod))
{
TND[i] <- fdc.tnd(FDC$Q[[i]]/MAF[i],norm=TRUE,maxd=maxd,logq=FALSE)
}
# Adding a column (field) with the computed TND values
GaugedCatchments$TND <- TND
Application of total negative deviation top-kriging (TNDTK) for computing the POR-FDC at a given (ungauged) target site.
(a) Scaling relationship between MAF and drainage area (for gauged catchments)
# Scaling relationship between MAF and drainage area (for gauged catchments)
plot(GaugedCatchments$Area_km2,GaugedCatchments$MAF,log="xy",
xlab=expression("Drainage area [km"^2*"]"),
ylab=expression("MAF [m"^3*"/s]"))
# Plotting
loglin.mod <- lm(log(GaugedCatchments$MAF)~log(GaugedCatchments$Area_km2))
c1 <- exp(loglin.mod$coefficients[1])
c2 <- loglin.mod$coefficients[2]
curve(c1*x^c2,add=TRUE,col="black")
(b) Top-kriging prediction of TND
# Top-kriging prediction of TND
set.seed(1)
vic <- 6 # Neighbourhood (no. of considered catchments)
# Create an rtop object
rtopObj <- createRtopObject(observations=GaugedCatchments,
predictionLocations=UngaugedCatchments,
formulaString=TND~1, # Variable to krige
params=list(gDist=TRUE, # How to compute distances between areas
rresol=500, # Catchment area discretization
nmax=vic, # No. of neighbours
wlim=1, # Upper limit for the norm of the weights
debug.level=0, # No additional outputs required
partialOverlap=TRUE)) # For inaccurate catchment boundaries
# Build the empirical variogram
rtopObj <- rtopVariogram(rtopObj) # Create variogram for the rtop object
rtopObj <- rtopFitVariogram(rtopObj) # Fit theoretical variogram
## 52 best 0.42 function convergence 200 parameter convergence 6598.284
## 87 best 0.313 function convergence 200 parameter convergence 5357.522
## 120 best 0.271 function convergence 200 parameter convergence 4961.29
## 158 best 0.107 function convergence 200 parameter convergence 3110.485
## 193 best 0.107 function convergence 200 parameter convergence 2107.613
## 222 best 0.106 function convergence 119 parameter convergence 1555.224
## 255 best 0.105 function convergence 99.6 parameter convergence 1241.546
## 288 best 0.105 function convergence 88.2 parameter convergence 907.2956
## 316 best 0.104 function convergence 2.62 parameter convergence 717.1948
## 346 best 0.103 function convergence 3.32 parameter convergence 679.5383
## 384 best 0.103 function convergence 2.56 parameter convergence 475.6586
## 418 best 0.103 function convergence 1.8 parameter convergence 335.4722
## 455 best 0.103 function convergence 1.8 parameter convergence 227.6378
## 487 best 0.103 function convergence 1.03 parameter convergence 142.0758
## 517 best 0.103 function convergence 0.333 parameter convergence 101.2766
## 544 best 0.103 function convergence 0.368 parameter convergence 100.6624
## 572 best 0.103 function convergence 0.128 parameter convergence 93.1503
## 599 best 0.103 function convergence 0.138 parameter convergence 59.40883
## 628 best 0.103 function convergence 0.0903 parameter convergence 51.11815
## 655 best 0.103 function convergence 0.0908 parameter convergence 42.50883
## 680 best 0.103 function convergence 0.0602 parameter convergence 33.27974
## 705 best 0.103 function convergence 0.0332 parameter convergence 31.79337
## 731 best 0.103 function convergence 0.0237 parameter convergence 22.75609
## 757 best 0.103 function convergence 0.0154 parameter convergence 17.8893
## 782 best 0.103 function convergence 0.00914 parameter convergence 15.17109
## 807 best 0.103 function convergence 0.00461 parameter convergence 13.24189
## 834 best 0.103 function convergence 0.00206 parameter convergence 8.13514
## 862 best 0.103 function convergence 0.000994 parameter convergence 7.279419
## 889 best 0.103 function convergence 0.000468 parameter convergence 6.488266
## 915 best 0.103 function convergence 0.000293 parameter convergence 5.200596
## 940 best 0.103 function convergence 0.00026 parameter convergence 3.546659
## 965 best 0.103 function convergence 0.000131 parameter convergence 2.130915
## 990 best 0.103 function convergence 0.000103 parameter convergence 2.411578
## 1015 best 0.103 function convergence 7.88e-05 parameter convergence 1.421412
## 1040 best 0.103 function convergence 2.67e-05 parameter convergence 1.058514
## 1065 best 0.103 function convergence 2.29e-05 parameter convergence 1.180639
## 1091 best 0.103 function convergence 1.15e-05 parameter convergence 0.8366503
rtopObj <- checkVario(rtopObj) # Check variogram fitting
## [1] "cloud is FALSE"
## [1] "Sampling points from 5 areas"
## [1] "Sampled on average 1361.6 points from 5 areas"
## [1] "Sampling points from 6 areas"
## [1] "Sampled on average 745.5 points from 6 areas"
## [1] "Sampling points from 6 areas"
## [1] "Sampled on average 1475.83 points from 6 areas"
## [1] "Sampling points from 6 areas"
## [1] "Sampled on average 1274.83 points from 6 areas"
## [1] "Sampling points from 5 areas"
## [1] "Sampled on average 561.6 points from 5 areas"
# Perform the predicton
rtopObj <- rtopKrige(rtopObj,wret=TRUE)
## [1] "Sampling points from 27 areas"
## [1] "Sampled on average 1053.85 points from 27 areas"
## [1] "Sampling points from 3 areas"
## [1] "Sampled on average 1581 points from 3 areas"
## [1] "Creating prediction semivariance matrix. This can take some time."
TND.pred <- rtopObj$predictions$var1.pred # TND values estimated with TK
weights <- rtopObj$weight # Geostatistical TK weights
# Adding a column (field) with the predicted TND values
UngaugedCatchments$TND.pred <- TND.pred
(c) Top-kriging prediction of MAF
# Top-kriging prediction of MAF
GaugedCatchments$obs <- GaugedCatchments$MAF/(GaugedCatchments$Area_km2^c2)
# Create an rtop object
rtopObj.MAF <- createRtopObject(observations=GaugedCatchments,
predictionLocations=UngaugedCatchments,
formulaString=obs~1, # Variable to krige
params=list(gDist=TRUE, # How to compute distances between areas
rresol=500, # Catchment area discretization
nmax=vic, # No. of neighbours
wlim=1, # Upper limit for the norm of the weights
debug.level=0, # No additional outputs required
partialOverlap=TRUE)) # For inaccurate catchment boundaries
# Build the empirical variogram
rtopObj.MAF <- rtopVariogram(rtopObj.MAF) # Create variogram for the rtop object
rtopObj.MAF <- rtopFitVariogram(rtopObj.MAF) # Fit theoretical variogram
## 55 best 0.215 function convergence 200 parameter convergence 7354.32
## 85 best 0.215 function convergence 200 parameter convergence 4475.475
## 117 best 0.172 function convergence 200 parameter convergence 2982.587
## 147 best 0.166 function convergence 200 parameter convergence 2094.3
## 177 best 0.166 function convergence 200 parameter convergence 1667.127
## 215 best 0.166 function convergence 25.5 parameter convergence 1095.484
## 257 best 0.166 function convergence 25.5 parameter convergence 833.2039
## 294 best 0.166 function convergence 3.37 parameter convergence 835.5796
## 327 best 0.166 function convergence 0 parameter convergence 596.0265
## 364 best 0.166 function convergence 0 parameter convergence 525.2285
## 396 best 0.166 function convergence 0 parameter convergence 497.2264
## 429 best 0.166 function convergence 0.0257 parameter convergence 396.1975
## 463 best 0.166 function convergence 0.0257 parameter convergence 363.745
## 498 best 0.166 function convergence 0.0369 parameter convergence 304.3263
## 533 best 0.166 function convergence 0.0408 parameter convergence 299.6069
## 570 best 0.166 function convergence 0.0637 parameter convergence 225.9832
## 607 best 0.166 function convergence 0.038 parameter convergence 188.4626
## 639 best 0.166 function convergence 0.038 parameter convergence 126.8017
## 668 best 0.166 function convergence 0.0296 parameter convergence 104.3948
## 694 best 0.166 function convergence 0.0277 parameter convergence 79.83891
## 719 best 0.166 function convergence 0.00552 parameter convergence 78.02527
## 744 best 0.166 function convergence 0.00608 parameter convergence 55.89709
## 769 best 0.166 function convergence 0.00634 parameter convergence 38.36427
## 794 best 0.166 function convergence 0.00385 parameter convergence 28.90216
## 819 best 0.166 function convergence 0.00192 parameter convergence 19.99419
## 844 best 0.166 function convergence 0.00127 parameter convergence 20.22688
## 869 best 0.166 function convergence 0.000764 parameter convergence 15.27244
## 894 best 0.166 function convergence 0.000519 parameter convergence 13.76526
## 919 best 0.166 function convergence 0.000316 parameter convergence 10.41494
## 946 best 0.166 function convergence 0.000115 parameter convergence 10.24436
## 973 best 0.166 function convergence 7.89e-05 parameter convergence 7.86707
## 999 best 0.166 function convergence 3.11e-05 parameter convergence 6.232098
## 1025 best 0.166 function convergence 1.44e-05 parameter convergence 5.743356
## 1051 best 0.166 function convergence 8.14e-06 parameter convergence 4.751966
## 1076 best 0.166 function convergence 3.78e-06 parameter convergence 3.623507
## 1105 best 0.166 function convergence 2.48e-06 parameter convergence 3.500545
## 1131 best 0.166 function convergence 1.29e-06 parameter convergence 2.740953
## 1156 best 0.166 function convergence 8.88e-07 parameter convergence 2.4068
## 1183 best 0.166 function convergence 5.42e-07 parameter convergence 2.251982
## 1208 best 0.166 function convergence 4.98e-07 parameter convergence 1.735113
## 1233 best 0.166 function convergence 3.4e-07 parameter convergence 1.434151
## 1258 best 0.166 function convergence 2.08e-07 parameter convergence 1.203088
## 1283 best 0.166 function convergence 1.56e-07 parameter convergence 1.064479
## 1309 best 0.166 function convergence 9.04e-08 parameter convergence 0.7208086
rtopObj.MAF <- checkVario(rtopObj.MAF) # Check variogram fitting
## [1] "cloud is FALSE"
## [1] "Sampling points from 5 areas"
## [1] "Sampled on average 1361.6 points from 5 areas"
## [1] "Sampling points from 6 areas"
## [1] "Sampled on average 745.5 points from 6 areas"
## [1] "Sampling points from 6 areas"
## [1] "Sampled on average 1475.83 points from 6 areas"
## [1] "Sampling points from 6 areas"
## [1] "Sampled on average 1274.83 points from 6 areas"
## [1] "Sampling points from 5 areas"
## [1] "Sampled on average 561.6 points from 5 areas"
# Perform the predicton
rtopObj.MAF <- rtopKrige(rtopObj.MAF)
## [1] "Sampling points from 27 areas"
## [1] "Sampled on average 1053.85 points from 27 areas"
## [1] "Sampling points from 3 areas"
## [1] "Sampled on average 1581 points from 3 areas"
## [1] "Creating prediction semivariance matrix. This can take some time."
#
# MAF values estimated with TK
MAF.pred <- rtopObj.MAF$predictions$var1.pred*(UngaugedCatchments$Are_km2^c2)
UngaugedCatchments$MAF.pred <- MAF.pred
(d) Computation of FDCs for ungauged (target) catchments
# Computation of FDCs for ungauged catchments
# Resampling observed FDCs in view of the estimation for target sites
np <- 100
sam <- pnorm(seq(qnorm(1-maxd),qnorm(maxd),(qnorm(maxd)-qnorm(1-maxd))/(np-1))) # Sampling points
duration_percentage <- round(sam*100,2)
# Empirical FDCs (dimensionless) for gauged catchments
FDC.obs.dimless <- matrix(NA,nrow=np,ncol=length(GaugedCatchments$Cod))
colnames(FDC.obs.dimless) <- GaugedCatchments$Cod
rownames(FDC.obs.dimless) <- paste(duration_percentage)
for(j in 1:length(GaugedCatchments$Cod)) # Resampling...
{
resamp <- resample.FDC(FDC$Q[[j]]/GaugedCatchments$MAF[j],sam=sam,norm=FALSE)
FDC.obs.dimless[,j] <- resamp$y
}
# Final predictions
# Predicted FDCs (dimensionless)
FDC.pred.dimless <- FDC.obs.dimless%*%t(weights) # Estimates
colnames(FDC.pred.dimless) <- UngaugedCatchments$Locatin
rownames(FDC.pred.dimless) <- paste(duration_percentage)
# Predicted FDCs (dimensional)
FDC.pred <- t(t(FDC.pred.dimless)*MAF.pred)
(e) Plotting final results
Blue lines represent observed dimensional FDCs. Red lines represent predicted dimensional FDCs (at ungauged target sites).
# Plotting final results
plot(FDC$d[[1]],FDC$Q[[1]]/mean(FDC$Q[[1]]),
type="l",log="y",
ylim=c(0.01,1000),xlim=c(0,1),
xlab="Duration [-]",
ylab=expression("Dimensional streamflow [m"^3*"/s]"),
col="black",axes=FALSE)
grid(NULL,NULL, lty = 3, col = "lightgray")
abline(v=seq(0,1,0.1), lty = 3, col = "lightgray")
abline(h=lseq(0.01,1000,6), lty = 3, col = "lightgray")
axis(1,at=seq(0,1,0.1),labels = paste(seq(0,1,0.1)))
axis(2,at=lseq(0.01,1000,6),labels = paste(lseq(0.01,1000,6)))
for (i in 2:length(GaugedCatchments$Cod))
{
lines(FDC$d[[i]],FDC$Q[[i]],col="blue",lwd=1.5)
}
for (j in 1:length(UngaugedCatchments$Locatin))
{
lines(as.numeric(rownames(FDC.pred))/100,as.numeric(FDC.pred[,j]),col="red",lwd=3)
}