The observations for the validation were taken from 2015-01-02 01:00:00 to 2023-12-30 23:00:00.
Remember that we have on a daily scale the variables
sfcWind, tas, pr,
tasmax, tasmin and psl and in a
monthly scale clt, rsdt,rsds as a
predictors. We also have the month , hour, sun’s elevation & azimuth
and the daily daylight amount in seconds as a predictors too.
| mae | cor | ratio_of_sd | KGE | amplitude_mae | maximum_correlation | sign_correlation | acf_mae | extremogram_mae | amount_rainy_hours_mae | |
|---|---|---|---|---|---|---|---|---|---|---|
| xgboost | 0.075 | 0.865 | 0.855 | 0.805 | 0.178 | 0.085 | 0.569 | 0.097 | 0.069 | 1.833 |
| cnn | 0.078 | 0.859 | 1.036 | 0.789 | 0.129 | 0.145 | 0.545 | 0.057 | 0.037 | 1.630 |
| naive | 0.095 | 0.827 | 0.829 | 0.757 | 0.252 | 0.150 | 0.566 | 0.120 | 0.073 | 2.227 |
| lstm | 0.076 | 0.853 | 0.922 | 0.775 | 0.143 | 0.126 | 0.565 | 0.071 | 0.030 | 1.774 |
\[ \frac{\sum_{d=1}^D |A_d - \hat{A}_d|}{D} \]
Where
\[ \frac{\sum_{d=1}^D \mathbb{1}_{{mh}_d = \hat{mh}_d}}{D} \]
Where
The function \(\mathbb{1}_{mh_d = \hat{mh}_d}\) is an indicator function that equals 1 when the actual peak hour \(mh_d\) matches the estimated peak hour \(\hat{mh}_d\), and 0 otherwise.
The possible values of this indicator range between 0 (the estimated peak hour never matches the observed one) and 1 (the estimated hour always matches the observed one). Higher values are better.
\[ \frac{\sum^{N}_{n=1}1_{sg(X_{n+1} - X_{n}) = sg(\hat{X}_{n+1} - \hat{X}_{n}))}}{N-1} \]
Where
The function \(\mathbb{1}{sg(X{n+1} - X_{n}) = sg(\hat{X}{n+1} - \hat{X}{n})}\) is an indicator function that equals 1 when the direction of change in the actual series matches the direction of change in the estimated series (i.e., if the actual series increases, the estimated series also increases, or if the actual series decreases, the estimated series also decreases), and equals 0 otherwise.
The possible values of this indicator range between 0 (meaning that whenever the actual series increases, the estimated series decreases, or vice versa) and 1 (meaning that the estimated series always follows the same direction as the actual series). Higher values are preferable.
On the x-axis we have the daily mean (standardized). It says
Undownscaled value, but is the daily mean after the
downscaling. A good idea is to plot the original undownscaled
value.
The purpose of this plot is to illustrate the distribution of P(undownscaled value | we predicted an extreme). This is useful because it reveals how much information we can recover concerning extreme events. If the distribution is skewed to the right, it suggests that we’re predicting extreme values only when extreme values have already occurred. Conversely, if the lower tail of the distribution resembles the reanalysis data, it indicates that we can capture short-duration extremes (e.g., brief periods of heavy rainfall, such as an intense downpour lasting an hour before stopping).
Important: Right now we are only estimating the upper tail
extremogram. Currently we didn’t find a way to estimate the two tales at
the same time. We are using quant = .97