The observations for the validation were taken from 2015-01-02 01:00:00 to 2023-12-30 23:00:00.

Remember that we have on a daily scale the variables sfcWind, tas, pr, tasmax, tasmin and psl and in a monthly scale clt, rsdt,rsds as a predictors. We also have the month , hour, sun’s elevation & azimuth and the daily daylight amount in seconds as a predictors too.

mae cor ratio_of_sd KGE amplitude_mae maximum_correlation sign_correlation acf_mae extremogram_mae amount_rainy_hours_mae
xgboost 0.075 0.865 0.855 0.805 0.178 0.085 0.569 0.097 0.069 1.833
cnn 0.078 0.859 1.036 0.789 0.129 0.145 0.545 0.057 0.037 1.630
naive 0.095 0.827 0.829 0.757 0.252 0.150 0.566 0.120 0.073 2.227
lstm 0.076 0.853 0.922 0.775 0.143 0.126 0.565 0.071 0.030 1.774

Amplitude MAE

\[ \frac{\sum_{d=1}^D |A_d - \hat{A}_d|}{D} \]

Where

Maximum correlation

\[ \frac{\sum_{d=1}^D \mathbb{1}_{{mh}_d = \hat{mh}_d}}{D} \]

Where

The function \(\mathbb{1}_{mh_d = \hat{mh}_d}\) is an indicator function that equals 1 when the actual peak hour \(mh_d\) matches the estimated peak hour \(\hat{mh}_d\), and 0 otherwise.

The possible values of this indicator range between 0 (the estimated peak hour never matches the observed one) and 1 (the estimated hour always matches the observed one). Higher values are better.

Sign correlation

\[ \frac{\sum^{N}_{n=1}1_{sg(X_{n+1} - X_{n}) = sg(\hat{X}_{n+1} - \hat{X}_{n}))}}{N-1} \]

Where

The function \(\mathbb{1}{sg(X{n+1} - X_{n}) = sg(\hat{X}{n+1} - \hat{X}{n})}\) is an indicator function that equals 1 when the direction of change in the actual series matches the direction of change in the estimated series (i.e., if the actual series increases, the estimated series also increases, or if the actual series decreases, the estimated series also decreases), and equals 0 otherwise.

The possible values of this indicator range between 0 (meaning that whenever the actual series increases, the estimated series decreases, or vice versa) and 1 (meaning that the estimated series always follows the same direction as the actual series). Higher values are preferable.

Time series of the first days

How Often Peaks Hit Hourly

Daily amplitude

QQ Plot

Distribution of the undownscaled value on days with estimated extremes values.

On the x-axis we have the daily mean (standardized). It says Undownscaled value, but is the daily mean after the downscaling. A good idea is to plot the original undownscaled value.

The purpose of this plot is to illustrate the distribution of P(undownscaled value | we predicted an extreme). This is useful because it reveals how much information we can recover concerning extreme events. If the distribution is skewed to the right, it suggests that we’re predicting extreme values only when extreme values have already occurred. Conversely, if the lower tail of the distribution resembles the reanalysis data, it indicates that we can capture short-duration extremes (e.g., brief periods of heavy rainfall, such as an intense downpour lasting an hour before stopping).

Autocorrelogram

Extremogram

Important: Right now we are only estimating the upper tail extremogram. Currently we didn’t find a way to estimate the two tales at the same time. We are using quant = .97

Model Explanation

XGBoost