date | yiel | year | doy | bio5 | bio11 | bio10 | id | jci | moi | ntg | ogs10 | wki | ws | wsdi | xtg | hi | hd17 | tn90p | tr | tx10p | vwd | prcptot | cdd | cwd | r10mm | r20mm | rx1day | rx5d | sdii | r95tot | r99tot | fd | su | csd | stx32 | mi | fg | fgcalm | fg6bft | mai | mfi | ssd | ssp | snd | wci | utci | periodicity | dom_wdirhk | dphk | rh2hk_dailymean | srhk_dailymean | t2hk_dailymax | t2hk_dailymean | t2hk_dailymin | wsphk_dailymax | wsphk_dailymean |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2001-01-01 | 2001 | 1 | 30.63180 | 19.28192 | 28.04827 | 0 | 27.49678 | 1.690107 | 15.88942 | 0 | 0 | 15.88942 | 0 | 15.88942 | 15 | 1.1105785 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0e+00 | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2.695067 | 0 | 0 | 71.76713 | 0e+00 | 315.8694 | 1316.1224 | 0 | 24.95087 | -123.72464 | monthly | 67.5 | 0e+00 | 68.01212 | 315.8694 | 17.15674 | 15.88942 | 14.341217 | 4.613824 | 2.695067 | |
2004-01-01 | 2004 | 1 | 30.52903 | 18.26718 | 28.55370 | 0 | 35.05992 | 1.459625 | 18.11393 | 0 | 0 | 18.11393 | 0 | 18.11393 | 18 | 0.0000000 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0e+00 | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3.756847 | 0 | 0 | 27.26053 | 0e+00 | 207.0936 | 862.8902 | 0 | 24.79248 | -85.42874 | monthly | 67.5 | 0e+00 | 66.78092 | 207.0936 | 20.62766 | 18.11393 | 15.581024 | 5.635863 | 3.756847 | |
2006-01-01 | 5961.38 | 2006 | 1 | 30.46052 | 19.12600 | 28.23819 | 0 | 28.71821 | 1.648079 | 20.09398 | 0 | 0 | 20.09398 | 0 | 20.09398 | 20 | 0.0000000 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0e+00 | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2.746764 | 0 | 0 | 70.00171 | 0e+00 | 202.2794 | 842.8307 | 0 | 25.21861 | -67.40000 | monthly | 67.5 | 0e+00 | 71.20263 | 202.2794 | 22.73886 | 20.09398 | 17.799957 | 4.530318 | 2.746764 |
2019-01-01 | 20492.00 | 2019 | 1 | 30.77942 | 20.42255 | 28.77577 | 0 | 27.54951 | 1.688249 | 14.73851 | 0 | 0 | 14.73851 | 0 | 14.73851 | 14 | 2.2614946 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 8e-07 | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3.800786 | 0 | 0 | 45.32357 | 8e-07 | 194.4447 | 810.1861 | 0 | 25.83674 | -56.88514 | monthly | 45.0 | 8e-07 | 59.67574 | 194.4447 | 18.49609 | 14.73851 | 9.629028 | 7.122638 | 3.800786 |
2024-01-01 | 21523.00 | 2024 | 1 | 31.36483 | 19.75575 | 29.12367 | 0 | 33.91850 | 1.490297 | 19.13332 | 0 | 0 | 19.13332 | 0 | 19.13332 | 19 | 0.0000000 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0e+00 | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.000105 | 0 | 0 | 49.57399 | 0e+00 | 201.9484 | 841.4518 | 0 | 25.89776 | -58.87297 | monthly | 67.5 | 0e+00 | 70.21452 | 201.9484 | 21.24637 | 19.13332 | 17.068298 | 6.053437 | 5.000105 |
2012-01-01 | 12890.86 | 2012 | 1 | 30.62774 | 18.61248 | 28.07238 | 0 | 33.01736 | 1.515438 | 16.75205 | 0 | 0 | 16.75205 | 0 | 16.75205 | 16 | 0.2479477 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0e+00 | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3.079289 | 0 | 0 | 85.96466 | 0e+00 | 204.3631 | 851.5130 | 0 | 24.86616 | -48.07879 | monthly | 67.5 | 0e+00 | 65.90116 | 204.3631 | 20.31604 | 16.75205 | 14.298065 | 4.742382 | 3.079289 |
2022-01-01 | 22177.00 | 2022 | 1 | 31.14147 | 18.89542 | 28.81193 | 0 | 36.15213 | 1.431435 | 17.31522 | 0 | 0 | 17.31522 | 0 | 17.31522 | 17 | 0.0000000 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0e+00 | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3.174846 | 0 | 0 | 61.11502 | 0e+00 | 203.2604 | 846.9184 | 0 | 24.94639 | -72.01952 | monthly | 67.5 | 0e+00 | 70.77531 | 203.2604 | 20.43677 | 17.31522 | 14.415924 | 4.915345 | 3.174846 |
2016-01-01 | 16531.50 | 2016 | 1 | 31.42702 | 17.73512 | 28.94781 | 0 | 36.23008 | 1.429464 | 17.59181 | 0 | 0 | 17.59181 | 0 | 17.59181 | 17 | 0.0000000 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0e+00 | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3.891066 | 0 | 0 | 44.90873 | 0e+00 | 203.7141 | 848.8086 | 0 | 25.30550 | -21.61573 | monthly | 67.5 | 0e+00 | 57.14178 | 203.7141 | 20.94125 | 17.59181 | 14.697449 | 6.208151 | 3.891066 |
2010-01-01 | 9756.29 | 2010 | 1 | 30.52635 | 19.64243 | 28.40109 | 0 | 28.87433 | 1.642858 | 16.58170 | 0 | 0 | 16.58170 | 0 | 16.58170 | 16 | 0.4183025 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0e+00 | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4.488943 | 0 | 0 | 90.56621 | 0e+00 | 192.3433 | 801.4303 | 0 | 24.87201 | -50.86973 | monthly | 67.5 | 0e+00 | 79.20745 | 192.3433 | 18.27646 | 16.58170 | 14.640808 | 5.967861 | 4.488943 |
2018-01-01 | 20289.25 | 2018 | 1 | 30.54786 | 18.12408 | 28.45458 | 0 | 34.54116 | 1.473407 | 16.48720 | 0 | 0 | 16.48720 | 0 | 16.48720 | 16 | 0.5128040 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0e+00 | NA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 3.972923 | 0 | 0 | 66.34983 | 0e+00 | 203.3076 | 847.1151 | 0 | 25.08855 | -47.56812 | monthly | 67.5 | 0e+00 | 70.02759 | 203.3076 | 19.21570 | 16.48720 | 13.822662 | 5.564889 | 3.972923 |
## date bio5 ... wsphk_dailymax wsphk_dailymean
## 0 2001-01-01 30.631801 ... 4.613824 2.695067
## 1 2001-01-02 NaN ... 4.357427 2.921393
## 2 2001-01-03 NaN ... 4.911554 3.461336
## 3 2001-01-04 NaN ... 5.330330 3.840347
## 4 2001-01-05 NaN ... 5.145489 4.542647
##
## [5 rows x 54 columns]
## <class 'pandas.core.frame.DataFrame'>
## [{'mean': np.float64(24.07278709845588), 'std': np.float64(4.691036119980351), 'kurtosis': np.float64(nan), 'skewness': np.float64(nan), 'ks_stat': np.float64(nan), 'ks_pval': np.float64(nan), 'jb_stat': np.float64(nan), 'jb_pval': np.float64(nan)}]
Concept and Methodology Link
The typical seasons of temperature contracts correspond to when heating demands or cooling demands are high, thus the HDD index usually measures the temperature from October to April whereas the CDD index measures the temperature from April to October. The CAT index substitutes the CDD index for the European cities traded on the CME.
Hence, indices created are of types -
HDD for heating demand heavy months and CDD (substituted by CAT in European market convention)
OCT, NOV, DEC, JAN, FEB, MAR: HDD APR, MAY, JUN, JUL, AUG, SEP, OCT: CAT
HDD OCT-JAN, JAN-APR, OCT-APR CAT APR-JUL, JUL-OCT, APR-OCT
Using Euler–scheme CAR(p) simulator -
Defining the payoff indices (baseline temp - 18 degree celcius) -
\[\begin{equation} HDD(\tau_{1},\tau_{2}) = \int_{\tau_{1}}^{\tau_{2}} \max(c - T_{u}, 0) du, \end{equation}\]
\[\begin{equation} CDD(\tau_{1},\tau_{2}) = \int_{\tau_{1}}^{\tau_{2}} \max(T_{u}-c, 0) du, \end{equation}\]
\[\begin{equation} CAT(\tau_{1},\tau_{2}) = \int_{\tau_{1}}^{\tau_{2}} T_{u} du, \end{equation}\]
with HDD-CDD parity
\[\begin{equation} CDD(\tau_{1},\tau_{2})- HDD(\tau_{1},\tau_{2})= CAT(\tau_{1},\tau_{2}) - c(\tau_{2}-\tau_{1}). \end{equation}\]
OT (Overall Temperature) index is placeholder for month-dependent index selection (months 5,6,7,8,9 - CAT else HDD)
The fitted AR(3) model is
\[
y_t \;=\; \phi_1\,y_{t-1} \;+\; \phi_2\,y_{t-2} \;+\; \phi_3\,y_{t-3}
\;+\; \varepsilon_t,\quad
\varepsilon_t \sim \mathcal{N}(0, \sigma_t^2).
\]
The seasonal mean function uses a Fourier series:
\[
m(t) \;=\; a_0 + a_1\,t + a_2\cos(\omega t) + a_3\sin(\omega t),
\quad \omega = \frac{2\pi}{365.25}.
\]
The time‐varying variance is
\[
\sigma_t^2 \;=\; d_1\cos(\omega t) + d_2\sin(\omega t).
\] ## A) Fitted AutoReg Model
AutoReg Model Results
==============================================================================
Dep. Variable: seasonalized No. Observations: 17898
Model: AutoReg(3) Log Likelihood -29377.318
Method: Conditional MLE S.D. of innovations 1.249
Date: Thu, 19 Jun 2025 AIC 58762.635
Time: 10:56:30 BIC 58793.804
Sample: 01-04-2001 HQIC 58772.888
- 01-01-2050
===================================================================================
coef std err z P>|z| [0.025 0.975]
-----------------------------------------------------------------------------------
seasonalized.L1 1.0183 0.007 138.167 0.000 1.004 1.033
seasonalized.L2 -0.4146 0.010 -40.925 0.000 -0.434 -0.395
seasonalized.L3 0.1675 0.007 22.725 0.000 0.153 0.182
Roots
=============================================================================
Real Imaginary Modulus Frequency
-----------------------------------------------------------------------------
AR.1 1.3114 -0.0000j 1.3114 -0.0000
AR.2 0.5821 -2.0528j 2.1338 -0.2060
AR.3 0.5821 +2.0528j 2.1338 0.2060
-----------------------------------------------------------------------------
date | seasonalized | residuals_cv | fitted_variance | residuals_tv |
---|---|---|---|---|
2001-01-01 00:00:00 | -2.12601 | nan | 3.25328 | nan |
2001-01-02 00:00:00 | -0.498213 | nan | 3.25678 | nan |
2001-01-03 00:00:00 | 0.435505 | nan | 3.25974 | nan |
2001-01-04 00:00:00 | -0.572836 | -0.866796 | 3.26221 | -0.479911 |
2001-01-05 00:00:00 | 0.22907 | 1.07639 | 3.26422 | 0.595771 |
parameter | estimate |
---|---|
a₀ | -19.7378 |
a₁ | 5.92484e-05 |
a₂ | -5.9183 |
a₃ | 0.0843081 |
d₁ | 5.0438 |
d₂ | -146.533 |
seasonalized.L1 | 1.01827 |
seasonalized.L2 | -0.414631 |
seasonalized.L3 | 0.167486 |
a0 | 1.56136 |
a1 | 1.52839 |
a2 | 0.326036 |
a3 | -0.0423424 |
a4 | -0.108629 |
b1 | 0.130793 |
b2 | 0.0333528 |
b3 | -0.0694925 |
b4 | -0.0160239 |
date year ... strip_hdd_continuous_payoff strip_ot_continuous_payoff
0 2001-01-01 2001 … 0.000000 0.000000 1 2001-01-02 2001 … 0.000000 0.000000 2 2001-01-03 2001 … 10.287056 10.287056 3 2001-01-04 2001 … 0.000000 0.000000 4 2001-01-05 2001 … 12.942390 12.942390
[5 rows x 49 columns]
## \begin{tabular}{lrrrrrrrrr}
## & \multicolumn{6}{c}{Seasonal Component} & \multicolumn{3}{c}{Autoregressive Component} \\
## & a_0 & a_1 & a_2 & a_3 & d_1 & d_2 & seasonalized.L1 & seasonalized.L2 & seasonalized.L3 \\
## station_id & & & & & & & & & \\
## CW0063 & -19.738 & 0.059 & -5.918 & 0.084 & 5.044 & -146.533 & 1.018 & -0.415 & 0.167 \\
## \end{tabular}
The K-S test statistic measures the largest distance between the EDF x) and the theoretical function (x), measured in a vertical direction (Kolmogorov as cited in Stephens 1992).
mean | std | kurtosis | skewness | ks_stat | ks_pval | jb_stat | jb_pval | |
---|---|---|---|---|---|---|---|---|
station_id | ||||||||
CW0063 | -0.01 | 1.00 | 6.32 | -0.88 | 0.07 | 0.00 | 10525.65 | 0.00 |
## <pandas.io.formats.style.Styler object at 0x00000295E6C1BD90>
## []
data <- combined_df
plot_tavg <- create_plot("t2hk_dailymean")
plot_prcp <- create_plot("dphk")
plot_wspd <- create_plot("wsphk_dailymax")
plot_pres <- create_plot("rh2hk_dailymean")
grid.arrange(plot_tavg, plot_prcp, plot_wspd, plot_pres, ncol = 2, top = "Empiric vs Theoric Distribution for Weather Variables")
##
## Call:
## lm(formula = Temp ~ t + sin(Omega * t) + cos(Omega * t), data = df_model)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.3164 -1.0493 0.0734 1.4042 5.5610
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.355e+01 3.152e-02 747.12 <2e-16 ***
## t 5.928e-05 3.050e-06 19.44 <2e-16 ***
## sin(Omega * t) -2.571e+00 2.229e-02 -115.33 <2e-16 ***
## cos(Omega * t) -5.318e+00 2.227e-02 -238.75 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.108 on 17894 degrees of freedom
## (334 observations deleted due to missingness)
## Multiple R-squared: 0.7981, Adjusted R-squared: 0.7981
## F-statistic: 2.358e+04 on 3 and 17894 DF, p-value: < 2.2e-16
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 23.546 | 0.032 | 747.120 | 0 |
t | 0.000 | 0.000 | 19.437 | 0 |
sin(Omega * t) | -2.571 | 0.022 | -115.330 | 0 |
cos(Omega * t) | -5.318 | 0.022 | -238.748 | 0 |
Model | a1 | a2 | a3 | a4 | r.squared | sigma |
---|---|---|---|---|---|---|
\(T_t^{m}\) (OLS) | 23.5463 | 5.93e-05 | -2.57103 | -5.317991 | 0.7981256 | 2.10788 |
Model | A | B | C | Phi |
---|---|---|---|---|
\(T_m\) | 23.5463 | 5.93e-05 | 5.90688 | -2.021124 |
## [{'station_id': 'CW0063', 'theta_ols': np.float64(4.49450916924405), 'kappa_ols': np.float64(2.444947015974006), 'sigma_ols': np.float64(1.3217379144842238), 'theta_mle': np.float64(4.494258052517118), 'kappa_mle': np.float64(2.4425439461097915), 'sigma_mle': np.float64(2.6914355377269192), 'mle_2_mu': np.float64(4.494510995381748), 'mle_2_eta': array([1.]), 'mle_2_sigma_sq': array([5.98216371])}]
Value | |
---|---|
wsphk_dailymax | 4.4945 |
dom_wdirhk | 116.3609 |
θ_11 | θ_12 | |
---|---|---|
wsphk_dailymax 1 | 0.4958 | -0.0007 |
wsphk_dailymax 2 | -3.0215 | 0.3359 |
σ_11 | σ_12 | |
---|---|---|
dom_wdirhk 1 | 1.4605 | -3.5294 |
dom_wdirhk 2 | -3.5294 | 3120.8882 |
Wind Dynamics - Continous time transformation: Benth (2010) employed a Box-Cox transformation to achieve symmetry in wind speed data before fitting an AR(4) process with seasonal mean and volatility. General transformation of wind speed data when defining continuous time dynamics - \[\begin{equation} W(t)= \begin{cases} \big [ \lambda (\Lambda(t) + Y(t)) + 1 \big ]^{1/\lambda} &, \lambda \neq 0, \\ \exp (\Lambda(t) + Y(t)) &, \lambda=0, \end{cases} \end{equation}\]
for a constant and (t) seasonal function as in the previous section. A problem with this transformation is that for we may theoretically obtain negative wind speeds. The research of Benth (2010) argued that 1/ performed best.
Concept and Methodology Link
Using Multisite Rainfall Generation Methodology - generate daily precipitation at multiple stations by following method:
For each station \(s\) and day \(t\), define the wet‑day indicator
\[ I_{s,t} \;=\; \begin{cases} 1, & \text{if station $s$ on day $t$ has precipitation}\;\ge v_{\min},\\ 0, & \text{otherwise}. \end{cases} \]
For each station \(s\), fit a generalized linear model
\[ \Pr(I_{s,t}=1 \mid \mathbf{x}_t) = g^{-1}\!\bigl(\beta_{s,0} + \boldsymbol\beta_s^\top\,\mathbf{x}_t\bigr), \]
where:
This yields fitted marginal probabilities
\[ p_{s,t} \;=\;\Pr(I_{s,t}=1\mid\mathbf{x}_t)\,. \] —
Gaussian latent: transform each marginal \(p_{s,t}\) to a standard normal quantile
\[ z_{s,t} \;=\;\Phi^{-1}(p_{s,t})\,. \]
Estimate the \(S\times S\) residual correlation matrix \(\Sigma\) from the \(\{z_{s,t}\}\).
Simulate
\[ \mathbf Z_t \;\sim\;\mathcal N(\mathbf0,\Sigma), \]
and recover occurrence by thresholding:
\[ \tilde I_{s,t} = \mathbf1\bigl\{Z_{s,t} > \Phi^{-1}(u)\bigr\},\quad u\approx0.5. \]
This preserves both each station’s marginal wet‐day frequency and the inter‐site correlation structure.
Conditional on a wet day (\(\tilde I_{s,t}=1\)), model the positive rainfall depth \(Y_{s,t}\). A common choice is a Gamma‐GLM:
\[ Y_{s,t} \;\big|\;I_{s,t}=1 \;\sim\; \Gamma\bigl(\alpha_s,\;\theta_s(\mathbf{x}_t)\bigr), \]
where one can link the scale via
\[ \log\bigl(\theta_s(\mathbf{x}_t)\bigr) = \gamma_{s,0} + \boldsymbol\gamma_s^\top\,\mathbf{x}_t. \]
This two‐stage “occurrence-amount” approach, coupled via a Gaussian copula, produces spatially coherent daily rainfall series.