Main References
Data Source
Stochastic modelling for mean temperature
In 2002 Peter Alaton published On Modelling and Pricing Weather Derivatives, in which he proposed a stochastic model for the mean temperature and he used it to price HDD and CDD derivatives contracts. Let’s define the mean temperature for the \(t\)-th day as:
\[T_t = \frac{T_t^{max} + T_t^{min}}{2}\]
Let’s consider an Ornstein–Uhlenbeck process of the form: \[dT_t = a(T^{m} - T_t)dt + \sigma_t dW_t\] Since the mean temperature present a strong seasonal component during the year it is not actually mean reverting to a fixed mean \(T^{m}\), instead he substitue it with a time-dependent mean that takes into account the seasonality of the mean temperature. The specification proposed for the seasonal drift is: \[\frac{dT_t^m}{dt} = B + \omega \cdot C \cdot cos(\omega t + \phi)\]
Therefore, the SDE will have the form: \[dT_t = \biggl\{ \frac{dT_t^m}{dt} + a(T_t^{m} - T_t) \biggl\}dt + \sigma_t dW_t\]
The reasoning behind the choose of an OU-process to model the randomness of the mean temperature can be found if we observe the distribution of the differences \(\Delta T_t = T_t - T_{t-1}\) and we compared it with the theoric density of a normal random variable with mean \(\bar{\Delta T_t} = \frac{1}{n} \sum_{t = 1}^{n} \Delta T_t\) and variance \(\frac{1}{n} \sum_{t = 1}^{n} (\Delta T_t - \bar{\Delta T_t})^2\).
mu_ <- mean(neviano_nasa_D$DeltaT)
sigma_ <- sd(neviano_nasa_D$DeltaT)
grid <- seq(min(neviano_nasa_D$DeltaT), max(neviano_nasa_D$DeltaT), length.out = nrow(neviano_nasa_D))
neviano_nasa_D %>%
ggplot()+
geom_density(aes(DeltaT, ..density..))+
geom_line(aes(grid, dnorm(grid, mu_, sigma_)), color = "red")+
theme_bw()+
theme(axis.text.x = element_text(angle = 0, face = "bold", size = 7),
axis.text.y = element_text(face = "bold"),
axis.title = element_text(face = "bold"),
axis.title.x = element_text(face = "bold",size = 10),
axis.title.y = element_text(face = "bold", size = 10),
plot.title = element_text(face = "bold"),
plot.subtitle = element_text(face = "italic"),
plot.caption = element_text(face = "italic"),
panel.grid.major.x = element_line(colour="grey60", linetype="dotted"),
panel.grid.minor.x = element_blank(),
panel.grid.major.y = element_line(colour="grey60", linetype="dotted"),
legend.text = element_text(face = "italic", size = 10),
legend.title = element_text(face = "bold"),
legend.position = "top" ) +
labs(
title = "Empiric vs Theoric distribution",
subtitle = "Empiric (Black), Theoric normal (Red)",
y = "Frequency",
x = "First differences",
caption = "Neviano De Rossi (PA), Years: 2010-2021"
)Seasonal Function
The seasonal function can be estimated fitting the following functional form with OLS: \[T_t^m = a_1 + a_2 t + a_3 \sin(\omega t) + a_4 \cos(\omega t)\]
df_model = NASA_DB$Neviano$data
# Time index
df_model$t <- 1:nrow(df_model)
# Omega
df_model$Omega <- 2*base::pi/365
# Mean Temperature
df_model$Temp <- (df_model$Max_Temperature + df_model$Min_Temperature)/2
# Seasonal model
seasonal_model <- lm(Temp ~ t + sin(Omega*t) + cos(Omega*t), data = df_model)
# Estimated OLS parameters
a1 <- coef(seasonal_model)[1]
a2 <- coef(seasonal_model)[2]
a3 <- coef(seasonal_model)[3]
a4 <- coef(seasonal_model)[4]
dplyr::tibble(
Model = "$T_t^{m}$ (OLS)",
a1 = a1,
a2 = a2,
a3 = a3,
a4 = a4,
r.squared = broom::glance(seasonal_model)$r.squared,
sigma = broom::glance(seasonal_model)$sigma
) %>%
knitr::kable(caption = "Estimated parameters for the seasonal model",
escape = FALSE) %>%
kableExtra::kable_classic() %>%
kable_styling(latex_options = "hold_position")| Model | a1 | a2 | a3 | a4 | r.squared | sigma |
|---|---|---|---|---|---|---|
| \(T_t^{m}\) (OLS) | 12.27585 | 0.000274 | -4.125778 | -9.221489 | 0.8508982 | 2.999615 |
Then, we obtain: \[T_t^m = A + B t + C \sin(\omega t + \phi)\] rearranging the estimated parameters we have:
- \(A = a_1\)
- \(B = a_2\)
- \(C = \sqrt{a_3^2 + a_4^2}\)
- \(\phi = \arctan \bigl( \frac{a_4}{a_3} \bigl) - \pi\)
# Rearrenge the coefficients
A <- a1
B <- a2
C <- sqrt(a3^2 + a4^2)
Phi <- atan(a4/a3) - base::pi
# Fitted seasonal mean
df_model$Temp_m <- A + B*df_model$t + C*sin(df_model$Omega*df_model$t + Phi)
# Function for the seasonal drift
SeasonalDrift <- function(t){
omega <- 2*base::pi/365
B + omega * C * cos(omega * t + Phi)
}
# Function for the seasonal function
SeasonalFunction <- function(t){
omega <- 2*base::pi/365
A + (B * t) + C * sin(omega * t + Phi)
}
dplyr::tibble(
Model = "$T_m$",
A = A,
B = B,
C = C,
Phi = Phi) %>%
knitr::kable(caption = "Coefficients of the regression of Tm", escape = FALSE) %>%
kableExtra::kable_classic() %>%
kable_styling(latex_options = "hold_position")| Model | A | B | C | Phi |
|---|---|---|---|---|
| \(T_m\) | 12.27585 | 0.000274 | 10.10237 | -1.991494 |
ggplot(df_model[1:1200,])+
geom_line(aes(Date, Temp), alpha = 0.7)+
geom_line(aes(Date, Temp_m), color = "red")+
theme_bw()+
theme(axis.text.x = element_text(angle = 0, face = "bold", size = 7),
axis.text.y = element_text(face = "bold"),
axis.title = element_text(face = "bold"),
axis.title.x = element_text(face = "bold",size = 10),
axis.title.y = element_text(face = "bold", size = 10),
plot.title = element_text(face = "bold"),
plot.subtitle = element_text(face = "italic"),
plot.caption = element_text(face = "italic"),
panel.grid.major.x = element_line(colour="grey60", linetype="dotted"),
panel.grid.minor.x = element_blank(),
panel.grid.major.y = element_line(colour="grey60", linetype="dotted"),
legend.text = element_text(face = "italic", size = 10),
legend.title = element_text(face = "bold"),
legend.position = "top" ) +
labs(
title = "Alaton's Model: Seasonal Function",
subtitle = "Empiric mean temperature (Black), Seasonal mean temperature (Red)",
y = "Temperature (°C)",
x = NULL,
caption = "Neviano De Rossi (PA), Years: 2010-2021"
)Calibration of the Parameters of the SDE
In total we have to estimate two parameters \(a\) and \(\sigma^2\).
Mean Reversion Parameter
In order to estimate the mean-reversion parameter, Alaton uses the martingale estimation method suggested in Bibby and Sorensen. An efficient estimator could be obtained searching for the value of \(\hat{a}_n\) such that: \[G_n(\hat{a}_n) = 0\] where: \[G_n(a) = \sum_{i=1}^{n} \frac{\partial_{a} b(T_{i-1}; a) }{\sigma_{i-1}^2} \biggl\{ T_i - E[T_i|T_{i-1}] \biggl\}\]
# Moving Standard Deviation
df_model$Sigma2 <- map_dbl(1:nrow(df_model), ~var(df_model$Temp[1:.x]))
df_model$L1_Sigma2 <- lag(df_model$Sigma2, 1)
# Lag of the Temperature Mean
df_model$L1_Temp_m <- lag(df_model$Temp_m, 1)
# Lag of the Temperature
df_model$L1_Temp <- lag(df_model$Temp, 1)
# Lag of dependent variable
df_model$L1_Y <- (df_model$L1_Temp_m - df_model$L1_Temp)/(df_model$L1_Sigma2)
# Remove first two rows to avoid NAs
df_model <- df_model[-c(1,2),]
# Mean Reversion Parameter
a <- -log(sum(df_model$L1_Y * (df_model$Temp - df_model$Temp_m))/sum(df_model$L1_Y * (df_model$L1_Temp- df_model$L1_Temp_m)))Sigma Parameter
In order to estimate the parameter \(\sigma_t\) the author proposed to use the seasonal variance. However this parameter could be estimated using two approaches. The first one consist in simply the computation of: \[\hat{\sigma}_u^2 = \frac{1}{N_u} \sum_{j = 0}^{N_u -1} (T_{j+1} - T_j)^2\] Considering the observations for the \(u\)-th month. The second estimator is obtained discretiziting the SDE for the mean temperature and it read as follow: \[\hat{\sigma}_u^2 = \frac{1}{N_u-2} \sum_{j = 0}^{N_u -1} (T_{j} - \hat{a} T_{j-1}^m - (1 - \hat{a}) T_{j-1} )^2\]
seasonal_sigma = tibble(Month = 1:12, Nu = 0, Sigma = 0)
for(i in 1:nrow(seasonal_sigma)){
df_month <- dplyr::filter(df_model, Month == i)
Xj <- a * df_month$L1_Temp_m + (1 - a) * df_month$L1_Temp
Xj_star <- df_month$Temp - (df_month$Temp_m - df_month$L1_Temp_m)
seasonal_sigma$Nu[i] <- nrow(df_month)
seasonal_sigma$Sigma[i] <- sum((Xj_star - Xj)^2) / (nrow(df_month))
seasonal_sigma$Sigma[i] <- sqrt(seasonal_sigma$Sigma[i])
}
df_model = left_join(df_model, select(seasonal_sigma, Month, Sigma), by = "Month")
# Create the Seasonal Function for variance
SeasonalSigma <- function(t){
df_model$Sigma[max(t, 1)]
}
SimulateTemperature <- function(X0 = 1, dt = 1, N = 100, seed = 1){
a = 0.1776014
dt = 1
set.seed(seed)
dWt <- rnorm(N, mean = 0, sd = sqrt(dt))
Differences = c(0)
Deterministic = c(0)
Stochastic = c(0)
Variance = c(0)
Path = c(X0)
for(i in 2:N) {
Deterministic[i] <- (SeasonalDrift(i) + a *( SeasonalFunction(i) - Path[i-1]))*dt
Variance[i] <- (1 - exp(-2 * a)) / (2 * a)
Stochastic[i] <- SeasonalSigma(i) * sqrt(Variance[i]) * dWt[i]
Differences[i] <- Deterministic[i] + Stochastic[i]
Path[i] <- Path[i-1] + Differences[i]
}
tibble(t = 1:N,
Path = Path,
Differences = Differences,
Deterministic = Deterministic,
Stochastic = Stochastic,
Variance = Variance)
}
df_model$Sim1 <- SimulateTemperature(X0 = df_model$Temp[1], N = nrow(df_model), seed = 1)$Path
df_model$Sim2 <- SimulateTemperature(X0 = df_model$Temp[1], N = nrow(df_model), seed = 2)$Path
ggplot(df_model[1:1200,])+
geom_line(aes(Date, Sim1), color = "red", alpha = 1)+
geom_line(aes(Date, Temp), alpha = 0.5)+
theme_bw()+
theme(axis.text.x = element_text(angle = 0, face = "bold", size = 7),
axis.text.y = element_text(face = "bold"),
axis.title = element_text(face = "bold"),
axis.title.x = element_text(face = "bold",size = 10),
axis.title.y = element_text(face = "bold", size = 10),
plot.title = element_text(face = "bold"),
plot.subtitle = element_text(face = "italic"),
plot.caption = element_text(face = "italic"),
panel.grid.major.x = element_line(colour="grey60", linetype="dotted"),
panel.grid.minor.x = element_blank(),
panel.grid.major.y = element_line(colour="grey60", linetype="dotted"),
legend.text = element_text(face = "italic", size = 10),
legend.title = element_text(face = "bold"),
legend.position = "top" ) +
labs(
title = "Alaton's Model: Simulated vs Empiric",
subtitle = "Empiric (Black), Simulated (Red)",
y = "Temperature (°C)",
x = NULL,
caption = "Neviano De Rossi (PA), Years: 2010-2021"
)