1 SKEW-NORMAL DISTRIBUTION

1.1 🧠 What is the Skew-Normal Distribution?

The skew-normal distribution is a generalization of the normal (Gaussian) distribution that allows for asymmetry (skewness). While the standard normal distribution is symmetric around the mean, the skew-normal can lean to the left or right, depending on the shape parameter.

1.2 📈 Probability Density Function (PDF)

The probability density function (PDF) of the skew-normal distribution is:

\[ f(x; \xi, \omega, \alpha) = \frac{2}{\omega} \, \phi\left( \frac{x - \xi}{\omega} \right) \, \Phi\left( \alpha \cdot \frac{x - \xi}{\omega} \right) \]

Where:

\(\phi(\cdot)\) : Standard normal PDF
\(\Phi(\cdot)\) : Standard normal CDF
\(\xi\) : Location parameter (like the mean)
\(\omega\) > 0 : Scale parameter (like the standard deviation)
\(\alpha\) : Shape parameter (controls skewness)

1.3 🔁 Special Cases

If ( \(\alpha\) = 0 ): It becomes a standard normal distribution.
If ( \(\alpha\) > 0 ): The distribution is right-skewed (tail leans right).
If ( \(\alpha\) < 0 ): The distribution is left-skewed (tail leans left).

1.4 🔍 Difference Between Normal and Skew-Normal Distributions

The normal distribution is a special case of the skew-normal distribution. Here’s how they differ:

Feature	Normal Distribution	Skew-Normal Distribution
Shape	Symmetrical (bell curve)	Asymmetrical (can lean left or right)
Skewness	Exactly 0	Controlled by a shape parameter ( \(\alpha\) )
PDF	Based on the standard normal function	Involves both the standard normal PDF and CDF
Parameters	Mean ( \(\mu\) ), SD ( \(\sigma\) )	Location ( \(\xi\) ), Scale ( \(\omega\) ), Shape ( \(\alpha\) )
Flexibility	Limited to symmetry	More flexible; can model skewed data
Special Case	—	Normal dist. when ( \(\alpha\) = 0 )

In summary, the skew-normal adds an extra parameter to capture asymmetry, making it more adaptable for real-world data that often deviates from perfect symmetry.

Code

# Set seed for reproducibility
set.seed(123)

# Generate 1000 random values from a normal distribution
normal_data <- rnorm(1000, mean = 0, sd = 1)

# Plot histogram with density instead of frequency
hist(normal_data,
     breaks = 30,
     col = "skyblue",
     border = "white",
     probability = TRUE,
     main = "Normal Distribution with Density Curve",
     xlab = "Value",
     ylab = "Density")

# Add a normal density curve
curve(dnorm(x, mean = mean(normal_data), sd = sd(normal_data)),
      col = "darkblue",
      lwd = 2,
      add = TRUE)

Code

library(sn)
library(tseries)
library(xts)
library(ggplot2)
library(ggthemes)
library(e1071)
library(PerformanceAnalytics)
library(patchwork)
library(scales)
library(quantmod)
library(dplyr)
library(lubridate)
# Set seed for reproducibility
set.seed(123)
# Generate 1000 random values from a skew-normal distribution
# Parameters: location (xi), scale (omega), shape (alpha)
skew_data <- rsn(1000, xi = 0, omega = 1, alpha = 5)

# Plot histogram with density
hist(skew_data,
     breaks = 30,
     col = "lightgreen",
     border = "white",
     probability = TRUE,
     main = "Skew-Normal Distribution with Density Curve",
     xlab = "Value",
     ylab = "Density")

# Add skew-normal density curve
curve(dsn(x, xi = 0, omega = 1, alpha = 5),
      col = "darkgreen",
      lwd = 2,
      add = TRUE)

Code

# Set seed for reproducibility
set.seed(123)

# Generate data from skew-normal distribution with alpha = -5
skew_data_neg <- rsn(n = 1000, xi = 0, omega = 1, alpha = -5)

# Plot histogram with density
hist(skew_data_neg,
     breaks = 30,
     col = "salmon",
     border = "white",
     probability = TRUE,
     main = "Skew-Normal Distribution (α = -5)",
     xlab = "Value",
     ylab = "Density")

# Add skew-normal density curve
curve(dsn(x, xi = 0, omega = 1, alpha = -5),
      col = "darkred",
      lwd = 2,
      add = TRUE)

2 HMM (Hidden Markov Models)

2.1 📦 What is a Hidden Markov Model?

A Hidden Markov Model (HMM) is a statistical model where the system being modeled is assumed to follow a Markov process with unobserved (hidden) states.

Note

In plain terms: - There’s a process that moves between hidden states over time. - Each hidden state emits an observable output. - You see the outputs, but you don’t see the states.

Warning

Finance : Market regime switching: bull - bear market.

2.2 🧩 HMM Components

A Hidden Markov Model (HMM) is defined by the following key elements:

2.2.1 🔒 Hidden States

Denoted as:

\[ C_1, C_2, \dots, C_T \]

A finite set of unobserved states (e.g., calm/volatile, sunny/rainy).
The state at time ( t ), denoted ( C_t ), follows a Markov chain:

\[ P(C_t \mid C_{t-1}) = \text{transition probability} \]

This means the current state only depends on the previous state, not the full history.

2.2.2 📈 Observations

Denoted as:

\[ X_1, X_2, \dots, X_T \]

Each state emits data based on a specific probability distribution.
The observation at time ( t ), ( X_t ), depends only on the hidden state at that time:

\[ P(X_t \mid C_t) \]

These are sometimes called the emission probabilities.

2.2.3 🎲 Initial Probabilities

The model starts with an initial distribution over the hidden states:

\[ \delta = P(C_1) \]

This represents the probability of starting in each possible hidden state.

2.3 🔁 Markov Property

The hidden states follow the first-order Markov assumption:

\[ P(C_t \mid C_{1:t-1}) = P(C_t \mid C_{t-1}) \]

Also, the current observation depends only on the current state:

\[ P(X_t \mid C_{1:T}, X_{1:t-1}) = P(X_t \mid C_t) \]

2.4 ⚙️ Parameters of an HMM

Transition matrix ( ):

\[ \Gamma = \begin{bmatrix} P(C_{t+1} = 1 \mid C_t = 1) & \cdots & \cdots & P(C_{t+1} = m \mid C_t = m) \end{bmatrix} \]

Emission probabilities (depends on the application):

Normal: \(\mathcal{N}(\mu_i, \sigma_i^2)\)
Poisson: \(\ \text{Pois}(\lambda_i)\)
Skew-normal, etc.

Initial distribution:
\(\ \delta = P(C_1 = i)\)

2.5 📈 What You Can Do With HMMs

Task	What It Means
Likelihood evaluation	Compute \(\\P(X_1, \dots, X_T)\)
State estimation	Infer the most likely hidden states
Parameter estimation	Learn transition & emission parameters
Forecasting	Predict future observations or states

2.6 📐 Key Algorithms

2.6.1 🔄 Forward Algorithm

Efficiently computes the likelihood of observed data using recursion:

\[ \alpha_t(j) = \sum_i \alpha_{t-1}(i) \cdot \gamma_{ij} \cdot P(X_t \mid C_t = j) \]

Where:

\(\alpha_t(j)\): probability of partial sequence ending in state \(\\j\)
\(\gamma_{ij}\): transition probability from state \(\\i\) to \(\\j\)
\(\\P(X_t \mid C_t = j)\): emission probability at time \(\\t\)

2.6.2 🔁 Backward Algorithm

Used for:

Smoothing (estimating past states)
EM Algorithm for learning parameters

2.7 📊 Visualization

📈 Plot observations with colored hidden states
📊 Show state probabilities over time
📉 Compare model fit using criteria like AIC and BIC across different numbers of states

2.8 📌 Advantages of HMMs

✅ Handles hidden structure in time series
✅ Efficient thanks to Forward-Backward recursion
✅ Works well for sequential or temporal data
✅ Easily extended (e.g., with skew-normal emissions, covariates, etc.)

3 Main Part —> Project code

3.1 SX600 —> Analyzing data

Code

sx600 = get.hist.quote(instrument = "^stoxx", start = "2015-01-01", end = "2025-04-30",
                       quote = "AdjClose")

time series starts 2015-01-05
time series ends   2025-04-29

Code

sx600.num = as.numeric(sx600)
plot(sx600)

Code

nts = length(sx600)
sx600.sr = 100*diff(sx600.num)/sx600.num[-nts]
sx600.cr = diff(log(sx600))
plot(sx600.sr, type = "l", ylim = c(-10,10))

Code

hist(sx600.sr,
     breaks = 500,               
     xlim = c(-4, 4),
     probability = TRUE,         
     col = "red",
     border = "white",
     main = "Histogram with Fitted Skew-Normal Curve2",
     xlab = "Returns")


f=function(x) 0.7*dsn(x,x=0.5,omega=0.8, alpha = 5)+ 0.2*dsn(x,x=0.6,omega=0.8, alpha = -4)+
+ 0.1*dsn(x,x=-1.85,omega=3, alpha = 2)
curve(f,
      col = "blue",
      lwd = 2,
      add = TRUE)

Code

set.seed(42)  # For reproducibility

data <- as.numeric(sx600.sr)
n <- length(data)

# Define your target proportions
weights <- c(0.9, 0.7, 0.4, 0.2)

# Initialize vectors to store estimated parameters
xi <- numeric(4)
omega <- numeric(4)
alpha <- numeric(4)

# Create overlapping subsets
splits <- list()
for (i in 1:4) {
  size_i <- floor(weights[i] * n)
  splits[[i]] <- sample(data, size_i, replace = FALSE)  # sample without replacement, but from full data each time
}

# Estimate skew-normal parameters for each subset
results <- list()
for (i in 1:4) {
  fit <- selm(splits[[i]] ~ 1, family = "SN")
  params <- coef(fit, param.type = "DP")
  # Store estimates
  xi[i] <- params["xi"]
  omega[i] <- params["omega"]
  alpha[i] <- params["alpha"]
  results[[i]] <- c(params,xi,omega, alpha)
  cat(sprintf("Subset %d (%.0f%% of data): xi = %.3f, omega = %.3f, alpha = %.3f\n",
              i, weights[i]*100, params["xi"], params["omega"], params["alpha"]))
}

Subset 1 (90% of data): xi = 0.875, omega = 1.358, alpha = -1.378
Subset 2 (70% of data): xi = 0.858, omega = 1.345, alpha = -1.278
Subset 3 (40% of data): xi = 0.923, omega = 1.447, alpha = -1.557
Subset 4 (20% of data): xi = -0.597, omega = 1.255, alpha = 0.809

Code

forward_skew_loop = function(data, location.vec, scale.vec, shape) {
  results = list()
  
  for (m in 2:4) {
    n = length(data)
    
    # Generate gamma: m x m transition matrix with row sums = 1
    gamma = matrix(runif(m * m), nrow = m)
    gamma = gamma / rowSums(gamma)
    
    # Initial state probabilities (delta): uniform
    delta = rep(1/m, m)
    
    # Initialize matrices
    pred = matrix(NA, nrow = m, ncol = n)
    filtering = matrix(NA, nrow = m, ncol = n)
    
    # Time t = 1
    pred[, 1] = delta
    filtering[, 1] = pred[, 1] * dsn(x = data[1], xi = location.vec[1:m], omega = scale.vec[1:m], alpha = shape[1:m])
    llik = log(sum(filtering[, 1]))
    filtering[, 1] = filtering[, 1] / sum(filtering[, 1])
    
    # Loop over time
    for (t in 2:n) {
      pred[, t] = c(t(filtering[, t - 1]) %*% gamma)
      filtering[, t] = pred[, t] * dsn(x = data[t], xi = location.vec[1:m], omega = scale.vec[1:m], alpha = shape[1:m])
      llik = llik + log(sum(filtering[, t]))
      filtering[, t] = filtering[, t] / sum(filtering[, t])
    }
    
    results[[m]] = list(
      llik = llik,
      filtering = filtering,
      pred = pred,
      m = m,
      gamma = gamma
    )
    
  }
    return(results)
}


out = forward_skew_loop(data=as.numeric(sx600.sr), 
                        location.vec = xi, 
                        scale.vec = omega, 
                        shape = alpha)

for (m in 2:4) {
  cat("HMM with", m, "components has likelihood value:", out[[m]][["llik"]], "\n")
}

HMM with 2 components has likelihood value: -3755.94 
HMM with 3 components has likelihood value: -3753.523 
HMM with 4 components has likelihood value: -3729.03

Code

# Initialize empty lists to store results
transition_m = list()
solved = list()

for (m in 2:4) {
  gamma_m = out[[m]][["gamma"]]
  
  # Compute stationary distribution
  transition_m[[m]] = solve(t(diag(m) - gamma_m + matrix(1, m, m)), rep(1, m))
  
  # Multiply with gamma to check if it's stationary
  solved[[m]] = transition_m[[m]] %*% gamma_m
  print(transition_m[[m]])
  print(solved[[m]])
  
}

[1] 0.8430069 0.1569931
          [,1]      [,2]
[1,] 0.8430069 0.1569931
[1] 0.3455869 0.2265738 0.4278392
          [,1]      [,2]      [,3]
[1,] 0.3455869 0.2265738 0.4278392
[1] 0.1680199 0.2624128 0.2600075 0.3095598
          [,1]      [,2]      [,3]      [,4]
[1,] 0.1680199 0.2624128 0.2600075 0.3095598

Code

llik = function(par, m, data) {
  # Extract and normalize transition matrix
  gamma = matrix(par[1:m^2], nrow = m, byrow = TRUE)
  gamma = gamma / apply(gamma, 1, sum)
  
  # Extract parameters
  location.vec = par[(m^2 + 1):(m^2 + m)]
  scale.vec = par[(m^2 + m + 1):(m^2 + 2*m)]
  shape.vec = par[(m^2 + 2*m + 1):(m^2 + 3*m)]
  
  # Compute stationary distribution
  delta = c(solve(t(diag(m) - gamma + matrix(1, m, m)), rep(1, m)))
  
  n = length(data)
  pred = matrix(NA, nrow = m, ncol = n)
  filtering = matrix(NA, nrow = m, ncol = n)
  
  # Time t = 1
  pred[, 1] = delta
  filtering[, 1] = pred[, 1] * dsn(x = data[1], xi = location.vec[1:m],
                                   omega = scale.vec[1:m], alpha = shape.vec[1:m])
  llik = log(sum(filtering[, 1]))
  filtering[, 1] = filtering[, 1] / sum(filtering[, 1])
  
  # Loop over time
  for (t in 2:n) {
    pred[, t] = c(t(filtering[, t - 1]) %*% gamma)
    filtering[, t] = pred[, t] * dsn(x = data[t], xi = location.vec[1:m],
                                     omega = scale.vec[1:m], alpha = shape.vec[1:m])
    llik = llik + log(sum(filtering[, t]))
    filtering[, t] = filtering[, t] / sum(filtering[, t])
  }
  
  return(llik)
}

llik_vals = numeric(4)
for (i in 2:4) {
  gamma_i = out[[i]][["gamma"]]
  location_init = xi[1:i]
  scale_init = omega[1:i]
  shape_init = alpha[1:i]
  
  par_i = c(t(gamma_i), location_init, scale_init, shape_init)  
  
  llik_vals[i] = llik(par = par_i, m = i, data = as.numeric(sx600.sr))
  cat("Log-likelihood for", i, "components:", llik_vals[i], "\n")
  }

Log-likelihood for 2 components: -3755.939 
Log-likelihood for 3 components: -3753.525 
Log-likelihood for 4 components: -3729.031

Code

llik_optim_vals = numeric(4)
llik_optim_pars = list()
for (i in 2:4){
  gamma_i = out[[i]][["gamma"]]
  location_init = xi[1:i]
  scale_init = omega[1:i]
  shape_init = alpha[1:i]
  par_i = c(t(gamma_i), location_init, scale_init, shape_init)
  outhmm=optim(par_i,
               fn=llik,
               method="L-BFGS-B",
               lower=c(rep(0, i^2), rep(-Inf, i), rep(0.01, m), rep(-10, i)),
               upper=c(rep(1, i^2), rep(Inf, i), rep(Inf, i), rep(10, i)),
               control=list(fnscale=-1),data=as.numeric(sx600.sr),m=i)
  llik_optim_vals[i]=outhmm$value
  llik_optim_pars[[i]]=outhmm$par
  cat("MAX llik for", i, "components:", llik_optim_vals[i], "\n")
}

MAX llik for 2 components: -3358.633 
MAX llik for 3 components: -3284.078 
MAX llik for 4 components: -3274.408

Code

aic_last_vals = numeric()
for (i in 2:4){
  aic_last_vals[i] = -2*llik_optim_vals[i]+2*(i*(i-1)+3*i)
  cat("AIC for", i, "components:", aic_last_vals[i], "\n")
}

AIC for 2 components: 6733.266 
AIC for 3 components: 6598.156 
AIC for 4 components: 6596.815

Code

best_model = which.min(aic_last_vals)
cat("\nBest model is with", best_model, "components (lowest AIC = ", aic_last_vals[best_model], ")\n")


Best model is with 4 components (lowest AIC =  6596.815 )

Code

m=best_model
gamma.mle = matrix(llik_optim_pars[[m]][1:m^2], ncol=m, byrow=TRUE)
gamma.mlev = gamma.mle/apply(gamma.mle, FUN=sum, MAR=1)
gamma.mlev

           [,1]       [,2]        [,3]      [,4]
[1,] 0.93760405 0.05261219 0.009783761 0.0000000
[2,] 0.04427223 0.95572777 0.000000000 0.0000000
[3,] 0.00000000 0.00000000 0.505253711 0.4947463
[4,] 0.08535604 0.00000000 0.442838669 0.4718053

Code

xi = llik_optim_pars[[m]][(m^2 + 1):(m^2 + m)]
cat("\nBest model's location parameter is:", llik_optim_pars[[m]][(m^2 + 1):(m^2 + m)])


Best model's location parameter is: 0.7496635 0.5134708 0.5325302 -0.009874364

Code

omega = llik_optim_pars[[m]][(m^2 + m + 1):(m^2 + 2*m)]
cat("\nBest model's scale parameter is:", llik_optim_pars[[m]][(m^2 + m + 1):(m^2 + 2*m)])


Best model's scale parameter is: 1.301072 0.6749959 2.951049 2.074013

Code

alpha = llik_optim_pars[[m]][(m^2 + 2*m + 1):(m^2 + 3*m)]
cat("\nBest model's shape parameter is:", llik_optim_pars[[m]][(m^2 + 2*m + 1):(m^2 + 3*m)])


Best model's shape parameter is: -1.133608 -1.252673 -1.891089 1.360593

Code

llik1 = function(par, m, data) {
  results = list()
  # Extract and normalize transition matrix
  gamma = matrix(par[1:m^2], nrow = m, byrow = TRUE)
  gamma = gamma / apply(gamma, 1, sum)
  
  # Extract parameters
  location.vec = par[(m^2 + 1):(m^2 + m)]
  scale.vec = par[(m^2 + m + 1):(m^2 + 2*m)]
  shape.vec = par[(m^2 + 2*m + 1):(m^2 + 3*m)]
  
  # Compute stationary distribution
  delta = c(solve(t(diag(m) - gamma + matrix(1, m, m)), rep(1, m)))
  
  n = length(data)
  pred = matrix(NA, nrow = m, ncol = n)
  filtering = matrix(NA, nrow = m, ncol = n)
  
  # Time t = 1
  pred[, 1] = delta
  filtering[, 1] = pred[, 1] * dsn(x = data[1], xi = location.vec[1:m],
                                   omega = scale.vec[1:m], alpha = shape.vec[1:m])
  llik = log(sum(filtering[, 1]))
  filtering[, 1] = filtering[, 1] / sum(filtering[, 1])
  
  # Loop over time
  for (t in 2:n) {
    pred[, t] = c(t(filtering[, t - 1]) %*% gamma)
    filtering[, t] = pred[, t] * dsn(x = data[t], xi = location.vec[1:m],
                                     omega = scale.vec[1:m], alpha = shape.vec[1:m])
    llik = llik + log(sum(filtering[, t]))
    filtering[, t] = filtering[, t] / sum(filtering[, t])
  }
  results=list(llik = llik,
               filtering = filtering,
               pred = pred,
               delta = delta)
  return(results)
}

pars = llik_optim_pars[[m]]
forward_results <- llik1(par = llik_optim_pars[[m]], m = m, data = as.numeric(sx600.sr))
filtering <- forward_results$filtering
most_likely_states <- apply(filtering, 2, which.max)

# Plot histograms by state using only training data
par(mfrow = c(2, 2))  # 2x2 grid for 4 states

for (state in 1:4) {
  # Filter: only training period & only current state
  data_state <- as.numeric(sx600.sr)[most_likely_states == state]
  
  if (length(data_state) == 0) {
    plot.new()
    title(main = paste("No data for State", state))
    next
  }
  
  hist_info <- hist(data_state,  probability = TRUE, breaks = 50, plot = FALSE)
  
  data_min <- min(data_state)
  data_max <- max(data_state)
  range_expansion <- (data_max - data_min) * 0.15
  
  x_min <- data_min - range_expansion
  x_max <- data_max + range_expansion
  x_vals <- seq(x_min, x_max, length.out = 500)
  y_vals <- dsn(x_vals, xi = xi[state], omega = omega[state], alpha = alpha[state])
  
  ylim_max <- max(max(hist_info$density), max(y_vals)) * 1.1
  
  hist(data_state, probability = TRUE, breaks = 50,
       main = paste("Histogram of Returns - State", state),
       xlab = "Log-Return", ylab = "Density",
       col = "lightgray", border = "white",
       xlim = c(x_min, x_max), ylim = c(0, ylim_max))
  
  curve(dsn(x, xi = xi[state], omega = omega[state], alpha = alpha[state]),
        from = x_min, to = x_max, add = TRUE, col = "red", lwd = 2)
}

Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования
Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования
Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования
Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования

Code

dates = index(sx600[-1])
sx600.ret = xts(sx600.cr, order.by = dates)
dates_full <- index(sx600.ret)
returns_full <- na.omit(as.numeric(sx600.ret[,1]))



plot_data_train <- data.frame(
  Date = dates_full,
  Return = returns_full,
  State = factor(most_likely_states, levels = 1:4)  # Only 4 states
)

# Step 3: Plot
ggplot(plot_data_train, aes(x = Date, y = Return, color = State)) +
  geom_point(size = 0.8, alpha = 0.8) +
  scale_color_brewer(palette = "Dark2") +
  labs(title = "STOXX600 Daily Returns Colored by Hidden State (2015–2025)",
       y = "Log-Return",
       x = "Date") +
  theme_minimal()

3.2 SP500 —> Analyzing data

Code

sp500 = get.hist.quote(instrument = "^gspc", start = "2015-01-01", end = "2025-04-30"
                       , quote = "AdjClose")

time series starts 2015-01-02
time series ends   2025-04-29

Code

sp500.num = as.numeric(sp500)
plot(sp500)

Code

nts = length(sp500)
sp500.sr = 100*diff(sp500.num)/sp500.num[-nts]
sp500.cr = diff(log(sp500))
plot(sp500.sr, type = "l", ylim = c(-10,10))

Code

hist(sp500.sr,
     breaks = 500,               
     xlim = c(-4, 4),
     probability = TRUE,         
     col = "red",
     border = "white",
     main = "Histogram with Fitted Skew-Normal Curve2",
     xlab = "Returns")


f=function(x) 0.5*dsn(x,x=0.3,omega=0.6, alpha = 5)+
  +0.2*dsn(x,x=0.4,omega=0.8, alpha = -4)+
  +0.1*dsn(x,x=-1,omega=1, alpha = 2)+
  +0.2*dsn(x,x=1,omega=1, alpha = 2)
curve(f,
      col = "blue",
      lwd = 2,
      add = TRUE)

Code

set.seed(45)  # For reproducibility

data <- as.numeric(sp500.sr)
n <- length(data)

# Define your target proportions
weights <- c(0.9, 0.7, 0.4, 0.2)

# Initialize vectors to store estimated parameters
xi <- numeric(4)
omega <- numeric(4)
alpha <- numeric(4)

# Create overlapping subsets
splits <- list()
for (i in 1:4) {
  size_i <- floor(weights[i] * n)
  splits[[i]] <- sample(data, size_i, replace = FALSE)  # sample without replacement, but from full data each time
}

# Estimate skew-normal parameters for each subset
results <- list()
for (i in 1:4) {
  fit <- selm(splits[[i]] ~ 1, family = "SN")
  params <- coef(fit, param.type = "DP")
  # Store estimates
  xi[i] <- params["xi"]
  omega[i] <- params["omega"]
  alpha[i] <- params["alpha"]
  results[[i]] <- c(params,xi,omega, alpha)
  cat(sprintf("Subset %d (%.0f%% of data): xi = %.3f, omega = %.3f, alpha = %.3f\n",
              i, weights[i]*100, params["xi"], params["omega"], params["alpha"]))
}

Subset 1 (90% of data): xi = 0.795, omega = 1.370, alpha = -0.955
Subset 2 (70% of data): xi = 0.859, omega = 1.419, alpha = -1.052
Subset 3 (40% of data): xi = -0.649, omega = 1.369, alpha = 0.892
Subset 4 (20% of data): xi = 0.908, omega = 1.411, alpha = -1.186

Code

forward_skew_loop = function(data, location.vec, scale.vec, shape) {
  results = list()
  
  for (m in 2:4) {
    n = length(data)
    
    # Generate gamma: m x m transition matrix with row sums = 1
    gamma = matrix(runif(m * m), nrow = m)
    gamma = gamma / rowSums(gamma)
    
    # Initial state probabilities (delta): uniform
    delta = rep(1/m, m)
    
    # Initialize matrices
    pred = matrix(NA, nrow = m, ncol = n)
    filtering = matrix(NA, nrow = m, ncol = n)
    
    # Time t = 1
    pred[, 1] = delta
    filtering[, 1] = pred[, 1] * dsn(x = data[1], xi = location.vec[1:m], omega = scale.vec[1:m], alpha = shape[1:m])
    llik = log(sum(filtering[, 1]))
    filtering[, 1] = filtering[, 1] / sum(filtering[, 1])
    
    # Loop over time
    for (t in 2:n) {
      pred[, t] = c(t(filtering[, t - 1]) %*% gamma)
      filtering[, t] = pred[, t] * dsn(x = data[t], xi = location.vec[1:m], omega = scale.vec[1:m], alpha = shape[1:m])
      llik = llik + log(sum(filtering[, t]))
      filtering[, t] = filtering[, t] / sum(filtering[, t])
    }
    
    results[[m]] = list(
      llik = llik,
      filtering = filtering,
      pred = pred,
      m = m,
      gamma = gamma
    )
    
  }
  
  
  
  return(results)
}

out = forward_skew_loop(data=as.numeric(sp500.sr), 
                        location.vec = xi, 
                        scale.vec = omega, 
                        shape = alpha)


for (m in 2:4) {
  cat("HMM with", m, "components has likelihood value:", out[[m]][["llik"]], "\n")
}

HMM with 2 components has likelihood value: -4023.631 
HMM with 3 components has likelihood value: -3987.416 
HMM with 4 components has likelihood value: -3971.691

Code

# Initialize empty lists to store results
transition_m = list()
solved = list()

for (m in 2:4) {
  gamma_m = out[[m]][["gamma"]]
  
  # Compute stationary distribution
  transition_m[[m]] = solve(t(diag(m) - gamma_m + matrix(1, m, m)), rep(1, m))
  
  # Multiply with gamma to check if it's stationary
  solved[[m]] = transition_m[[m]] %*% gamma_m
  print(transition_m[[m]])
  print(solved[[m]])
  
}

[1] 0.5477482 0.4522518
          [,1]      [,2]
[1,] 0.5477482 0.4522518
[1] 0.2035905 0.1775676 0.6188419
          [,1]      [,2]      [,3]
[1,] 0.2035905 0.1775676 0.6188419
[1] 0.2317339 0.3612939 0.2448064 0.1621657
          [,1]      [,2]      [,3]      [,4]
[1,] 0.2317339 0.3612939 0.2448064 0.1621657

Code

llik = function(par, m, data) {
  # Extract and normalize transition matrix
  gamma = matrix(par[1:m^2], nrow = m, byrow = TRUE)
  gamma = gamma / apply(gamma, 1, sum)
  
  # Extract parameters
  location.vec = par[(m^2 + 1):(m^2 + m)]
  scale.vec = par[(m^2 + m + 1):(m^2 + 2*m)]
  shape.vec = par[(m^2 + 2*m + 1):(m^2 + 3*m)]
  
  # Compute stationary distribution
  delta = c(solve(t(diag(m) - gamma + matrix(1, m, m)), rep(1, m)))
  
  n = length(data)
  pred = matrix(NA, nrow = m, ncol = n)
  filtering = matrix(NA, nrow = m, ncol = n)
  
  # Time t = 1
  pred[, 1] = delta
  filtering[, 1] = pred[, 1] * dsn(x = data[1], xi = location.vec[1:m],
                                   omega = scale.vec[1:m], alpha = shape.vec[1:m])
  llik = log(sum(filtering[, 1]))
  filtering[, 1] = filtering[, 1] / sum(filtering[, 1])
  
  # Loop over time
  for (t in 2:n) {
    pred[, t] = c(t(filtering[, t - 1]) %*% gamma)
    filtering[, t] = pred[, t] * dsn(x = data[t], xi = location.vec[1:m],
                                     omega = scale.vec[1:m], alpha = shape.vec[1:m])
    llik = llik + log(sum(filtering[, t]))
    filtering[, t] = filtering[, t] / sum(filtering[, t])
  }
  
  return(llik)
}







llik_vals = numeric(4)
for (i in 2:4) {
  gamma_i = out[[i]][["gamma"]]
  location_init = xi[1:i]
  scale_init = omega[1:i]
  shape_init = alpha[1:i]
  
  par_i = c(t(gamma_i), location_init, scale_init, shape_init)  
  
  llik_vals[i] = llik(par = par_i, m = i, data = as.numeric(sp500.sr))
  cat("Log-likelihood for", i, "components:", llik_vals[i], "\n")
}

Log-likelihood for 2 components: -4023.632 
Log-likelihood for 3 components: -3987.419 
Log-likelihood for 4 components: -3971.684

Code

llik_optim_vals = numeric(4)
llik_optim_pars = list()
for (i in 2:4){
  gamma_i = out[[i]][["gamma"]]
  location_init = xi[1:i]
  scale_init = omega[1:i]
  shape_init = alpha[1:i]
  par_i = c(t(gamma_i), location_init, scale_init, shape_init)
  outhmm=optim(par_i,
               fn=llik,
               method="L-BFGS-B",
               lower=c(rep(0, i^2), rep(-Inf,i), rep(0.01, i), rep(-10, i)),
               upper=c(rep(1, i^2), rep(Inf, i), rep(Inf, i), rep(10, i)),
               control=list(fnscale=-1),data=as.numeric(sp500.sr),m=i)
  llik_optim_vals[i]=outhmm$value
  llik_optim_pars[[i]]=outhmm$par
  cat("MAX llik for", i, "components:", llik_optim_vals[i], "\n")
}

MAX llik for 2 components: -3456.351 
MAX llik for 3 components: -3330.375 
MAX llik for 4 components: -3318.017

Code

aic_last_vals = numeric()
for (i in 2:4){
  aic_last_vals[i] = -2*llik_optim_vals[i]+2*(i*(i-1)+3*i)
  cat("AIC for", i, "components:", aic_last_vals[i], "\n")
}

AIC for 2 components: 6928.701 
AIC for 3 components: 6690.75 
AIC for 4 components: 6684.033

Code

best_model = which.min(aic_last_vals)
cat("\nBest model is with", best_model, "components (lowest AIC = ", aic_last_vals[best_model], ")\n")


Best model is with 4 components (lowest AIC =  6684.033 )

Code

m=best_model
gamma.mle = matrix(llik_optim_pars[[m]][1:m^2], ncol=m, byrow=TRUE)
gamma.mlev = gamma.mle/apply(gamma.mle, FUN=sum, MAR=1)
gamma.mlev

           [,1]       [,2]       [,3]      [,4]
[1,] 0.97373389 0.02626611 0.00000000 0.0000000
[2,] 0.00000000 0.47620583 0.03685692 0.4869373
[3,] 0.00000000 0.00000000 0.90362390 0.0963761
[4,] 0.04014038 0.08207112 0.00000000 0.8777885

Code

xi = llik_optim_pars[[m]][(m^2 + 1):(m^2 + m)]
cat("\nBest model's location parameter is:", llik_optim_pars[[m]][(m^2 + 1):(m^2 + m)])


Best model's location parameter is: 0.3829119 0.2964956 -0.7069928 1.106778

Code

omega = llik_optim_pars[[m]][(m^2 + m + 1):(m^2 + 2*m)]
cat("\nBest model's scale parameter is:", llik_optim_pars[[m]][(m^2 + m + 1):(m^2 + 2*m)])


Best model's scale parameter is: 0.5968282 1.692717 3.722713 1.502212

Code

alpha = llik_optim_pars[[m]][(m^2 + 2*m + 1):(m^2 + 3*m)]
cat("\nBest model's shape parameter is:", llik_optim_pars[[m]][(m^2 + 2*m + 1):(m^2 + 3*m)])


Best model's shape parameter is: -0.7264074 -1.438691 0.2471854 -1.268492

Code

llik1 = function(par, m, data) {
  results = list()
  # Extract and normalize transition matrix
  gamma = matrix(par[1:m^2], nrow = m, byrow = TRUE)
  gamma = gamma / apply(gamma, 1, sum)
  
  # Extract parameters
  location.vec = par[(m^2 + 1):(m^2 + m)]
  scale.vec = par[(m^2 + m + 1):(m^2 + 2*m)]
  shape.vec = par[(m^2 + 2*m + 1):(m^2 + 3*m)]
  
  # Compute stationary distribution
  delta = c(solve(t(diag(m) - gamma + matrix(1, m, m)), rep(1, m)))
  
  n = length(data)
  pred = matrix(NA, nrow = m, ncol = n)
  filtering = matrix(NA, nrow = m, ncol = n)
  
  # Time t = 1
  pred[, 1] = delta
  filtering[, 1] = pred[, 1] * dsn(x = data[1], xi = location.vec[1:m],
                                   omega = scale.vec[1:m], alpha = shape.vec[1:m])
  llik = log(sum(filtering[, 1]))
  filtering[, 1] = filtering[, 1] / sum(filtering[, 1])
  
  # Loop over time
  for (t in 2:n) {
    pred[, t] = c(t(filtering[, t - 1]) %*% gamma)
    filtering[, t] = pred[, t] * dsn(x = data[t], xi = location.vec[1:m],
                                     omega = scale.vec[1:m], alpha = shape.vec[1:m])
    llik = llik + log(sum(filtering[, t]))
    filtering[, t] = filtering[, t] / sum(filtering[, t])
  }
  results=list(llik = llik,
               filtering = filtering,
               pred = pred,
               delta = delta)
  return(results)
}

pars = llik_optim_pars[[m]]
forward_results <- llik1(par = llik_optim_pars[[m]], m = m, data = as.numeric(sp500.sr))
filtering <- forward_results$filtering
most_likely_states <- apply(filtering, 2, which.max)

# Plot histograms by state using only training data
par(mfrow = c(2, 2))  # 2x2 grid for 4 states

for (state in 1:4) {
  # Filter: only training period & only current state
  data_state <- as.numeric(sp500.sr)[most_likely_states == state]
  
  if (length(data_state) == 0) {
    plot.new()
    title(main = paste("No data for State", state))
    next
  }
  
  hist_info <- hist(data_state,  probability = TRUE, breaks = 50, plot = FALSE)
  
  data_min <- min(data_state)
  data_max <- max(data_state)
  range_expansion <- (data_max - data_min) * 0.15
  
  x_min <- data_min - range_expansion
  x_max <- data_max + range_expansion
  x_vals <- seq(x_min, x_max, length.out = 500)
  y_vals <- dsn(x_vals, xi = xi[state], omega = omega[state], alpha = alpha[state])
  
  ylim_max <- max(max(hist_info$density), max(y_vals)) * 1.1
  
  hist(data_state, probability = TRUE, breaks = 50,
       main = paste("Histogram of Returns - State", state),
       xlab = "Log-Return", ylab = "Density",
       col = "lightgray", border = "white",
       xlim = c(x_min, x_max), ylim = c(0, ylim_max))
  
  curve(dsn(x, xi = xi[state], omega = omega[state], alpha = alpha[state]),
        from = x_min, to = x_max, add = TRUE, col = "red", lwd = 2)
}

Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования
Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования
Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования
Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования

Code

dates = index(sp500[-1])
sp500.ret = xts(sp500.cr, order.by = dates)
dates_full <- index(sp500.ret)
returns_full <- na.omit(as.numeric(sp500.ret[,1]))


plot_data_train <- data.frame(
  Date = dates_full,
  Return = returns_full,
  State = factor(most_likely_states, levels = 1:4)  # Only 4 states
)

# Step 3: Plot
ggplot(plot_data_train, aes(x = Date, y = Return, color = State)) +
  geom_point(size = 0.8, alpha = 0.8) +
  scale_color_brewer(palette = "Dark2") +
  labs(title = "SP500 Daily Returns Colored by Hidden State (2015–2025)",
       y = "Log-Return",
       x = "Date") +
  theme_minimal()

Code

par(mfrow = c(1, 1))
stat_distr = forward_results$delta
f=function(x) stat_distr[1]*dsn(x,x=xi[1],omega=omega[1], alpha = alpha[1])+ stat_distr[2]*dsn(x,x=xi[2],omega=omega[2], alpha =alpha[2])+
+ stat_distr[3]*dsn(x,x=xi[3],omega=omega[3], alpha = alpha[3])+
  stat_distr[4]*dsn(x,x=xi[4],omega=omega[4], alpha = alpha[4])
cat("\nBest model's stationary distribution is:", stat_distr)


Best model's stationary distribution is: 0.5360731 0.08184454 0.03129964 0.3507828

Code

hist(sp500.sr,
     breaks = 100,               
     xlim = c(-4, 4),
     probability = TRUE,         
     col = "red",
     border = "white",
     main = "Histogram with Fitted Skew-Normal Curve, stationary distribution, optimized parameters",
     xlab = "Returns")
curve(f,
      col = "blue",
      lwd = 2,
      add = TRUE)

3.3 NIKKEI225 —> Analyzing data

Code

nikkei = get.hist.quote(instrument = "^N225", start = "2015-01-01", end = "2025-05-30"
                       , quote = "AdjClose")

Warning: ^N225 contains missing values. Some functions will not work if objects
contain missing values in the middle of the series. Consider using na.omit(),
na.approx(), na.fill(), etc to remove or replace them.

time series starts 2015-01-05

Code

nikkei = na.omit(nikkei)
nikkei.num = as.numeric(nikkei)
plot(nikkei)

Code

nts = length(nikkei)
nikkei.sr = 100*diff(nikkei.num)/nikkei.num[-nts]
nikkei.cr = diff(log(nikkei))
plot(nikkei.sr, type = "l", ylim = c(-10,10))

Code

hist(nikkei.sr,
     breaks = 500,               
     xlim = c(-4, 4),
     probability = TRUE,         
     col = "red",
     border = "white",
     main = "Histogram of NIKKEI index",
     xlab = "Returns")

Code

set.seed(36)  # For reproducibility

data <- as.numeric(nikkei.sr)
n <- length(data)

# Define your target proportions
weights <- c(0.9, 0.7, 0.4, 0.2)

# Initialize vectors to store estimated parameters
xi <- numeric(4)
omega <- numeric(4)
alpha <- numeric(4)

# Create overlapping subsets
splits <- list()
for (i in 1:4) {
  size_i <- floor(weights[i] * n)
  splits[[i]] <- sample(data, size_i, replace = FALSE)  # sample without replacement, but from full data each time
}

# Estimate skew-normal parameters for each subset
results <- list()
for (i in 1:4) {
  fit <- selm(splits[[i]] ~ 1, family = "SN")
  params <- coef(fit, param.type = "DP")
  # Store estimates
  xi[i] <- params["xi"]
  omega[i] <- params["omega"]
  alpha[i] <- params["alpha"]
  results[[i]] <- c(params,xi,omega, alpha)
  cat(sprintf("Subset %d (%.0f%% of data): xi = %.3f, omega = %.3f, alpha = %.3f\n",
              i, weights[i]*100, params["xi"], params["omega"], params["alpha"]))
}

Subset 1 (90% of data): xi = 0.851, omega = 1.567, alpha = -0.876
Subset 2 (70% of data): xi = 0.856, omega = 1.510, alpha = -0.945
Subset 3 (40% of data): xi = 1.032, omega = 1.639, alpha = -1.162
Subset 4 (20% of data): xi = -0.887, omega = 1.628, alpha = 0.915

Code

forward_skew_loop = function(data, location.vec, scale.vec, shape) {
  results = list()
  
  for (m in 2:4) {
    n = length(data)
    
    # Generate gamma: m x m transition matrix with row sums = 1
    gamma = matrix(runif(m * m), nrow = m)
    gamma = gamma / rowSums(gamma)
    
    # Initial state probabilities (delta): uniform
    delta = rep(1/m, m)
    
    # Initialize matrices
    pred = matrix(NA, nrow = m, ncol = n)
    filtering = matrix(NA, nrow = m, ncol = n)
    
    # Time t = 1
    pred[, 1] = delta
    filtering[, 1] = pred[, 1] * dsn(x = data[1], xi = location.vec[1:m], omega = scale.vec[1:m], alpha = shape[1:m])
    llik = log(sum(filtering[, 1]))
    filtering[, 1] = filtering[, 1] / sum(filtering[, 1])
    
    # Loop over time
    for (t in 2:n) {
      pred[, t] = c(t(filtering[, t - 1]) %*% gamma)
      filtering[, t] = pred[, t] * dsn(x = data[t], xi = location.vec[1:m], omega = scale.vec[1:m], alpha = shape[1:m])
      llik = llik + log(sum(filtering[, t]))
      filtering[, t] = filtering[, t] / sum(filtering[, t])
    }
    
    results[[m]] = list(
      llik = llik,
      filtering = filtering,
      pred = pred,
      m = m,
      gamma = gamma
    )
    
  }
  
  
  
  return(results)
}

out = forward_skew_loop(data=as.numeric(nikkei.sr), 
                        location.vec = xi, 
                        scale.vec = omega, 
                        shape = alpha)


for (m in 2:4) {
  cat("HMM with", m, "components has likelihood value:", out[[m]][["llik"]], "\n")
}

HMM with 2 components has likelihood value: -4304.308 
HMM with 3 components has likelihood value: -4300.9 
HMM with 4 components has likelihood value: -4269.035

Code

# Initialize empty lists to store results
transition_m = list()
solved = list()

for (m in 2:4) {
  gamma_m = out[[m]][["gamma"]]
  
  # Compute stationary distribution
  transition_m[[m]] = solve(t(diag(m) - gamma_m + matrix(1, m, m)), rep(1, m))
  
  # Multiply with gamma to check if it's stationary
  solved[[m]] = transition_m[[m]] %*% gamma_m
  print(transition_m[[m]])
  print(solved[[m]])
  
}

[1] 0.6661874 0.3338126
          [,1]      [,2]
[1,] 0.6661874 0.3338126
[1] 0.3609897 0.3303818 0.3086285
          [,1]      [,2]      [,3]
[1,] 0.3609897 0.3303818 0.3086285
[1] 0.2717139 0.3431659 0.2318416 0.1532786
          [,1]      [,2]      [,3]      [,4]
[1,] 0.2717139 0.3431659 0.2318416 0.1532786

Code

llik = function(par, m, data) {
  # Extract and normalize transition matrix
  gamma = matrix(par[1:m^2], nrow = m, byrow = TRUE)
  gamma = gamma / apply(gamma, 1, sum)
  
  # Extract parameters
  location.vec = par[(m^2 + 1):(m^2 + m)]
  scale.vec = par[(m^2 + m + 1):(m^2 + 2*m)]
  shape.vec = par[(m^2 + 2*m + 1):(m^2 + 3*m)]
  
  # Compute stationary distribution
  delta = c(solve(t(diag(m) - gamma + matrix(1, m, m)), rep(1, m)))
  
  n = length(data)
  pred = matrix(NA, nrow = m, ncol = n)
  filtering = matrix(NA, nrow = m, ncol = n)
  
  # Time t = 1
  pred[, 1] = delta
  filtering[, 1] = pred[, 1] * dsn(x = data[1], xi = location.vec[1:m],
                                   omega = scale.vec[1:m], alpha = shape.vec[1:m])
  llik = log(sum(filtering[, 1]))
  filtering[, 1] = filtering[, 1] / sum(filtering[, 1])
  
  # Loop over time
  for (t in 2:n) {
    pred[, t] = c(t(filtering[, t - 1]) %*% gamma)
    filtering[, t] = pred[, t] * dsn(x = data[t], xi = location.vec[1:m],
                                     omega = scale.vec[1:m], alpha = shape.vec[1:m])
    llik = llik + log(sum(filtering[, t]))
    filtering[, t] = filtering[, t] / sum(filtering[, t])
  }
  
  return(llik)
}




llik_vals = numeric(4)
for (i in 2:4) {
  gamma_i = out[[i]][["gamma"]]
  location_init = xi[1:i]
  scale_init = omega[1:i]
  shape_init = alpha[1:i]
  
  par_i = c(t(gamma_i), location_init, scale_init, shape_init)  
  
  llik_vals[i] = llik(par = par_i, m = i, data = as.numeric(nikkei.sr))
  cat("Log-likelihood for", i, "components:", llik_vals[i], "\n")
}

Log-likelihood for 2 components: -4304.277 
Log-likelihood for 3 components: -4300.898 
Log-likelihood for 4 components: -4269.052

Code

llik_optim_vals = numeric(4)
llik_optim_pars = list()
for (i in 2:4){
  gamma_i = out[[i]][["gamma"]]
  location_init = xi[1:i]
  scale_init = omega[1:i]
  shape_init = alpha[1:i]
  par_i = c(t(gamma_i), location_init, scale_init, shape_init)
  outhmm=optim(par_i,
               fn=llik,
               method="L-BFGS-B",
               lower=c(rep(0.0000000001, i^2), rep(-Inf,i), rep(0.01, i), rep(-10, i)),
               upper=c(rep(1, i^2), rep(Inf, i), rep(Inf, i), rep(10, i)),
               control=list(fnscale=-1),data=as.numeric(nikkei.sr),m=i)
  llik_optim_vals[i]=outhmm$value
  llik_optim_pars[[i]]=outhmm$par
  cat("MAX llik for", i, "components:", llik_optim_vals[i], "\n")
}

MAX llik for 2 components: -3983.133 
MAX llik for 3 components: -3945.825 
MAX llik for 4 components: -3916.983

Code

aic_last_vals = numeric()
for (i in 2:4){
  aic_last_vals[i] = -2*llik_optim_vals[i]+2*(i*(i-1)+3*i)
  cat("AIC for", i, "components:", aic_last_vals[i], "\n")
}

AIC for 2 components: 7982.266 
AIC for 3 components: 7921.651 
AIC for 4 components: 7881.967

Code

best_model = which.min(aic_last_vals)
cat("\nBest model is with", best_model, "components (lowest AIC = ", aic_last_vals[best_model], ")\n")


Best model is with 4 components (lowest AIC =  7881.967 )

Code

m=best_model
gamma.mle = matrix(llik_optim_pars[[m]][1:m^2], ncol=m, byrow=TRUE)
gamma.mlev = gamma.mle/apply(gamma.mle, FUN=sum, MAR=1)
gamma.mlev

             [,1]         [,2]         [,3]         [,4]
[1,] 9.274162e-01 1.046761e-10 5.896843e-02 1.361532e-02
[2,] 1.300097e-01 6.558046e-01 2.141857e-01 8.237048e-11
[3,] 8.481752e-11 5.934721e-01 4.065279e-01 8.481752e-11
[4,] 1.024942e-01 8.978791e-11 8.978791e-11 8.975058e-01

Code

xi = llik_optim_pars[[m]][(m^2 + 1):(m^2 + m)]
omega = llik_optim_pars[[m]][(m^2 + m + 1):(m^2 + 2*m)]
alpha = llik_optim_pars[[m]][(m^2 + 2*m + 1):(m^2 + 3*m)]
cat("\nBest model's location parameter is:", llik_optim_pars[[m]][(m^2 + 1):(m^2 + m)],
    "\nBest model's shape parameter is:", llik_optim_pars[[m]][(m^2 + 2*m + 1):(m^2 + 3*m)],
    "\nBest model's scale parameter is:", llik_optim_pars[[m]][(m^2 + m + 1):(m^2 + 2*m)])


Best model's location parameter is: 0.873272 0.3671296 1.264923 -0.8585419 
Best model's shape parameter is: -1.091009 -1.500902 -1.264446 0.2061429 
Best model's scale parameter is: 1.527224 0.6268233 1.078457 3.169442

Code

llik1 = function(par, m, data) {
  results = list()
  # Extract and normalize transition matrix
  gamma = matrix(par[1:m^2], nrow = m, byrow = TRUE)
  gamma = gamma / apply(gamma, 1, sum)
  
  # Extract parameters
  location.vec = par[(m^2 + 1):(m^2 + m)]
  scale.vec = par[(m^2 + m + 1):(m^2 + 2*m)]
  shape.vec = par[(m^2 + 2*m + 1):(m^2 + 3*m)]
  
  # Compute stationary distribution
  delta = c(solve(t(diag(m) - gamma + matrix(1, m, m)), rep(1, m)))
  
  n = length(data)
  pred = matrix(NA, nrow = m, ncol = n)
  filtering = matrix(NA, nrow = m, ncol = n)
  
  # Time t = 1
  pred[, 1] = delta
  filtering[, 1] = pred[, 1] * dsn(x = data[1], xi = location.vec[1:m],
                                   omega = scale.vec[1:m], alpha = shape.vec[1:m])
  llik = log(sum(filtering[, 1]))
  filtering[, 1] = filtering[, 1] / sum(filtering[, 1])
  
  # Loop over time
  for (t in 2:n) {
    pred[, t] = c(t(filtering[, t - 1]) %*% gamma)
    filtering[, t] = pred[, t] * dsn(x = data[t], xi = location.vec[1:m],
                                     omega = scale.vec[1:m], alpha = shape.vec[1:m])
    llik = llik + log(sum(filtering[, t]))
    filtering[, t] = filtering[, t] / sum(filtering[, t])
  }
  results=list(llik = llik,
               filtering = filtering,
               pred = pred,
               delta = delta)
  return(results)
}

pars = llik_optim_pars[[m]]
forward_results <- llik1(par = llik_optim_pars[[m]], m = m, data = as.numeric(nikkei.sr))
filtering <- forward_results$filtering
most_likely_states <- apply(filtering, 2, which.max)

# Plot histograms by state using only training data
par(mfrow = c(2, 2))  # 2x2 grid for 4 states

for (state in 1:4) {
  # Filter: only training period & only current state
  data_state <- as.numeric(nikkei.sr)[most_likely_states == state]
  
  if (length(data_state) == 0) {
    plot.new()
    title(main = paste("No data for State", state))
    next
  }
  
  hist_info <- hist(data_state, probability = TRUE, breaks = 50, plot = FALSE)
  
  data_min <- min(data_state)
  data_max <- max(data_state)
  range_expansion <- (data_max - data_min) * 0.15
  
  x_min <- data_min - range_expansion
  x_max <- data_max + range_expansion
  x_vals <- seq(x_min, x_max, length.out = 500)
  y_vals <- dsn(x_vals, xi = xi[state], omega = omega[state], alpha = alpha[state])
  
  ylim_max <- max(max(hist_info$density), max(y_vals)) * 1.1
  
  hist(data_state, probability = TRUE,  breaks = 50,
       main = paste("Histogram of Returns - State", state),
       xlab = "Log-Return", ylab = "Density",
       col = "lightgray", border = "white",
       xlim = c(x_min, x_max), ylim = c(0, ylim_max))
  
  curve(dsn(x, xi = xi[state], omega = omega[state], alpha = alpha[state]),
        from = x_min, to = x_max, add = TRUE, col = "red", lwd = 2)
}

Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования
Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования
Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования
Warning in hist.default(data_state, probability = TRUE, breaks = 50, plot =
FALSE): аргумент 'probability' не для использования

Code

dates = index(nikkei[-1])
nikkei.ret = xts(nikkei.cr, order.by = dates)
dates_full <- index(nikkei.ret)
returns_full <- na.omit(as.numeric(nikkei.ret[,1]))


plot_data_train <- data.frame(
  Date = dates_full,
  Return = returns_full,
  State = factor(most_likely_states, levels = 1:4)  # Only 4 states
)

# Step 3: Plot
ggplot(plot_data_train, aes(x = Date, y = Return, color = State)) +
  geom_point(size = 0.8, alpha = 0.8) +
  scale_color_brewer(palette = "Dark2") +
  labs(title = "NIKKEI Daily Returns Colored by Hidden State (2015–2025)",
       y = "Log-Return",
       x = "Date") +
  theme_minimal()

4 Fitting the GEV distribution to the monthly maxima and minima

4.1 Generalized Extreme Value (GEV) Distribution — Theory

The GEV distribution is designed to model the extremes (either maxima or minima) of a dataset. Instead of modeling the entire distribution, it focuses only on tail events — e.g., extreme market crashes or spikes.

4.1.1 GEV CDF (Simplified)

\[F(x) = \exp\left\{ - \left[ 1 + \xi \left( \frac{x - \mu}{\sigma} \right) \right]^{-1/\xi} \right\} \quad \text{for } 1 + \xi \left( \frac{x - \mu}{\sigma} \right) > 0 \]

Where: - \(\mu\): Location parameter — shifts the distribution - \(\sigma\) > 0 : Scale parameter — controls spread - \(\xi\): Shape parameter — determines the tail behavior

4.1.2 GEV Types Based on Shape Parameter

Type	ξ (Shape)	Tail Behavior
Fréchet	ξ > 0	Heavy-tailed
Gumbel	ξ = 0	Light-tailed
Weibull	ξ < 0	Bounded upper tail

4.2 Why Use GEV in Financial Time Series?

Financial markets occasionally exhibit extreme behavior (e.g., crashes, spikes). While the HMM captures regime-switching behavior over time, it does not directly model the most extreme values within each state.

By fitting a GEV distribution to monthly maxima and minima, we gain:

Insight into tail risk,
Better understanding of risk under each regime,
Tools to estimate return levels (e.g., “What’s the largest monthly drop we expect every 10 years?”).

4.3 Monthly analysis of STOXX600

Code

sx600 = get.hist.quote(instrument = "^stoxx", start = "2015-01-01", end = "2025-04-30",
                       quote = "AdjClose")

time series starts 2015-01-05
time series ends   2025-04-29

Code

# Ensure column name is standardized
colnames(sx600) <- "AdjClose"

# Convert to a data frame with an explicit date column
sx600_df <- data.frame(
  date = index(sx600),
  price = as.numeric(sx600$AdjClose)  # extract as numeric
)

# Add a "month" column
sx600_df <- sx600_df %>%
  mutate(month = floor_date(date, "month"))

# Compute monthly maxima and minima
monthly_extremes <- sx600_df %>%
  group_by(month) %>%
  summarise(
    monthly_max = max(price, na.rm = TRUE),
    monthly_min = min(price, na.rm = TRUE),
    .groups = "drop"
  )

# View result
head(monthly_extremes)

# A tibble: 6 × 3
  month      monthly_max monthly_min
  <date>           <dbl>       <dbl>
1 2015-01-01        372.        332.
2 2015-02-01        392.        367.
3 2015-03-01        404.        388.
4 2015-04-01        414.        396.
5 2015-05-01        409.        389.
6 2015-06-01        401.        381.

Code

library(ggplot2)

Code

ggplot(monthly_extremes, aes(x = month)) +
  geom_ribbon(aes(ymin = monthly_min, ymax = monthly_max), fill = "skyblue", alpha = 0.3) +
  geom_line(aes(y = monthly_max), color = "blue", size = 1) +
  geom_line(aes(y = monthly_min), color = "red", size = 1) +
  labs(title = "Monthly Price Range of STOXX600",
       x = "Month",
       y = "Adjusted Close Price") +
  theme_minimal()

Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

Code

library(ggplot2)
library(tidyr)

# Reshape data for easy plotting
extreme_long <- monthly_extremes %>%
  pivot_longer(cols = c(monthly_max, monthly_min),
               names_to = "type", values_to = "value")

# Plot histograms
ggplot(extreme_long, aes(x = value, fill = type)) +
  geom_histogram(alpha = 0.6, bins = 15, position = "identity", color = "white") +
  facet_wrap(~ type, scales = "free") +
  scale_fill_manual(values = c("monthly_max" = "blue", "monthly_min" = "red"),
                    labels = c("Monthly Maxima", "Monthly Minima")) +
  labs(title = "Histogram of Monthly Maxima and Minima",
       x = "Adjusted Close Price",
       y = "Frequency",
       fill = "Type") +
  theme_minimal()

Code

#install.packages("ismev")
# Load the required package
library(ismev)

Загрузка требуемого пакета: mgcv

Загрузка требуемого пакета: nlme


Присоединяю пакет: 'nlme'

Следующий объект скрыт от 'package:dplyr':

    collapse

This is mgcv 1.9-1. For overview type 'help("mgcv-package")'.

Code

# Fit GEV to monthly maxima
gev_max_fit <- gev.fit(monthly_extremes$monthly_max)

$conv
[1] 0

$nllh
[1] 665.5927

$mle
[1] 394.70275481  42.54372301   0.05627224

$se
[1] 4.59869221 3.55441103 0.09788299

Code

# Fit GEV to negative of monthly minima (to treat minima as maxima)
gev_min_fit <- gev.fit(-monthly_extremes$monthly_min)  # Flip sign for minima

$conv
[1] 0

$nllh
[1] 671.8119

$mle
[1] -415.9909219   59.4110871   -0.4117731

$se
[1] 5.71413360 4.10154189 0.04101183

Code

gev.profxi(gev_max_fit, xlow=-0.15, xup=0.35, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.prof(gev_max_fit, m=60,xlow=530, xup=750, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.diag(gev_max_fit)

Code

gev.profxi(gev_min_fit, xlow=-0.5, xup=-0.3, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.prof(gev_min_fit, m=60,xlow=-310, xup=-280, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.diag(gev_min_fit)

4.4 Monthly analysis of SP500

Code

sp500 = get.hist.quote(instrument = "^gspc", start = "2015-01-01", end = "2025-04-30"
                       , quote = "AdjClose")

time series starts 2015-01-02
time series ends   2025-04-29

Code

# Ensure column name is standardized
colnames(sp500) <- "AdjClose"

# Convert to a data frame with an explicit date column
sp500_df <- data.frame(
  date = index(sp500),
  price = as.numeric(sp500$AdjClose)  # extract as numeric
)

# Add a "month" column
sp500_df <- sp500_df %>%
  mutate(month = floor_date(date, "month"))

# Compute monthly maxima and minima
monthly_extremes <- sp500_df %>%
  group_by(month) %>%
  summarise(
    monthly_max = max(price, na.rm = TRUE),
    monthly_min = min(price, na.rm = TRUE),
    .groups = "drop"
  )

# View result
head(monthly_extremes)

# A tibble: 6 × 3
  month      monthly_max monthly_min
  <date>           <dbl>       <dbl>
1 2015-01-01       2063.       1993.
2 2015-02-01       2115.       2021.
3 2015-03-01       2117.       2040.
4 2015-04-01       2118.       2060.
5 2015-05-01       2131.       2080.
6 2015-06-01       2124.       2058.

Code

ggplot(monthly_extremes, aes(x = month)) +
  geom_ribbon(aes(ymin = monthly_min, ymax = monthly_max), fill = "skyblue", alpha = 0.3) +
  geom_line(aes(y = monthly_max), color = "blue", size = 1) +
  geom_line(aes(y = monthly_min), color = "red", size = 1) +
  labs(title = "Monthly Price Range of SP500",
       x = "Month",
       y = "Adjusted Close Price") +
  theme_minimal()

Code

# Reshape data for easy plotting
extreme_long <- monthly_extremes %>%
  pivot_longer(cols = c(monthly_max, monthly_min),
               names_to = "type", values_to = "value")

# Plot histograms
ggplot(extreme_long, aes(x = value, fill = type)) +
  geom_histogram(alpha = 0.6, bins = 15, position = "identity", color = "white") +
  facet_wrap(~ type, scales = "free") +
  scale_fill_manual(values = c("monthly_max" = "blue", "monthly_min" = "red"),
                    labels = c("Monthly Maxima", "Monthly Minima")) +
  labs(title = "Histogram of Monthly Maxima and Minima",
       x = "Adjusted Close Price",
       y = "Frequency",
       fill = "Type") +
  theme_minimal()

Code

# Fit GEV to monthly maxima
gev_max_fit <- gev.fit(monthly_extremes$monthly_max)

$conv
[1] 0

$nllh
[1] 1045.977

$mle
[1] 2.929553e+03 9.041336e+02 7.576205e-02

$se
[1] 108.3372738  88.0623152   0.1358157

Code

# Fit GEV to negative of monthly minima (to treat minima as maxima)
gev_min_fit <- gev.fit(-monthly_extremes$monthly_min)  # Flip sign for minima

$conv
[1] 0

$nllh
[1] 1026.645

$mle
[1] -3426.8819235  1279.5105503    -0.7963211

$se
[1] 119.37781148 106.58968288   0.05076008

Code

gev.profxi(gev_max_fit, xlow=-0.25, xup=0.5, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.prof(gev_max_fit, m=60,xlow=5900, xup=11100, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.diag(gev_max_fit)

Code

gev.profxi(gev_min_fit, xlow=-1.2, xup=-0.6, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.prof(gev_min_fit, m=60,xlow=-1920, xup=-1800, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.diag(gev_min_fit)

4.5 Monthly analysis of NIKKEI225

Code

nikkei = get.hist.quote(instrument = "^N225", start = "2015-01-01", end = "2025-05-30"
                       , quote = "AdjClose")

Warning: ^N225 contains missing values. Some functions will not work if objects
contain missing values in the middle of the series. Consider using na.omit(),
na.approx(), na.fill(), etc to remove or replace them.

time series starts 2015-01-05

Code

nikkei = na.omit(nikkei)
# Ensure column name is standardized
colnames(nikkei) <- "AdjClose"

# Convert to a data frame with an explicit date column
nikkei_df <- data.frame(
  date = index(nikkei),
  price = as.numeric(nikkei$AdjClose)  # extract as numeric
)

# Add a "month" column
nikkei_df <- nikkei_df %>%
  mutate(month = floor_date(date, "month"))

# Compute monthly maxima and minima
monthly_extremes <- nikkei_df %>%
  group_by(month) %>%
  summarise(
    monthly_max = max(price, na.rm = TRUE),
    monthly_min = min(price, na.rm = TRUE),
    .groups = "drop"
  )

# View result
head(monthly_extremes)

# A tibble: 6 × 3
  month      monthly_max monthly_min
  <date>           <dbl>       <dbl>
1 2015-01-01      17796.      16796.
2 2015-02-01      18798.      17336.
3 2015-03-01      19754.      18665.
4 2015-04-01      20188.      19035.
5 2015-05-01      20563.      19292.
6 2015-06-01      20868.      19991.

Code

ggplot(monthly_extremes, aes(x = month)) +
  geom_ribbon(aes(ymin = monthly_min, ymax = monthly_max), fill = "skyblue", alpha = 0.3) +
  geom_line(aes(y = monthly_max), color = "blue", size = 1) +
  geom_line(aes(y = monthly_min), color = "red", size = 1) +
  labs(title = "Monthly Price Range of NIKKEI225",
       x = "Month",
       y = "Adjusted Close Price") +
  theme_minimal()

Code

# Reshape data for easy plotting
extreme_long <- monthly_extremes %>%
  pivot_longer(cols = c(monthly_max, monthly_min),
               names_to = "type", values_to = "value")

# Plot histograms
ggplot(extreme_long, aes(x = value, fill = type)) +
  geom_histogram(alpha = 0.6, bins = 15, position = "identity", color = "white") +
  facet_wrap(~ type, scales = "free") +
  scale_fill_manual(values = c("monthly_max" = "blue", "monthly_min" = "red"),
                    labels = c("Monthly Maxima", "Monthly Minima")) +
  labs(title = "Histogram of Monthly Maxima and Minima",
       x = "Adjusted Close Price",
       y = "Frequency",
       fill = "Type") +
  theme_minimal()

Code

# Fit GEV to monthly maxima
gev_max_fit <- gev.fit(monthly_extremes$monthly_max)

$conv
[1] 0

$nllh
[1] 1267.901

$mle
[1] 2.256051e+04 4.801743e+03 1.486045e-01

$se
[1] 513.0572819 407.7429910   0.1003568

Code

# Fit GEV to negative of monthly minima (to treat minima as maxima)
gev_min_fit <- gev.fit(-monthly_extremes$monthly_min)  # Flip sign for minima

$conv
[1] 0

$nllh
[1] 1260.507

$mle
[1] -2.632558e+04  6.658697e+03 -5.531426e-01

$se
[1] 682.20722370 474.01405360   0.05680979

Code

gev.profxi(gev_max_fit, xlow=-0.1, xup=0.4, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.prof(gev_max_fit, m=60,xlow=40000, xup=70000, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.diag(gev_max_fit)

Code

gev.profxi(gev_min_fit, xlow=-1, xup=-0.4, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.prof(gev_min_fit, m=60,xlow=-16000, xup=-14000, conf = 0.95)

If routine fails, try changing plotting interval

Code

gev.diag(gev_min_fit)