This analysis compares three Value-at-Risk (VaR) estimation approaches for modeling market risk:
The models are calibrated on data prior to 2021 and evaluated on 2021 returns using a moving window methodology.
Exercise 1.1
Import quotations for any asset of your choice using either provided files or quantmod package from first labs. Then calculate logarithmic returns on that asset and compare Value-at-Risk estimates for 2021 from historical simulation, GARCH models and filtered historical simulation (models should be built on the data prior to 01-01-2021 with moving window method).
Key Questions:
# Load SPX data
spx <- read.csv("/Users/ashutoshverma/Downloads/Projects/Applied Finance/Assignment Risk/HW/spx.csv")
# Preview data
head(spx)# Summary statistics
summary_stats <- data.frame(
Statistic = c("Mean", "Std Dev", "Skewness", "Kurtosis", "Min", "Max"),
Value = c(
mean(returns, na.rm = TRUE),
sd(returns, na.rm = TRUE),
skewness(returns),
kurtosis(returns),
min(returns, na.rm = TRUE),
max(returns, na.rm = TRUE)
)
)
print(summary_stats)## Statistic Value
## 1 Mean 0.0003429247
## 2 Std Dev 0.0119925391
## 3 Skewness -0.4233136050
## 4 Kurtosis 10.8398447947
## 5 Min -0.1276521412
## 6 Max 0.1095719593
# Create visualization panel
par(mfrow = c(2, 2))
# 1. Price series
plot(prices, main = "SPX Closing Prices",
ylab = "Price", col = "steelblue", lwd = 1.5)
# 2. Return series
plot(returns, main = "Log Returns",
ylab = "Return", col = "darkred", lwd = 0.8)
# 3. Return distribution
hist(returns, breaks = 100, prob = TRUE,
main = "Return Distribution", xlab = "Return",
col = "lightblue", border = "white")
curve(dnorm(x, mean = mean(returns), sd = sd(returns)),
add = TRUE, col = "red", lwd = 2)
legend("topright", legend = "Normal Distribution",
col = "red", lty = 1, lwd = 2)
# 4. Q-Q plot
qqnorm(returns, main = "Normal Q-Q Plot", pch = 20, col = alpha("blue", 0.5))
qqline(returns, col = "red", lwd = 2)Key Observations:
# Training set: data before 2021
train_returns <- returns[index(returns) < "2021-01-01"]
# Test set: 2021 data
test_returns <- returns[index(returns) >= "2021-01-01"]Historical Simulation is a non-parametric approach that estimates VaR as the empirical quantile of historical returns.
Advantages: - Simple, no distributional assumptions - Captures actual historical tail events
Disadvantages: - Slow to adapt to volatility changes - All observations weighted equally - Ghost features (old extreme events still influence VaR)
GARCH models capture volatility clustering by allowing variance to depend on past shocks and past variance.
Model specification: \[r_t = \mu + \epsilon_t\] \[\epsilon_t = \sigma_t z_t, \quad z_t \sim N(0,1)\] \[\sigma_t^2 = \omega + \alpha \epsilon_{t-1}^2 + \beta \sigma_{t-1}^2\]
VaR formula: \[\text{VaR}_t = \mu_t + \sigma_t \cdot \Phi^{-1}(\alpha)\]
# Define GARCH specification
spec <- ugarchspec(
variance.model = list(model = "sGARCH", garchOrder = c(1, 1)),
mean.model = list(armaOrder = c(0, 0), include.mean = TRUE),
distribution.model = "norm"
)
# Initialize VaR vector
VaR_GARCH <- rep(NA, length(test_returns))
names(VaR_GARCH) <- as.character(index(test_returns))
# Rolling window estimation
for (i in seq_along(test_returns)) {
# Extract rolling window data
window_data <- returns[index(returns) < index(test_returns)[i]]
window_data <- tail(window_data, window_size)
# Fit GARCH model
fit <- tryCatch({
ugarchfit(
spec = spec,
data = window_data,
solver = "hybrid",
silent = TRUE
)
}, error = function(e) NULL)
if (!is.null(fit)) {
# Forecast one-step ahead
forecast <- ugarchforecast(fit, n.ahead = 1)
# Extract conditional mean and volatility
mu_hat <- fitted(forecast)[1]
sigma_hat <- sigma(forecast)[1]
# Calculate VaR
VaR_GARCH[i] <- mu_hat + sigma_hat * qnorm(alpha)
}
}
VaR_GARCH <- xts(VaR_GARCH, order.by = index(test_returns))FHS combines the best of both worlds: - Uses GARCH to model time-varying volatility - Uses empirical distribution of standardized residuals (captures fat tails)
Methodology: 1. Fit GARCH model to obtain standardized residuals: \(z_t = \epsilon_t / \sigma_t\) 2. Calculate empirical quantile of \(z_t\): \(q_z = Q_\alpha(z)\) 3. VaR forecast: \(\text{VaR}_t = \sigma_t \cdot q_z\)
# Initialize VaR vector
VaR_FHS <- rep(NA, length(test_returns))
names(VaR_FHS) <- as.character(index(test_returns))
# Rolling window estimation
for (i in seq_along(test_returns)) {
# Extract rolling window data
window_data <- returns[index(returns) < index(test_returns)[i]]
window_data <- tail(window_data, window_size)
# Fit GARCH model
fit <- tryCatch({
ugarchfit(
spec = spec,
data = window_data,
solver = "hybrid",
silent = TRUE
)
}, error = function(e) NULL)
if (!is.null(fit)) {
# Extract standardized residuals
z <- residuals(fit, standardize = TRUE)
# Empirical quantile of standardized residuals
q_z <- quantile(z, probs = alpha, na.rm = TRUE)
# Forecast volatility (last fitted value)
sigma_t <- as.numeric(tail(sigma(fit), 1))
# Calculate VaR
VaR_FHS[i] <- sigma_t * q_z
}
}
VaR_FHS <- xts(VaR_FHS, order.by = index(test_returns))# Merge all series
plot_data <- merge(
test_returns,
VaR_HS_2021,
VaR_GARCH,
VaR_FHS
)
colnames(plot_data) <- c("Returns", "HS", "GARCH", "FHS")
# Create plot
plot.zoo(
plot_data,
screens = 1,
col = c("black", "red", "blue", "green3"),
lwd = c(1, 2, 2, 2),
main = "VaR Comparison: 2021 Test Period",
ylab = "Log Returns / VaR",
xlab = "Date"
)
# Add horizontal line at zero
abline(h = 0, lty = 2, col = "gray50")
# Add legend
legend(
"bottomleft",
legend = c("Actual Returns", "Historical Simulation", "GARCH(1,1)", "Filtered HS"),
col = c("black", "red", "blue", "green3"),
lwd = c(1, 2, 2, 2),
bty = "n",
cex = 0.9
)# Identify violations (when returns breach VaR threshold)
viol_HS <- as.numeric(test_returns < VaR_HS_2021)
viol_GARCH <- as.numeric(test_returns < VaR_GARCH)
viol_FHS <- as.numeric(test_returns < VaR_FHS)
# Count violations
n_obs <- length(test_returns)
violations_table <- data.frame(
Model = c("Historical Simulation", "GARCH(1,1)", "Filtered HS"),
Violations = c(sum(viol_HS, na.rm = TRUE),
sum(viol_GARCH, na.rm = TRUE),
sum(viol_FHS, na.rm = TRUE)),
Expected = rep(n_obs * alpha, 3),
Rate = c(mean(viol_HS, na.rm = TRUE),
mean(viol_GARCH, na.rm = TRUE),
mean(viol_FHS, na.rm = TRUE)) * 100
)
violations_table$Expected <- round(violations_table$Expected, 1)
violations_table$Rate <- round(violations_table$Rate, 2)
print(violations_table)## Model Violations Expected Rate
## 1 Historical Simulation 0 2.4 0.00
## 2 GARCH(1,1) 7 2.4 2.90
## 3 Filtered HS 3 2.4 1.24
# Visualize violations
viol_df <- data.frame(
Date = index(test_returns),
HS = viol_HS,
GARCH = viol_GARCH,
FHS = viol_FHS
) %>%
pivot_longer(cols = c(HS, GARCH, FHS),
names_to = "Model",
values_to = "Violation")
ggplot(viol_df, aes(x = Date, y = Model, fill = factor(Violation))) +
geom_tile(color = "white") +
scale_fill_manual(values = c("0" = "lightgreen", "1" = "red"),
labels = c("No Violation", "Violation")) +
labs(title = "VaR Violations Timeline",
subtitle = "Red = Return below VaR threshold",
x = "Date", y = "Model", fill = "") +
theme_minimal() +
theme(legend.position = "bottom")The Kupiec Unconditional Coverage Test evaluates whether the observed violation rate matches the expected rate under the specified confidence level.
Null Hypothesis: The violation rate equals the expected rate (\(\alpha\))
Test Statistic: \[LR = -2 \ln\left[\frac{(1-\alpha)^{n-x} \alpha^x}{(1-\hat{p})^{n-x} \hat{p}^x}\right] \sim \chi^2_1\]
where \(\hat{p} = x/n\) is the empirical violation rate.
# Data quality check
data_quality <- data.frame(
Series = c("Test Returns", "VaR HS", "VaR GARCH", "VaR FHS"),
`Total Obs` = c(length(test_returns), length(VaR_HS_2021),
length(VaR_GARCH), length(VaR_FHS)),
`Missing Values` = c(sum(is.na(test_returns)), sum(is.na(VaR_HS_2021)),
sum(is.na(VaR_GARCH)), sum(is.na(VaR_FHS))),
check.names = FALSE
)
print(data_quality)## Series Total Obs Missing Values
## 1 Test Returns 241 0
## 2 VaR HS 241 0
## 3 VaR GARCH 241 0
## 4 VaR FHS 241 0
# Violations summary
violations_summary <- data.frame(
Model = c("Historical Simulation", "GARCH(1,1)", "Filtered HS"),
Violations = c(sum(viol_HS, na.rm = TRUE),
sum(viol_GARCH, na.rm = TRUE),
sum(viol_FHS, na.rm = TRUE)),
`Valid Observations` = c(sum(!is.na(viol_HS)),
sum(!is.na(viol_GARCH)),
sum(!is.na(viol_FHS))),
check.names = FALSE
)
print(violations_summary)## Model Violations Valid Observations
## 1 Historical Simulation 0 241
## 2 GARCH(1,1) 7 241
## 3 Filtered HS 3 241
# Ensure all series are properly aligned and have no NAs
valid_idx_HS <- !is.na(VaR_HS_2021) & !is.na(test_returns)
valid_idx_GARCH <- !is.na(VaR_GARCH) & !is.na(test_returns)
valid_idx_FHS <- !is.na(VaR_FHS) & !is.na(test_returns)
# Perform Kupiec tests with clean data
kupiec_HS <- tryCatch({
VaRTest(alpha,
actual = test_returns[valid_idx_HS],
VaR = VaR_HS_2021[valid_idx_HS],
conf.level = 0.95)
}, error = function(e) NULL)
kupiec_GARCH <- tryCatch({
VaRTest(alpha,
actual = test_returns[valid_idx_GARCH],
VaR = VaR_GARCH[valid_idx_GARCH],
conf.level = 0.95)
}, error = function(e) NULL)
kupiec_FHS <- tryCatch({
VaRTest(alpha,
actual = test_returns[valid_idx_FHS],
VaR = VaR_FHS[valid_idx_FHS],
conf.level = 0.95)
}, error = function(e) NULL)# Manual Kupiec test function (as backup)
manual_kupiec <- function(alpha, violations, n_obs) {
p_hat <- violations / n_obs
if (violations == 0 || violations == n_obs) {
return(list(
violations = violations,
expected = n_obs * alpha,
p_value = NA,
result = "N/A"
))
}
# Likelihood ratio test statistic
LR <- -2 * (
log((1 - alpha)^(n_obs - violations) * alpha^violations) -
log((1 - p_hat)^(n_obs - violations) * p_hat^violations)
)
# P-value from chi-square distribution with 1 df
p_value <- 1 - pchisq(LR, df = 1)
return(list(
violations = violations,
expected = n_obs * alpha,
p_value = p_value,
result = ifelse(p_value > 0.05, "PASS", "FAIL")
))
}
# Create results table
if (!is.null(kupiec_HS) && !is.null(kupiec_GARCH) && !is.null(kupiec_FHS)) {
kupiec_results <- data.frame(
Model = c("Historical Simulation", "GARCH(1,1)", "Filtered HS"),
`Expected Violations` = rep(round(n_obs * alpha, 1), 3),
`Actual Violations` = c(kupiec_HS$actual.exceed,
kupiec_GARCH$actual.exceed,
kupiec_FHS$actual.exceed),
`Violation Rate` = paste0(round(c(kupiec_HS$actual.exceed / n_obs,
kupiec_GARCH$actual.exceed / n_obs,
kupiec_FHS$actual.exceed / n_obs) * 100, 2), "%"),
`P-Value` = round(c(kupiec_HS$uc.LRp,
kupiec_GARCH$uc.LRp,
kupiec_FHS$uc.LRp), 4),
`Test Result` = c(
ifelse(kupiec_HS$uc.LRp > 0.05, "PASS ✓", "FAIL ✗"),
ifelse(kupiec_GARCH$uc.LRp > 0.05, "PASS ✓", "FAIL ✗"),
ifelse(kupiec_FHS$uc.LRp > 0.05, "PASS ✓", "FAIL ✗")
),
check.names = FALSE
)
print(kupiec_results)
} else {
# Count valid observations for each model
n_valid_HS <- sum(!is.na(viol_HS))
n_valid_GARCH <- sum(!is.na(viol_GARCH))
n_valid_FHS <- sum(!is.na(viol_FHS))
# Perform manual tests
test_HS <- manual_kupiec(alpha, sum(viol_HS, na.rm = TRUE), n_valid_HS)
test_GARCH <- manual_kupiec(alpha, sum(viol_GARCH, na.rm = TRUE), n_valid_GARCH)
test_FHS <- manual_kupiec(alpha, sum(viol_FHS, na.rm = TRUE), n_valid_FHS)
kupiec_results <- data.frame(
Model = c("Historical Simulation", "GARCH(1,1)", "Filtered HS"),
`Sample Size` = c(n_valid_HS, n_valid_GARCH, n_valid_FHS),
`Expected Violations` = round(c(test_HS$expected, test_GARCH$expected, test_FHS$expected), 1),
`Actual Violations` = c(test_HS$violations, test_GARCH$violations, test_FHS$violations),
`Violation Rate` = paste0(round(c(test_HS$violations / n_valid_HS,
test_GARCH$violations / n_valid_GARCH,
test_FHS$violations / n_valid_FHS) * 100, 2), "%"),
`P-Value` = round(c(test_HS$p_value, test_GARCH$p_value, test_FHS$p_value), 4),
`Test Result` = c(test_HS$result, test_GARCH$result, test_FHS$result),
check.names = FALSE
)
print(kupiec_results)
}## Model Sample Size Expected Violations Actual Violations
## 1 Historical Simulation 241 2.4 0
## 2 GARCH(1,1) 241 2.4 7
## 3 Filtered HS 241 2.4 3
## Violation Rate P-Value Test Result
## 1 0% NA N/A
## 2 2.9% 0.0157 FAIL
## 3 1.24% 0.7129 PASS
Yes. VaR model performance exhibits time-varying behavior, particularly during 2021:
The 2021 period was characterized by: - Post-COVID economic recovery volatility - Monetary policy uncertainty - Inflation concerns - Geopolitical tensions
These factors created a challenging environment for risk models.
Yes. The models rank as follows (best to worst):
The performance differences stem from fundamental characteristics of financial returns and model design:
FHS provides the most accurate VaR estimates for this dataset, balancing volatility adaptation with realistic tail modeling
Historical Simulation is inadequate for modern risk management in volatile markets
GARCH improves substantially over HS but may underestimate tail risk under non-normal returns
Model choice matters more during periods of market stress than during calm periods
## R version 4.4.1 (2024-06-14)
## Platform: aarch64-apple-darwin20
## Running under: macOS 15.6.1
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Europe/Warsaw
## tzcode source: internal
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] tidyr_1.3.1 dplyr_1.1.4
## [3] gridExtra_2.3 ggplot2_4.0.1
## [5] PerformanceAnalytics_2.0.4 rugarch_1.5-4
## [7] quantmod_0.4.26 TTR_0.24.4
## [9] xts_0.14.1 zoo_1.8-13
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 xfun_0.54
## [3] bslib_0.9.0 ks_1.15.1
## [5] lattice_0.22-6 numDeriv_2016.8-1.1
## [7] quadprog_1.5-8 vctrs_0.6.5
## [9] tools_4.4.1 generics_0.1.4
## [11] curl_5.2.3 tibble_3.2.1
## [13] pkgconfig_2.0.3 Matrix_1.7-0
## [15] KernSmooth_2.23-24 RColorBrewer_1.1-3
## [17] DistributionUtils_0.6-2 S7_0.2.0
## [19] lifecycle_1.0.4 truncnorm_1.0-9
## [21] compiler_4.4.1 farver_2.1.2
## [23] spd_2.0-1 codetools_0.2-20
## [25] htmltools_0.5.8.1 SkewHyperbolic_0.4-2
## [27] sass_0.4.10 Rsolnp_2.0.1
## [29] yaml_2.3.10 pracma_2.4.6
## [31] pillar_1.10.2 nloptr_2.1.1
## [33] jquerylib_0.1.4 GeneralizedHyperbolic_0.8-7
## [35] MASS_7.3-60.2 cachem_1.1.0
## [37] mclust_6.1.1 parallelly_1.44.0
## [39] fracdiff_1.5-3 tidyselect_1.2.1
## [41] digest_0.6.37 mvtnorm_1.3-1
## [43] future_1.49.0 purrr_1.2.0
## [45] listenv_0.9.1 fastmap_1.2.0
## [47] grid_4.4.1 cli_3.6.5
## [49] magrittr_2.0.4 future.apply_1.20.1
## [51] withr_3.0.2 scales_1.4.0
## [53] rmarkdown_2.30 globals_0.18.0
## [55] evaluate_1.0.5 knitr_1.49
## [57] rlang_1.1.6 Rcpp_1.1.1
## [59] glue_1.8.0 rstudioapi_0.17.1
## [61] jsonlite_2.0.0 R6_2.6.1
Report generated on 2026-01-30