Volatility is an important concept in financial markets because it is directly related to risk. Higher volatility generally indicates higher risk, as it means that the asset’s price is more likely to move dramatically in either direction, potentially resulting in greater losses or gains for investors.
Volatility is also important for financial modeling and analysis, as it helps to inform decisions related to risk management, portfolio construction, and asset valuation. Volatility models, such as ARIMA and GARCH models, are used to forecast future volatility based on historical patterns, and this information can be used to inform trading strategies and investment decisions.
Historical Volatility: Historical volatility measures the actual volatility of an asset over a specific historical period, usually calculated as the standard deviation of the asset’s returns over that period.
Implied Volatility: Implied volatility is a measure of the expected future volatility of an asset, as implied by the current market price of its options contracts. It is calculated using an options pricing model and represents the market’s consensus on how volatile the asset is likely to be in the future.
Realized Volatility: Realized volatility is a measure of the actual volatility of an asset over a specific recent period, usually calculated as the standard deviation of its returns over that period. It differs from historical volatility in that it uses more recent data to estimate volatility.
#Clear working environment
#rm(list = ls())
#Install packages
#install.packages("quantmod")
library(quantmod)
#getSymbols(c("","",...)) -> obtain historical quotes
getSymbols(c("AAPL", "TSLA"))
s1p <- AAPL[,6]
s1r <- dailyReturn(s1p)
s2p <- TSLA[,6]
s2r <- dailyReturn(s2p)
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 4129 43.66 49.22 23.12 34.63 25.36 2.37 180.43 178.06 1.42 0.63
## se
## X1 0.77
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 3251 62.41 96.26 16.54 41.35 13.19 1.05 409.97 408.92 1.69 1.48
## se
## X1 1.69
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 4129 0 0.02 0 0 0.01 -0.18 0.14 0.32 -0.13 5.8 0
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 3251 0 0.04 0 0 0.03 -0.21 0.24 0.45 0.32 4.97 0
#install.packages("psych")
library(psych)
describe(s1p)
describe(s2p)
describe(s1r)
describe(s2r)
Autoregressive (AR) component: The AR component of an ARIMA model refers to the dependence of the current value of the series on its past values. Order = “p”
Integrated (I) component: The I component of an ARIMA model refers to the degree of differencing needed to make the time series stationary. Stationary time series have a constant mean and variance over time, making them easier to model and analyze. Order = “I”
Moving Average (MA) component: The MA component of an ARIMA model refers to the dependence of the current value of the series on the past errors (the difference between the predicted and actual values) of the series. Order = “q”
A stationary time series is one whose properties do not depend on the time at which the series is observed. It does not matter when you observe it, it should look much the same at any point in time.
In practice, we would take the difference until we achieve the stationarity (thus, the I component)
par(mfrow = c(2,2))
acf(s1p)
acf(s1r)
acf(s2p)
acf(s2r)
par(mfrow = c(2,2))
pacf(s1p)
pacf(s1r)
pacf(s2p)
pacf(s2r)
One common method for estimating ARIMA model parameters is maximum likelihood estimation (MLE), which involves selecting the parameters that maximize the likelihood function of the observed data.
One approach to model selection is to use information criteria such as Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC).
\(AIC = -2*log(likelihood) + 2*k\)
where k is the number of parameters in the model.
\(BIC = -2*log(likelihood) + log(n)*k\)
where n is the number of observations in the time series.
In general, lower values of AIC and BIC indicate a better model fit.
# ARIMA
res1 <- arima(s1p, order = c(1,0,0))
res1
res2 <- arima(s1p, order = c(1,1,1))
res2
res3 <- arima(s1p, order = c(2,1,1))
res3
# AUTO.ARIMA
# install.packages("forecast")
library(forecast)
res4 <- auto.arima(s1p)
res4
Interpret the model parameters and their significance.
Measures of forecasting accuracy, such as mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), or mean absolute percentage error (MAPE)
The residuals should be normally distributed and uncorrelated, with no remaining patterns or trends. Apply Ljung-Box Test, Augmented Dickey-Fuller Test, QQ Plot, Histogram, Jacque-Bera Test etc. on the residuals.
# We need tseries library for some test statistics
# install.packages("tseries")
library(tseries)
# Obtain residuals and run some diagnostics
err4 <- res4$residuals
fit4 <- res4$fitted
# Calculate model accuracy measures
describe(err4)
mse4 <- mean(err4^2)
mad4 <- mean(abs(err4))
mape4 <- mean(abs(err4/fit4))
# Run a few more tests
# Ljung-Box Test -> H1: Error is not stationary
Box.test(err4, lag = 16, type = "Ljung")
# ADF Test -> H1: Error has no unit root
adf.test(err4)
# Jarque-Bera Test -> H1: Error is not normal
jarque.bera.test(err4)
# Run a few more tests
# Ljung-Box Test -> H1: Error is not stationary
Box.test(err, lag = 16, type = "Ljung")
# ADF Test -> H1: Error has no unit root
adf.test(err)
# Jarque-Bera Test -> H1: Error is not normal
jarque.bera.test(err)
$$
\[\begin{aligned} r_t &= `ARIMA' + \sigma_t\epsilon_t \\ \sigma_t &= `GARCH' \\ \epsilon_t &\rightarrow `Distribution' \end{aligned}\]$$
GARCH (Generalized Autoregressive Conditional Heteroskedasticity) models are a type of time series model that are commonly used to analyze and model volatility in financial markets. The GARCH model consists of two components: the ARCH component and the GARCH component.
The ARCH component is defined as:
\[ \mathrm{ARCH}(p) : \sigma_t^2 = \omega + \sum_{i=1}^p \alpha_i \epsilon_{t-i}^2 \]
where \(\sigma_t^2\) is the conditional variance at time \(t\), \(\omega\) is a constant, \(\alpha_i\) are the ARCH parameters, and \(\epsilon_{t-i}^2\) are the squared residuals at time \(t-i\).
The GARCH component is defined as:
\[ \mathrm{GARCH}(q) : \sigma_t^2 = \omega + \sum_{i=1}^p \alpha_i \epsilon_{t-i}^2 + \sum_{j=1}^q \beta_j \sigma_{t-j}^2 \]
where \(\beta_j\) are the GARCH parameters, and \(\sigma_{t-j}^2\) are the past conditional variances.
Together, the ARCH and GARCH components form the GARCH(p, q) model, which can be expressed as:
\[ \mathrm{GARCH}(p, q) : \sigma_t^2 = \omega + \sum_{i=1}^p \alpha_i \epsilon_{t-i}^2 + \sum_{j=1}^q \beta_j \sigma_{t-j}^2 \]
# install.packages("rugarch")
library(rugarch)
gspec1 <- ugarchspec(variance.model = list(model = "sGARCH", garchOrder = c(1, 1)),
mean.model = list(armaOrder = c(0, 0)),
distribution.model = "std")
# You can use different variance, mean and distribution models
gfit1 <- ugarchfit(gspec1, s2r)
gfit1
# Run a few more tests
# Ljung-Box Test -> H1: Error is not stationary
Box.test(gerr1, lag = 16, type = "Ljung")
# ADF Test -> H1: Error has no unit root
adf.test(gerr1)
# Jarque-Bera Test -> H1: Error is not normal
jarque.bera.test(gerr1)
Exponential GARCH (EGARCH) models: These models allow for asymmetric volatility, where negative and positive shocks have different impacts on volatility.
Threshold GARCH (TGARCH) models: These models allow for changes in the volatility structure based on a threshold value. When the time series values exceed the threshold, the volatility structure changes.
Component GARCH (CGARCH) models: These models decompose the volatility of a time series into two components: a long-term component and a short-term component.
GARCH-in-mean (GARCH-M) models: These models allow the conditional variance to enter the conditional mean equation.
Markov-switching GARCH (MS-GARCH) models: These models allow for changes in the volatility structure based on a hidden Markov process.
getSymbols('TSLA')
getSymbols('BTC-USD')
getSymbols('USDTHB=X')