A. Model Introduction

One fundamental assumption for the classic linear regression model is homoskedasticity, meaning the variance of the error term is constant. However, in financial data, we often observe time-varying variance (heteroskedasticity). This is where a GARCH model (Generalized Autoregressive Conditional Heteroskedasticity) comes into play.

The GARCH model describes the variance of the current error term as following an ARMA process, instead of being constant.

Model Formula:

\[ y_t = x_t + \epsilon_t \]

Where: - \(y_t\) is a stock return, - \(x_t\) is a mean-reverting process, - \(\epsilon_t = \sqrt{\delta_t} z_t\), with \(z_t \sim D(0,1)\), where \(D\) is a specified distribution, - \(\delta_t\) is time-varying and follows an ARMA process: \[ \delta_t = \omega + \alpha_1 \epsilon^2_{t-1} + \dots + \alpha_q \epsilon^2_{t-q} + \beta_1 \delta^2_{t-1} + \dots + \beta_p \delta^2_{t-p} \]

B. Data Exploration

B.1 Import Data

library(quantmod)
## Warning: package 'quantmod' was built under R version 4.4.3
## Loading required package: xts
## Warning: package 'xts' was built under R version 4.4.3
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
google <- getSymbols("GOOGL", from="2010-01-01", to="2022-08-03", auto.assign = F)
head(google[, 1:5], 5)
##            GOOGL.Open GOOGL.High GOOGL.Low GOOGL.Close GOOGL.Volume
## 2010-01-04   15.68944   15.75350  15.62162    15.68443     78169752
## 2010-01-05   15.69520   15.71171  15.55405    15.61537    120067812
## 2010-01-06   15.66216   15.66216  15.17417    15.22172    158988852
## 2010-01-07   15.25025   15.26527  14.83108    14.86737    256315428
## 2010-01-08   14.81481   15.09635  14.74249    15.06557    188783028

B.2 Calculate Daily Returns

daily_ret <- (google$GOOGL.Close - stats::lag(google$GOOGL.Close)) / stats::lag(google$GOOGL.Close)

B.3 Convert to Data Frame

daily_ret <- data.frame(index(daily_ret), daily_ret)
colnames(daily_ret) <- c("date", "return")
rownames(daily_ret) <- 1:nrow(daily_ret)

B.4 Plot Daily Return

library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.4.3
p1 <- ggplot(daily_ret, aes(x=date, y=return))
p1 + geom_line(colour="steelblue") + labs(title="Google Stock Daily Return", x="Date", y="Return")
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_line()`).

B.5 Plot Histogram of Returns

p2 <- ggplot(daily_ret)
p2 + geom_histogram(aes(x=return, y=..density..), binwidth = 0.005, color="steelblue", fill="grey", size=1) +
  stat_function(fun = dnorm, args = list(mean = mean(daily_ret$return, na.rm = T), sd = sd(daily_ret$return, na.rm = T)), size=1)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 1 row containing non-finite outside the scale range
## (`stat_bin()`).

B.6 Calculate Monthly Volatility

library(PerformanceAnalytics)
## Warning: package 'PerformanceAnalytics' was built under R version 4.4.3
## 
## Attaching package: 'PerformanceAnalytics'
## The following object is masked from 'package:graphics':
## 
##     legend
library(xts)
daily_ret_xts <- xts(daily_ret[,-1], order.by=daily_ret[,1])
realizedvol <- rollapply(daily_ret_xts, width = 20, FUN=sd.annualized)
vol <- data.frame(index(realizedvol), realizedvol)
colnames(vol) <- c("date", "volatility")

B.7 Plot Volatility

p3 <- ggplot(vol, aes(x=date, y=volatility))
p3 + geom_line(color="steelblue") + labs(title="Monthly Volatility", x="Date", y="Volatility")
## Warning: Removed 19 rows containing missing values or values outside the scale range
## (`geom_line()`).

C. Fit a GARCH Model

C.1 Model Specification

library(rugarch)
## Warning: package 'rugarch' was built under R version 4.4.3
## Loading required package: parallel
## 
## Attaching package: 'rugarch'
## The following object is masked from 'package:stats':
## 
##     sigma
garch_spec <- ugarchspec(variance.model=list(model="sGARCH", garchOrder=c(1,1)), mean.model=list(armaOrder=c(0,0)))

C.2 Fit the Model

fit_garch <- ugarchfit(spec = garch_spec, data = vol[-c(1:19),2])
fit_garch
## 
## *---------------------------------*
## *          GARCH Model Fit        *
## *---------------------------------*
## 
## Conditional Variance Dynamics    
## -----------------------------------
## GARCH Model  : sGARCH(1,1)
## Mean Model   : ARFIMA(0,0,0)
## Distribution : norm 
## 
## Optimal Parameters
## ------------------------------------
##         Estimate  Std. Error    t value Pr(>|t|)
## mu      0.183618    0.001297 141.605668        0
## omega   0.000276    0.000027  10.226852        0
## alpha1  0.999000    0.033681  29.660723        0
## beta1   0.000000    0.007195   0.000002        1
## 
## Robust Standard Errors:
##         Estimate  Std. Error   t value Pr(>|t|)
## mu      0.183618    0.010063 18.247385 0.000000
## omega   0.000276    0.000121  2.285627 0.022276
## alpha1  0.999000    0.084219 11.861995 0.000000
## beta1   0.000000    0.005298  0.000003 0.999998
## 
## LogLikelihood : 4348.209 
## 
## Information Criteria
## ------------------------------------
##                     
## Akaike       -2.7600
## Bayes        -2.7523
## Shibata      -2.7600
## Hannan-Quinn -2.7572
## 
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
##                         statistic p-value
## Lag[1]                       1862       0
## Lag[2*(p+q)+(p+q)-1][2]      2695       0
## Lag[4*(p+q)+(p+q)-1][5]      4905       0
## d.o.f=0
## H0 : No serial correlation
## 
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
##                         statistic p-value
## Lag[1]                     0.1021  0.7493
## Lag[2*(p+q)+(p+q)-1][5]    0.1275  0.9969
## Lag[4*(p+q)+(p+q)-1][9]    0.2006  0.9999
## d.o.f=2
## 
## Weighted ARCH LM Tests
## ------------------------------------
##             Statistic Shape Scale P-Value
## ARCH Lag[3]  0.004844 0.500 2.000  0.9445
## ARCH Lag[5]  0.040714 1.440 1.667  0.9963
## ARCH Lag[7]  0.062792 2.315 1.543  0.9998
## 
## Nyblom stability test
## ------------------------------------
## Joint Statistic:  1.2256
## Individual Statistics:             
## mu     0.2599
## omega  0.4897
## alpha1 0.6476
## beta1  0.4083
## 
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic:          1.07 1.24 1.6
## Individual Statistic:     0.35 0.47 0.75
## 
## Sign Bias Test
## ------------------------------------
##                    t-value   prob sig
## Sign Bias           1.2672 0.2052    
## Negative Sign Bias  0.3633 0.7164    
## Positive Sign Bias  1.0105 0.3123    
## Joint Effect        2.3665 0.4999    
## 
## 
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
##   group statistic p-value(g-1)
## 1    20      7769            0
## 2    30      7843            0
## 3    40      9979            0
## 4    50      8988            0
## 
## 
## Elapsed time : 0.31388