1 General directions for this Workshop

You will work in RStudio. Create an R Notebook document (File -> New File -> R Notebook), where you have to write whatever is asked in this workshop.

At the beginning of the R Notebook write Workshop 7 - Financial Econometrics II and your name (as we did in previous workshop).

You have to replicate all the steps explained in this workshop, and ALSO you have to do whatever is asked. Any QUESTION or any STEP you need to do will be written in CAPITAL LETTERS. For ANY QUESTION, you have to RESPOND IN CAPITAL LETTERS right after the question.

It is STRONGLY RECOMMENDED that you write your OWN NOTES as if this were your notebook. Your own workshop/notebook will be very helpful for your further study.

You have to keep saving your .Rmd file, and ONLY SUBMIT the .html version of your .Rmd file. Pay attention in class to know how to generate an html file from your .Rmd.

2 Introduction to ARCH model

You have to read my note ā€œIntroducción a modelos ARCHā€.

An ARCH model is designed to model daily variance of financial instruments. We will start with the ARCH(1) model. Assume that \(Y_t\) is the daily continuously compounded return of a financial instrument, then we can model its variance as follows:

\[ Y_{t}=\beta_{0}+\varepsilon_{t} \] In this case, we are interested on the shock or error of the series, not on the actual mean return of the series. This is the reason I indicated that the daily value of Y is about a mean value \(B_0\) plus a random shock (error). This error is supposed to behave like a normal distributed variable with a specific variance and mean equal to zero. Then, I will now focus on how to model the variance of this daily error series:

The shock follows a normal distribution with mean 0 and volatility = square root of h: \[ \varepsilon_{t}\thicksim N(0,\sqrt{h}_{t}) \] Now we will model variance \(h_t\) as an autoregressive process similar to an MA process:

\[ h_{t}=\alpha_{0}+\alpha_{1}\varepsilon_{t-1}^{2} \] We see that the variance of today is equal to a coefficient \(\alpha_0\) plus \(\alpha_1\) times the squared shock of yesterday.

\[ \alpha_{0}>=0;0<=\alpha_{1}<1 \]

This model is called ARCH(1) since we are considering only the lag 1 of the shock as a factor that impacts the return variance of today.

We can see that \(\alpha_1\) is like a filter with values from 0 to 1. If \(\alpha_1\) is close to 1, then almost all information of the previous shock squared will pass to the variance of today. Then, the higher the value of \(\alpha_1\), the higher the impact of the shock of yesterday to the return variance of today.

We will play with a simulation to better understand this ARCH(1) model.

2.1 Simulating ARCH(1)

Now you have to simulate 2 ARCH(1) models using 500 periods (days) with the following specifications:

For model 1 use \(\beta_0=0\), \(\alpha_0=1\), and \(\alpha_1=0.90\). You can do this simulation in R as follows:

alpha0=1
alpha1= 0.9
B0= 0

I create empty vectors for the 500 variances and shocks:

n = 500
error = single(n)
h = single(n)

The single function creates a numeric vector with zero values.

I generate the first random shock following the specifications:

h[1] = alpha0 + alpha1*0 
# Since there is no shock at day 0, I will assign 0 to this shock, so I only
#   consider alpha for the formula of the variance

error[1] = rnorm(n = 1, mean=0, sd=sqrt(h[1]))

Now create the Y variable, which will be an ARCH(1) process. Assign the value for Y for the first day:

Y= single(length=n)
Y[1]= B0 + error[1]

Now fill out the Y values for the rest of the days

for (i in 2:n){
  h[i] <- alpha0 + alpha1*((error[i-1]^2))
  error[i] <- rnorm(n=1, 0, sqrt(h[i]))
}

We can see the first values of h and error:

head(error)
## [1] -0.2122812 -1.0245506  1.7442818 -3.5002572  3.8034050  0.8138148
head(h)
## [1]  1.000000  1.040557  1.944734  3.738267 12.026621 14.019300

Now I create the Y values from day 2 to day 500:

for (i in 2:n){
Y[i] <- B0 + error[i]
}
plot(Y,  type="l", col="blue")

Now using the same dataset SIMULATE model 2 (as Y2) as another ARCH(1). Use alpha0=0.1 and alpha1=0.1.

Plot Y2 and compare with Y

WHAT IS THE DIFFERENCE BETWEEN THE SERIES Y AND Y2?

Looking at the original ARCH equation, EXPLAIN why do you think both series are different?

3 Further understanding of volatility

In this exercise we analyze volatility of financial returns using rolling windows. It has been shown that volatility of financial instruments changeS over time. As we practiced in previous exercise, family ARCH models are designed to model the changing volatility, which is the standard deviation of returns (actually, these models analyze the changing variance, but the volatility is the squared root of the variance). In this part, we will use daily returns of a market index, and calculate rolling means and rolling standard deviations using a moving/rolling windows of 20 business days.

  1. Download daily returns of the IPyC (from Jan 2008 to date) from Yahoo Finance.
#Load the quantmod library to use the getsymbols command:
library(quantmod)
## Loading required package: xts
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
IPC <- getSymbols("^MXX", from="2008-01-01",auto.assign = FALSE)[,6]
## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
## 
## This message is shown once per session and may be disabled by setting 
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
# auto.assign=FALSE since I want the dataset to be called IPC, not MXX
# I indicate to get the column 6 of the dataset, which is the adjusted price

# I calculate cc returns of the IPCyC index:

ipc_ret <- na.omit(diff(log(IPC)))
  1. Using the rollapply() function from the zoo package, and a moving window of 20 business days (one month), generate a dataset with the rolling mean and rolling volatility of the IPyC continuously compounded returns:
library(zoo)
roll_mean <- na.omit(rollapply(ipc_ret, 20, mean))
roll_sd <- na.omit(rollapply(ipc_ret, 20, sd))

ipc_mean_vol <- merge(roll_mean, roll_sd)
colnames(ipc_mean_vol) <- c("roll_mean", "roll_sd")
  1. R will generate a new dataset that contains one time window for each row. Now plot both variables. The average 20-day volatility and the average 20-day returns: (The sd will be the red line and the mean, the black line):
plot(ipc_mean_vol)

WHAT DO YOU OBSERVE? In which periods you observe more volatility? Describe with your words, if you see a relationship between volatility and average returns.

  1. Do the same steps 2 and 3, but now move the window to 5 days. Report your responses to the same questions and compare the results with the 20-day window results.

4 Modeling the IPCyC returns with an ARCH model

  1. Download daily data of the IPyC from Yahoo Finance from Jan 1, 2013 to date. Check whether the IPCyC cc returns are heteroscedastic. In other words, check whether the IPCyC returns follow an ARCH(1) process.
IPC13 <- getSymbols("^MXX", from="2013-01-01",auto.assign = FALSE)[,6]

# I calculate cc returns:
ipc_ret13 <- na.omit(diff(log(IPC13)))

colnames(ipc_ret13) <- c("r")

Now we check whether the IPCyC returns are heteroscedastic. You can do this by testing whether the cc IPCyC returns follows an ARCH process. Do this test and interpret it.

We do this test with the ArchTest function from the FinTS package. Install this package first:

#install.packages("FinTS")
library(FinTS)
## Warning: package 'FinTS' was built under R version 4.0.3
ArchTest(ipc_ret13$r)
## 
##  ARCH LM-test; Null hypothesis: no ARCH effects
## 
## data:  ipc_ret13$r
## Chi-squared = 454.01, df = 12, p-value < 2.2e-16

INTERPRETATION:

Since the p-value of this test is less than 0.05, we can reject the NULL hypothesis that states that there is NO ARCH effects. In sum, we can have statistical evidence to say that the IPCyC returns has heteroscedasticity in its variance.

  1. Check if an ARCH(1) model fits the series.

We need to install the fGarch package first.

library(fGarch)
## Warning: package 'fGarch' was built under R version 4.0.4
## Loading required package: timeDate
## Warning: package 'timeDate' was built under R version 4.0.3
## Loading required package: timeSeries
## Warning: package 'timeSeries' was built under R version 4.0.3
## 
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
## 
##     time<-
## Loading required package: fBasics
## Warning: package 'fBasics' was built under R version 4.0.4
## 
## Attaching package: 'fBasics'
## The following object is masked from 'package:TTR':
## 
##     volatility
arch.fit <- garchFit(~garch(1,0), data = ipc_ret13$r, trace = F)
## Warning: Using formula(x) is deprecated when x is a character vector of length > 1.
##   Consider formula(paste(x, collapse = " ")) instead.
summary(arch.fit)
## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(1, 0), data = ipc_ret13$r, trace = F) 
## 
## Mean and Variance Equation:
##  data ~ garch(1, 0)
## <environment: 0x000000001b3ef4a8>
##  [data = ipc_ret13$r]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1  
## 5.0893e-05  6.7802e-05  3.0348e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     5.089e-05   1.921e-04    0.265    0.791    
## omega  6.780e-05   2.803e-06   24.185  < 2e-16 ***
## alpha1 3.035e-01   3.752e-02    8.088 6.66e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  7135.907    normalized:  3.249502 
## 
## Description:
##  Tue Oct 05 15:19:26 2021 by user: Alberto 
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value    
##  Jarque-Bera Test   R    Chi^2  559.8914  0          
##  Shapiro-Wilk Test  R    W      0.9780766 0          
##  Ljung-Box Test     R    Q(10)  26.58328  0.003030058
##  Ljung-Box Test     R    Q(15)  31.22136  0.008204608
##  Ljung-Box Test     R    Q(20)  40.15469  0.0047752  
##  Ljung-Box Test     R^2  Q(10)  240.6274  0          
##  Ljung-Box Test     R^2  Q(15)  382.0976  0          
##  Ljung-Box Test     R^2  Q(20)  437.5122  0          
##  LM Arch Test       R    TR^2   201.9699  0          
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -6.496273 -6.488493 -6.496276 -6.493430

We can also run the same ARCH model using the ugarchfit from the rugarch package.

library(FinTS)
library(rugarch)
## Warning: package 'rugarch' was built under R version 4.0.3
## Loading required package: parallel
## 
## Attaching package: 'rugarch'
## The following object is masked from 'package:stats':
## 
##     sigma
#ARCH(1) 

Espec1=ugarchspec(variance.model= list(model= "sGARCH", garchOrder= c(1, 0), 
submodel= NULL, external.regressors= NULL, variance.targeting= FALSE), 
mean.model= list(armaOrder= c(0, 0), include.mean= TRUE, archm= FALSE, 
archpow= 0, arfima= FALSE, external.regressors= NULL, archex= FALSE), 
distribution.model= "norm", start.pars= list(), fixed.pars= list())

#Running the ARCH model with the specifications:
arch1<- ugarchfit(spec=Espec1, data=ipc_ret13$r)
arch1
## 
## *---------------------------------*
## *          GARCH Model Fit        *
## *---------------------------------*
## 
## Conditional Variance Dynamics    
## -----------------------------------
## GARCH Model  : sGARCH(1,0)
## Mean Model   : ARFIMA(0,0,0)
## Distribution : norm 
## 
## Optimal Parameters
## ------------------------------------
##         Estimate  Std. Error  t value Pr(>|t|)
## mu      0.000051    0.000192  0.26478  0.79118
## omega   0.000068    0.000003 24.17530  0.00000
## alpha1  0.303946    0.037582  8.08763  0.00000
## 
## Robust Standard Errors:
##         Estimate  Std. Error  t value Pr(>|t|)
## mu      0.000051    0.000195  0.26131 0.793851
## omega   0.000068    0.000005 13.37830 0.000000
## alpha1  0.303946    0.077274  3.93333 0.000084
## 
## LogLikelihood : 7135.912 
## 
## Information Criteria
## ------------------------------------
##                     
## Akaike       -6.4963
## Bayes        -6.4885
## Shibata      -6.4963
## Hannan-Quinn -6.4934
## 
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
##                         statistic  p-value
## Lag[1]                      8.291 0.003984
## Lag[2*(p+q)+(p+q)-1][2]     9.506 0.002582
## Lag[4*(p+q)+(p+q)-1][5]    14.119 0.000765
## d.o.f=0
## H0 : No serial correlation
## 
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
##                         statistic   p-value
## Lag[1]                      1.848 1.740e-01
## Lag[2*(p+q)+(p+q)-1][2]    17.131 2.525e-05
## Lag[4*(p+q)+(p+q)-1][5]    71.770 0.000e+00
## d.o.f=1
## 
## Weighted ARCH LM Tests
## ------------------------------------
##             Statistic Shape Scale   P-Value
## ARCH Lag[2]     30.51 0.500 2.000 3.321e-08
## ARCH Lag[4]     82.62 1.397 1.611 0.000e+00
## ARCH Lag[6]    106.50 2.222 1.500 0.000e+00
## 
## Nyblom stability test
## ------------------------------------
## Joint Statistic:  2.6423
## Individual Statistics:              
## mu     0.08003
## omega  2.09380
## alpha1 1.10020
## 
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic:          0.846 1.01 1.35
## Individual Statistic:     0.35 0.47 0.75
## 
## Sign Bias Test
## ------------------------------------
##                    t-value   prob sig
## Sign Bias           0.7233 0.4696    
## Negative Sign Bias  0.6675 0.5045    
## Positive Sign Bias  0.8625 0.3885    
## Joint Effect        1.1992 0.7532    
## 
## 
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
##   group statistic p-value(g-1)
## 1    20     70.10    8.846e-08
## 2    30     78.26    2.075e-06
## 3    40     94.60    1.609e-06
## 4    50    109.01    1.863e-06
## 
## 
## Elapsed time : 0.3794761

Both functions get the same estimations for the ARCH coefficients.

INTERPRET the OUTPUT of the ARCH(1) model. You have to interpret the omega, alpha and mu coefficients. The omega coefficient is equivalent to the \(\alpha_0\) coefficient of our model, and the mu coefficient is actually our \(\beta_0\) coefficient.

  1. Check if a GARCH(1,1) fits the series. Run the model :
garch.fit <- garchFit(~garch(1,1), data = ipc_ret13$r, trace = F)
## Warning: Using formula(x) is deprecated when x is a character vector of length > 1.
##   Consider formula(paste(x, collapse = " ")) instead.
summary(garch.fit)
## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(1, 1), data = ipc_ret13$r, trace = F) 
## 
## Mean and Variance Equation:
##  data ~ garch(1, 1)
## <environment: 0x000000001a30d3d0>
##  [data = ipc_ret13$r]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1       beta1  
## 1.4403e-04  3.0743e-06  1.0723e-01  8.6065e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     1.440e-04   1.708e-04    0.843 0.398952    
## omega  3.074e-06   8.166e-07    3.765 0.000167 ***
## alpha1 1.072e-01   1.536e-02    6.982 2.91e-12 ***
## beta1  8.607e-01   2.026e-02   42.490  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  7265.342    normalized:  3.308443 
## 
## Description:
##  Tue Oct 05 15:19:28 2021 by user: Alberto 
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value     
##  Jarque-Bera Test   R    Chi^2  90.46075  0           
##  Shapiro-Wilk Test  R    W      0.9916303 6.190389e-10
##  Ljung-Box Test     R    Q(10)  21.25434  0.01938818  
##  Ljung-Box Test     R    Q(15)  24.5425   0.0564351   
##  Ljung-Box Test     R    Q(20)  33.36706  0.03073904  
##  Ljung-Box Test     R^2  Q(10)  22.45791  0.01293441  
##  Ljung-Box Test     R^2  Q(15)  25.64162  0.04196263  
##  Ljung-Box Test     R^2  Q(20)  32.38531  0.0393617   
##  LM Arch Test       R    TR^2   24.40085  0.01793165  
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -6.613244 -6.602871 -6.613250 -6.609453

We can also run the same model using the other function:

Espec2=ugarchspec(variance.model= list(model= "sGARCH", garchOrder= c(1, 1), 
submodel= NULL, external.regressors= NULL, variance.targeting= FALSE), 
mean.model= list(armaOrder= c(0, 0), include.mean= TRUE, archm= FALSE, 
archpow= 0, arfima= FALSE, external.regressors= NULL, archex= FALSE), 
distribution.model= "norm", start.pars= list(), fixed.pars= list())

#Estimación del GARCH(1,1) 
garch1<- ugarchfit(spec=Espec2, data=ipc_ret13$r)
garch1
## 
## *---------------------------------*
## *          GARCH Model Fit        *
## *---------------------------------*
## 
## Conditional Variance Dynamics    
## -----------------------------------
## GARCH Model  : sGARCH(1,1)
## Mean Model   : ARFIMA(0,0,0)
## Distribution : norm 
## 
## Optimal Parameters
## ------------------------------------
##         Estimate  Std. Error  t value Pr(>|t|)
## mu      0.000144    0.000171  0.84138  0.40014
## omega   0.000003    0.000002  1.64675  0.09961
## alpha1  0.106936    0.014019  7.62808  0.00000
## beta1   0.861639    0.019245 44.77301  0.00000
## 
## Robust Standard Errors:
##         Estimate  Std. Error  t value Pr(>|t|)
## mu      0.000144    0.000154  0.93206 0.351306
## omega   0.000003    0.000009  0.32911 0.742076
## alpha1  0.106936    0.024579  4.35077 0.000014
## beta1   0.861639    0.073470 11.72785 0.000000
## 
## LogLikelihood : 7265.34 
## 
## Information Criteria
## ------------------------------------
##                     
## Akaike       -6.6132
## Bayes        -6.6029
## Shibata      -6.6132
## Hannan-Quinn -6.6095
## 
## Weighted Ljung-Box Test on Standardized Residuals
## ------------------------------------
##                         statistic  p-value
## Lag[1]                      7.803 0.005216
## Lag[2*(p+q)+(p+q)-1][2]     8.456 0.004898
## Lag[4*(p+q)+(p+q)-1][5]    11.716 0.003307
## d.o.f=0
## H0 : No serial correlation
## 
## Weighted Ljung-Box Test on Standardized Squared Residuals
## ------------------------------------
##                         statistic p-value
## Lag[1]                    0.09301 0.76038
## Lag[2*(p+q)+(p+q)-1][5]   3.02490 0.40252
## Lag[4*(p+q)+(p+q)-1][9]   8.93277 0.08391
## d.o.f=2
## 
## Weighted ARCH LM Tests
## ------------------------------------
##             Statistic Shape Scale P-Value
## ARCH Lag[3]     2.738 0.500 2.000 0.09797
## ARCH Lag[5]     4.927 1.440 1.667 0.10686
## ARCH Lag[7]    10.221 2.315 1.543 0.01635
## 
## Nyblom stability test
## ------------------------------------
## Joint Statistic:  8.3883
## Individual Statistics:              
## mu     0.03645
## omega  1.81276
## alpha1 0.21722
## beta1  0.20653
## 
## Asymptotic Critical Values (10% 5% 1%)
## Joint Statistic:          1.07 1.24 1.6
## Individual Statistic:     0.35 0.47 0.75
## 
## Sign Bias Test
## ------------------------------------
##                    t-value   prob sig
## Sign Bias           0.5826 0.5602    
## Negative Sign Bias  1.0974 0.2726    
## Positive Sign Bias  0.7030 0.4821    
## Joint Effect        1.9291 0.5873    
## 
## 
## Adjusted Pearson Goodness-of-Fit Test:
## ------------------------------------
##   group statistic p-value(g-1)
## 1    20     40.92    2.472e-03
## 2    30     55.12    2.399e-03
## 3    40     82.94    5.224e-05
## 4    50     97.26    4.953e-05
## 
## 
## Elapsed time : 0.164115

*INTERPRET the model output**. What does each coefficient mean?

  1. Do a forecast of the volatility of 5 days in the future, and do a graph to show this prediction. To do this, run the following:
predict(garch.fit, n.ahead=5, plot=TRUE)

##   meanForecast   meanError standardDeviation lowerInterval upperInterval
## 1 0.0001440273 0.008363383       0.008363383   -0.01624790    0.01653596
## 2 0.0001440273 0.008412711       0.008412711   -0.01634458    0.01663264
## 3 0.0001440273 0.008460181       0.008460181   -0.01643762    0.01672568
## 4 0.0001440273 0.008505873       0.008505873   -0.01652718    0.01681523
## 5 0.0001440273 0.008549866       0.008549866   -0.01661340    0.01690146

5 Estimating the market model under heterogeneity of variance

  1. Download the daily returns of CEMEX and the IPCyC (from Jan 2015 to date) from Yahoo finance. Save the variable as datacemex.

  2. Run a simple market model to estimate a beta coefficient of CEMEX. As comments, adequately interpret the beta coefficient, the Jensen alpha coefficient, and their p-values.

  3. The OLS regression method you used in the previous point is adequate when we assume no autocorrelation of errors and also homogeneity of variance is assumed for any OLS regresson model. If these assumptions do not hold then the OLS coefficients might be accurate but its standard error will not be reliable. You have to examine the data to check for both autocorrelation and heterogeneity of variance. Do the following:

  4. Check evidence for homogeneity of error variance in the market model. Do the corresponding test and interpret it. (test whether the residuals of the regression follows an ARCH process)

  5. Run a market model with ARCH / GARCH effects:

  1. Run an ARCH(5) market model (hint: use the ugarchfit function including the market returns as external regressor)

  2. Run a GARCH(1,1) market model:

With your words, INTERPRET the coefficients of each model.

6 Quiz 8 and W8 submission

Go to Canvas and respond Quiz 8. You will have 3 attempts. Questions in this Quiz are related to concepts of the readings related to this Workshop.

The grade of this Workshop will be the following:

  • Complete (100%): If you submit an ORIGINAL and COMPLETE HTML file with all the activities, with your notes, and with your OWN RESPONSES to questions
  • Incomplete (75%): If you submit an ORIGINAL HTML file with ALL the activities but you did NOT RESPOND to the questions and/or you did not do all activities and respond to some of the questions.
  • Very Incomplete (10%-70%): If you complete from 10% to 75% of the workshop or you completed more but parts of your work is a copy-paste from other workshops.
  • Not submitted (0%)

Remember that you have to submit your .html file through Canvas BEFORE NEXT CLASS.