About
In this worksheet we look at different variance, covariance, volatility, and causality calculations. We finish with a short matematical proof (no R required).
Setup
Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions. Submit your work to RPubs as detailed in previous notes.
Note
For clarity, tasks/questions to be completed/answered are highlighted in red color (color visible only in preview mode) and numbered according to their particular placement in the task section. Type your answers outside the red color tags!
Quite often you will need to add your own code chunk. Execute sequentially all code chunks, preview, publish, and submit link on Sakai following the naming convention. Make sure to add comments to your code where appropriate. Use own language!
Any sign of plagiarism, will result in dissmissal of work!
Task 1: Variance, Covariance, and Volatility
This task follows the two examples in the book R Example 2.5/p. 58 and R Example 2.6/p. 66
# Require will load the package only if not installed
# Dependencies = TRUE makes sure that dependencies are install
if(!require("quantmod",quietly = TRUE))
install.packages("quantmod",dependencies = TRUE, repos = "https://cloud.r-project.org")
##### 1A) Calculate the correlation and covariance matrix of the adjusted daily log returns for four different stocks of your choice. Explain your observations in terms of potential relationships.
# Once you have obtained the adjusted daily log returns for your stocks, omitting the time index, you will need to combine them to create a matrix. Below is an example. For more details see the Help command in R on cbind, cov, and cor.
# M <- cbind(A,B,C) # create a matrix where each column is an array/vector of numerical values
# cov(M) # compute the covariance matrix
# cor(M, method="pearson") # compute the correlation matrix based on the Pearson method
library(quantmod)
symbols=c('BMO', 'KR', 'MSFT', 'XOM')
getSymbols(symbols, src='yahoo')
[1] "BMO" "KR" "MSFT" "XOM"
bmord = periodReturn(BMO, period = "daily", type ="log")
krrd = periodReturn(KR, period = "daily", type ="log")
msftrd = periodReturn(MSFT, period = "daily", type ="log")
xomrd = periodReturn(XOM, period = "daily", type ="log")
##Creating a matrix
M <- cbind(bmord,krrd,msftrd,xomrd)
colnames(M) <- c('BMO', 'KR', 'MSFT', 'XOM')
# compute the covariance matrix
options(scipen=999)
cov(M)
BMO KR MSFT XOM
BMO 0.00026902159 0.00008059575 0.00013456559 0.00013650491
KR 0.00008059575 0.00027936674 0.00008433024 0.00008563205
MSFT 0.00013456559 0.00008433024 0.00029532422 0.00013858133
XOM 0.00013650491 0.00008563205 0.00013858133 0.00023134334
# compute the correlation matrix based on the Pearson method
cor(M)
BMO KR MSFT XOM
BMO 1.0000000 0.2939891 0.4774094 0.5471750
KR 0.2939891 1.0000000 0.2935935 0.3368375
MSFT 0.4774094 0.2935935 1.0000000 0.5301841
XOM 0.5471750 0.3368375 0.5301841 1.0000000
Both covariance and correlation tests of the daily log returns of BMO, KR, MSFT and XOM indicate variables are positively or related. Correlation also tells us about the degree to which the variables tend to move together.The correlation range from 0.29 and and the covariance is also greater than 0 which shows there is a strong relationship between stocks
##### 1B) Calculate the three types of volatility for a particular stock of your choice. Consider a time window extending one year back from most recent obtainable closing day price. Order the three estimates from low to high volatility and explain how the ordering makes sense.
# For this task make sure you understand well what the variables n,m represent in the book's referenced example.
msft = MSFT['2017-12-03/2018-12-03']
M = length(msft[, "MSFT.Close"])
ohlc <- msft[, c("MSFT.Open", "MSFT.High", "MSFT.Low", "MSFT.Close")]
vClose = volatility(ohlc, n= M, calc = "close", N=252)
vParkinson = volatility(ohlc, n=M, calc = "parkinson", N=252)
vGK = volatility(ohlc, n=M, calc = "garman", N=252)
vClose[M]
[,1]
2018-12-03 0.2599985
vParkinson[M]
[,1]
2018-12-03 0.2222632
vGK[M]
[,1]
2018-12-03 0.2193207
The estimator shows volatility for Microsoft Corporation as follows: 0.219 for Garman Klass, 0.222 for Parkinson and the highest 0.259 for close-close.The ordering is because Parkinson assumed continous trading and only considers high and low prices whereas Garman Klass considers the open and close values. Hence the overall volatility for the year according to Garman klass model is lesser when compared to Parkinson’s model
Task 2: Auto-Correlation and Auto-Regression
Follow the example in the book R Example 3.2/p. 74 and R Example 4.1/p. 115
##### 2A) Calculate the ACF for a stock of your choice. Consider both the log return and squared log return. Interpret your results in terms of possible existence of autocorrelation.
library(quantmod)
acf(na.omit(xomrd), main = "ACF OF XOM", ylim = c(-0.2, 0.2))

acf(na.omit(xomrd)^2, main = "ACF OF XOM Squared log return", ylim = c(-0.3, 0.5))

Autocorrelation with daily log returns exists exists at lag 3,8,16 and 20 and negative autocorrelation exists at lag 1,2,7,17 and 18 The result indicates that there is some linear dependence of the variance of Exxon Mobil with its past values.
##### 2B) Plot the exchange rate for USD versus another currency of your choice. Interpret your results in terms of behavior.
getFX("USD/NGN")
[1] "USDNGN"
plot(USDNGN)

The graph shows a fluctuationg relationship between USD/NGN,it goes from really low to average, then low again but it started to rise as from sept 2018.
##### 2C) Test for the possible existence of an underlying AR(1) – Markov process in your exchange rate currency pair. To this end, plot the ACF and the partial ACF (PACF). Interpret your results. Clearly refer to the lags, and their impacts in determining the order.
acf(USDNGN)

pacf(USDNGN)

THE ACF shows a slow exponential decay for successive lags, hence revealing that the series USD/NGN does behaves as an AR(1) process.
Task 3: Granger Causality Test
To conduct this test the package lmtest will be required, as already done in the code chunk below.
# Require will load the package only if not installed
# Dependencies = TRUE makes sure that dependencies are install
if(!require("lmtest",quietly = TRUE))
install.packages("lmtest",dependencies = TRUE, repos = "https://cloud.r-project.org")
##### 3A) Include below the code chunk to solve for 3.5.7 R Lab/p. 106. Write your conclusions.
# More information about the data used in testing for causality can be obtained by typing the name of the data set `ChickEgg``ChickEgg` in the R Help menu.
grangertest(egg ~ chicken, order = 3, data = ChickEgg)
Granger causality test
Model 1: egg ~ Lags(egg, 1:3) + Lags(chicken, 1:3)
Model 2: egg ~ Lags(egg, 1:3)
Res.Df Df F Pr(>F)
1 44
2 47 -3 0.5916 0.6238
grangertest(chicken~egg, order = 3, data=ChickEgg)
Granger causality test
Model 1: chicken ~ Lags(chicken, 1:3) + Lags(egg, 1:3)
Model 2: chicken ~ Lags(chicken, 1:3)
Res.Df Df F Pr(>F)
1 44
2 47 -3 5.405 0.002966 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The first Granger causality test is testing whether the chicken causes the egg. The null hypothesis is the chicken does not cause the egg. The p-value is 0.6238 and the critical value is 0.05. This means that the p-value is higher than the critical value. Hence, I do not reject the null hypothesis. In conclusion, the chicken does not cause the egg
The second Granger causality test is testing whether the egg causes the chicken. The null Hypothenus is the egg does not cause chicken. The p-value is 0.002966 and the critical value is 0.05.This means that the p-value is less than the critical value. Hence, I reject the null hypothesis. In conclusion, the egg causes the chicken
##### 3B) Briefly describe the data in terms of time range and variables. Similar to the linear autoegressive model described in class, write the mathematical regression model solved in each Granger test, including the proper order. Use naming conventions, and notations more reflective of the data set considered for ChickEgg.
The data(ChickEgg) contains the annual production of chicken and eggs from the period of 1980 to 1983.
The Granger test uses the bivariate linear autoregressive model on two variables(egg and chicken)
The regression model used in the granger test is \(Y_{t}= a_{0} + a_{1}Y_{t-1} + a_{2}Y_{t-2} + a_{3}Y_{t-3} + b_{1}X_{t-1} + b_{2}X_{t-2} + b_{3}X_{t-3} + \varepsilon_{t}\)
For the first test: whether chicken causes egg, the regression model is: \(Egg_{t}= a_{0} + a_{1}Egg_{t-1} + a_{2}Egg_{t-2} + a_{3}Egg_{t-3} + b_{1}Chicken_{t-1} + b_{2}Chicken_{t-2} + b_{3}Chicken_{t-3} + \varepsilon_{t}\)
The null hypothesis tested is \(H_{0}= b_{1} = b_{2} = b_{3} = 0\)
For the second test: whether egg causes chicken, the regression model is:
\(Chicken_{t}= a_{0} + a_{1}Chicken_{t-1} + a_{2}Chicken_{t-2} + a_{3}Chicken_{t-3} + b_{1}Egg_{t-1} + b_{2}Egg_{t-2} + b_{3}Egg_{t-3} + \varepsilon_{t}\)
The null hypothesis tested is \(H_{0}= b_{1} = b_{2} = b_{3} = 0\)
Task 4: Mathematical Proof
##### 4A) Prove the two results in Eq (2.32)/p. 53. No R-coding is needed here. Clearly show your steps. Hint: Use the definition of \(E(X^n)\) for X-log normally distributed. Observe also that \(Var(X) = E(X^2)-E^2(X)\) for any random variable X.
The definition of the monments on variable x:
\(E(X^n) = exp(n\mu +\frac{1}{2}n^2 \sigma^2 ) , n>0\)
If we have a simple return series \(R_{t}\) which is log normally distributed with mean \(\mu_{R}\) and variance \(\sigma_{R}^2\), so that the log return series \(R_{t}\) is \(r_{t} = ln(R_{t}+1)\) ~ \(N(\mu_{r}, \sigma_{r}^2)\)
Solving for \(R_{t}\), we get
\(R_{t}+1 = e^{r_{t}}\)
\(\mu_{r}+\sigma_{r}\) we get an 1st moment of \(R_{t}\);
The formula for the first moment is \(E(R_{t})= \mu_{R} = e^{\mu_{r}+\sigma_{r}^2/2}-1\)
For the second moment, we can get it by using the variance formula \(Var(X) = E(X^2)-E^2(X)\)
SO, the formula for the second moment is:
\(Var(R_{t}) = e^{2\mu_{r}+2\sigma_{r}^2}- [e^{\mu_{r}+\sigma_{r}^2/2}]^2\)
finally
\(Var(R_{t}) = e^{2\mu_{r}+\sigma_{r}^2} (e^{\sigma_{r}^2}-1)\)
*http://computationalfinance.lsi.upc.edu
