About
In this worksheet we look at different variance, covariance, volatility, and causality calculations. We finish with a short matematical proof (no R required).
Setup
Remember to always set your working directory to the source file location. Go to ‘Session’, scroll down to ‘Set Working Directory’, and click ‘To Source File Location’. Read carefully the below and follow the instructions to complete the tasks and answer any questions. Submit your work to RPubs as detailed in previous notes.
Note
For clarity, tasks/questions to be completed/answered are highlighted in red color (color visible only in preview mode) and numbered according to their particular placement in the task section. Type your answers outside the red color tags!
Quite often you will need to add your own code chunk. Execute sequentially all code chunks, preview, publish, and submit link on Sakai following the naming convention. Make sure to add comments to your code where appropriate. Use own language!
Any sign of plagiarism, will result in dissmissal of work!
Task 1: Variance, Covariance, and Volatility
This task follows the two examples in the book R Example 2.5/p. 58 and R Example 2.6/p. 66
# Require will load the package only if not installed
# Dependencies = TRUE makes sure that dependencies are install
if(!require("quantmod",quietly = TRUE))
install.packages("quantmod",dependencies = TRUE, repos = "https://cloud.r-project.org")
##### 1A) Calculate the correlation and covariance matrix of the adjusted daily log returns for four different stocks of your choice. Explain your observations in terms of potential relationships.
# Once you have obtained the adjusted daily log returns for your stocks, omitting the time index, you will need to combine them to create a matrix. Below is an example. For more details see the Help command in R on cbind, cov, and cor.
# M <- cbind(A,B,C) # create a matrix where each column is an array/vector of numerical values
# cov(M) # compute the covariance matrix
# cor(M, method="pearson") # compute the correlation matrix based on the Pearson method
#symbols=c('TSLA','BABA','AMZN','FB',src="yahoo")
symbols=c('TSLA','BABA','AMZN','FB')
getSymbols(symbols,src='yahoo')
[1] "TSLA" "BABA" "AMZN" "FB"
tsladr=periodReturn(TSLA,period="daily",type="log")
babadr=periodReturn(BABA,period="daily",type="log")
fbdr=periodReturn(FB,period="daily",type="log")
amzndr=periodReturn(AMZN,period="daily",type="log")
tsla=as.numeric(tsladr)
baba=as.numeric(babadr)
fb=as.numeric(fbdr)
amzn=as.numeric(amzndr)
M<-cbind(tsla,baba,fb,amzn)
number of rows of result is not a multiple of vector length (arg 1)
cov(M)
tsla baba fb amzn
tsla 1.189734e-03 -7.985977e-06 -1.366790e-05 1.448781e-05
baba -7.985977e-06 4.048232e-04 2.972979e-06 -4.638111e-06
fb -1.366790e-05 2.972979e-06 5.439723e-04 5.046654e-06
amzn 1.448781e-05 -4.638111e-06 5.046654e-06 6.011452e-04
cor(M, method="pearson")
tsla baba fb amzn
tsla 1.00000000 -0.011507220 -0.016989808 0.017131207
baba -0.01150722 1.000000000 0.006335352 -0.009401968
fb -0.01698981 0.006335352 1.000000000 0.008825217
amzn 0.01713121 -0.009401968 0.008825217 1.000000000
##### 1B) Calculate the three types of volatility for a particular stock of your choice. Consider a time window extending one year back from most recent obtainable closing day price. Order the three estimates from low to high volatility and explain how the ordering makes sense.
# For this task make sure you understand well what the variables n,m represent in the book's referenced example.
symbols=c('TSLA')
getSymbols(symbols,src='yahoo',from="2017-12-28",to="2018-11-28")
[1] "TSLA"
#obtain adjusted closed
tsla=TSLA['2017-9/2018-10']
geAdj=TSLA$TSLA.Adjusted["2017-9/2018-10"]
# m is the length of the sample for the volatility estimate
m=length(geAdj)
ohlc<-tsla[,c("TSLA.Open","TSLA.High","TSLA.Low","TSLA.Close")]
vclose<-volatility(ohlc,n=m,calc="close",N=252)
vparkinson<-volatility(ohlc,n=m,calc="parkinson",N=252)
vGK<-volatility(ohlc,n=m,calc="garman",N=252)
vGK[m];vparkinson[m];vclose[m]
[,1]
2018-10-31 0.4311069
[,1]
2018-10-31 0.4413705
[,1]
2018-10-31 0.5883708
Volatility of parkinson is to compute fluctuation within one day,based on high price and low price; Volatility of garman is to compute fluctuatation within one day, based on high price and low price, and closed price and open price. The volatility of closed price is fluctuation based on two closed price between two days. The order of three types of volatility shows that fluctuation of the closed prices of tsla is more than other two, which are based on high price and low price, and closed price and open price.
Task 2: Auto-Correlation and Auto-Regression
Follow the example in the book R Example 3.2/p. 74 and R Example 4.1/p. 115
##### 2A) Calculate the ACF for a stock of your choice. Consider both the log return and squared log return. Interpret your results in terms of possible existence of autocorrelation.
getSymbols("TSLA",src = "yahoo")
[1] "TSLA"
adldr=periodReturn(TSLA$TSLA.Adjusted,period='daily',type="log")*100
adldrs=periodReturn(TSLA$TSLA.Adjusted,period="daily",type="log")^2*100
acf(adldr,main="ACF of TSLA",ylim=c(-0.2,0.2))

acf(adldrs,main="ACF of TSLA",ylim=c(-0.3,0.5))

##### 2B) Plot the exchange rate for USD versus another currency of your choice. Interpret your results in terms of behavior.
getFX("USD/CNY", src="yahoo")
[1] "USDCNY"
plot(USDCNY)

As time goes,the exchange rate of USD versus CNY is climbing. To be specific, the exchange rate of USD versus CNY soared from 7 Jun to 20 August.
##### 2C) Test for the possible existence of an underlying AR(1) – Markov process in your exchange rate currency pair. To this end, plot the ACF and the partial ACF (PACF). Interpret your results. Clearly refer to the lags, and their impacts in determining the order.
acf(USDCNY)

In the formula: \[X_t=\sum_{k=1}^{p}∅_kX_{t-k}+W_t\] In the graph of acf, there is gradually exponential decay for successive lags. it means that relationship between the rate of exchange and previous data is strong linear in first order of lag, but after first lag, the linear relationship gradually fades.
plot(pacf(USDCNY))


The graph of pacf confrims that the order of autoregressive process having strong linear relationship is 1.
Task 3: Granger Causality Test
To conduct this test the package lmtest will be required, as already done in the code chunk below.
# Require will load the package only if not installed
# Dependencies = TRUE makes sure that dependencies are install
if(!require("lmtest",quietly = TRUE))
install.packages("lmtest",dependencies = TRUE, repos = "https://cloud.r-project.org")
##### 3A) Include below the code chunk to solve for 3.5.7 R Lab/p. 106. Write your conclusions.
# More information about the data used in testing for causality can be obtained by typing the name of the data set `ChickEgg` in the R Help menu.
grangertest(egg~chicken,order=3,data=ChickEgg)
Granger causality test
Model 1: egg ~ Lags(egg, 1:3) + Lags(chicken, 1:3)
Model 2: egg ~ Lags(egg, 1:3)
Res.Df Df F Pr(>F)
1 44
2 47 -3 0.5916 0.6238
grangertest(chicken~egg,order=3,data=ChickEgg)
Granger causality test
Model 1: chicken ~ Lags(chicken, 1:3) + Lags(egg, 1:3)
Model 2: chicken ~ Lags(chicken, 1:3)
Res.Df Df F Pr(>F)
1 44
2 47 -3 5.405 0.002966 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Conclusion: chicken doesn’t cause egg, but egg causes chicken.
##### 3B) Briefly describe the data in terms of time range and variables. Similar to the linear autoregressive model described in class, write the mathematical regression model solved in each Granger test, including the proper order. Use naming conventions, and notations more reflective of the data set considered for ChickEgg.
\(Y_t=a_0+a_1Y_1+a_2Y_2+a_3Y_3+b_1 X_1+b_2X_2+b_3X_3\)
In the first Granger causality test, chicken is x and egg is y, when p-value is greater than 0.5, and one accept null hypothesis,which is \(b_1=b_2=b_3=0\), thus chicken doesn’t cause egg.
In the second Granger causality test, egg is x and chicken is y, when p-value is less than 0.5, and one rejects null hypothesis,which is \(b_1=b_2=b_3=0\), thus egg causes chicken.
Task 4: Mathematical Proof
##### 4A) Prove the two results in Eq (2.32)/p. 53. No R-coding is needed here. Clearly show your steps. Hint: Use the definition of \(E(X^n)\) for X-log normally distributed. Observe also that \(Var(X) = E(X^2)-E^2(X)\) for any random variable X. The moments of the variable X are \(E(X^n)=e^{nu+1/2n^2\sigma^2}\),n>0� assuming \({R_{t}}\) is log-normally distributed, since \(r_{t}=ln(R_{t}+1)\) \(e^{r}=R+1\), then \(R=e^{r}-1\),in \(E(X^n)=e^{nu+1/2n^2\sigma^2}\)
when n=1: \(E(R)=u_R=e^{u_r+\sigma^2_r/2}-1\)
when n=2,input in formula , the result \(E(x^2)=E(R^2)-E^2(R)=e^{2u_r+2\sigma^2_r}-(e^{u_r+\sigma^2_r/2})^2=e^{2u_r+\sigma^2_r}(e^{\sigma^2_r}-1)\)
*http://computationalfinance.lsi.upc.edu
---
title: "FINC621 Winter 2018-19 Lab Worksheet 03"
author: "Yu Jia"
date: "11/28/2018"
output:
  html_notebook: default
  html_document: default
subtitle: Variance, Covariance, Correlation & Causality (finc621-lab03)
---

### About

In this worksheet we look at different variance, covariance, volatility, and causality calculations. We finish with a short matematical proof (no R required).  

### Setup

Remember to always set your working directory to the source file location. Go to 'Session', scroll down to 'Set Working Directory', and click 'To Source File Location'. Read carefully the below and follow the instructions to complete the tasks and answer any questions.  Submit your work to RPubs as detailed in previous notes. 

### Note

For clarity, tasks/questions to be completed/answered are highlighted in red color (color visible only in preview mode) and numbered according to their particular placement in the task section.  Type your answers outside the red color tags!

Quite often you will need to add your own code chunk. Execute sequentially all code chunks, preview, publish, and submit link on Sakai following the naming convention. Make sure to add comments to your code where appropriate. Use own language!

**Any sign of plagiarism, will result in dissmissal of work!**

--------------

### Task 1: Variance, Covariance, and Volatility

This task follows the two examples in the book `R Example 2.5/p. 58` and `R Example 2.6/p. 66` 

```{r}
# Require will load the package only if not installed 
# Dependencies = TRUE makes sure that dependencies are install
if(!require("quantmod",quietly = TRUE))
  install.packages("quantmod",dependencies = TRUE, repos = "https://cloud.r-project.org")
```


<span style="color:red">
##### 1A) Calculate the correlation and covariance matrix of the adjusted daily log returns for four different stocks of your choice. Explain your observations in terms of potential relationships.
</span>

```{r}
# Once you have obtained the adjusted daily log returns for your stocks, omitting the time index, you will need to combine them to create a matrix. Below is an example.  For more details see the Help command in R on cbind, cov, and cor.
# M <- cbind(A,B,C) # create a matrix where each column is an array/vector of numerical values 
# cov(M) # compute the covariance matrix
# cor(M, method="pearson") # compute the correlation matrix based on the Pearson method
#symbols=c('TSLA','BABA','AMZN','FB',src="yahoo")
symbols=c('TSLA','BABA','AMZN','FB')
getSymbols(symbols,src='yahoo')
tsladr=periodReturn(TSLA,period="daily",type="log")
babadr=periodReturn(BABA,period="daily",type="log")
fbdr=periodReturn(FB,period="daily",type="log")
amzndr=periodReturn(AMZN,period="daily",type="log")
tsla=as.numeric(tsladr)
baba=as.numeric(babadr)
fb=as.numeric(fbdr)
amzn=as.numeric(amzndr)
M<-cbind(tsla,baba,fb,amzn)
cov(M)
cor(M, method="pearson")

```


<span style="color:red">
##### 1B) Calculate the three types of volatility for a particular stock of your choice. Consider a time window extending one year back from most recent obtainable closing day price. Order the three estimates from low to high volatility and explain how the ordering makes sense.
</span>

```{r}
# For this task make sure you understand well what the variables n,m represent in the book's referenced example.
symbols=c('TSLA')
getSymbols(symbols,src='yahoo',from="2017-12-28",to="2018-11-28")
 #obtain adjusted closed
tsla=TSLA['2017-9/2018-10']
geAdj=TSLA$TSLA.Adjusted["2017-9/2018-10"]
# m is the length of the sample for the volatility estimate
m=length(geAdj) 
ohlc<-tsla[,c("TSLA.Open","TSLA.High","TSLA.Low","TSLA.Close")]
vclose<-volatility(ohlc,n=m,calc="close",N=252)
vparkinson<-volatility(ohlc,n=m,calc="parkinson",N=252)
vGK<-volatility(ohlc,n=m,calc="garman",N=252)

vGK[m];vparkinson[m];vclose[m]
```
 Volatility of parkinson is to compute fluctuation within one day,based on high price and low price;
 Volatility of garman is to compute fluctuatation within one day, based on high price and low price, and closed price and open price. The volatility of closed price is fluctuation based on two closed price between two days. The order of three types of volatility shows that fluctuation of the closed prices of tsla is more than other two, which are based on high price and low price, and closed price and open price.  



### Task 2: Auto-Correlation and Auto-Regression

Follow the example in the book  `R Example 3.2/p. 74` and `R Example 4.1/p. 115`

<span style="color:red">
##### 2A) Calculate the ACF for a stock of your choice. Consider both the log return and squared log return. Interpret your results in terms of possible existence of autocorrelation.  
</span>

```{r}
getSymbols("TSLA",src = "yahoo")
adldr=periodReturn(TSLA$TSLA.Adjusted,period='daily',type="log")*100
adldrs=periodReturn(TSLA$TSLA.Adjusted,period="daily",type="log")^2*100
acf(adldr,main="ACF of TSLA",ylim=c(-0.2,0.2))
acf(adldrs,main="ACF of TSLA",ylim=c(-0.3,0.5))

```


<span style="color:red">
##### 2B) Plot the exchange rate for USD versus another currency of your choice. Interpret your results in terms of behavior.
</span>

```{r}

getFX("USD/CNY", src="yahoo")

```

As time goes,the exchange rate of USD versus CNY is climbing. To be specific, the exchange rate of USD versus CNY soared from 7 Jun to 20 August. 

<span style="color:red">
##### 2C) Test for the possible existence of an underlying AR(1) – Markov process in your exchange rate currency pair. To this end, plot the ACF and the partial ACF (PACF). Interpret your results.  Clearly refer to the lags, and their impacts in determining the order.
</span>

```{r}
acf(USDCNY)
```
In the formula: $$X_t=\sum_{k=1}^{p}∅_kX_{t-k}+W_t$$
In the graph of acf, there is gradually exponential decay for successive lags. it means that relationship between the rate of exchange and previous data is strong linear in first order of lag, but after first lag, the linear relationship gradually fades.

```{r}
pacf(USDCNY)

```

The graph of pacf confrims that the order of autoregressive process having strong linear relationship is 1.

### Task 3: Granger Causality Test

To conduct this test the package `lmtest` will be required, as already done in the code chunk below.

```{r}
# Require will load the package only if not installed 
# Dependencies = TRUE makes sure that dependencies are install
if(!require("lmtest",quietly = TRUE))
  install.packages("lmtest",dependencies = TRUE, repos = "https://cloud.r-project.org")
```

<span style="color:red">
##### 3A) Include below the code chunk to solve for 3.5.7 R Lab/p. 106.  Write your conclusions.
</span>

```{r}
# More information about the data used in testing for causality can be obtained by typing the name of the data set `ChickEgg` in the R Help menu.

grangertest(egg~chicken,order=3,data=ChickEgg)

grangertest(chicken~egg,order=3,data=ChickEgg)

```

Conclusion: chicken doesn't cause egg, but egg causes chicken.

<span style="color:red">
##### 3B) Briefly describe the data in terms of time range and variables. Similar to the linear autoregressive model described in class, write the mathematical regression model solved in each Granger test, including the proper order. Use naming conventions, and notations more reflective of the data set considered for  `ChickEgg`.
</span>

$Y_t=a_0+a_1Y_1+a_2Y_2+a_3Y_3+b_1 X_1+b_2X_2+b_3X_3$

In the first Granger causality test, chicken is x and egg is y, when p-value is greater than 0.5, and one accept null hypothesis,which is $b_1=b_2=b_3=0$, thus chicken doesn't cause egg.

In the second Granger causality test, egg is x and chicken is y, when p-value is less than 0.5, and one rejects null hypothesis,which is $b_1=b_2=b_3=0$, thus egg causes chicken.


### Task 4: Mathematical Proof

<span style="color:red">
##### 4A) Prove the two results in Eq (2.32)/p. 53.  No R-coding is needed here.  Clearly show your steps. Hint: Use the definition of $E(X^n)$ for X-log normally distributed.   Observe also that $Var(X) = E(X^2)-E^2(X)$ for any random variable X.
</span>
The moments of the variable X are
$E(X^n)=e^{nu+1/2n^2\sigma^2}$,n>0�
assuming ${R_{t}}$ is log-normally distributed, since $r_{t}=ln(R_{t}+1)$
$e^{r}=R+1$, then $R=e^{r}-1$,in $E(X^n)=e^{nu+1/2n^2\sigma^2}$

when n=1: $E(R)=u_R=e^{u_r+\sigma^2_r/2}-1$

when n=2,input in formula , the result $E(x^2)=E(R^2)-E^2(R)=e^{2u_r+2\sigma^2_r}-(e^{u_r+\sigma^2_r/2})^2=e^{2u_r+\sigma^2_r}(e^{\sigma^2_r}-1)$



*[http://computationalfinance.lsi.upc.edu ](http://computationalfinance.lsi.upc.edu)
