Mini Project

Exercise 1

Quandl R package is a tool that allows to get financial data directly into R. The company (acquired by Nasdaq in 2018) offers a global database of alternative, financial and public data, including information on capital markets, energy, shipping, healthcare, education, demography, economics and society. Quandl has millions of financial and economic datasets from hundreds of publishers (https://www.quandl.com/publishers) and the Quandl R package is free to use and grants access to all free datasets. Users only pay to access Quandl’s premium data products. It also provides different time series formats and datatables.

library(Quandl)

## Loading required package: xts

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## Registered S3 method overwritten by 'xts':
##   method     from
##   as.zoo.xts zoo

Quandl.api_key("7bvzJdw8JLMkc5xmK9rR")

We made some empirical statistics after having calculated the retutns of some stocks, assumed to be a stationary variable. We made a plot of the empirical distribution of the returns with hist and of the cumulative returns. in the summary we have some descriprions of the empirical distribution of net returns. After we calculated with the function f the skew, kurtosis and the 5% Var. Then we calculated the empirical varcov matrix.

The entire WIKI database is stored as a single table for faster and easy retrieval. Tables are a collection of data structured as one or more columns and rows. The list of all tickers found in the WIKI database can be found using the following URL:

URL AAPL:https://www.quandl.com/data/EOD/AAPL
URL AMZN:https://www.quandl.com/data/EOD/AMZN
URL GOOG:https://www.quandl.com/data/EOD/GOOG

URL SP500: https://www.quandl.com/data/MULTPL/SP500_REAL_PRICE_MONTH

Provided by QUOTEMEDIA

GOOG <- Quandl("WIKI/GOOG")
AAPL <- Quandl("WIKI/AAPL")
AMZN<-Quandl("WIKI/AMZN")

P_GOOG<-GOOG[,c(1,5)]
P_AAPL<-AAPL[,5]
P_AMZN<-AMZN[,5]

DataFrame_long<-as.data.frame(cbind(P_GOOG[1:100,],P_AAPL[1:100], P_AMZN[1:100]))
colnames(DataFrame_long)<-c("Date","Google","Apple","Amazon")

googr<-head(as.vector(as.zoo(GOOG[,2]/GOOG[,5] -1)),n=100 )
aaplr<-head( as.vector(as.zoo(AAPL[,2]/AAPL[,5] -1)), n=100 )
amznr<- head( as.vector(as.zoo(AMZN[,2]/AMZN[,5] -1)), n=100)
DataSeta<-cbind(googr,aaplr,amznr)
colnames(DataSeta)<-c('goog','aapl','amzn')

Empirical estimate of some statistics

summary(DataSeta[,'goog'])

##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
## -0.0494355 -0.0087379 -0.0008718  0.0001830  0.0039046  0.0576062

summary(DataSeta[,'aapl'])

##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
## -0.0502975 -0.0065486  0.0005406  0.0002708  0.0068851  0.0351063

summary(DataSeta[,'amzn'])

##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
## -5.640e-02 -9.104e-03  9.536e-05 -1.071e-04  7.162e-03  5.865e-02

Empirical distribution of returns

hist(DataSeta[,'goog'],probability = T,main = 'Empirical distribution for goog',xlab = 'historical outcomes')

hist(DataSeta[,'aapl'],probability = T,main = 'Empirical distribution for aapl',xlab = 'historical outcomes')

hist(DataSeta[,'amzn'],probability = T,main = 'Empirical distribution for amzn',xlab = 'historical outcomes')

path<-cbind(cumprod(DataSeta[,'goog']+1),cumprod(DataSeta[,'aapl']+1),cumprod(DataSeta[,'amzn']+1))
plot(1:(nrow(DataSeta)+1), c(1,path[,3]),type='l')
lines(1:(nrow(DataSeta)+1), c(1,path[,1]))
lines(1:(nrow(DataSeta)+1), c(1,path[,2]))

library(moments)
skewness(DataSeta[,'goog'])

## [1] 0.9859792

skewness(DataSeta[,'aapl'])

## [1] -0.2447013

skewness(DataSeta[,'amzn'])

## [1] 0.5514585

f<-function(DataSet2){
  row.names(DataSet2)<- NULL
  sk<-NULL
  kur<-NULL
  VaR_cinque<-NULL
  for (i in 1:ncol(DataSet2)) {
    #---------------------------------skew_e_Kurt------
    sk[i]<-skewness(DataSet2[,i])
    kur[i]<-kurtosis(DataSet2[,i])
    #---------------------------------VaR------------
    
    confidenza <- length(DataSet2[,i])*0.05
    posizione<-trunc(confidenza)
    VaR_cinque[i]<- (1- (confidenza-posizione))*sort(DataSet2[,i])[posizione] + (confidenza-posizione)*sort(DataSet2[,i])[posizione + 1]
    
  }
  risp<-cbind(sk,kur,VaR_cinque)
  row.names(risp)<-colnames(DataSet2)
  colnames(risp)<-c('skew','Kurtosis','VaR5%')
  return(risp)
}

f(DataSeta)

##            skew Kurtosis       VaR5%
## goog  0.9859792 6.425474 -0.01842940
## aapl -0.2447013 5.086854 -0.02009614
## amzn  0.5514585 5.532772 -0.02279847

cov(DataSeta)

##              goog         aapl         amzn
## goog 0.0002277898 0.0001352608 0.0001823020
## aapl 0.0001352608 0.0001660842 0.0001465878
## amzn 0.0001823020 0.0001465878 0.0002700008

Our native dataset is in the wide format because Quandl provides Data in the most intuitive way, giving the variable on the colums.

library(tidyr)
FrameData_wide<-gather(DataFrame_long,  Stock , P, -Date)

head(DataFrame_long)

##         Date  Google   Apple  Amazon
## 1 2018-03-27 1005.10 168.340 1497.05
## 2 2018-03-26 1053.21 172.770 1555.86
## 3 2018-03-23 1021.57 164.940 1495.56
## 4 2018-03-22 1049.08 168.845 1544.10
## 5 2018-03-21 1090.88 171.270 1581.86
## 6 2018-03-20 1097.71 175.240 1586.51

head(FrameData_wide)

##         Date  Stock       P
## 1 2018-03-27 Google 1005.10
## 2 2018-03-26 Google 1053.21
## 3 2018-03-23 Google 1021.57
## 4 2018-03-22 Google 1049.08
## 5 2018-03-21 Google 1090.88
## 6 2018-03-20 Google 1097.71

Exercise 2

Our package is “pdfetch”: https://cran.r-project.org/web/packages/pdfetch/pdfetch.pdf
It provides the download of economic and financial time series from public sources, including the St Louis Fed’s FRED system, Yahoo Finance (Used in our examples), the US Bureau of Labor Statistics, the US Energy Information Administration, the World Bank, Eurostat, the European Central Bank, the Bank of England, the UK’s Office of National Statistics, Deutsche Bundesbank, and INSEE.

Yahoo

library(pdfetch)

We made some empirical statistics after having calculated the retutns of some stocks, assumed to be a stationary variable. We made a plot of the empirical distribution of the returns with hist and of the cumulative returns. in the summary we have some descriprions of the empirical distribution of net returns. After we calculated with the function f the skew, kurtosis and the 5% Var. Then we calculated the empirical varcov matrix. URL: https://finance.yahoo.com/

CocaCola<-pdfetch_YAHOO("KO")
Pepsi<-pdfetch_YAHOO("PEP")
McDonalds<-pdfetch_YAHOO("MCD")

dat<- as.character(index(CocaCola))

CocaColar<-as.vector(as.zoo(CocaCola[,4]/CocaCola[,1] -1))
Pepsir<-as.vector(as.zoo(Pepsi[,4]/Pepsi[,1] -1))
McDonaldsr<- as.vector(as.zoo(McDonalds[,4]/McDonalds[,1] -1))
DataSet<-cbind(CocaColar,Pepsir,McDonaldsr)
colnames(DataSet)<-c('KO','PEP','MCD')

Empirical estimate of some statistics

summary(DataSet[,'KO'])

##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
## -0.0897625 -0.0044300  0.0004371  0.0002603  0.0051807  0.0887872

summary(DataSet[,'PEP'])

##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
## -0.0664993 -0.0046502  0.0004299  0.0003679  0.0052910  0.0653786

summary(DataSet[,'MCD'])

##       Min.    1st Qu.     Median       Mean    3rd Qu.       Max. 
## -0.0673566 -0.0049762  0.0003516  0.0002202  0.0054938  0.0731615

Empirical distribution of returns

hist(DataSet[,'KO'],probability = T,main = 'Empirical distribution for KO',xlab = 'historical outcomes')

hist(DataSet[,'PEP'],probability = T,main = 'Empirical distribution for PEP',xlab = 'historical outcomes')

hist(DataSet[,'MCD'],probability = T,main = 'Empirical distribution for MCD',xlab = 'historical outcomes')

path<-cbind(cumprod(DataSet[,'KO']+1),cumprod(DataSet[,'PEP']+1),cumprod(DataSet[,'MCD']+1))
plot(1:(nrow(DataSet)+1), c(1,path[,2]),type='l')
lines(1:(nrow(DataSet)+1), c(1,path[,1]))
lines(1:(nrow(DataSet)+1), c(1,path[,3]))

library(moments)
skewness(DataSet[,'KO'])

## [1] 0.07093544

skewness(DataSet[,'PEP'])

## [1] -0.03655163

skewness(DataSet[,'MCD'])

## [1] 0.107624

f<-function(DataSet2){
row.names(DataSet2)<- NULL
sk<-NULL
kur<-NULL
VaR_cinque<-NULL
for (i in 1:ncol(DataSet2)) {
#---------------------------------skew_e_Kurt------
  sk[i]<-skewness(DataSet2[,i])
  kur[i]<-kurtosis(DataSet2[,i])
#---------------------------------VaR------------
  
  confidenza <- length(DataSet2[,i])*0.05
  posizione<-trunc(confidenza)
  VaR_cinque[i]<- (1- (confidenza-posizione))*sort(DataSet2[,i])[posizione] + (confidenza-posizione)*sort(DataSet2[,i])[posizione + 1]

}
risp<-cbind(sk,kur,VaR_cinque)
row.names(risp)<-colnames(DataSet2)
colnames(risp)<-c('skew','Kurtosis','VaR5%')
return(risp)
}
f(DataSet)

##            skew  Kurtosis       VaR5%
## KO   0.07093544 12.801002 -0.01466827
## PEP -0.03655163  7.261805 -0.01388039
## MCD  0.10762401  8.687005 -0.01565758

cov(DataSet)

##               KO          PEP          MCD
## KO  9.415967e-05 5.888960e-05 4.504165e-05
## PEP 5.888960e-05 8.637010e-05 4.146353e-05
## MCD 4.504165e-05 4.146353e-05 1.024384e-04

The data is provided in the wide format as before. This is due to the fact that a wide Dataset is more intuitive and most commonly used for the time-series just because they are time-based.

Mini Project Programming

Milan Federico, Agnelli Lorenzo

13/11/2019

Mini Project

Exercise 1

Exercise 2