Installation of the Libraries
Libraries
If making the financial analysis of the stock exchange data, the
following packages are highly popular:
- quantmod
- PerformanceAnalytics
- tidyquant
#install.packages("PerformanceAnalytics", repos = "http://cran.us.r-project.org")
#install.packages("dplyr", repos = "http://cran.us.r-project.org")
#install.packages("tidyquant", repos = "http://cran.us.r-project.org")
#install.packages("quantmod", repos = "http://cran.us.r-project.org")
#install.packages("tseries", repos = "http://cran.us.r-project.org")
#install.packages("tidyverse", repos = "http://cran.us.r-project.org")
library(PerformanceAnalytics) #useful package !!!
library(quantmod) # useful package
library(tidyquant) # useful
# other libraries
library(ggplot2)
library(dplyr)
library(tseries)
library(tidyverse)
library(plotly)
library(hrbrthemes)
library(xts)
library(knitr)
library(kableExtra)
library(car)
library(mathjaxr)
library(zoo)
rm(list=ls())
Data download
If downloading the stock-exchange data, we use the quantmod package
command “getSymbols”. COmmand “Ad” enables extraction of just Adjusted
closing prices of the day.
symbol_name <<- c("AAPL", "GOOG", "AMZN", "F", "T", "TQQQ")
#symbol_name <- "AAPL"
for (i in 1:length(symbol_name)) {
prac <<- Ad(getSymbols(symbol_name[i], from = "2020-01-01", to = "2022-12-31",auto.assign=FALSE))
if (i==1) {
price <<-prac
} else{
price <<- merge(price,prac)
}
}
rm(prac) # prac is just temporary variable to remove
colnames(price) <- symbol_name #puting the names of the shares
Analysis of one asset
In the above, we learned downloading of the adjusted closing prices.
Now, take the first asset from the “symbol_name” and make a picture. We
learly see that the stochastic properties of the assets coming from the
two different periods significantly differ.
title <- paste(symbol_name[1], "Share" )
dolna_hranica <<- "2020-07-01"
horna_hranica <<- "2022-07-01"
#data(sample_matrix)
#sample.xts <- as.xts(price[,1])
events <- xts(c(" "," "),as.Date(c(dolna_hranica, horna_hranica)))
plot(price[,1], col="red", main=title)

addEventLines(events, srt=90, pos=2,col="blue")

The differences of the stochastic properties of the stock prices
originating from 2 different preiods, we can depict also making some
basic descriptive statistics.
subT_D <<- price["2020-01-01/2020-07-01",1]
subT_H <<- price["2022-07-01/2022-12-28",1]
# Get summary statistics using summary() function
summary_data <- summary(data.frame(cbind(subT_D,subT_H)))
summary_data <- summary_data[1:6,]
# Convert summary statistics to a table using kable()
colnames(summary_data) <- c("1st period","2nd period")
summary_data
1st period 2nd period
Min. :55.00 Min. :125.8
1st Qu.:69.33 1st Qu.:142.7
Median :75.70 Median :149.0
Mean :74.21 Mean :149.9
3rd Qu.:78.59 3rd Qu.:155.3
Max. :90.09 Max. :174.0
#summary_table <- kable(summary_data)
# Customize the table using kableExtra package
#summary_table %>%
# kable_styling(full_width = FALSE) %>%
# add_header_above(c("Summary Statistics" = 3))
Conversion of the level data time series to the returns
Previous empirical experience lead to the conclusion that the
underlying probability distribution of the prices \(f_t(P) <> f_{t \pm i}(P)\) that
means, the underlying time series is not stationary. That is, why we are
unable to use the tools of the probability theory. The economists solve
the problem by taking the capital asset price returns, i.e.
\[r_t = \frac{\Delta
P_t}{P_{t-1}}\] where \(\Delta P_t =
P_t - P_{t-1}\). Osborne came with an alternative formulation of
the returns as \[r_t = \ln(P_t) -
ln(P_{t-1}).\]
At first, look at the stochastic properties of the returns,
i.e. compare their means and values recorded in two researched
periods.
return_l <<- CalculateReturns(price, method="log")
return_p <- CalculateReturns(price, method="discrete")
for(i in 1:dim(return_l)[2]){ #imputation of the missing data
return_l[,i][is.na(return_l[,i])] <- median(return_l[,i],na.rm = TRUE)
}
for(i in 1:dim(return_p)[2]){ #imputation of the missing data
return_p[,i][is.na(return_p[,i])] <- median(return_p[,i],na.rm = TRUE)
}
subT_Dr_p <<- return_p["2020-01-01/2020-07-01",1] #Rozdelenie return_p na dva podsubory
subT_Hr_p <<- return_p["2022-07-01/2022-12-28",1]
# graphing just the firs share
title <- paste(symbol_name[1], "Share" )
events <- xts(c(" "," "),as.Date(c(dolna_hranica, horna_hranica)))
plot(return_p[,1], col="red", main=title)

addEventLines(events, srt=90, pos=2,col="blue")

# Get summary statistics using summary() function
summary_data <- summary(data.frame(cbind(subT_Dr_p,subT_Hr_p)))
summary_data <- summary_data[1:6,]
# Convert summary statistics to a table using kable()
colnames(summary_data) <- c("1st period","2nd period")
summary_data
1st period 2nd period
Min. :-0.12865 Min. :-0.05868
1st Qu.:-0.01154 1st Qu.:-0.01458
Median : 0.00148 Median :-0.00195
Mean : 0.00212 Mean :-0.00038
3rd Qu.: 0.01876 3rd Qu.: 0.01297
Max. : 0.11981 Max. : 0.08897
#summary_table <- kable(summary_data)
# Customize the table using kableExtra package
#summary_table %>%
# kable_styling(full_width = FALSE) %>%
# add_header_above(c("Summary Statistics" = 3))
The distribution seems to be equally distributed. At least, let us
test the equality of the mean returns as follows
\[H_0: M_{1st Period} = M_{2nd
Period}\]
against
\[H_1: M_{1st Period} \neq M_{2nd
Period},\] and also
\[H_0: \sigma^2_{1st Period} =
\sigma^2_{2nd Period}\]
against
\[H_1: \sigma^2_{1st Period} \neq
\sigma^2_{2nd Period},\]
resultM <<- wilcox.test(as.vector(subT_Dr_p), as.vector(subT_Hr_p))
# Perform Welch's test
resultSigma <<- var.test(as.vector(subT_Dr_p), as.vector(subT_Hr_p))
resultM
Wilcoxon rank sum test with continuity correction
data: as.vector(subT_Dr_p) and as.vector(subT_Hr_p)
W = 8575, p-value = 0.2239
alternative hypothesis: true location shift is not equal to 0
resultSigma
F test to compare two variances
data: as.vector(subT_Dr_p) and as.vector(subT_Hr_p)
F = 2.1779, num df = 125, denom df = 124, p-value = 1.881e-05
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
1.530087 3.099320
sample estimates:
ratio of variances
2.177942
The p-value of the Wilcoxon test is 0.2238738 , while the result of
the F test of the equal variances is 1.88054^{-5}.
Normally distributed returns?
Many theories (Markowitz, Black Schole, VaR) assume the normal
distribution of the data. In reality, the assumption is often violated
and the real distribution has have tails if comparing to the normal
distribution. See result of the following Jarque-Berra normality
test
test_result <- jarque.bera.test(price[,1])
print(test_result)
Jarque Bera Test
data: price[, 1]
X-squared = 56.844, df = 2, p-value = 4.534e-13
Histograms
dev.new() # new Plot
par(mfrow=c(2,3)) # arrange plots in a 3x2 grid
for(i in 1:dim(return_p)[2]){ # iterate through each symbol
etf = return_p[,i] # load data into temp variable
colnames( etf ) = colnames(return_p[,i]) # data header as the ticker
chart.Histogram( etf, main=paste(" Return Distribution"),
breaks=15, methods=c("add.normal"), # Add the normal curve, and the VaR levels
colorset=c("steelblue", "darkgreen", "navy") # colors for each (middle color not used)
)
}

NA
NA
chart.Histogram( return_p[,1], main= paste(" Return Distribution of ",colnames(return_p[,1])),
breaks=15, methods=c("add.normal", "add.risk"), # Add the normal curve, and the VaR levels
colorset=c("steelblue", "darkgreen", "navy") # colors for each (middle color not used)
)

Correlation structure of the portfolio assets
If speaking about facing the risks, the effective diversification of
the investment is needed. In this way, we can reduce the unsystemic
risks. See the correlations in our hypotetical portfolio in the
following Figure.
#chart.Correlation(return_p[2:ncol(return_p)])
chart.Correlation(return_p)

Estimation of the variance-covariance matrix of the portfolio
returns
Variance - covariance matrix is a cetral topic of the portfolio
theory. It containes the return variances on the main diagonal, while
besides there are the covariances. If speaking about the correlation
matrix, then the diagonal terms are 1’s and out of the diagonal, there
are the corresponding correlations.
var_covar_p <- cov(return_p)
var_covar_p
AAPL GOOG AMZN F T TQQQ
AAPL 0.0005405816 0.0003660638 0.0003793896 0.0003066270 0.0001693079 0.0011384589
GOOG 0.0003660638 0.0004681645 0.0003637674 0.0002879402 0.0001535278 0.0010263473
AMZN 0.0003793896 0.0003637674 0.0006053833 0.0002239328 0.0001049222 0.0010747271
F 0.0003066270 0.0002879402 0.0002239328 0.0009734113 0.0002524982 0.0008633685
T 0.0001693079 0.0001535278 0.0001049222 0.0002524982 0.0003164058 0.0004406637
TQQQ 0.0011384589 0.0010263473 0.0010747271 0.0008633685 0.0004406637 0.0030440217
