Obtain daily price data for any 3 stocks of your choosing for the last trading day of 2020 to the last trading day of 2021.
library(quantmod)
## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
getSymbols('AMZN', from = "2020-12-31", to = "2021-12-31")
## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
##
## This message is shown once per session and may be disabled by setting
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
## [1] "AMZN"
getSymbols('DIS', from = "2020-12-31", to = "2021-12-31")
## [1] "DIS"
getSymbols('IBM', from = "2020-12-31", to = "2021-12-31")
## [1] "IBM"
AMZN <- AMZN[, 'AMZN.Adjusted']
names(AMZN) <- 'P.AMZN'
DIS <- DIS[, 'DIS.Adjusted']
names(DIS) <- 'P.DIS'
IBM <- IBM[, 'IBM.Adjusted']
names(IBM) <- 'P.IBM'
DailyData <- merge.xts(AMZN$P.AMZN, DIS$P.DIS, IBM$P.IBM)
DailyData$R.AMZN <- na.omit(log(AMZN/lag(AMZN)))
DailyData$R.DIS <- na.omit(log(DIS/lag(DIS)))
DailyData$R.IBM <- na.omit(log(IBM/lag(IBM)))
DailyData <- na.omit(DailyData)
DailyData$cumR.AMZN <- cumsum(DailyData$R.AMZN)
DailyData$cumR.DIS <- cumsum(DailyData$R.DIS)
DailyData$cumR.IBM <- cumsum(DailyData$R.IBM)
Cumulative.Returns <- cbind(DailyData$cumR.AMZN, DailyData$cumR.DIS, DailyData$cumR.IBM)
plot(Cumulative.Returns, type = 'l', xlab = 'Date', ylab = "Return", main = "Cumulative Returns of AMZN, DIS, and IBM 2021", legend.loc = "topleft")
In class, we discussed four statistical properties: mean, variance, skewness, and kurtosis. These depend on the values a random variable takes and the underlying probability distribution. However, when dealing with real world data, we don’t know what probability distribution the data follows so we need to “estimate” these values from the observed values.
For the following exercise, you will need to install an additional package called PerformanceAnalytics. Do that first and load the package using the library function. Then, for each of the stocks that you used:
meanR.AMZN <- mean(DailyData$R.AMZN)
meanR.DIS <- mean(DailyData$R.DIS)
meanR.IBM<- mean(DailyData$R.IBM)
sdR.AMZN <- sd(DailyData$R.AMZN)
sdR.DIS <- sd(DailyData$R.DIS)
sdR.IBM<- sd(DailyData$R.IBM)
varR.AMZN <- sdR.AMZN^2
varR.DIS <- sdR.DIS^2
varR.IBM <- sdR.IBM^2
library(PerformanceAnalytics)
##
## Attaching package: 'PerformanceAnalytics'
## The following object is masked from 'package:graphics':
##
## legend
skewR.AMZN <- skewness(DailyData$R.AMZN)
skewR.DIS <- skewness(DailyData$R.DIS)
skewR.IBM <- skewness(DailyData$R.IBM)
kurtR.AMZN <- kurtosis(DailyData$R.AMZN)
kurtR.DIS <- kurtosis(DailyData$R.DIS)
kurtR.IBM <- kurtosis(DailyData$R.IBM)
mean <- c(meanR.AMZN, meanR.DIS, meanR.IBM)
var <- c(varR.AMZN, varR.DIS, varR.IBM)
sd <- c(sdR.AMZN, sdR.DIS, sdR.IBM)
skewness <- c(skewR.AMZN, skewR.DIS, skewR.IBM)
kurtosis <- c(kurtR.AMZN,kurtR.DIS, kurtR.IBM)
rownames <-c('AMZN', 'DIS', 'IBM')
Statistics <- data.frame(mean, var, sd, skewness, kurtosis, row.names = rownames)
Statistics
## mean var sd skewness kurtosis
## AMZN 0.0001393821 0.0002319437 0.01522970 -0.4103536 2.640314
## DIS -0.0005979435 0.0002451000 0.01565567 0.1640817 3.123931
## IBM 0.0006211192 0.0002158771 0.01469276 -2.6262563 18.109773
The skewness and kurtosis of a normal distribution are both equal to 0. Skewness and kurtosis values close to 0 often indicate that a random variable is normally distributed. The skewness of AMZN and IBM daily returns are far from zero, but the skewness of DIS daily returns is relatively close to zero. All three stocks have relatively large excess kurtosis values, especially IBM, so one cannot conclude that the daily returns are normally distributed based on these statistics.
Correlation <- cor( DailyData[ , c(4:6) ] )
Correlation <- round(Correlation, digits = 2)
Correlation
## R.AMZN R.DIS R.IBM
## R.AMZN 1.00 0.17 0.00
## R.DIS 0.17 1.00 0.18
## R.IBM 0.00 0.18 1.00
We simulated a random walk model in class. Consider an extension of that model to account for drift: \[ Y_{t+1} = Y_t + \alpha + \epsilon_t \]
where \(Y_0\) is fixed, \(\alpha\) is the drift term, and \(\epsilon_t \sim N(0,1)\) as before. The \(\alpha\) term essentially causes the time series to have an upward trend over time, although this might not be immediately obvious. Consider an \(\alpha = 0.05\) and a starting point of \(Y_0 = 0\). Simulate this process for 1,000 time steps. If you start with the code from class, you only need to tweak it very slightly to generate this new process. Create a plot with:
Provide appropriate title, legend, labels.
Note: Don’t forget to use set.seed and absolutely no for-loops allowed! Paste your code in a code chunk below.
set.seed(10)
N <- 1000
t <- 1:N
e <- rnorm(N,0,1)
a <- 0.05
y1 <- cumsum(e+a)
y2 <- cumsum(e)
plot(t,y1, xlab='Time', ylab = "Y", ylim = c(-30,70), type = 'l', col = 'blue', main = "Simulation Study with and without Alpha")
lines(t,y2, lty = 1, col = 'red')
abline(h=0)
legend("topleft", legend = c("with Alpha", "without Alpha"), col = c("blue", "red"), lty=1)
Practice with expected values & variances. Suppose \(X_1, X_2, \cdots, X_T\) are independent random variables with mean 0.5 and standard deviation 4. Then, calculate the expected values and standard deviation of their sum:
You don’t need to show any calculation but do explain your reasoning below.
Finally, if \(X_1, X_2, \cdots, X_T\) all have a normal distribution, then what distribution does their sum have?
Since the random variables are independent of one another, their Expected Values (mean) have a linear relationship. i.e. \(E(X_1+X_2+...+X_t)\) is equal to \(E(X_1)+E(X_2)+...+ E(X_t)\) Therefore, the expected value of their sum is equal to \(\mu t\) or \(0.5t\)
The same line of reasoning applies to Variance if random variables are independent. Considering that \(Var(X_1+X_2+...+X_t)\) is equal to \(Var(X_1)+Var(X_2)+...+Var(X_t)\), then the variance of their sums would equal \(t\sigma^2\) Therefore the standard deviation of their sum would equal \(\sqrt{t}\sigma\) or \(\sqrt{t}4\) in this case.
Since the random variables are all independent they have a linear relationship. Therefore, if they are all normally distributed, then their sum would be normally distributed as well.