Assignment 1

Question 1

Obtain daily price data for any 3 stocks of your choosing for the last trading day of 2020 to the last trading day of 2021.

Keep only the adjusted closing price column for each
Merge the three to obtain just one object containing the dates and the three prices (we didn’t do this in class)
Calculate the continuously compounded daily returns for each stock - you should have three more columns now
The first row should have NA values for the returns - remove this
Use the result of the previous step to calculate the cumulative return starting from the first trading day in 2021 - you should have three more columns now
Plot the three cumulative returns on the same chart
Make sure it is clear - you may use different colors and/or line types for each graph
Include a legend (we didn’t do this in class either)
Provide a title and label the axes properly (if it’s obvious from looking at it that the x-axis is date, then don’t worry about the labeling) You can paste your code here:

library(quantmod)

## Loading required package: xts

## Loading required package: zoo

## 
## Attaching package: 'zoo'

## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric

## Loading required package: TTR

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

getSymbols('AMZN', from = "2020-12-31", to = "2021-12-31")

## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
## 
## This message is shown once per session and may be disabled by setting 
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.

## [1] "AMZN"

getSymbols('DIS', from = "2020-12-31", to = "2021-12-31")

## [1] "DIS"

getSymbols('IBM', from = "2020-12-31", to = "2021-12-31")

## [1] "IBM"

AMZN <- AMZN[, 'AMZN.Adjusted']
names(AMZN) <- 'P.AMZN'
DIS <- DIS[, 'DIS.Adjusted']
names(DIS) <- 'P.DIS'
IBM <- IBM[, 'IBM.Adjusted']
names(IBM) <- 'P.IBM'
DailyData <- merge.xts(AMZN$P.AMZN, DIS$P.DIS, IBM$P.IBM)
DailyData$R.AMZN <- na.omit(log(AMZN/lag(AMZN)))
DailyData$R.DIS <- na.omit(log(DIS/lag(DIS)))
DailyData$R.IBM <- na.omit(log(IBM/lag(IBM)))
DailyData <- na.omit(DailyData)
DailyData$cumR.AMZN <- cumsum(DailyData$R.AMZN)
DailyData$cumR.DIS <- cumsum(DailyData$R.DIS)
DailyData$cumR.IBM <- cumsum(DailyData$R.IBM)
Cumulative.Returns <- cbind(DailyData$cumR.AMZN, DailyData$cumR.DIS, DailyData$cumR.IBM)
plot(Cumulative.Returns, type = 'l', xlab = 'Date', ylab = "Return", main = "Cumulative Returns of AMZN, DIS, and IBM 2021", legend.loc = "topleft")

Question 2

In class, we discussed four statistical properties: mean, variance, skewness, and kurtosis. These depend on the values a random variable takes and the underlying probability distribution. However, when dealing with real world data, we don’t know what probability distribution the data follows so we need to “estimate” these values from the observed values.

For the following exercise, you will need to install an additional package called PerformanceAnalytics. Do that first and load the package using the library function. Then, for each of the stocks that you used:

Calculate the mean daily return
Calculate the variance and standard deviation of daily returns
Calculate the skewness of daily returns
Calculate the kurtosis of daily returns
Compare the skewness and kurtosis to that of a normal distribution and comment on whether you think the daily stock returns are likely to follow a normal distribution or not.
- For the first two, R has built-in functions. For the last two, the PerformanceAnalytics package has functions available you can use.
Finally use the cor function to calculate the correlation matrix (which is just the correlation between all possible pairs of returns).
- To use this function, if your data is stored in an object called d, and the returns are in columns 4:6, you will need to use: cor( d[ , c(4:6) ] ). Store the result in a new variable.
- That is, you will need to provide all columns to the cor function at the same time
- Use the round function to output the correlation matrix with 2 decimal places. Create a code chunk below and paste your code there.

meanR.AMZN <- mean(DailyData$R.AMZN)
meanR.DIS <- mean(DailyData$R.DIS)
meanR.IBM<- mean(DailyData$R.IBM)
sdR.AMZN <- sd(DailyData$R.AMZN)
sdR.DIS <- sd(DailyData$R.DIS)
sdR.IBM<- sd(DailyData$R.IBM)
varR.AMZN <- sdR.AMZN^2
varR.DIS <- sdR.DIS^2
varR.IBM <- sdR.IBM^2
library(PerformanceAnalytics)

## 
## Attaching package: 'PerformanceAnalytics'

## The following object is masked from 'package:graphics':
## 
##     legend

skewR.AMZN <- skewness(DailyData$R.AMZN)
skewR.DIS <- skewness(DailyData$R.DIS)
skewR.IBM <- skewness(DailyData$R.IBM)
kurtR.AMZN <- kurtosis(DailyData$R.AMZN)
kurtR.DIS <- kurtosis(DailyData$R.DIS)
kurtR.IBM <- kurtosis(DailyData$R.IBM)
mean <- c(meanR.AMZN, meanR.DIS, meanR.IBM)
var <- c(varR.AMZN, varR.DIS, varR.IBM)
sd <- c(sdR.AMZN, sdR.DIS, sdR.IBM)
skewness <- c(skewR.AMZN, skewR.DIS, skewR.IBM)
kurtosis <- c(kurtR.AMZN,kurtR.DIS, kurtR.IBM)
rownames <-c('AMZN', 'DIS', 'IBM')
Statistics <- data.frame(mean, var, sd, skewness, kurtosis, row.names = rownames)
Statistics

##               mean          var         sd   skewness  kurtosis
## AMZN  0.0001393821 0.0002319437 0.01522970 -0.4103536  2.640314
## DIS  -0.0005979435 0.0002451000 0.01565567  0.1640817  3.123931
## IBM   0.0006211192 0.0002158771 0.01469276 -2.6262563 18.109773

The skewness and kurtosis of a normal distribution are both equal to 0. Skewness and kurtosis values close to 0 often indicate that a random variable is normally distributed. The skewness of AMZN and IBM daily returns are far from zero, but the skewness of DIS daily returns is relatively close to zero. All three stocks have relatively large excess kurtosis values, especially IBM, so one cannot conclude that the daily returns are normally distributed based on these statistics.

Correlation <- cor( DailyData[ , c(4:6) ] )
Correlation <- round(Correlation, digits = 2)
Correlation

##        R.AMZN R.DIS R.IBM
## R.AMZN   1.00  0.17  0.00
## R.DIS    0.17  1.00  0.18
## R.IBM    0.00  0.18  1.00

Question 3

We simulated a random walk model in class. Consider an extension of that model to account for drift: \[ Y_{t+1} = Y_t + \alpha + \epsilon_t \]

where \(Y_0\) is fixed, \(\alpha\) is the drift term, and \(\epsilon_t \sim N(0,1)\) as before. The \(\alpha\) term essentially causes the time series to have an upward trend over time, although this might not be immediately obvious. Consider an \(\alpha = 0.05\) and a starting point of \(Y_0 = 0\). Simulate this process for 1,000 time steps. If you start with the code from class, you only need to tweak it very slightly to generate this new process. Create a plot with:

Random walk with no drift, i.e. \(\alpha = 0\)
Random walk with the drift, i.e. \(\alpha = 0.05\)

Provide appropriate title, legend, labels.

Note: Don’t forget to use set.seed and absolutely no for-loops allowed! Paste your code in a code chunk below.

set.seed(10)
N <- 1000
t <- 1:N
e <- rnorm(N,0,1)
a <- 0.05
y1 <- cumsum(e+a)
y2 <- cumsum(e)
plot(t,y1, xlab='Time', ylab = "Y", ylim = c(-30,70), type = 'l', col = 'blue', main = "Simulation Study with and without Alpha")
lines(t,y2, lty = 1, col = 'red')
abline(h=0)
legend("topleft", legend = c("with Alpha", "without Alpha"), col = c("blue", "red"), lty=1)

Question 4

Practice with expected values & variances. Suppose \(X_1, X_2, \cdots, X_T\) are independent random variables with mean 0.5 and standard deviation 4. Then, calculate the expected values and standard deviation of their sum:

\(E(X_1 + X_2 + \cdots + X_T) = ?\)
\(\text{sd}(X_1 + X_2 + \cdots + X_T) = ?\)

You don’t need to show any calculation but do explain your reasoning below.

Finally, if \(X_1, X_2, \cdots, X_T\) all have a normal distribution, then what distribution does their sum have?

Since the random variables are independent of one another, their Expected Values (mean) have a linear relationship. i.e. \(E(X_1+X_2+...+X_t)\) is equal to \(E(X_1)+E(X_2)+...+ E(X_t)\) Therefore, the expected value of their sum is equal to \(\mu t\) or \(0.5t\)

The same line of reasoning applies to Variance if random variables are independent. Considering that \(Var(X_1+X_2+...+X_t)\) is equal to \(Var(X_1)+Var(X_2)+...+Var(X_t)\), then the variance of their sums would equal \(t\sigma^2\) Therefore the standard deviation of their sum would equal \(\sqrt{t}\sigma\) or \(\sqrt{t}4\) in this case.

Since the random variables are all independent they have a linear relationship. Therefore, if they are all normally distributed, then their sum would be normally distributed as well.

Assignment 1

Maanav Patel

2/21/2022

Question 1

Question 2

Question 3

Question 4