1 Non stationary variables - The Random Walk model for stock prices

The random walk hypothesis in Finance (Fama, 1965) states that the natural logarithm of stock prices behaves like a random walk with a drift. A random walk is a series (or variable) that cannot be predicted. Imagine that \(Y_t\) is the log price of a stock for today (t). The value of Y for tomorrow (\(Y_{t+1}\)) will be equal to its today’s value (\(Y_t\)) plus a constant value (\(φ_0\)) plus a random shock. This shock is a pure random value that follows a normal distribution with mean=0 and a specific standard deviation \(σ_ε\). The process is supposed to be the same for all future periods. In mathematical terms, the random walk model is the following:

\[ Y_t = φ_0 + Y_{t−1} + ε_t \]

The \(ε_t\) is a random shock for each day, which is the result of the log price movement due to all news (external and internal to the stock) that influence the price. \(φ_0\) refers as the drift of the series. If \(|φ_0|\) > 0 we say that the series is a random walk with a drift. If \(φ_0\) is positive, then the variable will have a positive trend over time; if it is negative, the series will have a negative trend.

If we want to simulate a random walk, we need the values of the following parameters/variables:

\(Y_0\), the first value of the series
\(φ_0\), the drift of the series
\(σ_ε\), the standard deviation (volatility) of the random shock

1.1 Q Monte Carlo simulation for the random walk model

Let’s go and run a MonteCarlo simulation for a random walk of the S&P 500. We will use real values of the S&P500 to estimate the previous 3 parameters.

1.1.1 Loading/installing R packages

The following R packages need to be installed for this workshop:

dplyr
ggplot2

Go to the right-bottom windows of RStudio, select the Package tab, click install, and install both R packages. These packages include many other R packages for time-series data management and analysis.

These fpp2 and fpp3 packages were written by Rob J Hyndman and George Athanasopoulos, professors from Monash University at Australia. They are also business consultants with many years of experience doing both serious research in time-series and also applying their findings in the real world.

Once you install these packages, load them in memory:

library(quantmod)
library(ggplot2)
library(dplyr)

1.1.2 Downloading data for the S&P500

Download the S&P500 historical daily data from Yahoo Finance from 2009 to date. I will set the last date as today, Aug 23, 2021, but you can change this date:

getSymbols("^GSPC", from="2009-01-01", 
           to = "2022-02-28")

## [1] "^GSPC"

Now we generate the log of the S&P index using the closing price/quotation, and create a variable N for the number of days in the dataset:

lnsp<-log(Ad(GSPC))
# I assign a name for the index:
names(lnsp)<-c("lnsp")
N<-nrow(lnsp)

Now we will simulate 2 random walk series estimating the 3 parameters from this log series of the S&P500:

a random walk with a drift (name it rw1), and
a random walk with no drift (name it rw2).

1.1.3 Estimating the parameters of the random walk model

We have to consider the mathematical definition of a random walk and estimate its parameters (initial value, phi0, volatility of the random shock) from the real daily S&P500 data.

Now, we create a variable for a random walk with a drift trying to model the log of the S&P500.

Reviewing the random walk equation again:

\[ Y_t = φ_0 + Y_{t−1} + ε_t \] The \(ε_t\) is the random shock of each day, which represents the overall average perception of all market participants after learning the news of the day (internal and external news announced to the market).

Remember that \(\varepsilon_{t}\) behaves like a random normal distributed variable with mean=0 and with a specific standard deviation \(\sigma_{\varepsilon}\).

For the simulation of the random walk, you need to estimate the values of

\(y_{0}\), the first value of the series, which is the log S&P500 index of the first day
\(\phi_{0}\)
\(\sigma_{\varepsilon}\)

You have to estimate \(\phi_{0}\) using the last and the first real values of the series following the equation of the random walk. Here you can see possible values of a random walk over time:

\[ Y_{0} = Initial value \] \[ Y_{1} = \phi_{0} + Y_{0} + \varepsilon_{1} \] \[ Y_{2} = \phi_{0} + Y_{1} + \varepsilon_{2} \] Substituting \(Y_{1}\) with its corresponding equation: \[ Y_{2} = \phi_{0} + \phi_{0} + Y_{0} + \varepsilon_{1} + \varepsilon_{2} \] Re-arranging the terms: \[ Y_{2} = 2*\phi_{0} + Y_{0} + \varepsilon_{1} + \varepsilon_{2} \] If you continue doing the same until the last N value, you can get: \[ Y_{N} = N*\phi_{0} + Y_{0} + \sum_{t=1}^{N}\varepsilon_{t} \]

This mathematical result is kind of intuitive. The value of a random walk at time N will be equal to its initial value plus N times phi0 plus the sum of ALL random shocks from 1 to N.

Since the mean of the shocks is assumed to be zero, then the expected value of the sum of the shocks will also be zero. Then: \[ E[Y_{N}] = N*\phi_{0} + Y_{0} \] From this equation we see that \(phi_{0}\) can be estimated as: \[ \phi_{0} = \frac{(Y_{N} - Y_{0})}{N} \]

Then, \(\phi_{0}\) = (last value - first value) / # of days.

I use scalars to calculate these coefficients for the simulation. A Stata scalar is a temporal variable to save a number.

I calculate \(\phi_{0}\) following this formula:

phi0<- (as.numeric(lnsp$lnsp[N])-as.numeric(lnsp$lnsp[1])) / N
cat("The value for phi0 is ",phi0)

## The value for phi0 is  0.000467758

Remember that N is the total # of observations, so lnsp[N] has last daily value of the log of the S&P500.

Now we need to estimate sigma, which is the standard deviation of the shocks. We can start estimating its variance first. It is known that the variance of a random walk cannot be determined unless we consider a specific number of periods.

Then, let’s consider the equation of the random walk series for the last value (\(Y_N\)), and then estimate its variance from there:

\[ Y_{N} = N*\phi_{0} + Y_{0} + \sum_{t=1}^{N}\varepsilon_{t} \] Using this equation, we calculate the variance of \(Y_N\) :

\[ Var(Y_{N}) = Var(N*\phi_{0}) + Var(Y_{0}) + \sum_{t=1}^{N}Var(\varepsilon_{t}) \] The variance of a constant is zero, so the first two terms are equal to zero.

Now analize the variance of the shock:

Since it is supposed that the volatility (standard deviation) of the shocks is about the same over time, then:

\[ Var(\varepsilon_{1}) = Var(\varepsilon_{2}) = Var(\varepsilon_{N}) = \sigma_{\varepsilon}^2 \] Then the sum of the variances of all shocks is actually the variance of the shock times N. Then the variance of all the shocks is actually the variance of \(Y_N\).

Then we can write the variance of \(Y_N\) as:

\[ Var(Y_{N}) = N * Var(\varepsilon)= N*\sigma_{\varepsilon}^2 \] To get the standard deviation of \(Y_N\) we take the square root of the variance of \(Y_N\):

\[ SD(Y_{N}) = \sqrt{N}*SD(\varepsilon) \] We use sigma character for standard deviations:

\[ \sigma_{Y} = \sqrt{N}*\sigma_{\varepsilon} \]
Finally we express the volatility of the shock (\(\sigma_{\varepsilon}\)) in terms of the volatility of \(Y_N\) (\(\sigma_{Y}\)):

\[ \sigma_{\varepsilon} = \frac{\sigma_{Y}}{\sqrt{N}} \]

Then we can estimate sigma as: sigma = StDev(lnsp) / sqrt(N). Let’s do it:

sigma<-sd(lnsp$lnsp) / sqrt(N)
cat("The volatility of the log is = ",sd(lnsp$lnsp),"\n")

## The volatility of the log is =  0.4365088

cat("The volatility for the shock is = ",sigma)

## The volatility for the shock is =  0.00758601

1.1.4 Simulating the random walk with drift

Now you are ready to start the simulation of random walk using rw1: \[ rw1_{t} = \phi_{0} + rw1_{t-1} + \varepsilon_{t} \]

The \(\phi_{0}\) coefficient is also drift of the random walk.

We will create a new column in the lnsp R dataset for the random walk with the name rw1.

lnsp$rw1 = 0

I assigned zero to all values before I do the simulation.

I start assigning the first value of the random walk to be equal to the first value of the log of the S&P500:

lnsp$rw1[1]<-lnsp$lnsp[1]

Now assign random values from day 2 to the last day following the random walk. For each day, we create the random shock using the function rnorm. We create this shock with standard deviation equal to the volatility of the shock we calculated above (the sigma). We indicate that the mean =0:

shock <- rnorm(n=N,mean=0,sd=sigma)
lnsp$shock<-shock

We can see the shock over time:

plot(shock, type="l", col="blue")

We can also see whether the shock behaves like a normal distribution by doing its histogram:

hist(lnsp$shock)

As expected, the shock behaves similiar to a normal-distributed variable.

Now we are ready to start the simulation of random walk. Then we fill the values for rw1. Remembering the formula for the random walk process:

\[ rw1_{t} = \phi_{0} + rw1_{t-1} + \varepsilon_{t} \] We start the random walk with the first value of the log of the S&P500. Then, from day 2 we do the simulation according to the previous formula and using the random shock just created:

# I create separate vectors:
rw1<-single(length=N)

# I assign the first value of the random-walk to be equal to real log value of the S&P
rw1[1]<-as.numeric(lnsp$lnsp[1])
# Now from day 1 I generate the values of the random walk following the formula:
for (i in 2:N){
  rw1[i] <- phi0 + rw1[i-1] + shock[i] 
}
lnsp$rw1<-rw1

I plot the simulated random walk and the real log of the S&P500:

plot.xts(lnsp$lnsp, col="blue")

lines(lnsp$rw1, col="black")

WE SEE THAT THE RANDOM WALK WITH THE DRIFT BEHAVES SIMILAR TO THE S&P500 IN TERMS OF UPWARD TENDENCY AND VARIABILITY (MOST OF THE TIME), AND THE ENDING VALUE OF BOTH SERIES. HOWEVER, THERE ARE MORE PRONOUNCED DECLINES IN THE REAL S&P500 COMPARED WITH THE SIMULATED RANDOM WALK, AND ALSO THERE ARE SOME FEW PERIODS THAT THE RANDOM WALK IS NOT QUITE A GOOD MODEL FOR THE REAL S&P500.

1.1.5 Simulating a random walk with no drift

Now we can do a simulation but now without the drift. I this case, the \(\phi_{0}\) coefficient must be zero.

Use another variable rw2 for this. You can follow the logic we did for rw1, but now \(\phi_{0}\) will be equal to zero, so we do not include it into the equation:

rw1_v2<-single(length=N)
rw1_v2[1]<-lnsp$lnsp[1]
for (i in 2:N){
  rw1_v2[i] <- rw1_v2[i-1] + shock[i] 
}

ts.plot(lnsp$lnsp, col="blue")
# I plot both lines to compare 
lines(rw1_v2, col="green")

WHAT DO YOU OBSERVE with this plot? EXPLAIN WITH YOUR WORDS.

I SEE THAT THE RANDOM WALK WITH A POSITIVE DRIFT, THE BLUE LINE, HAS A CLEAR GROWING TREND, WHILE THE RANDOM WALK WITH NO DRIFT (\(\phi_0\)=0), THE GREEN LINE, HAS NO CLEAR GROWING OR DECLINING TREND. THE VARIABILITY (VOLATILITY) OF BOTH SERIES SEEMS SIMILAR.

Now run a simple regression to check whether the rw1 is statistically related to the log of the S&P500. Use rw1 as explanatory variable. Show the regression results as comments.

regmodel<-lm(lnsp$lnsp~lnsp$rw1)
# I see statistics on my regression model
s_regmodel <- summary(regmodel)
s_regmodel

## 
## Call:
## lm(formula = lnsp$lnsp ~ lnsp$rw1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.37152 -0.07869  0.00537  0.09045  0.27793 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.691753   0.047014  -35.98   <2e-16 ***
## lnsp$rw1     1.251513   0.006328  197.76   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1219 on 3309 degrees of freedom
## Multiple R-squared:  0.922,  Adjusted R-squared:  0.922 
## F-statistic: 3.911e+04 on 1 and 3309 DF,  p-value: < 2.2e-16

DOES THE REGRESSION RESULT MAKE SENSE? EXPLAIN WHY YES OR WHY NOT?

THE BETA1 COEFFICIENT IS POSITIVE WITH A VALUE OF 1.2515126, AND IT IS VERY SIGNIFICANTLY GREATER THAN ZERO (t-value= 197.7590952). THIS MEANS THAT THERE IS STRONG EVIDENCE THAT BOTH, THE REAL LOG MARKET INDEX IS STRONGLY RELATED IN A POSITIVE WAY WITH THE SIMULATED RANDOM WALK, AND FOR EACH MOVEMENT OF 1 FOR THE RANDOM WALK, THE EXPECTED MOVEMENT FOR THE LOG MARKET INDEX IS ABOUT 1.2515126.

THIS IS QUITE SURPRISING IF WE REMEMBER THAT THE RANDOM WALK WAS GENERATED USING RANDOM NUMBERS. HOWEVER, IF WE REMEMBER THAT WE USED THE REAL LOG MARKET INDEX TO ESTIMATE THE INITIAL VALUE, THE DRIFT, AND THE VOLATILITY OF THE SHOCK, THEN IT MAKES SENSE THAT THE REGRESSION SHOWS STRONG EVIDENCE OF POSITIVE RELATIONSHIP.

ACTUALLY, THIS CAN BE A TEST FOR THE RANDOM WALK HYPOTHESIS OF FAMA, WHICH STATES THAT THE LOG OF STOCK PRICES MOVE LIKE A RANDOM WALK WITH A DRIFT.

DOES THE LOG OF THE S&P500 LOOKS LIKE A RANDOM WALK? WHY YES OR WHY NOT?

I WILL PLOT BOTH SERIES AGAIN:

plot.xts(lnsp$lnsp, col="blue")

lines(lnsp$rw1, col="black")

YES, THE REAL LOG OF THE S&P500 LOOKS LIKE A RANDOM WALK WITH POSITIVE DRIFT. WE CAN SEE THAT BOTH SERIES MOVE VERY SIMILARLY. HOWEVER, WE CAN SEE THAT THE REAL LOG INDEX HAS MORE PRONOUNCED DECLINES COMPARED TO THE SIMULATED RANDOM WALK. THIS INDICATES THAT THE REAL LOG PRICES OF A STOCK OR INDEX ARE SIMILAR TO A RANDOM WALK, BUT IN RECESSION/CRISIS PERIODS, THE VOLATILITY OF THE REAL LOG PRICE IS HIGHER COMPARED TO WORSE VOLATILITY PERIOD OF THE SIMULATED RANDOM WALK.

DO YOU THINK THAT WE CAN USE THIS TYPE OF SIMULATION TO PREDICT STOCK PRICES OR INDEXES? WHY YES OR WHY NOT?

IT SEEMS THAT WE CAN USE A RANDOM WALK MODEL TO PREDICT AN LOG INDEX OR A STOCK LOG PRICE SINCE WE SEE THAT OUR SIMULATION IS KIND OF SIMILAR TO THE REAL LOG INDEX. HOWEVER, WE HAVE TO REMEMBER THAT WE USED THE REAL VALUES OF THE S&P500 INDEX TO CALCULATE THE INITIAL VALUE, THE DRIFT AND THE VOLATILITY OF THE SHOCK. UNFORTUNATELY, WE HAVE NO INFORMATION ABOUT THE FUTURE, SO WE CANNOT EASILY ESTIMATE A NEW DRIFT FOR THE FUTURE, AND ALSO WE DO NOT HAVE DATA IN THE FUTURE TO ESTIMATE THE VOLATILITY OF THE SHOCK. THEN, IN SUM, I DO NOT BELIEVE THAT THE RANDOM WALK MODEL WILL BE A GOOD MODEL TO FORECAST A STOCK INDEX, UNLESS WE ARE VERY LUCKY TO GUESS WHERE THE INDEX WILL BE IN THE FUTURE.

ts.plot(lnsp$lnsp, col="blue")
lines(rw1_v2, col="green")
lines(rw1, col="black")

2 SIMULATING A RANDOM WALK AND AN AR(1) PROCESS

2.1 Create the dataset with simulation

Create a dataset of 1000 observations

obs = 1:1000

Create a Gaussian white noise with variance 0.09. Call this variable e1.The Gaussian white noise is a normal variable with mean=0 and a specic variance. We will use Gaussian white noises to model financial returns.

e1=rnorm(1000,0,sqrt(0.09))
N<-length(obs)
head(e1)

## [1] -0.08316722  0.42535166 -0.10202423 -0.48603217  0.62537529  0.25338123

tail(e1)

## [1]  0.05830977 -0.14033679 -0.26894940  0.47577353 -0.02903452 -0.35715982

hist(e1)

2.2 Simulating an AR(1) with \(\phi_0\)=1 and \(\phi_0\)=0.7

Following the formula for a Simple or First-order Autoregressive, AR(1):

Use the Gaussian white noise created (e1) for this
Declare a variable for the coefficient φ1(phi1) and equal this coefficient to 0.7.
Declare a variable for the coefficient φ0(phi0) and equal this coefficient to 1
Using simple simulation, generate the variable y1 as an AR(1) model using the above terms
The variable y1 is an AR(1) process or model. Graph this variable over time

You can run the following code for the previous steps:

e1=rnorm(1000,0,sqrt(0.09))

y1<-single(length=N)
phi1 <- 0.7
phi0 <- 1

# Now from day 1 I generate the values of the random walk following the formula:
for (i in 2:N){
  y1[i] <- phi0 + phi1*y1[i-1] + e1[i-1] 
}

ts.plot(y1, col="darkblue")

a). WHAT DO YOU SEE? LOOKING AT THE GRAPH, DOES THE MEAN OF THE SERIES CONVERGE TO A VALUE? IF YES, WHICH VALUE?

I CAN SEE THAT THE VALUES OF y1 CONVERGE TO A VALUE OF ABOUT 3.3. I CAN SEE THAT AFTER THE FIRST FEW PERIODS, THE MEAN OF y1 LOOKS ABOUT THE SAME FOR ANY TIME PERIOD. I CAN ALSO SEE THAT THE STANDARD DEVIATION OF THE SERIES LOOKS ABOUT THE SAME OVER TIME. IN OTHER WORDS, THE SERIES LOOKS AS A STATIONARY SERIES.

b). WHAT IS THE EXPECTED VALUE OF y1 ACCORDING TO THE AR(1) MODEL? PROVIDE THE FORMULA AND CALCULATE THE EXPECTED VALUE.

ACCORDING TO THE READING “INTRODUCTION TO TIME SERIES FINANCIAL MODELS”, THE EXPECTED VALUE OF STATIONARY SERIES IS THE FOLLOWING:

\(E\left[Y_{t}\right]=\frac{\phi_{0}}{(1-\phi_{1})}=\mu_{y}\)

THEN, WE CAN CALCULATE THE EXPECTED VALUE OF y1 USING THE \(\phi_0\) and \(\phi_1\) AS FOLLOWS:

\(E\left[Y_{t}\right]=\frac{1}{(1-0.7)}=3.3333333\)

ey1 <- phi0 / (1-phi1)
cat("The expected value of y1 is ",ey1)

## The expected value of y1 is  3.333333

c). IS THE EXPECTED VALUE OF y1 SIMILAR TO THE MEAN YOU SAW IN THE FIRST GRAPH?

YES, I HAD GUESS A MEAN VALUE OF 3.3, AND THE THEORETICAL EXPECTED VALUE IS 3.3333.

d). IS THE VOLATILITY (STANDARD DEVIATION) SIMILAR IN ALL TIME PERIODS?

YES, THE VOLATILITY OF THE SERIES SEEMS TO BE SIMILAR FOR ALL PERDIODS, EXCEPT FOR THE FIRST FEW DAYS

2.3 Simulating an AR(1) with \(\phi_0\)=0 and \(\phi_1\)=0.7

Now generate another series y2 with the same parameters than y1, but just change the constant phi0 from 1 to zero.

You can do this in R a follows:

e2=rnorm(1000,0,sqrt(0.09))

y2<-single(length=N)
phi1 <- 0.7
phi0 <- 0

# Now from day 1 I generate the values of the random walk following the formula:
for (i in 2:N){
  y2[i] <- phi0 + phi1*y2[i-1] + e2[i-1] 
}

ts.plot(y2, col="darkred")

Graph y2 over time.

(b) Is this time series also an AR(1)?

YES, IT LOOKS AN AR(1) MODEL SINCE THE MEAN OF THE SERIES LOOK ABOUT THE SAME OVER TIME. IN OTHER WORDS, IT LOOKS LIKE A STATIONARY SERIES. ALSO, IT IS AN AR(1) MODEL BECAUSE I FOLLOWED THE AR(1) EQUATION TO SIMULATE THE SERIES.

(c) WHAT IS THE EXPECTED VALUE OF y2? ACCORDING TO THE AR(1) MODEL? PROVIDE THE FORMULA AND CALCULATE THE EXPECTED VALUE.

THE MEAN OF THE SERIES LOOK ABOUT THE SAME OVER TIME, AND IT LOOKS TO BE CLOSE TO ZERO. HERE I CALCULATE THE EXACT THEORETICAL EXPECTED VALUE:

\(E\left[Y_{t}\right]=\frac{0}{(1-0.7)}=0\)

ey1 <- phi0 / (1-phi1)
cat("The expected value of y1 is ",ey1)

## The expected value of y1 is  0

2.4 Simulating a Random walk with \(\phi_0\)=0

Now generate another series y3, but now modify the parameters to simulate a Random Walk process with phi0=0.

THIS IS A RANDOM WALK WITHOUT DRIFT

e3=rnorm(1000,0,sqrt(0.09))
y3<-single(length=N)
phi1 <- 1
phi0 <- 0

# Now from day 1 I generate the values of the random walk following the formula:
for (i in 2:N){
  y3[i] <- phi0 + phi1*y3[i-1] + e3[i-1] 
}

ts.plot(y3, col="yellow")

Graph y3 over time. Observe its behaviour.

(b) IS THE MEAN OF THE SERIES CONSTANT OVER TIME?

**NO. IF I SEE PERIODS OF 200, THE MEAN OF THE SERIES CHANGES.*

(c) IS THE VOLATILITY OF THE SERIES CONSTANT OVER TIME? EXPLAIN

IT IS HARD TO TELL WHETHER VOLATILITY IS ABOUT THE SAME OR NOT. AT LEAST IN MY SIMULATION I SEE THAT AT THE END OF THE SERIES THE MOVEMENTS ARE BIGGER.

2.5 Simulating an AR(1) with \(\phi_0\)=0 and \(\phi_1\)=0.99

Now generate another series y4 as an AR(1) process with \(\phi_0\)=0 and \(\phi_1\)=0.99

N=1000
e4=rnorm(N,0,sqrt(0.01))

y4<-single(length=N)
phi1 <- 1
phi0 <- 0

# Now from day 1 I generate the values of the random walk following the formula:
for (i in 2:N){
  y4[i] <- phi0 + phi1*y4[i-1] + e4[i-1] 
}

ts.plot(y4, col="orange")

Graph y4 over time.

(b) IS THIS SERIES AN AR(1) ? EXPLAIN WHY YES OR WHY NOT

IN THEORY, IT IS AN AR(1) SINCE phi1<1. HOWEVER, THE PLOT DOES NOT LOOK LIKE A CLEAR STATIONARY SERIES. IF I TAKE PERIODS OF 100, THE MEAN OF THE PERIODS CHANGE. IF I TAKE ONLY 2 PERIODS OF 500, THE MEAN OF BOTH PERIODS MIGHT BE THE SAME. WHAT MIGHT HAPPEN IS THAT WHEN \(\phi_1\) IS VERY CLOSE TO 1, BUT LESS THAN 1, THEN WE NEED TO WAIT FOR MORE PERIODS TO SEE THE STATIONARITY OF THE SERIES.

(c) DOES THE MEAN CONVERGE TO A SPECIFIC VALUE? IF YES, TO WHICH ONE?

AS I MENTIONED BEFORE, IN THE LONG-RUN, YES, IT SEEMS TO CONVERGE TO A VALUE, BUT WHEN LOOKING SHORT-PERIODS, THE MEAN OF THE PERIODS ARE DIFFERENT

2.6 Checking for weakly stationary.

A time series is weakly stationary if:

its expected value is constant over time (about the same over time)
its expected variance over time is constant
the covariance (or correlation) between y and y(t+h) is the same for any t and any h

2.6.1 Checking for weakly stationary:

You have to install the packages zoo and dplyr
Use the function rollapply with a window of 12 time periods to compute rolling means for the series y1, y2, y3, and y4.
Graph the mean of each series to see how the moving means move for each window

library(zoo)

#I use the rollapply function to get the mean of each series:
y1_mean<-rollapply(y1,12,mean)
y2_mean<-rollapply(y2,12,mean)
y3_mean<-rollapply(y3,12,mean)
y4_mean<-rollapply(y4,12,mean)

# I create a sequence from 1 to 989 (1000-12+1)
x <- seq(1,989,1)

# I merge the rolling means in one dataset:
all_means <- tbl_df(data.frame(x, y1_mean, y2_mean, y3_mean, y4_mean))

Now we plot the 12-days rolling means:

# install.packages("ggplot2")
#library(ggplot)
library(ggplot2)

p <- ggplot(all_means, aes(x = x))

p+ geom_line(aes(y=y1_mean), colour="darkblue")+
  geom_line(aes(y=y2_mean), color="darkred")+
  geom_line(aes(y=y3_mean), color="yellow")+
  geom_line(aes(y=y4_mean), color="orange")

Using the rollaply function compute a series for the moving standard deviation of y1, y2, y3, y4 using 12-day windows.
Graph the moving standard deviation of each series to see how the standard deviation moves over time for each variable

y1_sd<-rollapply(y1,12,sd)
y2_sd<-rollapply(y2,12,sd)
y3_sd<-rollapply(y3,12,sd)
y4_sd<-rollapply(y4,12,sd)


all_sd <- tbl_df(data.frame(x, y1_sd, y2_sd, y3_sd, y4_sd))
pl <- ggplot(data = all_sd, aes(x = x))

pl+ geom_line(aes(y=y1_sd), color="darkblue")+
  geom_line(aes(y=y2_sd), color="darkred")+
  geom_line(aes(y=y4_sd), color="yellow")+
  geom_line(aes(y=y3_sd), color="orange")

I WILL PLOT EACH MOVING STANDARD DEVIATION SEPARETELY TO BETTER APPRECIATE THE STANDARD DEVIATION OF EACH VARIABLE

ts.plot(y1_sd)

ts.plot(y2_sd)

ts.plot(y3_sd)

ts.plot(y4_sd)

WHICH OF THE SERIES (y1, y2, y3, y4) IS (ARE) WEAKL STATIONARY AND WHICH IS (ARE) NOT? BRIEFLY EXPLAIN

SERIES y1 AND y2 CLEARLY LOOK LIKE STATIONARY SERIES SINCE:

A) LOOKING AT THE MOVING MEANS OF THE SERIES, y1, AND y2 HAVE A CONSTANT MEAN.

B) LOOKING AT THE MOVING STANDARD DEVIATIONS THE VOLATILITY SEEMS ABOUT THE SAME OVER TIME

SERIES y3 DOES NOT LOOK LIKE STATIONARY SINCE THEIR MOVING MEANS DOES NOT SEEM TO HAVE A CONSTANT MEAN.

IT IS HARD TO SAY SOMETHING ABOUT y4. WITH SOME SIMULATIONS IT SEEMS TO HAVE A CONSTANT MEAN, BUT WITH OTHER SIMULATIONS IT SEEMS TO HAVE DIFFERENT MEANS FOR DIFFERENT TIME PERIODS. HOWEVER, WE KNOW THAT y3 IS A STATIONARY SERIES SINCE ITS \(\phi_1\)=0.99.

Workshop 2 SOLUTION, Financial Econometrics II

Alberto Dorantes, Ph.D.

Mar 2, 2022