The random walk hypothesis in Finance (Fama, 1965) states that the natural logarithm of stock prices behaves like a random walk with a drift. A random walk is a series (or variable) that cannot be predicted.
Yt=φ0+Yt−1+εt
If we want to simulate a random walk, we need the values of the following parameters/variables:
Y0, the first value of the series φ0, the drift of the series σε, the standard deviation (volatility) of the random shock.
## Warning: package 'quantmod' was built under R version 4.0.3
## Loading required package: xts
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 4.0.3
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Loading required package: TTR
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Warning: package 'fpp2' was built under R version 4.0.3
## -- Attaching packages ---------------------------------------------------------------------------------- fpp2 2.4 --
## v ggplot2 3.3.3 v fma 2.4
## v forecast 8.12 v expsmooth 2.3
## Warning: package 'ggplot2' was built under R version 4.0.3
## Warning: package 'fma' was built under R version 4.0.3
## Warning: package 'expsmooth' was built under R version 4.0.3
##
## 'getSymbols' currently uses auto.assign=TRUE by default, but will
## use auto.assign=FALSE in 0.5-0. You will still be able to use
## 'loadSymbols' to automatically load data. getOption("getSymbols.env")
## and getOption("getSymbols.auto.assign") will still be checked for
## alternate defaults.
##
## This message is shown once per session and may be disabled by setting
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for details.
## [1] "^GSPC"
Now, we create a variable for a random walk with a drift trying to model the log of the S&P500.
Reviewing the random walk equation again:
Yt=φ0+Yt−1+εt
## The value for phi0 is 0.0004607493
Then we can estimate sigma as: sigma = StDev(lnsp) / sqrt(N).
## The volatility of the log is = 0.3889621
## The volatility for the shock is = 0.00702918
Now you are ready to start the simulation of random walk using rw1: rw1t=ϕ0+rw1t−1+εt
The shock over time:
does the shock behaves like a normal distribution…
As expected, the shock behaves similiar to a normal-distributed variable.
We start the random walk with the first value of the log of the S&P500. Then, from day 2 we do the simulation according to the previous formula and using the random shock just created:
# I create separate vectors:
rw1<-single(length=N)
# I assign the first value of the random-walk to be equal to real log value of the S&P
rw1[1]<-as.numeric(lnsp$lnsp[1])
# Now from day 1 I generate the values of the random walk following the formula:
for (i in 2:N){
rw1[i] <- phi0 + rw1[i-1] + shock[i]
}
lnsp$rw1<-rw1rw1_v2<-single(length=N)
rw1_v2[1]<-lnsp$lnsp[1]
for (i in 2:N){
rw1_v2[i] <- rw1_v2[i-1] + shock[i]
}
ts.plot(lnsp$lnsp, col="blue")
# I plot both lines to compare
lines(rw1_v2, col="green")WHAT DO YOU OBSERVE with this plot? EXPLAIN WITH YOUR WORDS.
BY WATCHING THE FIRST PLOT, WE CAN SAY THAT THE GRAPGH IS PRETTY SIMILAR BETWEEN ITS 2 VARIABLES, ALTHOUGH THE SECOND GRAPH THE DIFFERENCE BETWEEN VARIABLES IS EVIDENT.
regmodel<-lm(lnsp$lnsp ~ lnsp$rw1)
# To print the model in the Rmd we have to do the following:
s_regmodel <- summary(regmodel)
s_regmodel##
## Call:
## lm(formula = lnsp$lnsp ~ lnsp$rw1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.26507 -0.07703 -0.02062 0.06304 0.32482
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.935329 0.029303 66.05 <2e-16 ***
## lnsp$rw1 0.712414 0.003722 191.43 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.108 on 3060 degrees of freedom
## Multiple R-squared: 0.9229, Adjusted R-squared: 0.9229
## F-statistic: 3.665e+04 on 1 and 3060 DF, p-value: < 2.2e-16
DOES THE REGRESSION RESULT MAKE SENSE? EXPLAIN WHY YES OR WHY NOT?
THE DEPENDENT VARIABLE IS THE LOG OF THE S&P500 CONTINOUSLY COMPOUNDED RETURNS AND THE INDEPENDENT VARIABLE IS THE RANODM WALK,THERE IS ENOUGHT STATISTICAL EVIDENCE TO SAY THAT THE RANDOM WALK IS POSITIVE AND LINEARLY RELATED TO THE S&P500.
DOES THE LOG OF THE S&P500 LOOKS LIKE A RANDOM WALK? WHY YES OR WHY NOT?
DO YOU THINK THAT WE CAN USE THIS TYPE OF SIMULATION TO PREDICT STOCK PRICES OR INDEXES? WHY YES OR WHY NOT?
e1=rnorm(1000,0,sqrt(0.09))
y1<-single(length=N)
phi1 <- 0.7
phi0 <- 1
# Now from day 1 I generate the values of the random walk following the formula:
for (i in 2:N){
y1[i] <- phi0 + phi1*y1[i-1] + e1[i-1]
}
ts.plot(y1, col="darkblue")WHAT DO YOU SEE? LOOKING AT THE GRAPH, DOES THE MEAN OF THE SERIES CONVERGE TO A VALUE? IF YES, WHICH VALUE?
ITS A STATIONARY SERIES,I CAN SAY THAT THE VALUE OF THE MEAN SHOULD BE 3.5 APROXIMATELY.
WHAT IS THE EXPECTED VALUE OF y1 ACCORDING TO THE AR(1) MODEL? PROVIDE THE FORMULA AND CALCULATE THE EXPECTED VALUE.
IS THE EXPECTED VALUE OF y1 SIMILAR TO THE MEAN YOU SAW IN THE FIRST GRAPH?
YES THE VALUE IS ALMOST THE SAME
IS THE VOLATILITY (STANDARD DEVIATION) SIMILAR IN ALL TIME PERIODS?
YES THE VOLATILITY IS ALMOST THE SAME.
e2=rnorm(1000,0,sqrt(0.09))
y2<-single(length=N)
phi1 <- 0.7
phi0 <- 0
# Now from day 1 I generate the values of the random walk following the formula:
for (i in 2:N){
y2[i] <- phi0 + phi1*y2[i-1] + e2[i-1]
}
ts.plot(y2, col="darkred")WHAT IS THE EXPECTED VALUE OF y2? ACCORDING TO THE AR(1) MODEL?PROVIDE THE FORMULA AND CALCULATE THE EXPECTED VALUE.
E [Yt] = φ0/(1−φ1)
## [1] 0
LETS NOS FORGET THAT THE MEAN NEEDS TO BE EQUAL TO 0, AND AS THERE IS NO CHANGE IN THE VALUES , THERE IS NO GROWTH.
Graph y3 over time.
e3=rnorm(1000,0,sqrt(0.09))
y3<-single(length=N)
phi1_3 <- 1
phi0_3 <- 0
# Now from day 1 I generate the values of the random walk following the formula:
for (i in 2:N){
y3[i] <- phi0_3 + phi1_3*y3[i-1] + e3[i-1]
}
ts.plot(y3, col="yellow")DOES THE MEAN CONVERGE TO A SPECIFIC VALUE? IF YES, TO WHICH ONE?
NO, THE MEAN IS NOT CONSTAT BECAUSE THE GRAPH IS NON-STATIONARY.
e4=rnorm(1000,0,sqrt(0.09))
y4<-single(length=N)
phi1_4 <- 0.99
phi0_4<- 0
# Now from day 1 I generate the values of the random walk following the formula:
for (i in 2:N){
y4[i] <- phi0_4 + phi1_4*y4[i-1] + e4[i-1]
}
ts.plot(y4, col="orange")A time series is weakly stationary if:
i.Its expected value is constant over time (about the same over time)
ii.Its expected variance over time is constant
iii.The covariance (or correlation) between y and y(t+h) is the same for any t and any h
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:xts':
##
## first, last
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
#I use the rollapply function to get the mean of each series:
y1_mean<-rollapply(y1,12,mean)
y2_mean<-rollapply(y2,12,mean)
y3_mean<-rollapply(y3,12,mean)
y4_mean<-rollapply(y4,12,mean)
x <- seq(1,989,1)
all_means <- tbl_df(data.frame(x, y1_mean, y2_mean, y3_mean, y4_mean))## Warning: `tbl_df()` is deprecated as of dplyr 1.0.0.
## Please use `tibble::as_tibble()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
# install.packages("ggplot2")
library(ggplot2)
p <- ggplot(all_means, aes(x = x))
p+ geom_line(aes(y=y1_mean), colour="darkblue")+
geom_line(aes(y=y2_mean), color="darkred")+
geom_line(aes(y=y3_mean), color="yellow")+
geom_line(aes(y=y4_mean), color="orange")y1_sd<-rollapply(y1,12,sd)
y2_sd<-rollapply(y2,12,sd)
y3_sd<-rollapply(y3,12,sd)
y4_sd<-rollapply(y4,12,sd)
all_sd <- tbl_df(data.frame(x, y1_sd, y2_sd, y3_sd, y4_sd))
pl <- ggplot(data = all_sd, aes(x = x))
pl+ geom_line(aes(y=y1_sd), color="darkblue")+
geom_line(aes(y=y2_sd), color="darkred")+
geom_line(aes(y=y4_sd), color="yellow")+
geom_line(aes(y=y3_sd), color="orange")