Financial Markets Paper

companylist <- read.csv("~/Desktop/York Econ/ECON5520/companylist.csv")
df <- companylist[,1:4]
df <- df %>% dplyr::filter(!grepl('ETF', Name)) %>% mutate(Quantile = ntile(MarketCap, n=5))
df.small <- df %>% dplyr::filter(Quantile=='1')
df.mid <- df %>% dplyr::filter(Quantile=='3')
df.large <- df %>% dplyr::filter(Quantile=='5')
set.seed(10)
df1 <- sample_n(df.small, 10)
df2 <- sample_n(df.mid, 10)
set.seed(99)
df3 <- sample_n(df.large, 10)
kable(df1)

Symbol	Name	LastSale	MarketCap	Quantile
SLVO	Credit Suisse AG	6.7100	22487847	1
STAF	Staffing 360 Solutions, Inc.	1.6500	13437877	1
NAOV	NanoVibronix, Inc.	4.2400	18140988	1
FTEK	Fuel Tech, Inc.	1.3500	32630290	1
DCAR	DropCar, Inc.	0.5600	5438969	1
CTIB	CTI Industries Corporation	2.8200	10464879	1
ATAI	ATA Inc.	1.1400	12040866	1
KTOV	Kitov Pharma Ltd.	1.1400	11957951	1
CLIR	ClearSign Combustion Corporation	1.0100	26964234	1
CDMOP	Avid Bioservices, Inc.	25.8345	18084150	1

kable(df2)

Symbol	Name	LastSale	MarketCap	Quantile
OXLC	Oxford Lane Capital Corp.	10.06	383638372	3
LNDC	Landec Corporation	12.33	359669158	3
ALCO	Alico, Inc.	30.00	223884090	3
ALBO	Albireo Pharma, Inc.	30.79	368554083	3
RMTI	Rockwell Medical, Inc.	4.97	283178950	3
FRTA	Forterra, Inc.	4.81	308828955	3
PRTK	Paratek Pharmaceuticals, Inc.	6.63	214046911	3
AVCO	Avalon GloboCare Corp.	3.57	260755481	3
WOOD	iShares S&P Global Timber & Forestry Index Fund	64.09	296095800	3
PETS	PetMed Express, Inc.	22.28	460614960	3

kable(df3)

Symbol	Name	LastSale	MarketCap	Quantile
COLM	Columbia Sportswear Company	103.40	7052344990	5
SSB	South State Corporation	70.87	2506675727	5
AGNC	AGNC Investment Corp.	17.58	9428841237	5
CSCO	Cisco Systems, Inc.	51.30	225824000000	5
MLNX	Mellanox Technologies, Ltd.	108.98	5909469271	5
CHTR	Charter Communications, Inc.	343.75	77465371500	5
OKTA	Okta, Inc.	80.11	8847238169	5
VRNT	Verint Systems Inc.	52.34	3416318056	5
SAVE	Spirit Airlines, Inc.	56.21	3840771797	5
IBOC	International Bancshares Corporation	41.15	2700160331	5

small.tickers <- c("SLVO", "STAF", "NAOV", "FTEK", "DCAR", "CTIB", "ATAI", "KTOV", "CLIR", "CDMOP")
mid.tickers <- c("OXLC", "LNDC", "ALCO", "ALBO", "RMTI", "FRTA", "PRTK", "AVCO", "WOOD", "PETS")
large.tickers <- c("COLM", "SSB", "AGNC", "CSCO", "MLNX", "CHTR", "OKTA", "VRNT", "SAVE", "IBOC")

getSymbols(small.tickers, src= "yahoo", from="2017-12-29", to="2019-01-01")

##  [1] "SLVO"  "STAF"  "NAOV"  "FTEK"  "DCAR"  "CTIB"  "ATAI"  "KTOV"  "CLIR"  "CDMOP"

small.returns <- do.call(merge, lapply(small.tickers, function(x) dailyReturn(Ad(get(x)))))
l.small.returns <- do.call(merge, lapply(small.tickers, function(x) log((1+dailyReturn(Ad(get(x)))))))
names(small.returns) <- small.tickers
names(l.small.returns) <- small.tickers

#Mid adj P
getSymbols(mid.tickers, src= "yahoo", from="2017-12-29", to="2019-01-01")

##  [1] "OXLC" "LNDC" "ALCO" "ALBO" "RMTI" "FRTA" "PRTK" "AVCO" "WOOD" "PETS"

mid.returns <- do.call(merge, lapply(mid.tickers, function(x) dailyReturn(Ad(get(x)))))
l.mid.returns <- do.call(merge, lapply(mid.tickers, function(x) log((1+dailyReturn(Ad(get(x)))))))
names(mid.returns) <- mid.tickers
names(l.mid.returns) <- mid.tickers

#Large adj P
getSymbols(large.tickers, src= "yahoo", from="2017-12-29", to="2019-01-01")

##  [1] "COLM" "SSB"  "AGNC" "CSCO" "MLNX" "CHTR" "OKTA" "VRNT" "SAVE" "IBOC"

large.returns <- do.call(merge, lapply(large.tickers, function(x) dailyReturn(Ad(get(x)))))
l.large.returns <- do.call(merge, lapply(large.tickers, function(x) log((1+dailyReturn(Ad(get(x)))))))
names(large.returns) <- large.tickers
names(l.large.returns) <- large.tickers

Introduction

For completeness, the above section includes the process in which the 30 stocks are chosen from the NASDAQ stock listings, removing ETF’s. Each of the tables include the 10 stocks from small, medium, and large market caps; quantile 1, 3 and 5. Moreover, the process is random and a seed is set in order to replicate the sampling. Furthermore, the getSymbols command from the package ‘quantmod’ extracts 252 adjusted prices from Yahoo! Finance from the last trading day in December 2017 until the last trading day in December 2018. Finally, daily returns are calculated and complied in a data frame using the ‘do.call’ command, and lastly, ticker names are applied to each data frame.

Q1.

small.returns <- small.returns[2:252,]
mid.returns <- mid.returns[2:252,]
large.returns <- large.returns[2:252,]
returns <- cbind(small.returns, mid.returns, large.returns)
l.returns <- cbind(l.small.returns, l.mid.returns, l.large.returns)
l.returns <- l.returns[2:252,]

The above code is nothing but the matrix of 251 returns and log returns.

getSymbols("SPY", src= "yahoo", from="2017-12-29", to="2019-01-01")

## [1] "SPY"

SPY.adj <- do.call(merge, lapply("SPY", function(x) Ad(get(x))))
SPY.returns <- Delt(SPY.adj)
l.SPY.returns <- log((1+Delt(SPY.adj)))
SPY.returns <- SPY.returns[2:252,]
l.SPY.returns <- l.SPY.returns[2:252,]

Furthermore, the S&P 500 is a proxy for the market portfolio and the same process is applied to generate daily returns from adjusted prices using the same time period.

jb.test.5520 <- function (x) 
{
  if ((NCOL(x) > 1) || is.data.frame(x)) 
    stop("x is not a vector or univariate time series")
  if (any(is.na(x))) 
    stop("NAs in x")
  DNAME <- deparse(substitute(x))
  n <- length(x)
  m1 <- sum(x)/(n-1)
  m2 <- sum((x - m1)^2)/(n-1)
  m3 <- sum((x - m1)^3)/(n-1)
  m4 <- sum((x - m1)^4)/(n-1)
  b1 <- (m3/m2^(3/2))^2
  b2 <- (m4/m2^2)
  STATISTIC <- n * b1/6 + n * (b2 - 3)^2/24
  PVAL <- 1 - pchisq(STATISTIC, df = 2)
  PARAMETER <- 2
  METHOD <- "Jarque Bera Test"
  names(STATISTIC) <- "X-squared"
  names(PARAMETER) <- "df"
  structure(list(statistic = STATISTIC, parameter = PARAMETER, 
                 p.value = PVAL, method = METHOD, data.name = DNAME), 
            class = "htest")
}

To provide clarity for the above code I will explain the process. The function above is taken from the package in R called ‘tseries’. However, it is important to note that each of the moments is divided by \(n\), whereas the formula in the handouts is divided by \(n-1\), as this is a consistent estimator. I have taken the code for the original function and divided the moment estimators by \(n-1\), thereby, producing a consistent estimator.

output <- lapply(1:30, function(x) jb.test.5520(returns[,x]))
l.output <- lapply(1:30, function(x) jb.test.5520(l.returns[,x]))
jb.test.output <- do.call(rbind, c(output,l.output)) %>% as.data.frame()
l.return.jb.test.table <- jb.test.output[31:60,] %>% 
  as.data.frame() %>% 
  select(-parameter, -method, -data.name) %>% 
  rename("P-value from Log Returns" = p.value,
         "Test Statistic from Log Returns" = statistic
         )
return.jb.test.table <- jb.test.output[1:30,] %>% 
  as.data.frame() %>% 
  select(-parameter, -method, -data.name) %>% 
  rename("P-value from Net Returns" = p.value,
         "Test Statistic from Net Returns" = statistic
  )
jb.test.table.all.stocks <- methods::cbind2(return.jb.test.table, l.return.jb.test.table)
returns.tickers <- c(small.tickers, mid.tickers, large.tickers) %>% as.data.frame()
q1.table <- cbind(returns.tickers, jb.test.table.all.stocks) %>% 
  as.data.frame()
names(q1.table)[names(q1.table) == '.'] <- 'Tickers'
#SPY JB test
spy.output <- jb.test.5520(SPY.returns)
spy.l.output <- jb.test.5520(l.SPY.returns)
SPY.list <- "SPY"
spy.table <- do.call(cbind, c(spy.output, spy.l.output)) %>% 
  as.data.frame() %>% 
  cbind(SPY.list)
spy.table <- spy.table[,c(1,3,6,8,11)] %>% as.data.frame()
spy.table <- spy.table %>% rename(Tickers = SPY.list,
                                  "P-value from Net Returns" = p.value,
                                  "Test Statistic from Net Returns" = statistic,
                                  "P-value from Log Returns" = p.value.1,
                                  "Test Statistic from Log Returns" = statistic.1)
Q1.table <- rbind(q1.table, spy.table) %>% 
  as.data.frame()
Q1.table$`P-value from Net Returns` <- as.numeric(Q1.table$"P-value from Net Returns")
Q1.table$`P-value from Log Returns` <- as.numeric(Q1.table$"P-value from Log Returns") 
Q1.table$`Test Statistic from Net Returns` <- as.numeric(Q1.table$`Test Statistic from Net Returns`) %>% 
  round(2)
Q1.table$`Test Statistic from Log Returns` <- as.numeric(Q1.table$`Test Statistic from Log Returns`) %>% 
  round(2)
row.names(Q1.table) <- NULL
knitr::kable(Q1.table, caption = "JB Test for Normality for Log and Net Stock Returns ", digits = 3) %>% 
  landscape()

JB Test for Normality for Log and Net Stock Returns
Tickers	Test Statistic from Net Returns	P-value from Net Returns	Test Statistic from Log Returns	P-value from Log Returns
SLVO	44.53	0.000	47.61	0.000
STAF	419548.68	0.000	117283.34	0.000
NAOV	194.88	0.000	232.91	0.000
FTEK	6955.90	0.000	2929.71	0.000
DCAR	585.67	0.000	219.15	0.000
CTIB	149.78	0.000	349.74	0.000
ATAI	243261.97	0.000	71592.76	0.000
KTOV	6972.96	0.000	2510.12	0.000
CLIR	99.38	0.000	69.62	0.000
CDMOP	1723.56	0.000	1694.91	0.000
OXLC	455.85	0.000	435.76	0.000
LNDC	18.15	0.000	19.32	0.000
ALCO	47.60	0.000	42.08	0.000
ALBO	32.75	0.000	23.04	0.000
RMTI	224.82	0.000	148.58	0.000
FRTA	83.40	0.000	146.21	0.000
PRTK	121.40	0.000	133.01	0.000
AVCO	111825.72	0.000	11146.39	0.000
WOOD	18.37	0.000	21.65	0.000
PETS	139.75	0.000	195.30	0.000
COLM	90.15	0.000	114.13	0.000
SSB	131.55	0.000	164.22	0.000
AGNC	17.11	0.000	19.14	0.000
CSCO	13.52	0.001	14.43	0.001
MLNX	1002.60	0.000	718.91	0.000
CHTR	182.53	0.000	283.95	0.000
OKTA	171.85	0.000	149.07	0.000
VRNT	1322.67	0.000	937.98	0.000
SAVE	460.60	0.000	306.69	0.000
IBOC	50.74	0.000	60.84	0.000
SPY	114.38	0.000	113.56	0.000

The above table displays the results from the Jarque-Bera Test for 30 stocks and the S&P 500, which from here on now, will be referred to as the market portfolio. The Jarque–Bera test is a goodness-of-fit test which examines whether the distribution of the returns data has skewness and kurtosis matching that of a normal distribution. The test is based on the following properties: the normal distribution has an expected skewness of 0 and an expected excess kurtosis of 0, equivalently, kurtosis of 3.

Consequently, the null hypothesis is, skewness and excess kurtosis are jointly equal to 0. The alternative test is that at least one of the parameters is not 0. The JB statistic asymptotically is \(\chi^{2}\) distributed with 2 degrees of freedom, and therefore, the test statistic is tested against the value 5.99 at the 5% significance level.

All stocks have a p-value close to zero and test statistics much greater than 0, therefore, rejecting the null hypothesis at all confidence levels that each is normally distributed. Moreover, the same conclusion is drawn from the log returns. This is unsurprising, as for small returns, the log return is equal to the net returns. However, in the random walk model, we assume log returns are log-normally distributed. This assumption shall be analyzed in-depth in question 4 of this project. Another surprising result is that the small market cap stocks have test statistics that are relatively small compared to medium and large market cap stocks. Intuitively, small cap stocks are more volatile, and thus, would have more returns in the tails of the probability distribution of daily returns. As a result, excess kurtosis would not be equal to 0 in this case. Finally, the market portfolio is also, on average, not normally distributed.

Q2.

a)

V = cov(returns)
mu = colMeans(returns)
mu.prime = t(mu)
V.inv = solve(V)
one=rep(1, 30)
one.prime = t(one)

A=mu.prime%*%V.inv%*%one
B=mu.prime%*%V.inv%*%mu
C=one.prime%*%V.inv%*%one
D=B%*%C-A^2

#MVP
min.return = A/C %>% as.numeric()
min.sd= sqrt(1/C) %>% as.numeric()

mvp <- cbind(min.return, min.sd)
EZp <- runif(n = 99, min=min.return, max = 0.005) %>% as.list()
st_p = lapply(EZp, function(EZp)((C*EZp^2-2*A*EZp+B)/D)^(1/2))
EZp<- as.numeric(EZp)
st_p <- as.numeric(st_p)
df.2a <- cbind(EZp,st_p) %>% rbind(mvp) %>% data.frame()
ggplot(df.2a) +
  geom_smooth(aes(y=EZp, x=st_p)) +
  labs(x="Standard Deviation", y= "Expected Return", title="Efficent Frontier") +
  geom_point(aes(x=min.sd, y=min.return), colour="Red") +
  ylim(-0.001, 0.007)

The efficient frontier is generated through the well known financial formulas. In the first section of the code, I generate the vector \(\mu\), which is the column means of returns for each stock, and ‘cov’ is the covariance-variance matrix of returns. \(EZ_{p}\) is a vector of 99 arbitrary portfolio returns with the minimum value being the minimum variance portfolio, the maximum is arbitrarily chosen as 0.0005. Furthermore, the locus of the efficient portfolio is derived using,

\[\begin{align} \sigma_{p}^{2} = \frac{EZ_{p}^{2}C-2~EZ_{p}A+B}{D}. \end{align}\]

identical(min.sd, min(df.2a$st_p))

## [1] TRUE

Where \(EZ_{p}\) is the vector of portfolio returns. Using the ‘lapply’ command, I can input each element from \(EZ_{p}\) into (1) and output the risk for the arbitrary portfolio return. Moreover, taking the square root of this is nothing but the standard deviation, or volatility. Plotting this against \(\sigma_{p}\) will generate all efficient portfolios, as the minimum value of \(EZ_{p}\) is \(\frac{A}{C}\), which is the minimum variance portfolio. The shape of the efficient frontier does not contain as much curvature as in the class examples which is most likely due to the nature of the returns as the average returns of the 30 stocks contained many negative values. Furthermore, the MVP is the point on the graph, a logical test is done if the MVP has the lowest variance in the data matrix of arbitrary returns and volatility generated through the equation above. The test is true, the MVP standard deviation and the minimum standard deviation of the matrix are the same. Moreover, the minimum return is negative and close to 0, and less than the risk-free rate of return.

b)

rf <- 0.02/252
rf.vector <- rep(rf, 100)
df.2b <- cbind(df.2a, rf.vector) %>% as.data.frame() %>% mutate(sharpe.ratio = (EZp-rf)/st_p)
max(df.2b$sharpe.ratio)

## [1] 0.3797931

After including the risk-free rate of return, which now adds a dimension of risk trade-off in generating a portfolio that includes risky and a risk-free asset. As mentioned previously, the MVP is less than the risk-free rate of return which should produce a Sharpe ratio which is above the lower portion of the efficient frontier. Interestingly, this would suggest that investing in a combination of the risk-free asset and the risky portfolios is better at all levels of the efficient frontier. The risk-free rate of return is ‘rf’, and after generating a vector with dimensions 100 by 1, combining this with \(EZ_{p}\) and \(\sigma_{p}\) will produce a data matrix, in which the Sharpe ratio is calculated. The Sharpe ratio is calculated using \(\frac{EZ_{p} - \mu_{f}}{\sigma_{p}}\), where \(\mu_{f}\) is the risk-free rate of return. Finally, because the slope of the capital markets line is nothing but the maximum Sharpe ratio, the command ‘max’ finds the maximum value of the vector of Sharpe ratios.

ggplot(df.2b) +
  geom_smooth(aes(y=EZp, x=st_p)) +
  geom_abline(intercept=rf, slope =max(df.2b$sharpe.ratio)) +
  labs(x="Standard Deviation", y= "Expected Return", title = "Efficent Frontier and CML") + 
  geom_point(aes(x=min.sd, y=min.return), colour="Red") +
  xlim(0, 0.02) +
  ylim(-0.0015, 0.007)

Plotting the line with intercept equal to \(\mu_{f}\) and slope equal to the maximum Sharpe ratio, the capital markets line is plotted. The tangency portfolio is near the end of the frontier. Moreover, in the CAPM theory, this portfolio is the market portfolio, as everyone would like to hold this portfolio.

Q3.

All Stocks

rf.mat <- matrix(rf, nrow = 251, ncol = 30)
rf.vector <- rep(rf, 251)
ER_i <- returns - rf.mat
ER_m <- SPY.returns - rf.vector 
df.all <- cbind(ER_i, ER_m) %>% data.frame() 
capm.test <- lm(ER_i ~ ER_m, df.all)
B.a.all <- coefficients(capm.test) %>% t()
B.all <- B.a.all[,2]

To test the capital asset pricing model (CAPM) I will utilize 2 methods. The first is the Wald Test and the second is the Fama-MacBeth cross-sectional regression approach. To start, excess returns are calculated for each element in the returns matrix and also for the market portfolio. More specifically, this is the Sharpe and Lintner version of CAPM. Furthermore, CAPM is estimated using OLS instead of maximum likelihood (ML) or generalized method of moments (GMM). An argument for doing so is point estimates are of importance here, and for this project, OLS should allow all tests to be applied to the returns data. An assumption of single-factor models is the idiosyncratic error, which is the error term in the regression equation, has an expected value of 0 (due to having a small weighted portfolio) and no contemporaneous error covariation, (\(cov(\epsilon_{it}, \epsilon_{jt})=0\)). However, the last assumption is not true, and empirically, this assumption does not hold. Consequently, the standard errors are now wrong but the point estimators should be the same.

#Wald Test
a.all <- B.a.all[,1]
res <- resid(capm.test)
sigma <- (1/251)*(t(res)%*%res) 
sigma.inv <- solve(sigma)
a.all.prime <- t(a.all)
varER_m <- var(ER_m)
mu.ER_m <- mean(ER_m)
j0 <- 251*(1+(mu.ER_m^2/varER_m))^(-1)*(a.all.prime%*%sigma.inv%*%a.all)
j1 <- ((251-30-1)/(251*30))*j0
j1.all<- j1 %>% as.numeric()

The first test is the Wald test. Formally, the null hypothesis is that the vector of \(\hat{\alpha_{i}}\), where \(i\) is for each stock, is equal to 0. Whereas the alternative is if not equal to 0, thereby, providing evidence against CAPM. \(\hat{\Sigma}\) is the estimated covariance-variance matrix of the OLS model, which is used to calculate \(J_{0}\), the Wald test statistic. However, because the sample is finite, \(J_{1}\) must be calculated, where instead, this is an F-test, not a Wald test. Formally, the formula for the \(J_{1}\) F-test statistic is,

\[\begin{align} J_{1} = \frac{(T-n-1)}{T~n}~[ 1 + \frac{\hat{\mu_m^2}}{\hat{\sigma^2_m}} ]^{-1}~\hat{\alpha'} ~\hat{\Sigma}^{-1}~ \hat{\alpha}. \end{align}\]

Moreover, \(J_{1}\) follows an F-distribution, with \(n\) degrees of freedom in the numerator and \((T-n-1)\) degrees of freedom in the denominator, where \(n\) equals to the number of stocks in the portfolio. In this case, \(n=30\) because this is the F-test for a portfolio of all 30 stocks. Lastly, \(T\) is equal to 251, therefore, a finite sample. The reason to use \(J_{1}\) over \(J_{0}\) is due to finite samples rejecting the null hypothesis too often while using \(J_{0}\), as it is for large samples only. The results for each test are then saved to compile into a table.

kable(B.all)

	x
SLVO	0.1753967
STAF	1.6021898
NAOV	0.3621772
FTEK	0.5285670
DCAR	0.8664930
CTIB	0.2748523
ATAI	0.9105274
KTOV	0.9242521
CLIR	0.0323883
CDMOP	0.0358507
OXLC	0.4229842
LNDC	0.5930880
ALCO	0.1987354
ALBO	1.1597327
RMTI	0.5336933
FRTA	0.9905798
PRTK	1.3011712
AVCO	0.6507998
WOOD	0.7422248
PETS	0.6283831
COLM	0.7434855
SSB	0.7391666
AGNC	0.3036397
CSCO	1.3058718
MLNX	0.9496949
CHTR	0.9328803
OKTA	1.8785889
VRNT	1.0708232
SAVE	0.8806607
IBOC	0.8878494

The \(\hat{\beta}_{1,i}\)’s are displayed above.

#Fama-MacBeth
ER_i.prime <- t(ER_i)
df.b.test <- cbind(ER_i.prime, B.all) %>% data.frame()
dim(df.b.test)

## [1]  30 252

capm.test.2 <- lapply(1:251, function(x)lm(df.b.test[,x] ~  B.all, df.b.test))
B.a.all.2 <- sapply(capm.test.2, coef) %>% as.data.frame()
summaries <- lapply(capm.test.2, summary)
all.stocks.adj.r <- sapply(summaries, function(x)  c(adj_r_sq = x$adj.r.squared)) %>% as.numeric()

The second CAPM test is the cross-sectional regression approach, where the process involves using \(\hat{\beta}\) in estimating parameters \(\gamma_{0}\) and \(\gamma_{1}\). This method involves 3 steps. First, the vector of \(\hat{\gamma_{0}}\) is the estimated intercepts of the regression of excess portfolio returns onto \(\hat{\beta_{1,i}}\) from the initial regression. A formal statistical test is then performed, where the null hypothesis is the vector of all 251 \(\hat{\gamma_{0}}\) are equal to 0, and the alternative is not equal to 0. If the null is rejected, this provides support against CAPM. This test is similar to the Wald test, as the objective is to find if the stocks are jointly mis-priced.

Second, the vector of 251 \(\hat{\gamma_{1}}\) is needed in order to test its relationship with \(\hat{\beta_{1,i}}\). Another statistical test is performed where the null hypothesis, \(\hat{\gamma_{1}}\) is equal to the average excess expected return of the market portfolio. Conversely, the alternative, \(\hat{\gamma_{1}}\) is not equal to the average excess expected return of the market portfolio Mathematically, this can be shown as,

\(\begin{aligned} Z_{i} &= \alpha_{0} + \beta Z_{m} +\epsilon\\ E[Z_{i}] &= \hat{\alpha_{0}} + \hat{\beta}~E[Z_{m}]\\ &= \gamma_{0} + \gamma_{1}~\hat{\beta} +\zeta \\ &= \hat{\gamma_{0}} + \hat{\gamma_{1}}~\hat{\beta} \\ &= \hat{\beta}~E[Z_{m}]. \end{aligned}\)

Intuitively, if the null hypothesis is true for steps one and two, then expected excess stock returns for \(i\) should only be a function of \(\beta_{i}\) and expected excess return of the market portfolio. In other words, the deviations of stock \(i\)’s returns are perfectly captured in the movements of market portfolio returns. Therefore, providing evidence in support of CAPM. The last step for this approach is to build the probability distribution of 251 measures of adjusted \(R^{2}\), \(\bar{R^2}_{i}\). Ideally, to support the previous claim, the lower bound of \(\bar{R^2}_{i}\) is expected to arbitrarily large (>40%) to support \(\beta_i\) capturing movements of returns in stock \(i\) with the market portfolio. The interpretation of \(\bar{R^2}_{i}\) is not the same as \(R^{2}\), but the magnitude is of importance here. All three tests are performed for a portfolio of small cap, medium cap, large cap, and a portfolio of all the stocks used. Lastly, perhaps the rejection of CAPM arises not from the theory, but from an improper proxy for the market portfolio, this is of important consideration when making conclusions after performing the tests.

A short description of the code is provided in this paragraph. First, the matrix of excess expected returns is transposed to achieve a dimension of 30x251 in preparation to column bind the estimated values. Afterwards, ‘lapply’ is used to loop each column of daily returns for all 30 stocks at once, therefore, 251 regressions with 251 \(\hat{\gamma_i}\), where \(i=0,1\). Furthermore, 251 \(\bar{R^2}\)’s are extracted from each regression from the loop ‘lapply’.

B.all.2 <- B.a.all.2[1,] %>% as.numeric()
a.all.2 <- B.a.all.2[2,]%>% as.numeric()

gamma.0.hat <- mean(a.all.2)
gamma.1.hat <- mean(B.all.2)

From the lecture notes, the expected value of \(\hat{\gamma_i}\) is nothing but the mean of the vector of fitted values from the looped regression.

n <- length(a.all.2)
sd.gamma.0 <- sqrt(sum((a.all.2 - mean(a.all.2))^2)/(n^2-n))
sd.gamma.1 <- sqrt(sum((B.all.2 - mean((B.all.2)))^2)/(n^2-n))

Again, taken from the lecture notes, this is the sample standard deviation of each estimator.

t.gamma.0.all = gamma.0.hat/sd.gamma.0

This is the first test statistic for the Fama-MacBeth approach. Moreover, this test statistic is normally distributed with \(T-1\) degrees of freedom, which is nothing but the time period, or the number of regressions estimated, which is \(251-1=250\) degrees of freedom. Simply, the critical value here is 1.96 at the 5% confidence level for all size effect tests. A more in depth analysis along with a discussion about the size effect will follow with all test statistics organized into a table.

avg.excess.return.mkt = mean(ER_m)
t.gamma.1.all = (gamma.1.hat - avg.excess.return.mkt)/sd.gamma.1

Equivalently, the same distribution holds for this test as well, where the critical value is 1.96 at the 5% confidence level.

mean.adj.all <- mean(all.stocks.adj.r)
sd.adj.all <- sqrt(sum((all.stocks.adj.r - mean(all.stocks.adj.r))^2)/(n-1))
upper.CI.all <- mean.adj.all +(1.96*(sd.adj.all)/sqrt(n))
lower.CI.all <- mean.adj.all -(1.96*(sd.adj.all)/sqrt(n))

Lastly, this is the lower and upper bound of the 95% confidence interval of the distribution of \(\bar{R^2}_{i}\).

Small Cap

rf.mat <- matrix(rf, nrow = 251, ncol = 10)
rf.vector <- rep(rf, 251)
ER_i <-  small.returns - rf.mat
ER_m <- SPY.returns - rf.vector
df.all <- cbind(ER_i, ER_m) %>% data.frame() 
capm.test <- lm(ER_i ~ ER_m, df.all)
B.a.all <- coefficients(capm.test) %>% t()
B.all <- B.a.all[,2]
ER_i.prime <- t(ER_i)
df.b.test <- cbind(ER_i.prime, B.all) %>% data.frame()
n <- ncol(df.b.test)-1
capm.test.2 <- lapply(1:n, function(x)lm(df.b.test[,x] ~  B.all, df.b.test))
B.a.all.2 <- sapply(capm.test.2, coef) %>% as.data.frame()
summaries <- lapply(capm.test.2, summary)
all.stocks.adj.r <- sapply(summaries, function(x)  c(adj_r_sq = x$adj.r.squared)) %>% as.numeric()

This code and procedure is analogous to the previous case with a portfolio of all stocks. The only important difference is the dimension of the returns matrix, which is 251x10. Therefore, \(\beta\) and \(\alpha\) will be of a different length for the following tests. Another important note is this changes the value of \(n\) while performing the F-test as the critical value for the 95% confidence level is now evaluated at 10 degrees of freedom for the numerator and 241 degrees for the denominator.

kable(B.all)

	x
SLVO	0.1753967
STAF	1.6021898
NAOV	0.3621772
FTEK	0.5285670
DCAR	0.8664930
CTIB	0.2748523
ATAI	0.9105274
KTOV	0.9242521
CLIR	0.0323883
CDMOP	0.0358507

#Wald Test
a.all <- B.a.all[,1]
res <- resid(capm.test)
sigma <- (1/251)*(t(res)%*%res) 
sigma.inv <- solve(sigma)
a.all.prime <- t(a.all)
varER_m <- var(ER_m)
mu.ER_m <- mean(ER_m)
j0 <- 251*(1+(mu.ER_m^2/varER_m))^(-1)*(a.all.prime%*%sigma.inv%*%a.all)
j1 <- ((251-10-1)/(251*10))*j0
j1.small <- j1 %>% as.numeric()
#1

B.all.2 <- B.a.all.2[1,] %>% as.numeric()
a.all.2 <- B.a.all.2[2,]%>% as.numeric() 

#2
gamma.0.hat <- mean(a.all.2)
gamma.1.hat <- mean(B.all.2)

#3
n <- length(a.all.2)
sd.gamma.0 <- sqrt(sum((a.all.2 - mean(a.all.2))^2)/(n^2-n))
sd.gamma.1 <- sqrt(sum((B.all.2 - mean((B.all.2)))^2)/(n^2-n))
t.gamma.0.small = gamma.0.hat/sd.gamma.0
avg.excess.return.mkt = mean(ER_m)
t.gamma.1.small = (gamma.1.hat - avg.excess.return.mkt)/sd.gamma.1
mean.adj.all <- mean(all.stocks.adj.r)
sd.adj.all <- sqrt(sum((all.stocks.adj.r - mean(all.stocks.adj.r))^2)/(n-1))
upper.CI.small <- mean.adj.all +(1.96*(sd.adj.all)/sqrt(n))
lower.CI.small <- mean.adj.all -(1.96*(sd.adj.all)/sqrt(n))

Again, all the steps are identical for all stocks. Moreover, the procedure for medium cap and large cap is identical, therefore, I have chosen not to included it, apart from the \(\beta\) estimates.

Mid Cap

kable(B.all)

	x
OXLC	0.4229842
LNDC	0.5930880
ALCO	0.1987354
ALBO	1.1597327
RMTI	0.5336933
FRTA	0.9905798
PRTK	1.3011712
AVCO	0.6507998
WOOD	0.7422248
PETS	0.6283831

Large Cap

kable(B.all)

	x
COLM	0.7434855
SSB	0.7391666
AGNC	0.3036397
CSCO	1.3058718
MLNX	0.9496949
CHTR	0.9328803
OKTA	1.8785889
VRNT	1.0708232
SAVE	0.8806607
IBOC	0.8878494

small.capm.test <- c(j1.small, qf(.95,df1=10, df2=(251-10-1)), t.gamma.0.small, t.gamma.1.small, lower.CI.small, upper.CI.small) %>% t()
mid.capm.test <- c(j1.mid, qf(.95,df1=10, df2=(251-10-1)), t.gamma.0.mid, t.gamma.1.mid, lower.CI.mid, upper.CI.mid) %>% t()
large.capm.test <- c(j1.large, qf(.95,df1=10, df2=(251-10-1)) , t.gamma.0.large, t.gamma.1.large, lower.CI.large,  upper.CI.large) %>% t()
all.capm.test <- c(j1.all, qf(.95,df1=30, df2=(251-30-1)), t.gamma.0.all, t.gamma.1.all, lower.CI.all, upper.CI.all) %>% t()
capm.test.table <- rbind(small.capm.test, mid.capm.test, large.capm.test, all.capm.test)
row.names(capm.test.table) <- c("Small Cap", "Mid Cap", "Large Cap", "All Stocks")
colnames(capm.test.table) <- c("J1 F-Test Statistic","F-Distribution Critical Value", "Gamma 0 Test Statistic", "Gamma 1 Test Statistic", "Lower C.I", "Upper C.I")
knitr::kable(capm.test.table, caption = "CAPM Test Results for Size Effects", digits = 3) %>% landscape()

CAPM Test Results for Size Effects
	J1 F-Test Statistic	F-Distribution Critical Value	Gamma 0 Test Statistic	Gamma 1 Test Statistic	Lower C.I	Upper C.I
Small Cap	1.131	1.870	0.434	-0.798	0.006	0.047
Mid Cap	1.104	1.870	-1.680	1.077	-0.028	0.012
Large Cap	1.287	1.870	1.974	-1.797	0.075	0.134
All Stocks	1.417	1.511	0.780	-0.964	0.007	0.023

In the analytical portion for this question, I aim to find if there exists a size effect in testing the Sharpe and Lintner CAPM. The first test is the F-test. Interestingly, the F-test statistic for all portfolios of different market cap size is less than the critical value. Therefore, failing to reject the null hypothesis that the vector of \(\hat{\alpha}\) for each portfolio is statistically different from 0, which provides support for the Sharpe and Linter CAPM. There is an increase in F-test statistic in the large cap stock portfolio, and a further increase in the portfolio which includes all the stocks, however, both are still less than their respective critical values.

The second test for the Sharpe and Lintner CAPM is the Fama-MacBeth method. First, there is a noticeable size effect when testing if \(\hat{\gamma}_{0}\) is equal to 0. For small cap stocks, the test statistic is less than 1.96, thus failing to reject CAPM. Interestingly, however, the test statistic is less than 1.96, which fails to reject the null hypothesis at the 95% confidence level that \(\hat{\gamma}_{0}\) is equal to 0. Furthermore, \(\hat{\gamma}_{1}\) is also less than |1.96|, therefore, failing to reject the null hypothesis that \(\hat{\gamma}_{1}\) is equal to the average excess market return. Both tests provide support for CAPM, which is surprising because CAPM is thought to fail in small cap stocks. Another interesting result is the \(\hat{\gamma}_{0}\) test statistic for the medium cap stocks is larger in absolute value, again, the opposite of what is expected. However, this result is statistically insignificant. Furthermore, the test statistic for \(\hat{\gamma}_{1}\) is also statistically significant at the 95% confidence. Again, supporting evidence for CAPM for medium market cap stocks.

The most interesting result is the test statistic for \(\hat{\gamma}_{0}\), is marginally larger than 1.96. Which, suggests that at the 95% confidence level, \(\hat{\gamma}_{0}\) is statistically different from 0, which provides enough evidence to reject CAPM for large cap stocks. This is counter-intuitive to my hypothesis as theory and empirical work suggest CAPM fails in the opposite direction. Furthermore, the \(\hat{\gamma}_{1}\) test statistic is also larger in absolute value than the mid-market cap stock portfolio, but still statistically insignificant. Moreover, this conclusion is consistent, in some degree, to the F-test for \(J_{1}\), as the test statistic is the highest for the large market cap stock portfolio relative to the others. Lastly, a portfolio that includes all stocks is also considered. The F-test returns a value that is significant, however, much closer to the critical value than for previous sizes. Moreover, CAPM holds for the portfolio of all stocks as well. As noted previously, rejecting CAPM may not have to do with the theory, but with the proxy for the market portfolio. Another conclusion that could be drawn is the S&P 500 does not accurately proxy the market portfolio.

Lastly, the lower bound of the 95% confidence interval of \(\bar{R^2}\) for small cap stocks are reported to be 0.006. While this does not have a nice interpretation as \({R^2}\), it can conclude the models fit. This suggests that the excess market returns might be a poor sole factor in this model, and other factors will be needed. A result unsurprising due to the work of multi-factor models providing better goodness-of-fit. Furthermore, the large cap stocks interestingly have the highest lower bound for \(\bar{R^2}\), at 0.075.

Q4.

l.returns.w.spy <- cbind(l.returns, l.SPY.returns)
rev.l.returns <- apply(l.returns.w.spy, 2, rev)
lag.l.returns <- lapply(1:31, function(x)Lag(rev.l.returns[,x], k=0:10))
df.4.1 = do.call(cbind, lag.l.returns)
df.4 <- df.4.1[11:251,]
n=nrow(df.4)
dfm = as.matrix(df.4)

The last test is related to market efficiency. A market is efficient if past prices reflect all available information for the particular asset. Random walk is a theoretical framework that has been derived from this. Mathematically, this can be shown,

\[\begin{align} log(p_{t}) = \mu + log(p_{t-1}) + \epsilon_{t}. \end{align}\]

Equation (3) states that asset price at time \(t\) is a function of log asset prices of \(t-1\), a drift term \(\mu\) and the error, \(\epsilon_{t}\). Rearranging equation (3),

\(\begin{aligned} log (p_{t}-log(p) &= \mu + \epsilon_{t} \\ log(p_{t}/p_{t-1}) &= \mu + \epsilon_{t} \\ log(1+R) &= \mu + \epsilon_{t} \\ E[log(1+R)] &= \mu. \end{aligned}\)

Intuitively, this states that expected log returns can be estimated by just taking the average of the returns. The main assumption of this model is that log prices are log-normal distributed with expected value 0 and constant variance. Moreover, to avoid limited liability, we can rewrite the prices in log returns. The assumption of normality is, however, ambitious. Random walk is broken up into three different models, each iteration has more relaxed assumptions. The focus in this question is the third iteration, random walk 3. Most importantly, this model does not make any assumptions in the nature of the errors (ie. independence and constant) and the type of distribution. Therefore, a benchmark used to test for market efficiency. If random walk holds, then asset prices are only dependent on their mean value and nothing more. If not, then asset prices in time \(t\) are dependent on their mean and past returns of the asset.

To create the lagged matrix, I use the ‘Lag’ function in R, where I generate \(k\) lags for each asset, including the market portfolio, where \(k=0,...,10\). Furthermore, I use the ‘lapply’ loop to combine all the lagged matrices for efficiency. The next step will highlight the procedure needed to test for RW3.

SLVO

dfm1 <- dfm[,1:11] %>% as.data.frame() 
rw3.1 <- lm(dfm1[,1] ~ . -Lag.0, data=dfm1)
rho.a.all <- coefficients(rw3.1) %>% t()
rho.all <- rho.a.all[,2:11] %>% as.data.frame()

First, to test random walk one must estimate the Pearson coefficient of autocorrelation, which is defined as such,

\[\begin{align} \rho(k) = \frac{cov(r_{t},r_{t-k})}{var(r_{t})^{1/2}var(r_{t+k})^{1/2}}. \end{align}\]

Assuming stationarity, \({var(r_{t})}^{1/2}var(r_{t+k})^{1/2} = var(r_{t})^{1/2}var(r_{t})^{1/2} = var(r_{t})\). Therefore, equation (4) simplifies as such,

\(\begin{aligned} \rho(k) &= \frac{cov(r_{t},r_{t-k})}{var(r_{t})^{1/2}var(r_{t+k})^{1/2}} \\ &= \frac{cov(r_{t},r_{t-k})}{var(r_{t})} \\ &= \beta. \end{aligned}\)

Assuming stationarity, the autocorrelation coefficients are nothing but the estimates of a linear regression, which is done by regressing the return of each stock at lag 0 onto all 10 lags for each stock, therefore, 31 regressions will be estimated using the lagged data matrix. Moreover, because of \(T=241\), there is no need to use the bias-corrected estimator for the autocorrelation coefficient.

q <- c(2:11)
k <- c(1:10)
vrq<-function(q,k)
{
  1+2*(sum(1-(k/q))*rho.all[k,])
}
VRq <- mapply(vrq,q,k)

The RW3 tests initially starts with estimating 10 variance-ratio statistics Lo and MacKinlay proposed, due to estimating 10 lags for each stock because of daily returns. Furthermore, this test allows for general forms of heteroskedasticity and allows for returns to be not normally distributed, thus, restricting the assumptions for the random walk test. Mathematically, the above code is,

\[\begin{align} \bar{VR(q)} = 1 + \sum_{k=1}^{q-1}2~(1-\frac{k}{q})~\rho(k), \end{align}\]

where \(q=2,...,11\) and \(k=1,...,10\). There will be 10 variance-ratio statistics (\(VR(q)\)) estimated for SLVO, however, the number of variance-ratio statistics should not vary between each stock. Furthermore, \(VR(q)\) is asymptotically normally distributed with mean 1 and variance \(\frac{\hat{\theta(q)}}{nq}\), where \(nq=T=241\) for all stocks.

r.bar <- mean(dfm1[,1])
r.bar.vec <- rep(r.bar, 241)
delta.hat.1 <- 241*sum(((dfm1[2:241,1]-r.bar.vec[2:241])^2)*
                         ((dfm1[1:240,1]-r.bar.vec[1:240])^2))/
  (sum(((dfm1[2:241,1]-r.bar.vec[2:241])^2)^2))
delta.hat.2 <- 241*sum(((dfm1[3:241,1]-r.bar.vec[3:241])^2)*((dfm1[1:239,1]-r.bar.vec[1:239])^2))/(sum(((dfm1[3:241,1]-r.bar.vec[3:241])^2)^2))
delta.hat.3 <- 241*sum(((dfm1[4:241,1]-r.bar.vec[4:241])^2)*((dfm1[1:238,1]-r.bar.vec[1:238])^2))/(sum(((dfm1[4:241,1]-r.bar.vec[4:241])^2)^2))
delta.hat.4 <- 241*sum(((dfm1[5:241,1]-r.bar.vec[5:241])^2)*((dfm1[1:237,1]-r.bar.vec[1:237])^2))/(sum(((dfm1[5:241,1]-r.bar.vec[5:241])^2)^2))           
delta.hat.5 <- 241*sum(((dfm1[6:241,1]-r.bar.vec[6:241])^2)*((dfm1[1:236,1]-r.bar.vec[1:236])^2))/(sum(((dfm1[6:241,1]-r.bar.vec[6:241])^2)^2))           
delta.hat.6 <- 241*sum(((dfm1[7:241,1]-r.bar.vec[7:241])^2)*((dfm1[1:235,1]-r.bar.vec[1:235])^2))/(sum(((dfm1[7:241,1]-r.bar.vec[7:241])^2)^2))           
delta.hat.7 <- 241*sum(((dfm1[8:241,1]-r.bar.vec[8:241])^2)*((dfm1[1:234,1]-r.bar.vec[1:234])^2))/(sum(((dfm1[8:241,1]-r.bar.vec[8:241])^2)^2))           
delta.hat.8 <- 241*sum(((dfm1[9:241,1]-r.bar.vec[9:241])^2)*((dfm1[1:233,1]-r.bar.vec[1:233])^2))/(sum(((dfm1[9:241,1]-r.bar.vec[9:241])^2)^2))           
delta.hat.9 <- 241*sum(((dfm1[10:241,1]-r.bar.vec[10:241])^2)*((dfm1[1:232,1]-r.bar.vec[1:232])^2))/(sum(((dfm1[10:241,1]-r.bar.vec[10:241])^2)^2))           
delta.hat.10 <- 241*sum(((dfm1[11:241,1]-r.bar.vec[11:241])^2)*((dfm1[1:231,1]-r.bar.vec[1:231])^2))/(sum(((dfm1[11:241,1]-r.bar.vec[11:241])^2)^2))           
delta.hat <- c(delta.hat.1, delta.hat.2, delta.hat.3, delta.hat.4, delta.hat.5, delta.hat.6, delta.hat.7, delta.hat.8, delta.hat.9, delta.hat.10) %>% 
  as.vector()

Second, I find \(\hat{\delta(k)}\). This process is more tedious compared to running an ‘mapply’ loop, which allows me to run a loop with multiple inputs. Unfortunately, this method produces ‘NA’ values, however, a manual calculation provides the correct estimations. This produces 10 \(\hat{\delta(k)}\) estimates for each stock.

\[\begin{align} \hat \delta(k) = \frac{nq \sum_{j=k+1}^{nq} ((r_q - \bar r)^2)~(r_{q-k~}-~\bar r)}{[\sum_{q=1}^{T}((r_q - \bar r)^2)]^2} \end{align}\]

q <- c(2:11)
k <- c(1:10)
theta <- function(q,k)
{
  sum((2*(1-(k/q)))^2)*delta.hat[k]
}
theta.hat <- mapply(theta,q,k)

Third, \(\hat\theta(q)\) is estimated through a similar function to the \(\bar{VR(q)}\) calculation both in formula and coding method using and ‘mapply’ loop. Finally, the test statistic may be calculated.

\[\begin{align} \hat \theta(q)= \sum_{k=1}^{q-1}~[2~(1 - \frac{k}{q})]^2 ~\delta(k) \end{align}\]

psi.star <- lapply(1:10, function(k)((241)^(1/2)*VRq[k]-1)/(sqrt(theta.hat[k]))) %>% 
  unlist() 
psi.star.1 <- psi.star

Using equation (5), (6) and (7) \(\psi^{*}(q)\) is calculated for each of the 10 lags for each stock. Furthermore, each vector of \(\psi^{*}(q)\) is stored in a different value in order to construct a table for a neater presentation.

\[\begin{align} \psi^{*}(q) = \frac{\sqrt(nq)~(\bar VR(q) -~1)}{\sqrt (\hat \theta(q))} \end{align}\]

STAF

dfm1 <- dfm[,12:22] %>% as.data.frame()
rw3.1 <- lm(dfm1[,1] ~ . -Lag.0, data=dfm1)
rho.a.all <- coefficients(rw3.1) %>% t()
rho.all <- rho.a.all[,2:11] %>% as.data.frame()
q <- c(2:11)
k <- c(1:10)
vrq<-function(q,k)
{
  1+2*(sum(1-(k/q))*rho.all[k,])
}
VRq <- mapply(vrq,q,k)
r.bar <- mean(dfm1[,1])
r.bar.vec <- rep(r.bar, 241)
delta.hat.1 <- 241*sum(((dfm1[2:241,1]-r.bar.vec[2:241])^2)*((dfm1[1:240,1]-r.bar.vec[1:240])^2))/(sum(((dfm1[2:241,1]-r.bar.vec[2:241])^2)^2))
delta.hat.2 <- 241*sum(((dfm1[3:241,1]-r.bar.vec[3:241])^2)*((dfm1[1:239,1]-r.bar.vec[1:239])^2))/(sum(((dfm1[3:241,1]-r.bar.vec[3:241])^2)^2))
delta.hat.3 <- 241*sum(((dfm1[4:241,1]-r.bar.vec[4:241])^2)*((dfm1[1:238,1]-r.bar.vec[1:238])^2))/(sum(((dfm1[4:241,1]-r.bar.vec[4:241])^2)^2))
delta.hat.4 <- 241*sum(((dfm1[5:241,1]-r.bar.vec[5:241])^2)*((dfm1[1:237,1]-r.bar.vec[1:237])^2))/(sum(((dfm1[5:241,1]-r.bar.vec[5:241])^2)^2))           
delta.hat.5 <- 241*sum(((dfm1[6:241,1]-r.bar.vec[6:241])^2)*((dfm1[1:236,1]-r.bar.vec[1:236])^2))/(sum(((dfm1[6:241,1]-r.bar.vec[6:241])^2)^2))           
delta.hat.6 <- 241*sum(((dfm1[7:241,1]-r.bar.vec[7:241])^2)*((dfm1[1:235,1]-r.bar.vec[1:235])^2))/(sum(((dfm1[7:241,1]-r.bar.vec[7:241])^2)^2))           
delta.hat.7 <- 241*sum(((dfm1[8:241,1]-r.bar.vec[8:241])^2)*((dfm1[1:234,1]-r.bar.vec[1:234])^2))/(sum(((dfm1[8:241,1]-r.bar.vec[8:241])^2)^2))           
delta.hat.8 <- 241*sum(((dfm1[9:241,1]-r.bar.vec[9:241])^2)*((dfm1[1:233,1]-r.bar.vec[1:233])^2))/(sum(((dfm1[9:241,1]-r.bar.vec[9:241])^2)^2))           
delta.hat.9 <- 241*sum(((dfm1[10:241,1]-r.bar.vec[10:241])^2)*((dfm1[1:232,1]-r.bar.vec[1:232])^2))/(sum(((dfm1[10:241,1]-r.bar.vec[10:241])^2)^2))           
delta.hat.10 <- 241*sum(((dfm1[11:241,1]-r.bar.vec[11:241])^2)*((dfm1[1:231,1]-r.bar.vec[1:231])^2))/(sum(((dfm1[11:241,1]-r.bar.vec[11:241])^2)^2))           
delta.hat <- c(delta.hat.1, delta.hat.2, delta.hat.3, delta.hat.4, delta.hat.5, delta.hat.6, delta.hat.7, delta.hat.8, delta.hat.9, delta.hat.10) %>% 
  as.vector()
q <- c(2:11)
k <- c(1:10)
theta <- function(q,k)
{
  sum((2*(1-(k/q)))^2)*delta.hat[k]
}
theta.hat <- mapply(theta,q,k)
psi.star <- lapply(1:10, function(k)((241)^(1/2)*VRq[k]-1)/(sqrt(theta.hat[k]))) %>% 
  unlist() 
psi.star.2 <- psi.star

The above code and process is exactly the same as for stock SLVO, for completion, I have included this code chuck to display how it looks for all remaining 29 stocks. The only difference is the specification of columns in the data matrix which contains the lagged matrix for all 31 stocks.

rw3.table <- cbind(psi.star.1, psi.star.2, psi.star.3, psi.star.4, psi.star.5, psi.star.6, psi.star.7, psi.star.8, psi.star.9, psi.star.10, psi.star.11, psi.star.12, psi.star.13, psi.star.14, psi.star.15, psi.star.16, psi.star.17, psi.star.18, psi.star.19, psi.star.20, psi.star.21, psi.star.22, psi.star.23, psi.star.24, psi.star.25, psi.star.26, psi.star.27, psi.star.28, psi.star.29, psi.star.30, psi.star.31) %>% t() %>% as.data.frame()
colnames(rw3.table) <- seq(1:10)
rownames(rw3.table) <- c(small.tickers, mid.tickers, large.tickers, "SPY")
knitr::kable(rw3.table, digits = 3, caption = "Random Walk 3 Test Statistics of 10 Lags for Daily Returns of 30 Stocks and the Market Portolfio")

Random Walk 3 Test Statistics of 10 Lags for Daily Returns of 30 Stocks and the Market Portolfio
	1	2	3	4	5	6	7	8	9	10
SLVO	1.098	2.876	4.541	5.544	6.012	6.958	6.895	9.407	12.677	13.243
STAF	1.932	6.685	22.567	21.218	51.106	26.789	20.777	74.341	91.134	70.504
NAOV	1.128	3.333	4.342	4.235	6.433	5.225	7.649	6.842	8.639	9.493
FTEK	4.440	2.737	7.232	5.905	16.953	15.581	20.517	16.991	31.444	27.786
DCAR	1.312	2.595	5.060	5.414	8.543	7.202	12.103	12.447	16.570	13.234
CTIB	1.960	3.488	5.498	7.395	8.950	10.024	12.239	11.473	15.068	14.445
ATAI	4.335	14.066	13.500	55.337	50.427	29.805	36.444	36.485	23.582	88.160
KTOV	1.934	6.837	11.687	6.067	12.949	7.267	16.583	25.506	35.746	20.795
CLIR	1.249	2.402	3.205	4.464	5.600	6.033	6.767	8.172	7.332	8.386
CDMOP	0.411	2.568	7.113	8.348	6.961	7.533	10.804	17.454	24.902	23.478
OXLC	1.322	2.434	3.599	5.315	7.892	8.300	10.220	13.305	13.791	18.199
LNDC	2.088	2.441	3.912	4.754	6.647	6.800	7.765	9.135	10.441	11.244
ALCO	1.364	2.903	4.085	4.088	5.645	7.893	9.398	9.619	9.132	10.496
ALBO	1.502	2.539	3.964	4.775	5.702	5.910	6.739	9.925	10.525	10.263
RMTI	1.311	2.711	4.146	6.926	6.828	8.496	9.579	13.527	15.206	17.487
FRTA	1.986	3.709	5.114	6.090	7.143	9.191	9.388	15.490	13.032	17.146
PRTK	2.323	2.515	4.528	4.472	7.223	7.789	12.313	10.608	13.748	13.704
AVCO	1.740	3.073	6.264	19.984	13.613	16.837	21.540	9.283	16.885	27.677
WOOD	1.552	2.476	3.258	4.005	5.244	6.329	6.832	8.752	9.856	10.089
PETS	1.818	3.344	4.483	6.007	7.480	7.637	9.650	12.721	14.288	8.987
COLM	1.493	2.894	4.532	5.283	7.209	8.056	9.294	12.697	11.254	13.426
SSB	1.731	3.732	3.705	5.374	8.072	8.089	10.539	10.141	10.326	16.054
AGNC	1.344	2.076	3.325	4.070	5.380	5.636	7.490	8.066	8.201	9.569
CSCO	1.346	2.375	2.953	4.205	5.108	5.840	7.715	6.853	8.901	9.720
MLNX	1.338	2.635	5.871	7.930	9.325	12.387	17.691	12.535	13.140	19.500
CHTR	2.104	3.319	5.422	7.314	8.429	11.046	10.744	15.139	15.152	17.926
OKTA	2.623	2.758	4.950	5.928	5.948	7.044	10.877	7.820	14.878	11.014
VRNT	1.353	4.105	7.405	7.905	9.608	14.595	13.134	20.303	23.663	22.081
SAVE	1.926	4.049	5.944	6.710	8.481	8.503	7.237	12.107	16.799	17.131
IBOC	1.634	3.099	3.727	5.120	6.312	8.120	8.781	8.367	8.310	10.778
SPY	1.593	2.414	3.261	3.899	5.995	6.321	9.178	8.694	9.366	8.924

For the analysis section, I have included a table for the test statistics of all 31 stocks. More specifically, 30 stocks and the market portfolio proxy. The distribution of \(\psi^{*}(q)\) is normal with mean 0 and variance 1, therefore, the critical value is 1.96 at the 95% confidence interval. A test statistic further away from 0 implies a variance ratio that is less likely to be 1, thus, rejecting random walk and the efficient market hypothesis. Interestingly, there are no lags greater than 1 that fail to reject the null hypothesis. Therefore, failing to reject the null hypothesis that all \(k\), where \(k=2,...,10\), autocorrelation coefficients are jointly equal to 0 at the 95% confidence level. This is evidence for the efficient market hypothesis because of the relationship of \(E[r_{t}]=\mu\), where \(r_{t}\) is the log-returns of each stock, holds. More specifically, 23 out of the 31 stocks have first lags which are equal to 0, thus the efficient market hypothesis holds for these, and rejected entirely for 8 stocks. Moreover, there are only 2 large cap stocks for which the market efficiency hypothesis is rejected at the 95% confidence level, and 3 stocks are rejected for each small and medium cap stocks. Small cap stocks, on average, have very large test statistics for longer lags compared to large cap stocks. This is consistent with the theory that random walk fails more often in smaller cap stocks due to the more volatile nature of these returns.