Efficient Financial Markets [?]

Isai Guizar

Disclaimer: This document is intended for educational purposes only. It does not constitute financial advice. It is part of my Risk Financial Management course at Tec de Monterrey.

1 Intro

1.1 The Efficient Market Hypothesis

The description of the efficient market hypothesis (EMH) here follows Mishkin and Eakins (2024). It states that prices in financial markets fully reflect all available information. To understand the hypothesis, the authors define the arithmetic rate of return from holding a stock from time \(t\) to \(t+1\) as:

\[ R = \frac{P_{t+1}-P_t}{P_t} \] that is, the rate of capital gains (assuming no other cash payments).

At the start of the period, however, \(P_{t+1}\) is unknown, but investors do have some expectations of the price, then, the expected return is:

\[ R^e = \frac{P^e_{t+1}-P_t}{P_t} \]

the efficient market hypothesis (EMH) views the expectations as the optimal forecast of the return \((R^{of})\), or simply as the best guess of the future, using all available information \([R^e = R^{of}]\).

The supply-and-demand analysis of a financial market teaches us that the expected return on a stock will converge to the equilibrium return, \(R^*\), that equates the quantity demanded to the quantity supplied. Then, if the market is in equilibrium, \(R^e = R^*\).

Therefore, in an efficient market:

\[ R^{of}=R^* \]

current prices in a financial market will be set so that the optimal forecast of a stock’s return using all available information equals the security’s equilibrium return.

\[ If \ R^{of} > R^* \Rightarrow P_t \uparrow \ \rightarrow \ R^{of} \downarrow \]

\[ If \ R^{of} < R^* \Rightarrow P_t \downarrow \ \rightarrow \ R^{of} \uparrow \] until \(R^{of}=R^*\).

More simply: in an efficient market a stock’s price fully reflects all available information.

A strong view of the EMH states that not only an efficient market is one in which expectations are optimal forecasts using all available information, but they also add the condition that an efficient market is one in which prices reflect the true fundamental (intrinsic) value of the securities. Thus, in an efficient market, all prices are always correct and reflect market fundamentals.

In favor of the EMH: Random-Walk

A random walk describes the movements of a variable whose future changes cannot be predicted (are random) because, given today’s value, the variable is just as likely to fall as to rise. Under the EMH all price changes are due to information that can not be anticipated, thus must be uncorrelated over time. Then, stock prices should approximately follow a random walk.

More formally, \(y_t\) represent a random walk if: \[ y_t = y_{t-1} + \epsilon_t \] where:

\(y_t:\) is the return at time \(t\)

\(\epsilon_t:\) is the random error term at time \(t\), independent and identically distributed (i.i.d.) with mean zero, constant variance, and no autocorrelation.

Against the EMH: Mean reversion and trend

Stocks with low returns today tend to have high returns in the future, and viceversa; hence stocks that have done poorly in the past are more likely to do well in the future because mean reversion indicates that there will be a predictable positive change in the future price. A trend is when a movement in one direction is followed by another in the same direction. Suggesting that stock prices are not a random walk.

More formally, using a simple first order auto-regression process:

\[ y_t = \rho y_{t-1} + \epsilon_t \] If \(1>\rho>0\), then trend

If \(-1<\phi<0\), then mean reversion

As we have seen earlier, when \(\rho \ne 0\), the time aggregation problem arises, which must be accounted for when measuring financial risk.

Note:

Dissatisfaction with using the EMH to explain events like 1987’s Black Monday, when stock markets around the world plummeted sharply, with the Dow Jones Industrial Average (DJIA) experiencing its largest one-day percentage drop in history—22.6%, gave rise to the field of behavioral finance.

Behavioral finance is a field of study that combines insights from psychology, sociology, anthropology and other social sciences with financial theory to understand how human behavior affects financial decision-making and markets. It challenges the traditional view of financial markets as perfectly rational and efficient, acknowledging that investors often act irrationally due to biases and emotions.


2 Testing the EMH

Recall: Autocorrelation measures the relationship between lagged values of a variable. Under the Efficient Markets Hypothesis, the stock returns should not be predictable, that is, they should not be autocorrelated.

Times series that show no autocorrelation are called white noise. We can evaluate if a process is white noise by evaluating the autocorrelations \((\rho)\).

Formally, we employ the Ljung and Box (1978) test to verify the null hypothesis:

\[ \begin{equation*} \rho_1 = \rho_2 = \dots =\rho_\tau = 0 \end{equation*} \]

for \(\tau \ge 1\). If the null is not rejected, the evidence is consistent with the EMH.

3 Application

We will test the Efficient Market Hypothesis (EMH) for individual stocks and then generalize the procedure to a portfolio of N stocks. The ultimate goal is to identify stocks whose return patterns reject the EMH, enabling us to build a portfolio composed of assets with potentially predictable returns.

Data will be obtained from yahoo finance, make sure you have installed the library (!pip install yfinance) before importing it, you only have to do this once. Also, I have written some useful functions in a file named “iguizarFuncs.py”, please make sure this file is loaded.

3.1 One stock

Import the libraries:

# !pip install yfinance
import yfinance as yf
import pandas   as pd
import numpy    as np 
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller
from statsmodels.stats.diagnostic import acorr_ljungbox
import iguizarFuncs as ig

Download the data for one stock (AAPL):

Data = yf.download('AAPL', start='2023-09-30', end='2024-09-30', progress=False)['Close']

# Reset index to make the date a regular column 
Data = Data.reset_index()

# Clean the 'Date' column
Data['Date'] = pd.to_datetime(Data['Date']).dt.date
YF.download() has changed argument auto_adjust default to True

Plot the price data:

plt.figure(figsize=(7, 5))
plt.plot(Data['Date'], Data['AAPL'])
plt.title("Time Series of the stock's prices")
plt.xlabel("Date", fontsize = 10)
plt.ylabel("Price", fontsize=10)
plt.xticks(rotation=45)
plt.tight_layout()
plt.legend()
plt.show()

Calculate the log returns:

Data['Log_Returns'] = np.log(Data['AAPL'] / Data['AAPL'].shift(1))

Plot the returns:

plt.figure(figsize=(7, 5))

plt.plot(Data['Date'], Data['Log_Returns'])
plt.axhline(0, color='black', linestyle='--', linewidth=0.5)  
plt.title("Times series of the stock's return")
plt.xlabel("Date")
plt.ylabel("Returns")
plt.xticks(rotation=45)
plt.tight_layout()
plt.legend()
plt.show()

The Ljung-Box test

We can now apply the Ljung-Box test. The analyst decides the number of lags, in this particular case we are choosing 5. Then, the null hypothesis:

\[ \rho_1 = \rho_2 = \rho_3 = \rho_4 = \rho_5=0 \]

indicates that the returns of the past 5 days have no influence on current returns. If the null hypothesis was rejected we would have statitical evidence that returns from at least one of the past 5 days are significantly correlated with today’s return — implying a deviation from the EMH.

# Run Ljung-Box test on log returns with 5 lags
ljung_box_results = acorr_ljungbox(Data['Log_Returns'].dropna(), lags=[5], return_df=True)

# Display the test results
print(ljung_box_results)
    lb_stat  lb_pvalue
5  3.659679   0.599376

Since the p-value is not below any conventional significance level, we fail to reject the null hypothesis. This means that, for this particular stock, the test does not detect significant autocorrelation (p-value ≥ 0.10), supporting the idea that returns are independent — a result consistent with the Efficient Market Hypothesis (EMH).

3.2 Multiple stocks

To generalize this approach to multiple stocks, we will use the file “toUScompanies.csv” that lists the top US companies by market cap. Please make sure you have loaded this file.

  1. Specify the period and tickers of interest.
companies = pd.read_csv('toUScompanies.csv')
tickers   = companies['Symbol'].tolist()

start_date = '2023-09-30'
end_date   = '2024-09-30'

The function ‘get_prices’ is useful to extract daily prices for multiple tickers over a specified period:

nStocks = ig.get_prices(tickers, start_date, end_date)
nStocks.head(5)
Ticker Date AAPL ABBV ABNB ABT ADBE ADI ADP AMAT AMD ... UBER UNH UNP UPS V VRTX VZ WFC WMT XOM
0 2023-10-02 172.485809 138.997025 136.559998 92.424889 521.130005 170.950928 231.582275 137.814804 103.269997 ... 45.680000 502.766541 196.067673 144.168121 228.643341 347.829987 28.193148 38.061131 52.395565 109.820274
1 2023-10-03 171.145615 138.228180 127.730003 92.482803 507.029999 167.131012 232.598114 135.058731 100.080002 ... 44.509998 497.783508 197.170700 142.931702 226.211273 345.149994 28.388437 37.157883 52.065022 110.010231
2 2023-10-04 172.396454 138.471970 127.410004 92.347656 518.419983 169.352798 236.042297 137.607376 104.070000 ... 44.939999 498.907135 195.709686 142.903809 228.593918 352.970001 27.997854 37.446159 52.690102 105.897789
3 2023-10-05 173.637360 138.246933 124.989998 92.878654 516.440002 167.705978 235.442490 137.587631 102.910004 ... 44.610001 504.388428 194.132523 142.587738 230.828247 355.140015 28.246408 37.763252 52.061752 103.513908
4 2023-10-06 176.198624 138.987640 126.360001 93.535172 526.679993 169.528229 238.364227 138.585327 107.239998 ... 45.779999 512.771606 195.893509 143.415131 232.370529 360.619995 27.969883 38.138004 51.187946 101.785339

5 rows × 111 columns

While the function ‘get_returns’ help us to obtain returns for multiple tickers over a specified period:

nStocksRet = ig.get_returns(tickers, start_date, end_date)
nStocksRet.head(5)
Ticker Date AAPL ABBV ABNB ABT ADBE ADI ADP AMAT AMD ... UBER UNH UNP UPS V VRTX VZ WFC WMT XOM
0 2023-10-03 -0.007800 -0.005547 -0.066845 0.000627 -0.027429 -0.022599 0.004377 -0.020201 -0.031377 ... -0.025947 -0.009961 0.005610 -0.008613 -0.010694 -0.007735 0.006903 -0.024017 -0.006329 0.001728
1 2023-10-04 0.007282 0.001762 -0.002508 -0.001463 0.022216 0.013206 0.014699 0.018695 0.039094 ... 0.009614 0.002255 -0.007437 -0.000195 0.010478 0.022404 -0.013854 0.007728 0.011934 -0.038099
2 2023-10-05 0.007172 -0.001626 -0.019177 0.005734 -0.003827 -0.009772 -0.002544 -0.000143 -0.011209 ... -0.007370 0.010927 -0.008091 -0.002214 0.009727 0.006129 0.008838 0.008432 -0.011997 -0.022768
3 2023-10-06 0.014643 0.005344 0.010901 0.007044 0.019634 0.010807 0.012333 0.007225 0.041214 ... 0.025889 0.016484 0.009030 0.005786 0.006659 0.015313 -0.009838 0.009875 -0.016927 -0.016840
4 2023-10-09 0.008416 0.005852 0.011097 -0.001239 0.004943 -0.003743 0.015306 -0.000998 -0.002521 ... -0.007234 0.003234 0.009047 0.000454 -0.002556 -0.015144 0.019262 0.000252 -0.003651 0.034393

5 rows × 111 columns

  1. Use the function ‘ljung_box_test’ to test the EMH using the Ljung-Box test for each of the stocks, and store the results in a data frame named lb_results. Note the function allows us to choose the number of lags, we will continue usign 5:
lags = 5
lb_results = ig.ljung_box_test(tickers, start_date, end_date, lags)

# Display the results
lb_results
Ticker Ljung-Box Statistic P-value
0 AAPL 3.659644 0.599381
1 ABBV 0.964368 0.965388
2 ABNB 5.066243 0.407849
3 ABT 3.803183 0.578086
4 ADBE 1.699569 0.888954
... ... ... ...
105 VRTX 8.923362 0.112160
106 VZ 9.525372 0.089856
107 WFC 6.704654 0.243548
108 WMT 1.462580 0.917347
109 XOM 5.994274 0.306776

110 rows × 3 columns

Filter for those stocks that contradict the EMH. In this example, we use a p-value < 0.20, that is, for those that reject the null hypothesis with a confidence level relatively low, of 80%.

no_stocks = lb_results[lb_results['P-value'] < 0.20]


print(f"{no_stocks.shape[0]} stocks reject the EMH")
28 stocks reject the EMH

We can now export to an excel file the list of tickers that contradict the EMH for future reference as:

no_stocks.to_excel('no_stocks.xlsx', index=False)

Or select from the original list of companies:

select_companies = companies[companies['Symbol'].isin(no_stocks['Ticker'])]
select_companies
Name Symbol Country Sector Market Cap
5 Amazon.com Inc AMZN US Consumer Discretionary 2.030020e+12
6 Meta Platforms Inc META US Communication Services 1.499610e+12
7 Berkshire Hathaway Inc BRK-B US Financials 9.813140e+11
8 Berkshire Hathaway Inc BRK-A US Financials 9.813140e+11
10 Broadcom Inc AVGO US Information Technology 8.257580e+11
14 Visa Inc V US Financials 5.738230e+11
17 Oracle Corp ORCL US Information Technology 4.846800e+11
18 Mastercard Inc MA US Financials 4.750540e+11
19 Procter & Gamble Co PG US Consumer Staples 3.916550e+11
24 Bank of America Corp BAC US Financials 3.254660e+11
36 Thermo Fisher Scientific Inc TMO US Health Care 2.102390e+11
37 McDonald's Corp MCD US Consumer Discretionary 2.099230e+11
43 Morgan Stanley MS US Financials 1.907780e+11
44 Texas Instruments Inc TXN US Information Technology 1.903750e+11
45 General Electric Co GE US Industrials 1.891850e+11
47 Qualcomm Inc QCOM US Information Technology 1.873800e+11
52 Verizon Communications Inc VZ US Communication Services 1.741730e+11
57 Comcast Corp CMCSA US Communication Services 1.638650e+11
71 Charles Schwab Corp SCHW US Financials 1.302400e+11
76 Boston Scientific Corp BSX US Health Care 1.238550e+11
77 Vertex Pharmaceuticals Inc VRTX US Health Care 1.224950e+11
81 Palo Alto Networks Inc PANW US Information Technology 1.171250e+11
83 United Parcel Service Inc UPS US Industrials 1.148200e+11
84 Analog Devices Inc ADI US Information Technology 1.146350e+11
94 Regeneron Pharmaceuticals Inc REGN US Health Care 1.018900e+11
99 Intel Corp INTC US Information Technology 9.595344e+10
101 Elevance Health Inc ELV US Health Care 9.529940e+10
105 KLA Corp KLAC US Information Technology 9.278362e+10

The statistical evidence rejects the EMH for these stocks.

References

Ljung, G. M., and G. E. P. Box. 1978. “On a Measure of Lack of Fit in Time Series Models.” Biometrika 65 (2): 297–303. https://doi.org/10.1093/biomet/65.2.297.
Mishkin, Frederic S, and Stanley G Eakins. 2024. Financial Markets and Institutions. Pearson.