Efficient Financial Markets [?]

Isai Guizar

Disclaimer: This document is intended for educational purposes only. It does not constitute financial advice. It is part of my Risk Financial Management course at Tec de Monterrey.

1 Intro

1.1 The Efficient Market Hypothesis

The description of the efficient market hypothesis (EMH) here follows Mishkin and Eakins (2024). It states that prices in financial markets fully reflect all available information. To understand the hypothesis, the authors define the arithmetic rate of return from holding a stock from time \(t\) to \(t+1\) as:

\[ R = \frac{P_{t+1}-P_t}{P_t} \] that is, the rate of capital gains (assuming no other cash payments).

At the start of the period, however, \(P_{t+1}\) is unknown, but investors do have some expectations of the price, then, the expected return is:

\[ R^e = \frac{P^e_{t+1}-P_t}{P_t} \]

the efficient market hypothesis (EMH) views the expectations as the optimal forecast of the return \((R^{of})\), or simply as the best guess of the future, using all available information \([R^e = R^{of}]\).

The supply-and-demand analysis of a financial market teaches us that the expected return on a stock will converge to the equilibrium return, \(R^*\), that equates the quantity demanded to the quantity supplied. Then, if the market is in equilibrium, \(R^e = R^*\).

Therefore, in an efficient market:

\[ R^{of}=R^* \]

current prices in a financial market will be set so that the optimal forecast of a stock’s return using all available information equals the security’s equilibrium return.

\[ If \ R^{of} > R^* \Rightarrow P_t \uparrow \ \rightarrow \ R^{of} \downarrow \]

\[ If \ R^{of} < R^* \Rightarrow P_t \downarrow \ \rightarrow \ R^{of} \uparrow \] until \(R^{of}=R^*\).

More simply: in an efficient market a stock’s price fully reflects all available information.

A strong view of the EMH states that not only an efficient market is one in which expectations are optimal forecasts using all available information, but they also add the condition that an efficient market is one in which prices reflect the true fundamental (intrinsic) value of the securities. Thus, in an efficient market, all prices are always correct and reflect market fundamentals.

In favor of the EMH: Random-Walk

A random walk describes the movements of a variable whose future changes cannot be predicted (are random) because, given today’s value, the variable is just as likely to fall as to rise. Under the EMH all price changes are due to information that can not be anticipated, thus must be uncorrelated over time. Then, stock prices should approximately follow a random walk.

More formally, \(y_t\) represent a random walk if: \[ y_t = y_{t-1} + \epsilon_t \] where:

\(y_t:\) is the return at time \(t\)

\(\epsilon_t:\) is the random error term at time \(t\), independent and identically distributed (i.i.d.) with mean zero, constant variance, and no autocorrelation.

Against the EMH: Mean reversion and trend

Stocks with low returns today tend to have high returns in the future, and viceversa; hence stocks that have done poorly in the past are more likely to do well in the future because mean reversion indicates that there will be a predictable positive change in the future price. A trend is when a movement in one direction is followed by another in the same direction. Suggesting that stock prices are not a random walk.

More formally, using a simple first order auto-regression process:

\[ y_t = \rho y_{t-1} + \epsilon_t \] If \(1>\rho>0\), then trend

If \(-1<\phi<0\), then mean reversion

As we have seen earlier, when \(\rho \ne 0\), the time aggregation problem arises, which must be accounted for when measuring financial risk.

Note:

Dissatisfaction with using the EMH to explain events like 1987’s Black Monday, when stock markets around the world plummeted sharply, with the Dow Jones Industrial Average (DJIA) experiencing its largest one-day percentage drop in history—22.6%, gave rise to the field of behavioral finance.

Behavioral finance is a field of study that combines insights from psychology, sociology, anthropology and other social sciences with financial theory to understand how human behavior affects financial decision-making and markets. It challenges the traditional view of financial markets as perfectly rational and efficient, acknowledging that investors often act irrationally due to biases and emotions.

2 Testing the EMH

Recall: Autocorrelation measures the relationship between lagged values of a variable. Under the Efficient Markets Hypothesis, the stock returns should not be predictable, that is, they should not be autocorrelated.

Times series that show no autocorrelation are called white noise. We can evaluate if a process is white noise by evaluating the autocorrelations \((\rho)\).

Formally, we employ the Ljung and Box (1978) test to verify the null hypothesis:

\[ \begin{equation*} \rho_1 = \rho_2 = \dots =\rho_\tau = 0 \end{equation*} \]

for \(\tau \ge 1\). If the null is not rejected, the evidence is consistent with the EMH.

3 Application

We will test the Efficient Market Hypothesis (EMH) for individual stocks and then generalize the procedure to a portfolio of N stocks. The ultimate goal is to identify stocks whose return patterns reject the EMH, enabling us to build a portfolio composed of assets with potentially predictable returns.

Data will be obtained from yahoo finance, make sure you have installed the library (!pip install yfinance) before importing it, you only have to do this once. Also, I have written some useful functions in a file named “iguizarFuncs.py”, please make sure this file is loaded.

3.1 One stock

Import the libraries:

# !pip install yfinance
import yfinance as yf
import pandas   as pd
import numpy    as np 
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller
from statsmodels.stats.diagnostic import acorr_ljungbox
import iguizarFuncs as ig

Download the data for one stock (AAPL):

Data = yf.download('AAPL', start='2023-09-30', end='2024-09-30', progress=False)['Close']

# Reset index to make the date a regular column 
Data = Data.reset_index()

# Clean the 'Date' column
Data['Date'] = pd.to_datetime(Data['Date']).dt.date

YF.download() has changed argument auto_adjust default to True

Plot the price data:

plt.figure(figsize=(7, 5))
plt.plot(Data['Date'], Data['AAPL'])
plt.title("Time Series of the stock's prices")
plt.xlabel("Date", fontsize = 10)
plt.ylabel("Price", fontsize=10)
plt.xticks(rotation=45)
plt.tight_layout()
plt.legend()
plt.show()

Calculate the log returns:

Data['Log_Returns'] = np.log(Data['AAPL'] / Data['AAPL'].shift(1))

Plot the returns:

plt.figure(figsize=(7, 5))

plt.plot(Data['Date'], Data['Log_Returns'])
plt.axhline(0, color='black', linestyle='--', linewidth=0.5)  
plt.title("Times series of the stock's return")
plt.xlabel("Date")
plt.ylabel("Returns")
plt.xticks(rotation=45)
plt.tight_layout()
plt.legend()
plt.show()

The Ljung-Box test

We can now apply the Ljung-Box test. The analyst decides the number of lags, in this particular case we are choosing 5. Then, the null hypothesis:

\[ \rho_1 = \rho_2 = \rho_3 = \rho_4 = \rho_5=0 \]

indicates that the returns of the past 5 days have no influence on current returns. If the null hypothesis was rejected we would have statitical evidence that returns from at least one of the past 5 days are significantly correlated with today’s return — implying a deviation from the EMH.

# Run Ljung-Box test on log returns with 5 lags
ljung_box_results = acorr_ljungbox(Data['Log_Returns'].dropna(), lags=[5], return_df=True)

# Display the test results
print(ljung_box_results)

    lb_stat  lb_pvalue
5  3.659679   0.599376

Since the p-value is not below any conventional significance level, we fail to reject the null hypothesis. This means that, for this particular stock, the test does not detect significant autocorrelation (p-value ≥ 0.10), supporting the idea that returns are independent — a result consistent with the Efficient Market Hypothesis (EMH).

3.2 Multiple stocks

To generalize this approach to multiple stocks, we will use the file “toUScompanies.csv” that lists the top US companies by market cap. Please make sure you have loaded this file.

Specify the period and tickers of interest.

companies = pd.read_csv('toUScompanies.csv')
tickers   = companies['Symbol'].tolist()

start_date = '2023-09-30'
end_date   = '2024-09-30'

The function ‘get_prices’ is useful to extract daily prices for multiple tickers over a specified period:

nStocks = ig.get_prices(tickers, start_date, end_date)
nStocks.head(5)

Ticker	Date	AAPL	ABBV	ABNB	ABT	ADBE	ADI	ADP	AMAT	AMD	...	UBER	UNH	UNP	UPS	V	VRTX	VZ	WFC	WMT	XOM
0	2023-10-02	172.485809	138.997025	136.559998	92.424889	521.130005	170.950928	231.582275	137.814804	103.269997	...	45.680000	502.766541	196.067673	144.168121	228.643341	347.829987	28.193148	38.061131	52.395565	109.820274
1	2023-10-03	171.145615	138.228180	127.730003	92.482803	507.029999	167.131012	232.598114	135.058731	100.080002	...	44.509998	497.783508	197.170700	142.931702	226.211273	345.149994	28.388437	37.157883	52.065022	110.010231
2	2023-10-04	172.396454	138.471970	127.410004	92.347656	518.419983	169.352798	236.042297	137.607376	104.070000	...	44.939999	498.907135	195.709686	142.903809	228.593918	352.970001	27.997854	37.446159	52.690102	105.897789
3	2023-10-05	173.637360	138.246933	124.989998	92.878654	516.440002	167.705978	235.442490	137.587631	102.910004	...	44.610001	504.388428	194.132523	142.587738	230.828247	355.140015	28.246408	37.763252	52.061752	103.513908
4	2023-10-06	176.198624	138.987640	126.360001	93.535172	526.679993	169.528229	238.364227	138.585327	107.239998	...	45.779999	512.771606	195.893509	143.415131	232.370529	360.619995	27.969883	38.138004	51.187946	101.785339

5 rows × 111 columns

While the function ‘get_returns’ help us to obtain returns for multiple tickers over a specified period:

nStocksRet = ig.get_returns(tickers, start_date, end_date)
nStocksRet.head(5)

Ticker	Date	AAPL	ABBV	ABNB	ABT	ADBE	ADI	ADP	AMAT	AMD	...	UBER	UNH	UNP	UPS	V	VRTX	VZ	WFC	WMT	XOM
0	2023-10-03	-0.007800	-0.005547	-0.066845	0.000627	-0.027429	-0.022599	0.004377	-0.020201	-0.031377	...	-0.025947	-0.009961	0.005610	-0.008613	-0.010694	-0.007735	0.006903	-0.024017	-0.006329	0.001728
1	2023-10-04	0.007282	0.001762	-0.002508	-0.001463	0.022216	0.013206	0.014699	0.018695	0.039094	...	0.009614	0.002255	-0.007437	-0.000195	0.010478	0.022404	-0.013854	0.007728	0.011934	-0.038099
2	2023-10-05	0.007172	-0.001626	-0.019177	0.005734	-0.003827	-0.009772	-0.002544	-0.000143	-0.011209	...	-0.007370	0.010927	-0.008091	-0.002214	0.009727	0.006129	0.008838	0.008432	-0.011997	-0.022768
3	2023-10-06	0.014643	0.005344	0.010901	0.007044	0.019634	0.010807	0.012333	0.007225	0.041214	...	0.025889	0.016484	0.009030	0.005786	0.006659	0.015313	-0.009838	0.009875	-0.016927	-0.016840
4	2023-10-09	0.008416	0.005852	0.011097	-0.001239	0.004943	-0.003743	0.015306	-0.000998	-0.002521	...	-0.007234	0.003234	0.009047	0.000454	-0.002556	-0.015144	0.019262	0.000252	-0.003651	0.034393

5 rows × 111 columns

Use the function ‘ljung_box_test’ to test the EMH using the Ljung-Box test for each of the stocks, and store the results in a data frame named lb_results. Note the function allows us to choose the number of lags, we will continue usign 5:

lags = 5
lb_results = ig.ljung_box_test(tickers, start_date, end_date, lags)

# Display the results
lb_results

	Ticker	Ljung-Box Statistic	P-value
0	AAPL	3.659644	0.599381
1	ABBV	0.964368	0.965388
2	ABNB	5.066243	0.407849
3	ABT	3.803183	0.578086
4	ADBE	1.699569	0.888954
...	...	...	...
105	VRTX	8.923362	0.112160
106	VZ	9.525372	0.089856
107	WFC	6.704654	0.243548
108	WMT	1.462580	0.917347
109	XOM	5.994274	0.306776

110 rows × 3 columns

Filter for those stocks that contradict the EMH. In this example, we use a p-value < 0.20, that is, for those that reject the null hypothesis with a confidence level relatively low, of 80%.

no_stocks = lb_results[lb_results['P-value'] < 0.20]


print(f"{no_stocks.shape[0]} stocks reject the EMH")

28 stocks reject the EMH

We can now export to an excel file the list of tickers that contradict the EMH for future reference as:

no_stocks.to_excel('no_stocks.xlsx', index=False)

Or select from the original list of companies:

select_companies = companies[companies['Symbol'].isin(no_stocks['Ticker'])]
select_companies

	Name	Symbol	Country	Sector	Market Cap
5	Amazon.com Inc	AMZN	US	Consumer Discretionary	2.030020e+12
6	Meta Platforms Inc	META	US	Communication Services	1.499610e+12
7	Berkshire Hathaway Inc	BRK-B	US	Financials	9.813140e+11
8	Berkshire Hathaway Inc	BRK-A	US	Financials	9.813140e+11
10	Broadcom Inc	AVGO	US	Information Technology	8.257580e+11
14	Visa Inc	V	US	Financials	5.738230e+11
17	Oracle Corp	ORCL	US	Information Technology	4.846800e+11
18	Mastercard Inc	MA	US	Financials	4.750540e+11
19	Procter & Gamble Co	PG	US	Consumer Staples	3.916550e+11
24	Bank of America Corp	BAC	US	Financials	3.254660e+11
36	Thermo Fisher Scientific Inc	TMO	US	Health Care	2.102390e+11
37	McDonald's Corp	MCD	US	Consumer Discretionary	2.099230e+11
43	Morgan Stanley	MS	US	Financials	1.907780e+11
44	Texas Instruments Inc	TXN	US	Information Technology	1.903750e+11
45	General Electric Co	GE	US	Industrials	1.891850e+11
47	Qualcomm Inc	QCOM	US	Information Technology	1.873800e+11
52	Verizon Communications Inc	VZ	US	Communication Services	1.741730e+11
57	Comcast Corp	CMCSA	US	Communication Services	1.638650e+11
71	Charles Schwab Corp	SCHW	US	Financials	1.302400e+11
76	Boston Scientific Corp	BSX	US	Health Care	1.238550e+11
77	Vertex Pharmaceuticals Inc	VRTX	US	Health Care	1.224950e+11
81	Palo Alto Networks Inc	PANW	US	Information Technology	1.171250e+11
83	United Parcel Service Inc	UPS	US	Industrials	1.148200e+11
84	Analog Devices Inc	ADI	US	Information Technology	1.146350e+11
94	Regeneron Pharmaceuticals Inc	REGN	US	Health Care	1.018900e+11
99	Intel Corp	INTC	US	Information Technology	9.595344e+10
101	Elevance Health Inc	ELV	US	Health Care	9.529940e+10
105	KLA Corp	KLAC	US	Information Technology	9.278362e+10

The statistical evidence rejects the EMH for these stocks.

References

Ljung, G. M., and G. E. P. Box. 1978. “On a Measure of Lack of Fit in Time Series Models.” Biometrika 65 (2): 297–303. https://doi.org/10.1093/biomet/65.2.297.

Mishkin, Frederic S, and Stanley G Eakins. 2024. Financial Markets and Institutions. Pearson.