Market behavior and applied market anomalies

FZ2024 Financial Modeling and Programming

Author

Affiliation

Sergio Castellanos-Gamboa, PhD

Tecnológico de Monterrey

Published

November 18, 2025

0.1 Before you begin: important instructions for all Workshops

Welcome to our workshop series! Please read these instructions carefully before starting any activity. Following these guidelines will make your work smoother and ensure that your submissions are graded without issues.

0.1.1 Working environment

We will use Google Colab for all workshops. Colab runs Python in the cloud — you don’t need to install anything locally.

Access Colab at: https://colab.research.google.com/
Sign in with your institutional Google account for access to all features.
Always save a copy of the notebook to your Google Drive:
- Go to File → Save a copy in Drive.

0.1.2 Loading data

You may work with datasets provided by the instructor or public datasets online. You will receive instructions each time to load the data with Python code. However, it is a good idea to store files, like data or your own notes, in a dedicated Google Drive folder:

Create a folder in your Google Drive named fz2024_workshops (or similar).
Upload your datasets there.

0.1.3 Output and submission format

After completing the workshop, export your notebook as PDF:
- In Colab: File → Print → Save as PDF.
Submit the PDF file through Canvas, as well as the .ipynb.
Include all outputs, tables, and graphs in your PDF — make sure you run all cells before exporting.
Name your PDF file using the following format: Lastname_Firstname_WorkshopX.pdf.

0.1.4 Deadlines

All assignments must be uploaded to Canvas before the stated deadline. Late submissions are not accepted. Once you have read and understood these instructions, you are ready to begin the workshop!

1 Overview

This workshop is the second of three workshops that together explore how to build a portfolio grounded in Rational Agent Theory, Behavioural Finance, and the Market Anomalies literature. The process unfolds in three analytical stages:

Filter 1 – Market Efficiency
Test whether each stock’s historical information helps forecast its price. This step identifies inefficient markets, where past returns contain predictive signals.
Filter 2 – Market Anomalies
Examine systematic patterns—such as momentum or trend-following behaviour—that contradict the Efficient Market Hypothesis. Here, the strategy buys assets with upward trends and sells those trending downward.
Filter 3 – Portfolio Allocation
Optimize the portfolio composition using only the assets that pass the previous filters.

Unlike the traditional Markowitz framework, which focuses solely on optimizing asset weights, this three-part approach first conducts stock selection through theoretical filters. Each filter reflects assumptions derived from the rational agent and behavioural perspectives, allowing you to connect empirical testing with economic theory before moving into optimization.

In this workshop (Filter 2), we use technical trading rules and momentum as market anomalies to further filter the stock universe before portfolio construction.

2 Setup

2.1 Installing the `ta` library in Google Colab

In Google Colab, some libraries are not installed by default. If you get an error like ModuleNotFoundError: No module named 'ta', run the following cell once at the top of your notebook:

# Run this cell ONLY if you get "No module named 'ta'"
# !pip install ta # Remove "#" to run the chunk once in case you need to

We will use the same Python ecosystem as in all previous workshops.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import ta  # technical analysis library

# Optional: nicer plots
plt.style.use("seaborn-v0_8")

2.2 Technical overview

Library	Main Role in this Workshop
`pandas`	Store price data in `DataFrame` form, compute returns, loop over tickers.
`NumPy`	Basic numerical operations (log, exponentials, powers).
`matplotlib`	Visualise price series, bands, and trading signals.
`ta`	Compute Bollinger Bands, MACD, and other technical indicators.

3 Load price data

In the previous chapter, we started with 100 stocks and applied the Efficient Market Hypothesis (EMH) variance filter to obtain stocks for which the EMH fails (markets are inefficient). For this workshop, we will assume that the filter selected 56 stocks, contained in the following dataset:

file = "anomalies_local.csv"
data = pd.read_csv(file, index_col=0)

data.head()

	TSLA.Close	TSM.Close	JNJ.Close	UNH.Close	JPM.Close	TCEHY.Close	TCTZF.Close	XOM.Close	BAC.Close	PG.Close	...	ACN.Close	CSCO.Close	LRLCF.Close	CICHF.Close	MCD.Close	NKE.Close	INTC.Close	C.PJ.Close	TMUS.Close	TXN.Close
date
01/02/2020	86.052002	60.040001	145.970001	292.500000	141.089996	49.880001	49.880001	70.900002	35.639999	123.410004	...	210.149994	48.419998	293.450012	0.87	200.789993	102.199997	60.840000	28.570000	78.589996	129.570007
01/03/2020	88.601997	58.060001	144.279999	289.540009	138.339996	49.029999	48.930000	70.330002	34.900002	122.580002	...	209.800003	47.630001	297.130005	0.84	200.080002	101.919998	60.099998	28.719999	78.169998	127.849998
01/06/2020	90.307999	57.389999	144.100006	291.549988	138.229996	48.770000	48.700001	70.870003	34.849998	122.750000	...	208.429993	47.799999	293.000000	0.84	202.330002	101.830002	59.930000	28.719999	78.620003	126.959999
01/07/2020	93.811996	58.320000	144.979996	289.790009	135.880005	49.779999	49.770000	70.290001	34.619999	121.989998	...	203.929993	47.490002	288.549988	0.88	202.630005	101.779999	58.930000	28.629999	78.919998	129.410004
01/08/2020	98.428001	58.750000	144.960007	295.899994	136.940002	49.650002	49.650002	69.230003	34.970001	122.510002	...	204.330002	47.520000	287.500000	0.88	205.910004	101.550003	58.970001	28.709999	79.419998	129.759995

5 rows × 56 columns

Each column is a stock (e.g. TSLA.Close, XOM.Close), and each row is a trading day.
The index is the date (stored as text), and the cell values are closing prices.

Showing the last observations of the dataset.

data.tail()

	TSLA.Close	TSM.Close	JNJ.Close	UNH.Close	JPM.Close	TCEHY.Close	TCTZF.Close	XOM.Close	BAC.Close	PG.Close	...	ACN.Close	CSCO.Close	LRLCF.Close	CICHF.Close	MCD.Close	NKE.Close	INTC.Close	C.PJ.Close	TMUS.Close	TXN.Close
date
05/20/2022	663.900024	90.779999	176.979996	485.730011	117.339996	44.250000	43.900002	91.860001	33.860001	141.789993	...	276.649994	42.939999	321.929993	0.71	233.910004	108.000000	41.650002	25.530001	126.040001	169.809998
05/23/2022	674.900024	91.500000	179.440002	492.079987	124.599998	43.639999	43.700001	93.889999	35.869999	145.050003	...	283.390015	43.349998	331.079987	0.71	238.000000	108.629997	42.000000	25.320000	129.889999	169.929993
05/24/2022	628.159973	88.720001	181.399994	497.559998	126.360001	42.040001	41.849998	94.400002	35.650002	147.630005	...	279.309998	43.770000	331.369995	0.68	244.520004	107.290001	41.669998	25.799999	129.220001	167.860001
05/25/2022	658.799988	90.410004	179.619995	498.089996	127.239998	42.570000	42.500000	96.300003	35.840000	145.210007	...	279.640015	44.000000	325.579987	0.72	244.009995	108.199997	42.200001	26.090000	131.440002	170.009995
05/26/2022	707.729980	91.000000	179.460007	502.230011	129.440002	44.029999	44.040001	96.639999	36.669998	146.479996	...	291.549988	44.990002	332.010010	0.73	248.089996	112.940002	43.480000	26.180000	132.740005	174.130005

5 rows × 56 columns

Final goal: from this set of 56 “inefficient” stocks, we will apply two market anomalies and keep only the stocks whose anomaly-based strategies produce sufficiently high returns.

4 Market Anomalies: technical analysis & momentum

4.1 Efficient Market Hypothesis (EMH) vs anomalies

Under a strict form of the EMH, past prices and past returns should contain no useful information to predict future returns. Prices should follow a (conditional) martingale, and any predictable pattern should be arbitraged away. A conditional martingale is just a process where, given everything you know right now, your best prediction for tomorrow is simply today’s value.

In formal terms:

\mathbb{E}[X_{t+1} \mid \mathcal{F}_t] = X_t

where \mathcal{F}_t represents all information available up to time t.

4.1.1 Intuition (Martingale = “Fair Game”)

Imagine a fair coin-flip game:

Heads → you win $1
Tails → you lose $1

Let X_t be your wealth after t flips.
Even if you are currently up or down, your expected wealth after the next flip is:

\mathbb{E}[X_{t+1} \mid X_t] = \tfrac{1}{2}(X_t + 1) + \tfrac{1}{2}(X_t - 1) = X_t.

Nothing you know about past flips helps you beat the game.
No strategy can tilt the expected outcome.
It is conditionally fair: today’s value is the best forecast for tomorrow.

4.1.2 Why finance cares

If asset prices follow a conditional martingale, then:

Past prices do not help predict future prices,
No trading strategy can systematically make profits from historical data,
Markets behave as if they were informationally efficient.

This is the idea behind testing the Efficient Market Hypothesis (EMH): if returns behave like a martingale difference sequence, then past information has no predictive power.

# Simulate a simple martingale: fair coin random walk

np.random.seed(42)
n_steps = 200

# +1 for heads, -1 for tails, each with prob 0.5

steps = np.random.choice([-1, 1], size=n_steps)

# Wealth process X_t = cumulative sum of wins/losses

X = np.cumsum(steps)

plt.figure(figsize=(8, 4))
plt.plot(X)
plt.axhline(0, linestyle="--")  # starting wealth = 0
plt.xlabel("Time step")
plt.ylabel("Wealth $X_t$")
plt.title("Simulated Martingale: Fair Coin Random Walk")
plt.show()

However, many empirical studies document anomalies, i.e. return patterns that seem predictable using public information:

Trend-following / technical analysis: rules based on moving averages, bands, oscillators.
Short-run momentum: recent winners continue to outperform recent losers (3–12 months).
Long-run reversal: extreme past winners tend to underperform in the long run (2–5 years).

In this workshop, we implement:

A Bollinger Band trading strategy (technical analysis anomaly).
A momentum filter based on past performance.

4.2 Common Equity & Behavioral Market Anomalies

The following table summarizes twelve well-documented anomalies in empirical finance.
Students may use this table as a starting point for research and for identifying how each anomaly can be translated into a trading rule.

Anomaly	Economic / Behavioral Interpretation	How It Can Be Used for Trading
1. Short-Run Momentum (used in this workshop)	Investors underreact to news, so trends persist for months.	Buy recent winners, short losers over 3–12 months.
2. Technical Analysis Signals (used in this workshop)	Prices show patterns inconsistent with EMH; traders overreact/underreact.	Use BB, MACD, and moving averages to produce buy/sell signals.
3. Long-Run Reversal	Investors overreact in the long run; extreme past performance reverses.	Buy long-term losers, short long-term winners (3–5 years).
4. Size Effect (Small-Firm Premium)	Small firms are riskier or neglected, leading to excess returns.	Tilt portfolio toward small-cap stocks.
5. Value Effect (HML)	Investors overreact to growth narratives; cheap stocks outperform.	Buy high book-to-market or low P/E stocks.
6. Low-Volatility Anomaly	High-risk stocks underperform; investors overpay for “lottery” stocks.	Buy low-volatility stocks; short high-volatility stocks.
7. Post-Earnings Announcement Drift (PEAD)	Investors underreact to earnings surprises.	Buy stocks with positive earnings surprises; short the negative ones.
8. January Effect	Tax-loss selling and behavioral resets create seasonal price pressure.	Overweight small caps in January.
9. Holiday Effect / Weekend Effect	Investor mood, volume, and liquidity cause predictable calendar patterns.	Go long before holidays; avoid holding over weekends.
10. Disposition Effect	Retail investors sell winners too early and hold losers too long.	Look for continuation in prices of assets retail investors are selling.
11. Overreaction Effect	Prices overshoot on news and gradually revert.	Short extreme jumps, buy after extreme negative shocks.
12. Analyst Forecast Bias / Underreaction	Analysts adjust forecasts slowly due to anchoring.	Trade revision momentum (buy upgrades, short downgrades).

5 Analysis with Candlestick Charts

A candlestick chart lets you see how the price of an instrument has changed over a specific period. Each “candle” represents the price movement in a time interval, which can be daily, weekly, etc.

Green and red candles, together with body size and shadows, offer information about the direction and strength of price action. Start by looking for candles with large bodies or long shadows, as these indicate significant activity and can help anticipate price movements.

5.1 How to Interpret the Basic Elements of a Candle

Color of the candle:
- Green: The price closed higher than it opened. Indicates an upward movement.
- Red: The price closed lower than it opened. Indicates a downward movement.
Body of the candle:
- The color itself does not change the interpretation, but the size of the body does:
- Large body: Shows a significant price change from open to close, indicating strong buying or selling pressure.
- Small body: Indicates indecision or little movement; it is common in quiet markets or in potential turning points.
Wicks (or Shadows): The lines that extend above and below the body.
- Upper shadow: Shows the maximum price reached during the candle period.
- Lower shadow: Shows the minimum price reached during the candle period.
Length of the shadows:
- Long shadows: Indicate volatility and possible trend changes. If the upper shadow is long, buyers pushed the price up, but sellers brought it back down.
- Short shadows: Show stability in price movement during the period.

5.1.1 Basic Pattern Examples

Some simple patterns you can look for:

Large Green Candle: Strong buying signal. The price moved up steadily.
Large Red Candle: Selling signal. The price fell significantly from open to close.
Doji (very small body with long shadows): Indicates market indecision; it can signal a trend reversal.

5.2 Basic Candlestick Technical Analysis Concepts

Candlestick charts show price variations over time and are a key tool to analyze the behavior of financial instruments. The following are some of the most common patterns you can identify and how to interpret them:

5.2.1 Doji

Shape: Very small body with long shadows above and below.
Meaning: Indicates indecision in the market. If it appears after a strong trend (either bullish or bearish), it may signal a possible trend reversal.

5.2.2 Hammer

Shape: Small body with a long lower shadow and almost no upper shadow.
Meaning: Bullish signal suggesting that although there was selling pressure, buyers stepped in and pushed the price up. It is more reliable after a series of red candles (downtrend).

5.2.3 Shooting Star

Shape: Small body with a long upper shadow and almost no lower shadow.
Meaning: Bearish signal. It suggests that buyers pushed the price up, but sellers took control. This pattern is more reliable after a series of green candles (uptrend).

5.2.4 Bullish Engulfing

Shape: A large green candle that completely engulfs the previous red candle.
Meaning: Strong signal that the trend may turn bullish, as buyers have taken control.

5.2.5 Bearish Engulfing

Shape: A large red candle that completely engulfs the previous green candle.
Meaning: Indicates a possible bearish reversal, as sellers have dominated.

5.3 Bollinger Bands (BB): Trend and Volatility Tool

In addition to candlestick patterns, you can apply Bollinger Bands, a useful tool for identifying overbought or oversold levels. These bands help analyze an asset’s volatility, built around a moving average of the price, showing how far the price moves away from its average.

If the price approaches the upper band, the asset may be overbought (price could fall soon). It may be wise to avoid additional long positions, as the price is likely to stabilize or fall.
If the price approaches the lower band, the asset may be oversold (price could rise in the short term). It may signal a possible recovery and a buying opportunity.

For a window of size n:

Middle band:
\text{MB}_t = \text{SMA}_n(P_t) = \frac{1}{n} \sum_{i=0}^{n-1} P_{t-i}
Standard deviation of price over the same window: \sigma_n(P_t).
Upper band:
\text{UB}_t = \text{MB}_t + k \cdot \sigma_n(P_t)
Lower band:
\text{LB}_t = \text{MB}_t - k \cdot \sigma_n(P_t) Typical choices: n = 20 days, k = 2.

5.4 MACD (Moving Average Convergence Divergence)

MACD is a trend-following indicator based on exponential moving averages (EMAs).

Fast EMA (e.g. 12-day): \text{EMA}_{\text{fast}, t}
Slow EMA (e.g. 26-day): \text{EMA}_{\text{slow}, t}
MACD line:
\text{MACD}_t = \text{EMA}_{\text{fast}, t} - \text{EMA}_{\text{slow}, t}
Signal line: EMA of MACD (e.g. 9 days):
\text{Signal}_t = \text{EMA}_{9}(\text{MACD}_t)

Typical trading rule:

If \text{MACD}_t crosses above the signal line → bullish (buy).
If \text{MACD}_t crosses below the signal line → bearish (sell).

In this chapter, Bollinger Bands will drive the trading signal, while MACD is used mainly for illustration and intuition.

5.4.1 Daily returns and annualised strategy performance

Let P_t be the closing price on day t. The continuously compounded (log) return from day t to t+1 is:

r_{t+1} = \ln\left(\frac{P_{t+1}}{P_t}\right).

If we take a position s_t \in \{-1,0,+1\} on day t (short, neutral, long), then the daily strategy return is:

r_{t+1}^{\text{strat}} = s_t \cdot r_{t+1}.

To compute an annualised return from average daily log returns:

Compute the average daily strategy return $ r = _{t=1}^{T} r_t^{} $.
Annualise using 252 trading days: r_{\text{ann}} = \exp\big(252 \cdot \bar r\big) - 1.

This is consistent with continuously compounded returns: daily logs add up over the year, then we exponentiate to get a gross return.

5.5 3.2 Example: Bollinger Band strategy for one stock (TSLA)

For exposition, we start with the first stock in the dataset: TSLA.Close.

ticker = "TSLA.Close"
tsla = data[ticker]

tsla.head()

date
01/02/2020    86.052002
01/03/2020    88.601997
01/06/2020    90.307999
01/07/2020    93.811996
01/08/2020    98.428001
Name: TSLA.Close, dtype: float64

5.5.1 3.2.1 Compute Bollinger Bands with `ta`

We use a 20-day window and 2 standard deviations, as in the reference code.

bb = ta.volatility.BollingerBands(close=tsla, window=20, window_dev=2)

# Upper and lower bands
hb = bb.bollinger_hband()
lb = bb.bollinger_lband()

hb.head(), lb.head()

(date
 01/02/2020   NaN
 01/03/2020   NaN
 01/06/2020   NaN
 01/07/2020   NaN
 01/08/2020   NaN
 Name: hband, dtype: float64,
 date
 01/02/2020   NaN
 01/03/2020   NaN
 01/06/2020   NaN
 01/07/2020   NaN
 01/08/2020   NaN
 Name: lband, dtype: float64)

5.5.2 3.2.2 Build a DataFrame with price and bands

band_df = tsla.copy()
band_df = pd.concat([band_df, hb, lb], axis=1)
band_df.columns = ["price", "hband", "lband"]

band_df.head()

	price	hband	lband
date
01/02/2020	86.052002	NaN	NaN
01/03/2020	88.601997	NaN	NaN
01/06/2020	90.307999	NaN	NaN
01/07/2020	93.811996	NaN	NaN
01/08/2020	98.428001	NaN	NaN

5.5.3 3.2.3 Plot price and bands

band_df.plot(figsize=(10, 4), title=f"{ticker}: Price and Bollinger Bands")
plt.xlabel("Date")
plt.ylabel("Price")
plt.show()

5.6 3.3 Trading rule and signal construction

We now define the trading signal based on the bands:

If price is above or equal to the upper band: signal = -1 (sell/short).
If price is below or equal to the lower band: signal = +1 (buy/long).
Otherwise: signal = 0 (do nothing).

We implement this using a simple for loop over all days.

signal = []

for j in range(0, band_df.shape[0]):
    price_j = band_df["price"].iloc[j]
    h_j = band_df["hband"].iloc[j]
    l_j = band_df["lband"].iloc[j]

    if price_j >= h_j:
        signal.append(-1)
    elif price_j <= l_j:
        signal.append(1)
    else:
        signal.append(0)

signal[:10]

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Wrap the signal in a DataFrame and plot:

signal_df = pd.DataFrame(signal, index=band_df.index, columns=["signal"])
signal_df.plot(figsize=(10, 3), title=f"{ticker}: Trading Signal (BB rule)")
plt.xlabel("Date")
plt.show()

Note that at the beginning we see many zeros because the bands themselves are NaN until the rolling window has at least 20 observations.

5.7 3.4 Strategy returns for a single stock

We now compute the daily strategy returns using continuously compounded (log) returns.

For a given day index j:

Position today: s_j (taken from signal_df["signal"].iloc[j]).
Price today: P_j.
Price tomorrow: P_{j+1}.
Daily log return:
r_{j+1} = \ln\left(\frac{P_{j+1}}{P_j}\right).
Strategy return: $ r_{j+1}^{} = s_j r_{j+1} $.

We cannot compute r_{j+1} for the last row, because there is no P_{j+1}.
So we only loop up to band_df.shape[0] - 1.

# Indices where there is a non-zero signal
sig_idx = []
for i in range(signal_df.shape[0] - 1):  # up to second-to-last row
    if signal_df["signal"].iloc[i] != 0:
        sig_idx.append(i)

sig_idx[:10]

[19, 20, 21, 22, 45, 50, 52, 69, 70, 71]

Compute the log strategy return for each signal day:

ret_i = []

for j in sig_idx:
    p_today = band_df["price"].iloc[j]
    p_next = band_df["price"].iloc[j + 1]
    log_ret = np.log(p_next / p_today)
    strat_ret = log_ret * signal_df["signal"].iloc[j]
    ret_i.append(strat_ret)

ret_i[:5]

[np.float64(-0.015115861789670724),
 np.float64(-0.18144503336057394),
 np.float64(-0.12861872099095273),
 np.float64(0.18845037443736087),
 np.float64(0.059586901859173265)]

Wrap in a DataFrame:

ret = pd.DataFrame(ret_i, columns=["ret"])
ret.describe()

	ret
count	91.000000
mean	-0.008734
std	0.056662
min	-0.181445
25%	-0.031896
50%	-0.009899
75%	0.016849
max	0.188450

Annualise using 252 trading days and log-return logic:

mean_daily_log = ret["ret"].mean()
ann_ret_tsla = np.exp(252 * mean_daily_log) - 1

ann_ret_tsla

np.float64(-0.8892911665273225)

Interpretation: if this annualised return is positive and large, the Bollinger Band strategy would have performed well on this stock (ignoring transaction costs and other frictions).

5.8 3.5 MACD illustration for the same stock

To connect with another classical technical indicator, we compute and plot MACD.

macd_ind = ta.trend.MACD(close=tsla, window_slow=26, window_fast=12, window_sign=9)

macd_line = macd_ind.macd()
signal_line = macd_ind.macd_signal()
hist = macd_ind.macd_diff()

macd_df = pd.concat([macd_line, signal_line, hist], axis=1)
macd_df.columns = ["macd", "signal", "hist"]

macd_df.tail()

	macd	signal	hist
date
05/20/2022	-70.019921	-56.831104	-13.188817
05/23/2022	-71.635804	-59.792044	-11.843760
05/24/2022	-75.813998	-62.996435	-12.817564
05/25/2022	-75.779317	-65.553011	-10.226306
05/26/2022	-70.985316	-66.639472	-4.345844

Plot MACD and signal:

macd_df[["macd", "signal"]].plot(figsize=(10, 3), title=f"{ticker}: MACD and Signal")
plt.axhline(0, color="black", linewidth=1)
plt.xlabel("Date")
plt.show()

A common (but not the only) rule is: when MACD crosses above the signal line, it is bullish; when it crosses below, it is bearish.
In this workshop, the active trading rule is defined by Bollinger Bands, and MACD is used for intuition and further exploration.

6 4. Bollinger Band strategy for all stocks

We now repeat the TSLA procedure for every stock in the data DataFrame, storing the annualised strategy return for each ticker.

The loop does the following for each column (stock):

Compute BB (20, 2).
Build band_df with price and bands.
Compute the signal series.
Extract non-zero signal indices.
Compute daily log strategy returns for those days.
Compute the annualised return from log returns.

ret_all = []

for ticker in data.columns:
    price_series = data[ticker]

    # 1. Bollinger Bands
    bb = ta.volatility.BollingerBands(close=price_series, window=20, window_dev=2)
    hb = bb.bollinger_hband()
    lb = bb.bollinger_lband()

    # 2. Price + bands
    band_df = pd.concat([price_series, hb, lb], axis=1)
    band_df.columns = ["price", "hband", "lband"]

    # 3. Signals
    signal = []
    # Up to second-to-last index (we always need j+1)
    for j in range(0, band_df.shape[0] - 1):
        price_j = band_df["price"].iloc[j]
        h_j = band_df["hband"].iloc[j]
        l_j = band_df["lband"].iloc[j]

        if price_j >= h_j:
            signal.append(-1)
        elif price_j <= l_j:
            signal.append(1)
        else:
            signal.append(0)

    signal_df = pd.DataFrame(signal, index=band_df.index[:-1], columns=["signal"])

    # 4. Indices with active signals
    sig_idx = []
    for i in range(signal_df.shape[0]):
        if signal_df["signal"].iloc[i] != 0:
            sig_idx.append(i)

    # Edge case: if no signals, the strategy never trades
    if len(sig_idx) == 0:
        ret_all.append(0.0)
        continue

    # 5. Daily log strategy returns
    ret_i = []
    for j in sig_idx:
        p_today = band_df["price"].iloc[j]
        p_next = band_df["price"].iloc[j + 1]
        log_ret = np.log(p_next / p_today)
        strat_ret = log_ret * signal_df["signal"].iloc[j]
        ret_i.append(strat_ret)

    ret = pd.DataFrame(ret_i, columns=["ret"])

    # 6. Annualised return from log returns
    mean_daily_log = ret["ret"].mean()
    ann_ret = np.exp(252 * mean_daily_log) - 1

    ret_all.append(ann_ret)

# Build DataFrame with annualised strategy returns per ticker
ret_df = pd.DataFrame(ret_all, index=data.columns, columns=["ann_ret"]).sort_values(by="ann_ret", ascending=True)

ret_df.head()

	ann_ret
TSLA.Close	-0.889291
WFC.PL.Close	-0.630579
BML.PH.Close	-0.613088
BML.PL.Close	-0.468792
BAC.PE.Close	-0.389758

Look at the best performers:

ret_df.tail()

	ann_ret
TCTZF.Close	13.617779
BABAF.Close	15.152149
LRLCF.Close	52.640184
CICHF.Close	54.735516
RHHBF.Close	437.311217

We now have, for each stock, the annualized return of the Bollinger Band strategy using log returns.

6.1 4.1 Filter by Bollinger strategy performance

We apply a simple rule: keep only the stocks whose annualised BB strategy return is higher than 10%.

threshold = 0.10
ret_filt = ret_df[ret_df["ann_ret"] > threshold]

ret_filt.head()

	ann_ret
COST.Close	0.101253
CSCO.Close	0.113825
PG.Close	0.185918
CVX.Close	0.194800
JPM.PD.Close	0.216114

Number of stocks passing the BB filter:

len(ret_filt)

Now we keep only the price series of those stocks:

data_bb = data.loc[:, ret_filt.index]
data_bb.head()

	COST.Close	CSCO.Close	PG.Close	CVX.Close	JPM.PD.Close	XOM.Close	RYDAF.Close	HD.Close	NKE.Close	WFC.PQ.Close	...	NVSEF.Close	TMUS.Close	INTC.Close	PEP.Close	TCEHY.Close	TCTZF.Close	BABAF.Close	LRLCF.Close	CICHF.Close	RHHBF.Close
date
01/02/2020	291.489990	48.419998	123.410004	121.430000	27.639999	70.900002	30.000000	219.660004	102.199997	27.629999	...	94.800003	78.589996	60.840000	135.820007	49.880001	49.880001	27.25	293.450012	0.87	324.950012
01/03/2020	291.730011	47.630001	122.580002	121.010002	27.660000	70.330002	30.180000	218.929993	101.919998	27.780001	...	93.250000	78.169998	60.099998	135.630005	49.029999	48.930000	26.91	297.130005	0.84	319.750000
01/06/2020	291.809998	47.799999	122.750000	120.599998	27.580000	70.870003	30.469999	219.960007	101.830002	27.690001	...	94.349998	78.620003	59.930000	136.149994	48.770000	48.700001	26.91	293.000000	0.84	319.750000
01/07/2020	291.350006	47.490002	121.989998	119.059998	27.500000	70.290001	30.549999	218.520004	101.779999	27.500000	...	95.000000	78.919998	58.930000	134.009995	49.779999	49.770000	26.91	288.549988	0.88	322.049988
01/08/2020	294.690002	47.520000	122.510002	117.699997	27.549999	69.230003	30.139999	221.789993	101.550003	27.559999	...	94.720001	79.419998	58.970001	134.699997	49.650002	49.650002	26.91	287.500000	0.88	322.049988

5 rows × 44 columns

At this point, data_bb contains only the stocks for which the simple Bollinger Band strategy would have delivered more than 10% annualised return over the sample (ignoring transaction costs and constraints).

7 5. Momentum anomaly filter

The second anomaly we use is momentum: stocks that performed well over a recent window (winners) tend to continue performing well in the short run.

7.1 5.1 Momentum definition

Let R_{t}^{(k)} be the k-day cumulative return up to date t.
A simple definition using log returns is:

\text{MOM}_t^{(k)} = \sum_{i=1}^{k} r_{t+1-i},

where r_{t} are daily log returns.

If \text{MOM}_t^{(k)} is positive and large, the stock has been a recent winner; if it is negative, it has been a recent loser.

In practice, many empirical studies use 3–12 months as the momentum window.
Here, we use a 6‑month proxy: 126 trading days.

7.2 5.2 Compute daily log returns and 6‑month momentum

We compute log returns for each stock and then the 126‑day rolling sum.

# 1) Daily log returns for each stock
log_ret_all = np.log(data / data.shift(1))

# 2) 126-day rolling momentum (approx. 6 months)
window_mom = 126
mom_126 = log_ret_all.rolling(window=window_mom).sum()

mom_126.tail()

	TSLA.Close	TSM.Close	JNJ.Close	UNH.Close	JPM.Close	TCEHY.Close	TCTZF.Close	XOM.Close	BAC.Close	PG.Close	...	ACN.Close	CSCO.Close	LRLCF.Close	CICHF.Close	MCD.Close	NKE.Close	INTC.Close	C.PJ.Close	TMUS.Close	TXN.Close
date
05/20/2022	-0.501638	-0.306911	0.085974	0.077584	-0.328981	-0.343710	-0.359155	0.367495	-0.313339	-0.036902	...	-0.292856	-0.222305	-0.392301	0.028573	-0.080348	-0.461577	-0.176301	-0.073604	0.074596	-0.130442
05/23/2022	-0.521637	-0.306037	0.096766	0.111867	-0.255799	-0.370330	-0.368956	0.436675	-0.235611	-0.012129	...	-0.266111	-0.205691	-0.376600	0.088293	-0.056960	-0.476152	-0.164707	-0.080407	0.116379	-0.138433
05/24/2022	-0.610678	-0.340826	0.127407	0.128548	-0.262863	-0.381878	-0.392559	0.428341	-0.260961	-0.001151	...	-0.262907	-0.221085	-0.352174	-0.014599	-0.035988	-0.484898	-0.178836	-0.059438	0.126635	-0.136966
05/25/2022	-0.520821	-0.294406	0.111366	0.107931	-0.279554	-0.347519	-0.347005	0.422272	-0.281665	-0.028714	...	-0.254544	-0.228583	-0.357568	0.042560	-0.046600	-0.464385	-0.151439	-0.045335	0.129626	-0.124966
05/26/2022	-0.455444	-0.282531	0.113280	0.109455	-0.254537	-0.309307	-0.309246	0.420268	-0.261504	-0.014773	...	-0.217454	-0.210663	-0.332065	0.056353	-0.035712	-0.420812	-0.134910	-0.044086	0.147992	-0.105992

5 rows × 56 columns

We focus on the last available date to decide which stocks are “momentum winners” at the end of the sample.

last_mom = mom_126.iloc[-1, :]  # last row (final date)
last_mom.head()

TSLA.Close   -0.455444
TSM.Close    -0.282531
JNJ.Close     0.113280
UNH.Close     0.109455
JPM.Close    -0.254537
Name: 05/26/2022, dtype: float64

7.3 5.3 Momentum filter: keep winners

As a simple rule, we keep stocks with positive 6‑month momentum (they have gone up on net over the last 126 days).

mom_filt = last_mom[last_mom > 0]

mom_filt.sort_values(ascending=False).head()

XOM.Close      0.420268
CVX.Close      0.410034
RYDAF.Close    0.319491
SHEL.Close     0.306713
ABBV.Close     0.238166
Name: 05/26/2022, dtype: float64

Number of momentum winners:

len(mom_filt)

8 6. Combined filter: Bollinger strategy + momentum

Finally, we combine both filters:

Bollinger Band strategy filter: annualised return > 10\%.
Momentum filter: positive 6‑month momentum at the end of the sample.

We keep only the stocks that satisfy both criteria.

# Intersection of tickers passing the BB filter and momentum filter
bb_tickers = set(ret_filt.index)
mom_tickers = set(mom_filt.index)

final_tickers = sorted(bb_tickers.intersection(mom_tickers))
len(final_tickers), final_tickers[:10]

(12,
 ['ABBV.Close',
  'AZNCF.Close',
  'CICHF.Close',
  'CVX.Close',
  'JNJ.Close',
  'KO.Close',
  'NVSEF.Close',
  'PEP.Close',
  'RYDAF.Close',
  'TMUS.Close'])

Create a final price panel with these doubly-filtered stocks:

data_final = data.loc[:, final_tickers]
data_final.head()

	ABBV.Close	AZNCF.Close	CICHF.Close	CVX.Close	JNJ.Close	KO.Close	NVSEF.Close	PEP.Close	RYDAF.Close	TMUS.Close	UNH.Close	XOM.Close
date
01/02/2020	89.550003	100.000000	0.87	121.430000	145.970001	54.990002	94.800003	135.820007	30.000000	78.589996	292.500000	70.900002
01/03/2020	88.699997	101.500000	0.84	121.010002	144.279999	54.689999	93.250000	135.630005	30.180000	78.169998	289.540009	70.330002
01/06/2020	89.400002	99.699997	0.84	120.599998	144.100006	54.669998	94.349998	136.149994	30.469999	78.620003	291.549988	70.870003
01/07/2020	88.889999	99.699997	0.88	119.059998	144.979996	54.250000	95.000000	134.009995	30.549999	78.919998	289.790009	70.290001
01/08/2020	89.519997	99.699997	0.88	117.699997	144.960007	54.349998	94.720001	134.699997	30.139999	79.419998	295.899994	69.230003

This data_final DataFrame is the output of Filter 2 (market anomalies).
It contains only the stocks that:

Come from the EMH‑failed universe (Filter 1, from the previous workshop),
Have a profitable BB trading strategy (annualised return > 10\%),
Have positive recent momentum.

In the next workshop, you will use data_final as the input universe for portfolio allocation (Filter 3), where you will construct and evaluate alternative portfolios based on these theoretically and empirically filtered assets.

0.1 Before you begin: important instructions for all Workshops

0.1.1 Working environment

0.1.2 Loading data

0.1.3 Output and submission format

0.1.4 Deadlines

1 Overview

2 Setup

2.1 Installing the ta library in Google Colab

2.2 Technical overview

3 Load price data

4 Market Anomalies: technical analysis & momentum

4.1 Efficient Market Hypothesis (EMH) vs anomalies

4.1.1 Intuition (Martingale = “Fair Game”)

4.1.2 Why finance cares

4.2 Common Equity & Behavioral Market Anomalies

5 Analysis with Candlestick Charts

5.1 How to Interpret the Basic Elements of a Candle

5.1.1 Basic Pattern Examples

5.2 Basic Candlestick Technical Analysis Concepts

5.2.1 Doji

5.2.2 Hammer

5.2.3 Shooting Star

5.2.4 Bullish Engulfing

5.2.5 Bearish Engulfing

5.3 Bollinger Bands (BB): Trend and Volatility Tool

5.4 MACD (Moving Average Convergence Divergence)

5.4.1 Daily returns and annualised strategy performance

5.5 3.2 Example: Bollinger Band strategy for one stock (TSLA)

5.5.1 3.2.1 Compute Bollinger Bands with ta

5.5.2 3.2.2 Build a DataFrame with price and bands

5.5.3 3.2.3 Plot price and bands

5.6 3.3 Trading rule and signal construction

5.7 3.4 Strategy returns for a single stock

5.8 3.5 MACD illustration for the same stock

6 4. Bollinger Band strategy for all stocks

6.1 4.1 Filter by Bollinger strategy performance

7 5. Momentum anomaly filter

7.1 5.1 Momentum definition

7.2 5.2 Compute daily log returns and 6‑month momentum

7.3 5.3 Momentum filter: keep winners

8 6. Combined filter: Bollinger strategy + momentum

2.1 Installing the `ta` library in Google Colab

5.5.1 3.2.1 Compute Bollinger Bands with `ta`