Market behavior and applied market anomalies

FZ2024 Financial Modeling and Programming

Author
Affiliation

Sergio Castellanos-Gamboa, PhD

Tecnológico de Monterrey

Published

November 18, 2025

0.1 Before you begin: important instructions for all Workshops

Welcome to our workshop series! Please read these instructions carefully before starting any activity. Following these guidelines will make your work smoother and ensure that your submissions are graded without issues.

0.1.1 Working environment

We will use Google Colab for all workshops. Colab runs Python in the cloud — you don’t need to install anything locally.

  • Access Colab at: https://colab.research.google.com/
  • Sign in with your institutional Google account for access to all features.
  • Always save a copy of the notebook to your Google Drive:
    • Go to File → Save a copy in Drive.

0.1.2 Loading data

You may work with datasets provided by the instructor or public datasets online. You will receive instructions each time to load the data with Python code. However, it is a good idea to store files, like data or your own notes, in a dedicated Google Drive folder:

  1. Create a folder in your Google Drive named fz2024_workshops (or similar).
  2. Upload your datasets there.

0.1.3 Output and submission format

  • After completing the workshop, export your notebook as PDF:
    • In Colab: File → Print → Save as PDF.
  • Submit the PDF file through Canvas, as well as the .ipynb.
  • Include all outputs, tables, and graphs in your PDF — make sure you run all cells before exporting.
  • Name your PDF file using the following format: Lastname_Firstname_WorkshopX.pdf.

0.1.4 Deadlines

All assignments must be uploaded to Canvas before the stated deadline. Late submissions are not accepted. Once you have read and understood these instructions, you are ready to begin the workshop!

1 Overview

This workshop is the second of three workshops that together explore how to build a portfolio grounded in Rational Agent Theory, Behavioural Finance, and the Market Anomalies literature. The process unfolds in three analytical stages:

  1. Filter 1 – Market Efficiency
    Test whether each stock’s historical information helps forecast its price. This step identifies inefficient markets, where past returns contain predictive signals.

  2. Filter 2 – Market Anomalies
    Examine systematic patterns—such as momentum or trend-following behaviour—that contradict the Efficient Market Hypothesis. Here, the strategy buys assets with upward trends and sells those trending downward.

  3. Filter 3 – Portfolio Allocation
    Optimize the portfolio composition using only the assets that pass the previous filters.

Unlike the traditional Markowitz framework, which focuses solely on optimizing asset weights, this three-part approach first conducts stock selection through theoretical filters. Each filter reflects assumptions derived from the rational agent and behavioural perspectives, allowing you to connect empirical testing with economic theory before moving into optimization.

In this workshop (Filter 2), we use technical trading rules and momentum as market anomalies to further filter the stock universe before portfolio construction.

2 Setup

2.1 Installing the ta library in Google Colab

In Google Colab, some libraries are not installed by default. If you get an error like ModuleNotFoundError: No module named 'ta', run the following cell once at the top of your notebook:

# Run this cell ONLY if you get "No module named 'ta'"
# !pip install ta # Remove "#" to run the chunk once in case you need to

We will use the same Python ecosystem as in all previous workshops.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import ta  # technical analysis library

# Optional: nicer plots
plt.style.use("seaborn-v0_8")

2.2 Technical overview

Library Main Role in this Workshop
pandas Store price data in DataFrame form, compute returns, loop over tickers.
NumPy Basic numerical operations (log, exponentials, powers).
matplotlib Visualise price series, bands, and trading signals.
ta Compute Bollinger Bands, MACD, and other technical indicators.

3 Load price data

In the previous chapter, we started with 100 stocks and applied the Efficient Market Hypothesis (EMH) variance filter to obtain stocks for which the EMH fails (markets are inefficient). For this workshop, we will assume that the filter selected 56 stocks, contained in the following dataset:

file = "anomalies_local.csv"
data = pd.read_csv(file, index_col=0)

data.head()
TSLA.Close TSM.Close JNJ.Close UNH.Close JPM.Close TCEHY.Close TCTZF.Close XOM.Close BAC.Close PG.Close ... ACN.Close CSCO.Close LRLCF.Close CICHF.Close MCD.Close NKE.Close INTC.Close C.PJ.Close TMUS.Close TXN.Close
date
01/02/2020 86.052002 60.040001 145.970001 292.500000 141.089996 49.880001 49.880001 70.900002 35.639999 123.410004 ... 210.149994 48.419998 293.450012 0.87 200.789993 102.199997 60.840000 28.570000 78.589996 129.570007
01/03/2020 88.601997 58.060001 144.279999 289.540009 138.339996 49.029999 48.930000 70.330002 34.900002 122.580002 ... 209.800003 47.630001 297.130005 0.84 200.080002 101.919998 60.099998 28.719999 78.169998 127.849998
01/06/2020 90.307999 57.389999 144.100006 291.549988 138.229996 48.770000 48.700001 70.870003 34.849998 122.750000 ... 208.429993 47.799999 293.000000 0.84 202.330002 101.830002 59.930000 28.719999 78.620003 126.959999
01/07/2020 93.811996 58.320000 144.979996 289.790009 135.880005 49.779999 49.770000 70.290001 34.619999 121.989998 ... 203.929993 47.490002 288.549988 0.88 202.630005 101.779999 58.930000 28.629999 78.919998 129.410004
01/08/2020 98.428001 58.750000 144.960007 295.899994 136.940002 49.650002 49.650002 69.230003 34.970001 122.510002 ... 204.330002 47.520000 287.500000 0.88 205.910004 101.550003 58.970001 28.709999 79.419998 129.759995

5 rows × 56 columns

Each column is a stock (e.g. TSLA.Close, XOM.Close), and each row is a trading day.
The index is the date (stored as text), and the cell values are closing prices.

Showing the last observations of the dataset.

data.tail()
TSLA.Close TSM.Close JNJ.Close UNH.Close JPM.Close TCEHY.Close TCTZF.Close XOM.Close BAC.Close PG.Close ... ACN.Close CSCO.Close LRLCF.Close CICHF.Close MCD.Close NKE.Close INTC.Close C.PJ.Close TMUS.Close TXN.Close
date
05/20/2022 663.900024 90.779999 176.979996 485.730011 117.339996 44.250000 43.900002 91.860001 33.860001 141.789993 ... 276.649994 42.939999 321.929993 0.71 233.910004 108.000000 41.650002 25.530001 126.040001 169.809998
05/23/2022 674.900024 91.500000 179.440002 492.079987 124.599998 43.639999 43.700001 93.889999 35.869999 145.050003 ... 283.390015 43.349998 331.079987 0.71 238.000000 108.629997 42.000000 25.320000 129.889999 169.929993
05/24/2022 628.159973 88.720001 181.399994 497.559998 126.360001 42.040001 41.849998 94.400002 35.650002 147.630005 ... 279.309998 43.770000 331.369995 0.68 244.520004 107.290001 41.669998 25.799999 129.220001 167.860001
05/25/2022 658.799988 90.410004 179.619995 498.089996 127.239998 42.570000 42.500000 96.300003 35.840000 145.210007 ... 279.640015 44.000000 325.579987 0.72 244.009995 108.199997 42.200001 26.090000 131.440002 170.009995
05/26/2022 707.729980 91.000000 179.460007 502.230011 129.440002 44.029999 44.040001 96.639999 36.669998 146.479996 ... 291.549988 44.990002 332.010010 0.73 248.089996 112.940002 43.480000 26.180000 132.740005 174.130005

5 rows × 56 columns

Final goal: from this set of 56 “inefficient” stocks, we will apply two market anomalies and keep only the stocks whose anomaly-based strategies produce sufficiently high returns.

4 Market Anomalies: technical analysis & momentum

4.1 Efficient Market Hypothesis (EMH) vs anomalies

Under a strict form of the EMH, past prices and past returns should contain no useful information to predict future returns. Prices should follow a (conditional) martingale, and any predictable pattern should be arbitraged away. A conditional martingale is just a process where, given everything you know right now, your best prediction for tomorrow is simply today’s value.

In formal terms:

\mathbb{E}[X_{t+1} \mid \mathcal{F}_t] = X_t

where \mathcal{F}_t represents all information available up to time t.

4.1.1 Intuition (Martingale = “Fair Game”)

Imagine a fair coin-flip game:

  • Heads → you win $1
  • Tails → you lose $1

Let X_t be your wealth after t flips.
Even if you are currently up or down, your expected wealth after the next flip is:

\mathbb{E}[X_{t+1} \mid X_t] = \tfrac{1}{2}(X_t + 1) + \tfrac{1}{2}(X_t - 1) = X_t.

  • Nothing you know about past flips helps you beat the game.
  • No strategy can tilt the expected outcome.
  • It is conditionally fair: today’s value is the best forecast for tomorrow.

4.1.2 Why finance cares

If asset prices follow a conditional martingale, then:

  1. Past prices do not help predict future prices,
  2. No trading strategy can systematically make profits from historical data,
  3. Markets behave as if they were informationally efficient.

This is the idea behind testing the Efficient Market Hypothesis (EMH): if returns behave like a martingale difference sequence, then past information has no predictive power.

# Simulate a simple martingale: fair coin random walk

np.random.seed(42)
n_steps = 200

# +1 for heads, -1 for tails, each with prob 0.5

steps = np.random.choice([-1, 1], size=n_steps)

# Wealth process X_t = cumulative sum of wins/losses

X = np.cumsum(steps)

plt.figure(figsize=(8, 4))
plt.plot(X)
plt.axhline(0, linestyle="--")  # starting wealth = 0
plt.xlabel("Time step")
plt.ylabel("Wealth $X_t$")
plt.title("Simulated Martingale: Fair Coin Random Walk")
plt.show()

However, many empirical studies document anomalies, i.e. return patterns that seem predictable using public information:

  • Trend-following / technical analysis: rules based on moving averages, bands, oscillators.
  • Short-run momentum: recent winners continue to outperform recent losers (3–12 months).
  • Long-run reversal: extreme past winners tend to underperform in the long run (2–5 years).

In this workshop, we implement:

  1. A Bollinger Band trading strategy (technical analysis anomaly).
  2. A momentum filter based on past performance.

4.2 Common Equity & Behavioral Market Anomalies

The following table summarizes twelve well-documented anomalies in empirical finance.
Students may use this table as a starting point for research and for identifying how each anomaly can be translated into a trading rule.

Anomaly Economic / Behavioral Interpretation How It Can Be Used for Trading
1. Short-Run Momentum (used in this workshop) Investors underreact to news, so trends persist for months. Buy recent winners, short losers over 3–12 months.
2. Technical Analysis Signals (used in this workshop) Prices show patterns inconsistent with EMH; traders overreact/underreact. Use BB, MACD, and moving averages to produce buy/sell signals.
3. Long-Run Reversal Investors overreact in the long run; extreme past performance reverses. Buy long-term losers, short long-term winners (3–5 years).
4. Size Effect (Small-Firm Premium) Small firms are riskier or neglected, leading to excess returns. Tilt portfolio toward small-cap stocks.
5. Value Effect (HML) Investors overreact to growth narratives; cheap stocks outperform. Buy high book-to-market or low P/E stocks.
6. Low-Volatility Anomaly High-risk stocks underperform; investors overpay for “lottery” stocks. Buy low-volatility stocks; short high-volatility stocks.
7. Post-Earnings Announcement Drift (PEAD) Investors underreact to earnings surprises. Buy stocks with positive earnings surprises; short the negative ones.
8. January Effect Tax-loss selling and behavioral resets create seasonal price pressure. Overweight small caps in January.
9. Holiday Effect / Weekend Effect Investor mood, volume, and liquidity cause predictable calendar patterns. Go long before holidays; avoid holding over weekends.
10. Disposition Effect Retail investors sell winners too early and hold losers too long. Look for continuation in prices of assets retail investors are selling.
11. Overreaction Effect Prices overshoot on news and gradually revert. Short extreme jumps, buy after extreme negative shocks.
12. Analyst Forecast Bias / Underreaction Analysts adjust forecasts slowly due to anchoring. Trade revision momentum (buy upgrades, short downgrades).

5 Analysis with Candlestick Charts

A candlestick chart lets you see how the price of an instrument has changed over a specific period. Each “candle” represents the price movement in a time interval, which can be daily, weekly, etc.

Green and red candles, together with body size and shadows, offer information about the direction and strength of price action. Start by looking for candles with large bodies or long shadows, as these indicate significant activity and can help anticipate price movements.

5.1 How to Interpret the Basic Elements of a Candle

  • Color of the candle:
    • Green: The price closed higher than it opened. Indicates an upward movement.
    • Red: The price closed lower than it opened. Indicates a downward movement.
  • Body of the candle:
    • The color itself does not change the interpretation, but the size of the body does:
    • Large body: Shows a significant price change from open to close, indicating strong buying or selling pressure.
    • Small body: Indicates indecision or little movement; it is common in quiet markets or in potential turning points.
  • Wicks (or Shadows): The lines that extend above and below the body.
    • Upper shadow: Shows the maximum price reached during the candle period.
    • Lower shadow: Shows the minimum price reached during the candle period.
  • Length of the shadows:
    • Long shadows: Indicate volatility and possible trend changes. If the upper shadow is long, buyers pushed the price up, but sellers brought it back down.
    • Short shadows: Show stability in price movement during the period.

5.1.1 Basic Pattern Examples

Some simple patterns you can look for:

  • Large Green Candle: Strong buying signal. The price moved up steadily.
  • Large Red Candle: Selling signal. The price fell significantly from open to close.
  • Doji (very small body with long shadows): Indicates market indecision; it can signal a trend reversal.

5.2 Basic Candlestick Technical Analysis Concepts

Candlestick charts show price variations over time and are a key tool to analyze the behavior of financial instruments. The following are some of the most common patterns you can identify and how to interpret them:

5.2.1 Doji

  • Shape: Very small body with long shadows above and below.
  • Meaning: Indicates indecision in the market. If it appears after a strong trend (either bullish or bearish), it may signal a possible trend reversal.

5.2.2 Hammer

  • Shape: Small body with a long lower shadow and almost no upper shadow.
  • Meaning: Bullish signal suggesting that although there was selling pressure, buyers stepped in and pushed the price up. It is more reliable after a series of red candles (downtrend).

5.2.3 Shooting Star

  • Shape: Small body with a long upper shadow and almost no lower shadow.
  • Meaning: Bearish signal. It suggests that buyers pushed the price up, but sellers took control. This pattern is more reliable after a series of green candles (uptrend).

5.2.4 Bullish Engulfing

  • Shape: A large green candle that completely engulfs the previous red candle.
  • Meaning: Strong signal that the trend may turn bullish, as buyers have taken control.

5.2.5 Bearish Engulfing

  • Shape: A large red candle that completely engulfs the previous green candle.
  • Meaning: Indicates a possible bearish reversal, as sellers have dominated.

5.3 Bollinger Bands (BB): Trend and Volatility Tool

In addition to candlestick patterns, you can apply Bollinger Bands, a useful tool for identifying overbought or oversold levels. These bands help analyze an asset’s volatility, built around a moving average of the price, showing how far the price moves away from its average.

  • If the price approaches the upper band, the asset may be overbought (price could fall soon). It may be wise to avoid additional long positions, as the price is likely to stabilize or fall.

  • If the price approaches the lower band, the asset may be oversold (price could rise in the short term). It may signal a possible recovery and a buying opportunity.

For a window of size n:

  • Middle band:
    \text{MB}_t = \text{SMA}_n(P_t) = \frac{1}{n} \sum_{i=0}^{n-1} P_{t-i}
  • Standard deviation of price over the same window: \sigma_n(P_t).
  • Upper band:
    \text{UB}_t = \text{MB}_t + k \cdot \sigma_n(P_t)
  • Lower band:
    \text{LB}_t = \text{MB}_t - k \cdot \sigma_n(P_t) Typical choices: n = 20 days, k = 2.

5.4 MACD (Moving Average Convergence Divergence)

MACD is a trend-following indicator based on exponential moving averages (EMAs).

  • Fast EMA (e.g. 12-day): \text{EMA}_{\text{fast}, t}
  • Slow EMA (e.g. 26-day): \text{EMA}_{\text{slow}, t}
  • MACD line:
    \text{MACD}_t = \text{EMA}_{\text{fast}, t} - \text{EMA}_{\text{slow}, t}
  • Signal line: EMA of MACD (e.g. 9 days):
    \text{Signal}_t = \text{EMA}_{9}(\text{MACD}_t)

Typical trading rule:

  • If \text{MACD}_t crosses above the signal line → bullish (buy).

  • If \text{MACD}_t crosses below the signal line → bearish (sell).

In this chapter, Bollinger Bands will drive the trading signal, while MACD is used mainly for illustration and intuition.

5.4.1 Daily returns and annualised strategy performance

Let P_t be the closing price on day t. The continuously compounded (log) return from day t to t+1 is:

r_{t+1} = \ln\left(\frac{P_{t+1}}{P_t}\right).

If we take a position s_t \in \{-1,0,+1\} on day t (short, neutral, long), then the daily strategy return is:

r_{t+1}^{\text{strat}} = s_t \cdot r_{t+1}.

To compute an annualised return from average daily log returns:

  1. Compute the average daily strategy return $ r = _{t=1}^{T} r_t^{} $.
  2. Annualise using 252 trading days: r_{\text{ann}} = \exp\big(252 \cdot \bar r\big) - 1.

This is consistent with continuously compounded returns: daily logs add up over the year, then we exponentiate to get a gross return.


5.5 3.2 Example: Bollinger Band strategy for one stock (TSLA)

For exposition, we start with the first stock in the dataset: TSLA.Close.

ticker = "TSLA.Close"
tsla = data[ticker]

tsla.head()
date
01/02/2020    86.052002
01/03/2020    88.601997
01/06/2020    90.307999
01/07/2020    93.811996
01/08/2020    98.428001
Name: TSLA.Close, dtype: float64

5.5.1 3.2.1 Compute Bollinger Bands with ta

We use a 20-day window and 2 standard deviations, as in the reference code.

bb = ta.volatility.BollingerBands(close=tsla, window=20, window_dev=2)

# Upper and lower bands
hb = bb.bollinger_hband()
lb = bb.bollinger_lband()

hb.head(), lb.head()
(date
 01/02/2020   NaN
 01/03/2020   NaN
 01/06/2020   NaN
 01/07/2020   NaN
 01/08/2020   NaN
 Name: hband, dtype: float64,
 date
 01/02/2020   NaN
 01/03/2020   NaN
 01/06/2020   NaN
 01/07/2020   NaN
 01/08/2020   NaN
 Name: lband, dtype: float64)

5.5.2 3.2.2 Build a DataFrame with price and bands

band_df = tsla.copy()
band_df = pd.concat([band_df, hb, lb], axis=1)
band_df.columns = ["price", "hband", "lband"]

band_df.head()
price hband lband
date
01/02/2020 86.052002 NaN NaN
01/03/2020 88.601997 NaN NaN
01/06/2020 90.307999 NaN NaN
01/07/2020 93.811996 NaN NaN
01/08/2020 98.428001 NaN NaN

5.5.3 3.2.3 Plot price and bands

band_df.plot(figsize=(10, 4), title=f"{ticker}: Price and Bollinger Bands")
plt.xlabel("Date")
plt.ylabel("Price")
plt.show()


5.6 3.3 Trading rule and signal construction

We now define the trading signal based on the bands:

  • If price is above or equal to the upper band: signal = -1 (sell/short).
  • If price is below or equal to the lower band: signal = +1 (buy/long).
  • Otherwise: signal = 0 (do nothing).

We implement this using a simple for loop over all days.

signal = []

for j in range(0, band_df.shape[0]):
    price_j = band_df["price"].iloc[j]
    h_j = band_df["hband"].iloc[j]
    l_j = band_df["lband"].iloc[j]

    if price_j >= h_j:
        signal.append(-1)
    elif price_j <= l_j:
        signal.append(1)
    else:
        signal.append(0)

signal[:10]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Wrap the signal in a DataFrame and plot:

signal_df = pd.DataFrame(signal, index=band_df.index, columns=["signal"])
signal_df.plot(figsize=(10, 3), title=f"{ticker}: Trading Signal (BB rule)")
plt.xlabel("Date")
plt.show()

Note that at the beginning we see many zeros because the bands themselves are NaN until the rolling window has at least 20 observations.


5.7 3.4 Strategy returns for a single stock

We now compute the daily strategy returns using continuously compounded (log) returns.

For a given day index j:

  • Position today: s_j (taken from signal_df["signal"].iloc[j]).
  • Price today: P_j.
  • Price tomorrow: P_{j+1}.
  • Daily log return:
    r_{j+1} = \ln\left(\frac{P_{j+1}}{P_j}\right).
  • Strategy return: $ r_{j+1}^{} = s_j r_{j+1} $.

We cannot compute r_{j+1} for the last row, because there is no P_{j+1}.
So we only loop up to band_df.shape[0] - 1.

# Indices where there is a non-zero signal
sig_idx = []
for i in range(signal_df.shape[0] - 1):  # up to second-to-last row
    if signal_df["signal"].iloc[i] != 0:
        sig_idx.append(i)

sig_idx[:10]
[19, 20, 21, 22, 45, 50, 52, 69, 70, 71]

Compute the log strategy return for each signal day:

ret_i = []

for j in sig_idx:
    p_today = band_df["price"].iloc[j]
    p_next = band_df["price"].iloc[j + 1]
    log_ret = np.log(p_next / p_today)
    strat_ret = log_ret * signal_df["signal"].iloc[j]
    ret_i.append(strat_ret)

ret_i[:5]
[np.float64(-0.015115861789670724),
 np.float64(-0.18144503336057394),
 np.float64(-0.12861872099095273),
 np.float64(0.18845037443736087),
 np.float64(0.059586901859173265)]

Wrap in a DataFrame:

ret = pd.DataFrame(ret_i, columns=["ret"])
ret.describe()
ret
count 91.000000
mean -0.008734
std 0.056662
min -0.181445
25% -0.031896
50% -0.009899
75% 0.016849
max 0.188450

Annualise using 252 trading days and log-return logic:

mean_daily_log = ret["ret"].mean()
ann_ret_tsla = np.exp(252 * mean_daily_log) - 1

ann_ret_tsla
np.float64(-0.8892911665273225)

Interpretation: if this annualised return is positive and large, the Bollinger Band strategy would have performed well on this stock (ignoring transaction costs and other frictions).


5.8 3.5 MACD illustration for the same stock

To connect with another classical technical indicator, we compute and plot MACD.

macd_ind = ta.trend.MACD(close=tsla, window_slow=26, window_fast=12, window_sign=9)

macd_line = macd_ind.macd()
signal_line = macd_ind.macd_signal()
hist = macd_ind.macd_diff()

macd_df = pd.concat([macd_line, signal_line, hist], axis=1)
macd_df.columns = ["macd", "signal", "hist"]

macd_df.tail()
macd signal hist
date
05/20/2022 -70.019921 -56.831104 -13.188817
05/23/2022 -71.635804 -59.792044 -11.843760
05/24/2022 -75.813998 -62.996435 -12.817564
05/25/2022 -75.779317 -65.553011 -10.226306
05/26/2022 -70.985316 -66.639472 -4.345844

Plot MACD and signal:

macd_df[["macd", "signal"]].plot(figsize=(10, 3), title=f"{ticker}: MACD and Signal")
plt.axhline(0, color="black", linewidth=1)
plt.xlabel("Date")
plt.show()

A common (but not the only) rule is: when MACD crosses above the signal line, it is bullish; when it crosses below, it is bearish.
In this workshop, the active trading rule is defined by Bollinger Bands, and MACD is used for intuition and further exploration.


6 4. Bollinger Band strategy for all stocks

We now repeat the TSLA procedure for every stock in the data DataFrame, storing the annualised strategy return for each ticker.

The loop does the following for each column (stock):

  1. Compute BB (20, 2).
  2. Build band_df with price and bands.
  3. Compute the signal series.
  4. Extract non-zero signal indices.
  5. Compute daily log strategy returns for those days.
  6. Compute the annualised return from log returns.
ret_all = []

for ticker in data.columns:
    price_series = data[ticker]

    # 1. Bollinger Bands
    bb = ta.volatility.BollingerBands(close=price_series, window=20, window_dev=2)
    hb = bb.bollinger_hband()
    lb = bb.bollinger_lband()

    # 2. Price + bands
    band_df = pd.concat([price_series, hb, lb], axis=1)
    band_df.columns = ["price", "hband", "lband"]

    # 3. Signals
    signal = []
    # Up to second-to-last index (we always need j+1)
    for j in range(0, band_df.shape[0] - 1):
        price_j = band_df["price"].iloc[j]
        h_j = band_df["hband"].iloc[j]
        l_j = band_df["lband"].iloc[j]

        if price_j >= h_j:
            signal.append(-1)
        elif price_j <= l_j:
            signal.append(1)
        else:
            signal.append(0)

    signal_df = pd.DataFrame(signal, index=band_df.index[:-1], columns=["signal"])

    # 4. Indices with active signals
    sig_idx = []
    for i in range(signal_df.shape[0]):
        if signal_df["signal"].iloc[i] != 0:
            sig_idx.append(i)

    # Edge case: if no signals, the strategy never trades
    if len(sig_idx) == 0:
        ret_all.append(0.0)
        continue

    # 5. Daily log strategy returns
    ret_i = []
    for j in sig_idx:
        p_today = band_df["price"].iloc[j]
        p_next = band_df["price"].iloc[j + 1]
        log_ret = np.log(p_next / p_today)
        strat_ret = log_ret * signal_df["signal"].iloc[j]
        ret_i.append(strat_ret)

    ret = pd.DataFrame(ret_i, columns=["ret"])

    # 6. Annualised return from log returns
    mean_daily_log = ret["ret"].mean()
    ann_ret = np.exp(252 * mean_daily_log) - 1

    ret_all.append(ann_ret)

# Build DataFrame with annualised strategy returns per ticker
ret_df = pd.DataFrame(ret_all, index=data.columns, columns=["ann_ret"]).sort_values(by="ann_ret", ascending=True)

ret_df.head()
ann_ret
TSLA.Close -0.889291
WFC.PL.Close -0.630579
BML.PH.Close -0.613088
BML.PL.Close -0.468792
BAC.PE.Close -0.389758

Look at the best performers:

ret_df.tail()
ann_ret
TCTZF.Close 13.617779
BABAF.Close 15.152149
LRLCF.Close 52.640184
CICHF.Close 54.735516
RHHBF.Close 437.311217

We now have, for each stock, the annualized return of the Bollinger Band strategy using log returns.


6.1 4.1 Filter by Bollinger strategy performance

We apply a simple rule: keep only the stocks whose annualised BB strategy return is higher than 10%.

threshold = 0.10
ret_filt = ret_df[ret_df["ann_ret"] > threshold]

ret_filt.head()
ann_ret
COST.Close 0.101253
CSCO.Close 0.113825
PG.Close 0.185918
CVX.Close 0.194800
JPM.PD.Close 0.216114

Number of stocks passing the BB filter:

len(ret_filt)
44

Now we keep only the price series of those stocks:

data_bb = data.loc[:, ret_filt.index]
data_bb.head()
COST.Close CSCO.Close PG.Close CVX.Close JPM.PD.Close XOM.Close RYDAF.Close HD.Close NKE.Close WFC.PQ.Close ... NVSEF.Close TMUS.Close INTC.Close PEP.Close TCEHY.Close TCTZF.Close BABAF.Close LRLCF.Close CICHF.Close RHHBF.Close
date
01/02/2020 291.489990 48.419998 123.410004 121.430000 27.639999 70.900002 30.000000 219.660004 102.199997 27.629999 ... 94.800003 78.589996 60.840000 135.820007 49.880001 49.880001 27.25 293.450012 0.87 324.950012
01/03/2020 291.730011 47.630001 122.580002 121.010002 27.660000 70.330002 30.180000 218.929993 101.919998 27.780001 ... 93.250000 78.169998 60.099998 135.630005 49.029999 48.930000 26.91 297.130005 0.84 319.750000
01/06/2020 291.809998 47.799999 122.750000 120.599998 27.580000 70.870003 30.469999 219.960007 101.830002 27.690001 ... 94.349998 78.620003 59.930000 136.149994 48.770000 48.700001 26.91 293.000000 0.84 319.750000
01/07/2020 291.350006 47.490002 121.989998 119.059998 27.500000 70.290001 30.549999 218.520004 101.779999 27.500000 ... 95.000000 78.919998 58.930000 134.009995 49.779999 49.770000 26.91 288.549988 0.88 322.049988
01/08/2020 294.690002 47.520000 122.510002 117.699997 27.549999 69.230003 30.139999 221.789993 101.550003 27.559999 ... 94.720001 79.419998 58.970001 134.699997 49.650002 49.650002 26.91 287.500000 0.88 322.049988

5 rows × 44 columns

At this point, data_bb contains only the stocks for which the simple Bollinger Band strategy would have delivered more than 10% annualised return over the sample (ignoring transaction costs and constraints).


7 5. Momentum anomaly filter

The second anomaly we use is momentum: stocks that performed well over a recent window (winners) tend to continue performing well in the short run.

7.1 5.1 Momentum definition

Let R_{t}^{(k)} be the k-day cumulative return up to date t.
A simple definition using log returns is:

\text{MOM}_t^{(k)} = \sum_{i=1}^{k} r_{t+1-i},

where r_{t} are daily log returns.

If \text{MOM}_t^{(k)} is positive and large, the stock has been a recent winner; if it is negative, it has been a recent loser.

In practice, many empirical studies use 3–12 months as the momentum window.
Here, we use a 6‑month proxy: 126 trading days.

7.2 5.2 Compute daily log returns and 6‑month momentum

We compute log returns for each stock and then the 126‑day rolling sum.

# 1) Daily log returns for each stock
log_ret_all = np.log(data / data.shift(1))

# 2) 126-day rolling momentum (approx. 6 months)
window_mom = 126
mom_126 = log_ret_all.rolling(window=window_mom).sum()

mom_126.tail()
TSLA.Close TSM.Close JNJ.Close UNH.Close JPM.Close TCEHY.Close TCTZF.Close XOM.Close BAC.Close PG.Close ... ACN.Close CSCO.Close LRLCF.Close CICHF.Close MCD.Close NKE.Close INTC.Close C.PJ.Close TMUS.Close TXN.Close
date
05/20/2022 -0.501638 -0.306911 0.085974 0.077584 -0.328981 -0.343710 -0.359155 0.367495 -0.313339 -0.036902 ... -0.292856 -0.222305 -0.392301 0.028573 -0.080348 -0.461577 -0.176301 -0.073604 0.074596 -0.130442
05/23/2022 -0.521637 -0.306037 0.096766 0.111867 -0.255799 -0.370330 -0.368956 0.436675 -0.235611 -0.012129 ... -0.266111 -0.205691 -0.376600 0.088293 -0.056960 -0.476152 -0.164707 -0.080407 0.116379 -0.138433
05/24/2022 -0.610678 -0.340826 0.127407 0.128548 -0.262863 -0.381878 -0.392559 0.428341 -0.260961 -0.001151 ... -0.262907 -0.221085 -0.352174 -0.014599 -0.035988 -0.484898 -0.178836 -0.059438 0.126635 -0.136966
05/25/2022 -0.520821 -0.294406 0.111366 0.107931 -0.279554 -0.347519 -0.347005 0.422272 -0.281665 -0.028714 ... -0.254544 -0.228583 -0.357568 0.042560 -0.046600 -0.464385 -0.151439 -0.045335 0.129626 -0.124966
05/26/2022 -0.455444 -0.282531 0.113280 0.109455 -0.254537 -0.309307 -0.309246 0.420268 -0.261504 -0.014773 ... -0.217454 -0.210663 -0.332065 0.056353 -0.035712 -0.420812 -0.134910 -0.044086 0.147992 -0.105992

5 rows × 56 columns

We focus on the last available date to decide which stocks are “momentum winners” at the end of the sample.

last_mom = mom_126.iloc[-1, :]  # last row (final date)
last_mom.head()
TSLA.Close   -0.455444
TSM.Close    -0.282531
JNJ.Close     0.113280
UNH.Close     0.109455
JPM.Close    -0.254537
Name: 05/26/2022, dtype: float64

7.3 5.3 Momentum filter: keep winners

As a simple rule, we keep stocks with positive 6‑month momentum (they have gone up on net over the last 126 days).

mom_filt = last_mom[last_mom > 0]

mom_filt.sort_values(ascending=False).head()
XOM.Close      0.420268
CVX.Close      0.410034
RYDAF.Close    0.319491
SHEL.Close     0.306713
ABBV.Close     0.238166
Name: 05/26/2022, dtype: float64

Number of momentum winners:

len(mom_filt)
14

8 6. Combined filter: Bollinger strategy + momentum

Finally, we combine both filters:

  1. Bollinger Band strategy filter: annualised return > 10\%.
  2. Momentum filter: positive 6‑month momentum at the end of the sample.

We keep only the stocks that satisfy both criteria.

# Intersection of tickers passing the BB filter and momentum filter
bb_tickers = set(ret_filt.index)
mom_tickers = set(mom_filt.index)

final_tickers = sorted(bb_tickers.intersection(mom_tickers))
len(final_tickers), final_tickers[:10]
(12,
 ['ABBV.Close',
  'AZNCF.Close',
  'CICHF.Close',
  'CVX.Close',
  'JNJ.Close',
  'KO.Close',
  'NVSEF.Close',
  'PEP.Close',
  'RYDAF.Close',
  'TMUS.Close'])

Create a final price panel with these doubly-filtered stocks:

data_final = data.loc[:, final_tickers]
data_final.head()
ABBV.Close AZNCF.Close CICHF.Close CVX.Close JNJ.Close KO.Close NVSEF.Close PEP.Close RYDAF.Close TMUS.Close UNH.Close XOM.Close
date
01/02/2020 89.550003 100.000000 0.87 121.430000 145.970001 54.990002 94.800003 135.820007 30.000000 78.589996 292.500000 70.900002
01/03/2020 88.699997 101.500000 0.84 121.010002 144.279999 54.689999 93.250000 135.630005 30.180000 78.169998 289.540009 70.330002
01/06/2020 89.400002 99.699997 0.84 120.599998 144.100006 54.669998 94.349998 136.149994 30.469999 78.620003 291.549988 70.870003
01/07/2020 88.889999 99.699997 0.88 119.059998 144.979996 54.250000 95.000000 134.009995 30.549999 78.919998 289.790009 70.290001
01/08/2020 89.519997 99.699997 0.88 117.699997 144.960007 54.349998 94.720001 134.699997 30.139999 79.419998 295.899994 69.230003

This data_final DataFrame is the output of Filter 2 (market anomalies).
It contains only the stocks that:

  • Come from the EMH‑failed universe (Filter 1, from the previous workshop),
  • Have a profitable BB trading strategy (annualised return > 10\%),
  • Have positive recent momentum.

In the next workshop, you will use data_final as the input universe for portfolio allocation (Filter 3), where you will construct and evaluate alternative portfolios based on these theoretically and empirically filtered assets.