Lecture 6

Stat Arb/Mean Reversion

Terry Leitch

Copyright © 2018 T Leitch & J Liew

Recent developments

The “Sixth” Factor

User-identified sentiment

Sentiments of AAPL over time…


Correlation of $AAPL’s Tweets sentiment vs other factors

*

*

*

IPOs and Tweet Sentiments

*

*

*

*

Fun with quant strategies open to all!

https://websim.worldquant.com/simulate

Also, includes…

Attempts at defining Stat Arb

“…refers to highly technical short-term mean-reversion strategies involving large numbers of securities (hundreds to thousands, depending on the amount of risk capital), very short holding periods (measured in days to seconds), and substantial computational, trading, and IT infrastructure“ – Andrew Lo
http://www.alphasimplex.com/

“In the context of hedge funds, a style of management that employs complex statistical models that try to capture small abnormalities in a security’s intraday return. “– Campbell R. Harvey
http://www.duke.edu/~charvey/

An attempt to profit from pricing inefficiencies that are identified through the use of mathematical models. Statistical arbitrage attempts to profit from the likelihood that prices will trend toward a historical norm. Unlike pure arbitrage, statistical arbitrage is not riskless. – InvestorWords

Quantitative investment approach characterized by low use of discretion, high use of technology, research, data, and statistics/financial models. ## Where does Statistical Arbitrage fit in the context of the hedge fund industry?

Hedge Fund Industry: $2.9 T 2014



Different Types of Stat Arb Strategies

Sharpe Ratio vs Capacity

Speed vs Complexity

Roughly speaking…

Top Ten Lists: Delusions and Characteristics for Success

  1. I’m different from all the others who have tried
  2. Teach me, I can learn the secret-sauce
  3. It’s easy to make money
  4. I’ll get better by reading that next paper/book
  5. I’m a damn-good programmer
  6. I think I have a clue, I know how markets behave, I can do this…
  7. More complexity the better
  8. There exists a perfect money-making model
  9. My ideas are good enough to make money
  10. My idea is unique

Top 10 Successful Characteristics for Stat Arb

  1. Actively try to be different
  2. Teach others and you will master the ability to create your own secret-sauce
  3. It’s hard to make money so don’t get arrogant, be thankful you have a shot
  4. Assume what you read is already arb’ed away, but keep reading to help generate the next idea/ adjustment/ extension
  5. Constantly improve your skill set
  6. Assume you don’t have a clue and always be a constant student of the market, you can do it!
  7. Keep it simple, really understand what you’re capturing
  8. The perfect money-making machine is dynamic
  9. Constantly refine your ideas
  10. Keep generating new ideas

Mean-Reversion/Counter-Trend Strategies

Z-Score

Z-Score: \(\frac{x_{i}-\mu}{\sigma}\)
\(x_{i}\) : ith observation
\(\mu\) : mean
\(\sigma\) : Standard dev.

Maps data to common scale

Winsorizing outlier by pushing them back to -3 or 3, respectively.

Why do you want to do this?

What is Co-Integration?

Background info on Co-Integration

Co-Integration Procedure

Are KO and PEP Co-Integrated?

Ex. KO and PEP

“100 Best Pairs” project.

How to Group Stocks?

Grouping set of observations into subsets

Algorithm

Indices Examples

Stock Sectors Example

Liquid ETFs

Mean-Reversion Framework

Mean-Reversion Framework

Avellaneda and Lee (2008) Framework

S-Score Calculation

*

*

*

Matching JPM vs XLF…


Flip the S-score…

(only for this example, error in doc)


Practical Implications

Signals to Positions

How to map signals to positions

Signals in the most primitive form are: “buy,” “sell,” or “do nothing,”…in integers: 1, -1, 0.
Additionally, signals may exist in the continuum of [-3, 3] obtained from some form of z-score methodology

One way is to take a hard number of shares 100 (or contract, etc.) to trade (others ways: risk, mkt-cap, opt., etc.)

For example, all trades are represented by 100 shares to buy/sell, so signal-to-position is a simple linear function:

    ex. (-1 signal)* (100 shares) = -100  [trade this quantity]
          

Alternative derivation of positions can be achieved through a monotonic transformation of the signal space [-3,3] to positions space such as:

Forecasting Regressions

How to Employ: Principal Components Analysis (PCA)

PCA is a mathematical procedure that transforms data into orthogonal components. The top components are typically employed to determine time-series of residuals and thus measure deviations that that are “too far-away” and thus predicted to revert back.

Nice way to understand the structure of your data. Typically, for stock data the first principal component resembles a long-only “pseudo-market” portfolio. The second component looks like a long/short portfolio. Note that PCA assumes that the structure is constant over the period examined, financial data comes in with time-stamps, thus periods of non-traditional behavior becomes problematic.

Matlab – svd(); C++ - pca.h

Independent Component Analysis (ICA)

Example: Intra-Day Mean-Reversion Model

Outline

Data from 6/30/2008 to 4/9/2009: One minute closing prices per market

*

*


Proposal

Implementation Issues

Example: Ultra-high frequency “2xSPY ≈ SSO”

Introduction

Brief Description of ETFs

“ProShares Ultra S&P500 (the Fund) seeks daily investment results that correspond to twice (200%) the daily performance of the S&P 500 Index. “

Example of Raw Quotes Data

*

Create Dynamic Hedge Ratio (DHR) for 1,000 Share of SPY

DHR = (1,000 * lag_SPY / 2 ) / lag_SSO)

SPY/SSO Arb Algorithm

How many times did we do the arb?


How long does it last?

Histogram of the Quote Size

Statistics Bid/Ask Size


Conclusion

Secret-Sauce Steps

  1. Gather Data
    • What do you want to trade? Stocks, bonds, futures, currencies, options, etc. Gathering and cleaning the data can be very time-consuming, but is the first step
  2. Look at Data
    • Always, look at your data using equity lines, understand your data, fix gaps, NaN fill forward, other tools summary stats or PCA. Do you need to transform data: trade equal-risk per position, does it co-move? Can you see periods of high correlation in the tails? Entropy?
  3. Determine Model Type: MR or MO
    • What kind of alpha do you want to extract? Is it trend/momentum (MO) or mean-reversion (MR)? What inputs are going to be used to predict or build signals? Just prices, trades, bid/ask, size, volume, open interest, time-of-day, trade-time, block-orders, transaction order, etc.
  4. If MR: Define Residuals
    • How do you define the “residual” is it based on benchmark of equally-weighted, equal-risk weighted, MV, regressions, PCA, others? Is the thesis, more precisely, that the cumulative residuals or some function of residual is mean-reverting?
  5. Generate Signal
    • Typically, the core-model is determined by some measure of “near” versus “far” away. Cumulated residuals is a good starting point, but improvements can be made by thinking about how distance is measured/defined. Z-Score, S-Score, and Winsorizing help with signal construction and combination.
  6. Signals into Positions
    • Ultimately you need to convert information from signal to actual positions, as only positions are traded. Need to be mindful of the underlying liquidity. Once again look at your signals and positions over time, this tells you if your model is fast or slow regarding turnover.
  7. Positions to historical PnL net of costs
    • Try to get to Sharpe Ratio as fast as possible. Comparative statistics are easy to do if your code runs quickly. Examine PnL sensitivities to at most one/two parameters at a time. If you look at too many parameters you’ll will lose intuition on the behavior of your model.