Quantitative Trading Project

Strategy

Strategy Description

In this report we discuss pairs trading as a trading strategy to exploit the spread between the prices of Exxon Mobil (XOM) and Conoco Phillips (COP).

Pairs trading is a trading strategy that involves buying and selling two similar assets at the same time to profit from their price difference. The idea is to identify two assets that are typically highly correlated, meaning that they usually move up or down in value together. However, there are times when one asset may be overpriced relative to the other, leading to a temporary price difference.

In pairs trading, we would buy the underpriced asset while simultaneously selling the overpriced asset, expecting that the prices will eventually converge. We expect to profit from the price difference between the two assets when the prices converge.

Rationale and Research

In order to stay close to commodities but venture into Equities, I decided to pursue Oil and gas companies. A research paper by Carlos Salas Najera at University of London(CFA UK Data Science and ML Working Group) published in Dec 2019 discussed Exxon Mobil and Conoco Phillips as a trading pair. Najera performed and discussed some advanced statistical tests along with backtesting and comparing against Fama French risk factor models to test alpha generation. The paper suggests the pair as the most promising of all that were tested, including Chevron. Although, doing those advanced analyses is out of scope for this course. I decided to download the data and perform some visualizations.

Following graph shows a positive relation between two equities. They have a positive correlation of 0.97. I looked at the spread (XOM-COP) between two equities. Mean spread is 1.79,max spread is 18, and minimum spread is -22.

I looked at the ratio(XOM/COP) to deal with compact data points. Ratio ranges from 0.79 to 1.46 with a mean of 1.05 (very close to 1). Following graph shows the ratio with its moving averages.

This graph shows the z-score for the ratio with upper and lower bounds that can be used as trading indicators.

Although, ratio looks like it is mean reverting. We perform a unit root test using Augmented Dicky Fuller test, which confirms the ratio is stationary with a pvalue of 0.0003.

## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression trend 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.132472 -0.007993 -0.000105  0.008056  0.176846 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.198e-03  2.659e-03   1.578 0.114630    
## z.lag.1     -4.439e-03  1.951e-03  -2.275 0.022996 *  
## tt           4.624e-07  6.779e-07   0.682 0.495266    
## z.diff.lag  -7.181e-02  2.161e-02  -3.323 0.000904 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01611 on 2127 degrees of freedom
## Multiple R-squared:  0.009989,   Adjusted R-squared:  0.008593 
## F-statistic: 7.154 on 3 and 2127 DF,  p-value: 8.868e-05
## 
## 
## Value of test-statistic is: -2.2751 3.6505 5.2236 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2  6.09  4.68  4.03
## phi3  8.27  6.25  5.34

Our strategy will work when the pair is correlated and the market conditions are relatively stable with normal volatility. Pair diverges beyond normal levels and we can exploit this difference and it converges back in a reasonable time frame.

Unexpected events can cause volatility spikes. The strategy can prove risky during higher volatility/uncertainty periods as prices may keep diverging beyond our risk appetite.

We assume there to be a fundamental reason for high correlation between our equities, if this changes for any reason our strategy may not work.

Model Implementation

Our OHLC data spans from June 2017 to April 2023. We try two approaches:

Ratio approach is where we buy when the 50 day SMA of our ratio is above 100 day SMA and we sell on vice versa. The results are as follows:

Z-Score approach is where we trigger buy trades when the score reaches upper or lower bounds of 1.25 and -1.25 respectively. We trigger a sell trade when it reaches 1 and -1. A conservative approach to avoid false signals at the 1 mark.

The results are as follows:

When the 50Day SMA is below the 100Day SMA the signal is negative and is positive when the reversal occurs. In both cases when a trade is placed, we sell 1 unit of overpriced equity and short one unit at the same time. We buy 1 unit of the under-priced equity. When the reversal occurs we sell the now overpriced equity and short one unit. We buy 2 units of the underpriced equity to cover short (1) and go long(1) at the same time.

The returns are calculated in the appropriate Open-Close, Close-Open and Close-Close manners depending on the nature of position and the signal sign.

We define our training period up to 2020 January. We train the model on this period running simulations for ranging moving averages to find the optimal one. We plot the results in a 3D graph

We look at the max draw downs of our two equities to gain an understanding of our position. This is supposed to provide an intuition. In our strategy at any given time we are long one and short the other. In case of extreme draw downs, as long as draw downs on both are correlated (which is to be expected) our max draw down should not be nearly as extreme as either of the equities. It is in case of divergence and draw down on only one equity our strategy could result severe losses.

The drawdowns look to be correlated. Our max tolerance would be the drawdown on either of the equity considering a buy and hold strategy would result in the same drawdown.

We now go on to test the ratio strategy on test data and plot results

Lessons Learned

The test data post covid(post Jan 2020) is very volatile. The parameters suggested by training data optimization are too laggy and result in poor performance. I display the following results which are tweaked to shorter averages so that they catch up faster and react to volatility quicker.

I would love to optimize this approach using Exponential Moving Averages instead and see the difference, especially post covid. The most important lesson learned however is the future is different from the past. World moves on and even though there is nothing new under the sun, the reliance on past to predict future is fraught and must be used with extreme caution.