Pairs Trading: Performance of a Relative-Value Arbitrage Rule
當期價差= 𝑠𝑡𝑜𝑐𝑘𝐴 漲跌 −∆× 𝑠𝑡𝑜𝑐𝑘𝐵漲跌
當期價差 >過去價差平均 + 過去價差標準差
A 價格被高估 融券賣出 A、融資買進 B
當期價差 過去價差平均 - 過去價差表準差
B 價格被高估➔融券賣出 B、融資買進 A
當 Pairs Opening 時,持有直到兩者價差有發生相反情況時,再進行平倉。
Mean Reverting (均數回歸)
- 利率
- 連動性高的股票股價比例
針對資料先畫圖顯示分佈,在挑選可能的模型 Vasicek 可能有負數,CIR一律是正值 GAM (兩個常態分佈 機率組合)
模型使用的資料
- 共整合因子(用價差)
- 財務隨機模型(用比例)
使用比率(ratio)的動機
實務上Bollinger Bands計算方式
- 帶狀中心線=20天移動平均線(20MA) - 帶狀上限=帶狀中心線+兩個標準差
- 帶狀下限=帶狀中心線-兩個標準差
| 期間 | 標準差 |
|---|---|
| 10 | 1.9 |
| 20 | 2.0 |
| 50 | 2.1 |
最好的移動平均應該能夠讓修正走勢獲得支撐,尤其是趨勢反轉之後第一個修正。
Bollinger band 之所以適用,主要來自價格波動率,不是來自於精確挑選移動平均長度。
價格觸擊上限或下限,分別代表價格強勢或疲軟,仍需其他技術指標或成交量進一步確認。以確定為趨勢再發動或反轉。
由Bollinger bands 定義出發,本策略由過去歷史資料(如20天、30 天、60天),估出具Exponential Vasicek(mean-reverting)的隨機過程參數
由過去資料所估計之隨機過程,Mote Carlo模擬未來一期(如1天、5分,3分)之所有可能路徑(如10000條路徑)。
由模擬之所有可能路徑中,計算樣本標準差,與樣本平均數。接著依Bollinger Band指標概念,算出帶狀中心線、帶狀上限與帶狀下限(如上下加減1個標準差或0.8個標準差)。
Vasicek模型指出短期利率會遵循隨機微分方程式
It assumes that the instantaneous spot rate (or ”short rate”) follows an Ornstein-Uhlenbeck process with constant coefficients under the statistical or objective measure used for historical estimation \({ dx_t = \alpha (\theta - x_t) dt + \sigma * dW_t }\) -\({ \theta }\): 回歸均值 -\({ \alpha }\): 回歸均職的速度 -\({ \sigma }\): 波動度,數值越大隨機性越明顯
透過 Ito Lemma積分,計算每個時間點的數值
\({ x_t = \theta (1 - e^{-\alpha (t - s)} )+ x_s * e ^ {-\alpha (t - s)} + \sigma e^{-\alpha t} \int_s^t(e^{-\alpha \mu} \, dW_u) }\)
任一時間點資料,可透過回歸模型估算下一個時間點的資料,加上一個隨機因子 \({ x(t_i) = c + b * x(t_{i-1}) + \delta * \epsilon (t_i) \ }\)
透過回歸模型取得
- 截距 \({c = \theta (1 - e^{-\alpha * dt }) }\)
- 斜率 \({b = e^{-\alpha * dt } }\)
- 殘差 \({\delta = \sigma \sqrt{ \frac{ (1- e^{2\alpha * dt} ) } {2 \alpha} } }\)
價格不會有負值,可以透過指數或平方,讓數值強制為正值,計算Vasicek參數時透過取 log / sqrt修正數值
ML_VasicekParams <- function(ratios, dt = 1/365, transform="log" ) {
if(transform == "log") {
ratios_new <- log(ratios)
} else {
ratios_new <- sqrt(ratios)
}
N <- length(ratios_new)
x <- as.numeric(ratios_new[1: N-1][,1])
y <- as.numeric(ratios_new[2: N][,1])
data <- data.frame(x0 = x, x1 = y)
#x1 = c + b * x0
parm <- lm(x1~x0, data)
#delta standard deviation of residuals
delta <- sd( residuals(parm) )
c <- as.numeric(parm$coefficients[1])
b <- as.numeric(parm$coefficients[2])
alpha <- log(b) / (-1 * dt)
theta <- c/ (1- exp(-alpha * dt))
sigma <- delta / sqrt(( (1 - exp(-2 * alpha * dt)) / (2 * alpha) ) )
vasicekParam <- list( alpha = alpha, theta = theta, sigma = sigma)
}
由過去資料所估計之隨機過程,Mote Carlo模擬未來一期(如1天、5分,3分)之所有可能路徑(如10000條路徑)。
VasicekSimulator <- function(r0, parms, dt = 1/365, steps, paths) {
alpha <- parms[[1]];
theta <- parms[[2]];
sigma <- parms[[3]];
randNum <- matrix(rnorm(steps*paths),nrow = steps, ncol = paths)
rMat <- matrix(0,nrow = steps + 1, ncol = paths)
rMat[1,] <- r0
lapply(2:(steps+1), function(i) {
preRate <- rMat[i-1,]
dr <- alpha*(theta - preRate)*dt + sigma*randNum[i-1,]*sqrt(dt)
rMat[i,] <<- preRate + dr
})
return (rMat)
}
使用quantmod
require(quantmod)
## Loading required package: quantmod
## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
##
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Loading required package: TTR
## Version 0.4-0 included new data defaults. See ?getSymbols.
require(lubridate)
## Loading required package: lubridate
讀取 Apple, Google 股價
symbols <- c("AAPL", "GOOG")
getSymbols(symbols)
## As of 0.4-0, 'getSymbols' uses env=parent.frame() and
## auto.assign=TRUE by default.
##
## This behavior will be phased out in 0.5-0 when the call will
## default to use auto.assign=FALSE. getOption("getSymbols.env") and
## getOptions("getSymbols.auto.assign") are now checked for alternate defaults
##
## This message is shown once per session and may be disabled by setting
## options("getSymbols.warning4.0"=FALSE). See ?getSymbols for more details.
## [1] "AAPL" "GOOG"
檢視資料
summary(GOOG)
## Index GOOG.Open GOOG.High GOOG.Low
## Min. :2014-03-27 Min. :494.7 Min. :496.0 Min. :487.6
## 1st Qu.:2014-08-26 1st Qu.:533.8 1st Qu.:538.3 1st Qu.:529.7
## Median :2015-01-27 Median :555.1 Median :558.1 Median :548.5
## Mean :2015-01-26 Mean :571.9 Mean :576.6 Mean :566.3
## 3rd Qu.:2015-06-28 3rd Qu.:587.4 3rd Qu.:589.6 3rd Qu.:582.2
## Max. :2015-11-25 Max. :757.5 Max. :762.7 Max. :751.8
## GOOG.Close GOOG.Volume GOOG.Adjusted
## Min. :492.6 Min. : 7900 Min. :492.6
## 1st Qu.:534.1 1st Qu.: 1403900 1st Qu.:534.1
## Median :555.0 Median : 1714350 Median :555.0
## Mean :571.6 Mean : 1988608 Mean :571.6
## 3rd Qu.:586.7 3rd Qu.: 2220025 3rd Qu.:586.7
## Max. :756.6 Max. :11164900 Max. :756.6
summary(AAPL)
## Index AAPL.Open AAPL.High AAPL.Low
## Min. :2007-01-03 Min. : 79.39 Min. : 82.0 Min. : 78.2
## 1st Qu.:2009-03-25 1st Qu.:124.50 1st Qu.:126.1 1st Qu.:122.6
## Median :2011-06-14 Median :201.16 Median :202.9 Median :198.7
## Mean :2011-06-15 Mean :280.01 Mean :282.8 Mean :276.7
## 3rd Qu.:2013-09-05 3rd Qu.:427.57 3rd Qu.:431.6 3rd Qu.:423.3
## Max. :2015-11-25 Max. :702.41 Max. :705.1 Max. :699.6
## AAPL.Close AAPL.Volume AAPL.Adjusted
## Min. : 78.2 Min. : 14479600 Min. : 10.40
## 1st Qu.:124.5 1st Qu.: 75455275 1st Qu.: 22.79
## Median :200.6 Median :115645600 Median : 46.92
## Mean :279.9 Mean :144703609 Mean : 53.65
## 3rd Qu.:427.6 3rd Qu.:188165950 3rd Qu.: 76.36
## Max. :702.1 Max. :843242400 Max. :131.38
head(GOOG)
## GOOG.Open GOOG.High GOOG.Low GOOG.Close GOOG.Volume
## 2014-03-27 568.0026 568.0026 552.9225 558.4626 13100
## 2014-03-28 561.2025 566.4326 558.6725 559.9925 41200
## 2014-03-31 566.8926 567.0026 556.9325 556.9725 10800
## 2014-04-01 558.7126 568.4526 558.7126 567.1626 7900
## 2014-04-02 599.9927 604.8328 562.1926 567.0026 147100
## 2014-04-03 569.8526 587.2827 564.1326 569.7426 5099200
## GOOG.Adjusted
## 2014-03-27 558.4626
## 2014-03-28 559.9925
## 2014-03-31 556.9725
## 2014-04-01 567.1626
## 2014-04-02 567.0026
## 2014-04-03 569.7426
head(AAPL)
## AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume
## 2007-01-03 86.29 86.58 81.90 83.80 309579900
## 2007-01-04 84.05 85.95 83.82 85.66 211815100
## 2007-01-05 85.77 86.20 84.40 85.05 208685400
## 2007-01-08 85.96 86.53 85.28 85.47 199276700
## 2007-01-09 86.45 92.98 85.15 92.57 837324600
## 2007-01-10 94.75 97.80 93.45 97.00 738220000
## AAPL.Adjusted
## 2007-01-03 11.14677
## 2007-01-04 11.39418
## 2007-01-05 11.31304
## 2007-01-08 11.36891
## 2007-01-09 12.31332
## 2007-01-10 12.90259
par(mfrow=c(2,2))
plot(index(GOOG), GOOG$GOOG.Adjusted,type="l",col="red")
plot(index(AAPL), AAPL$AAPL.Adjusted,type="l",col="blue")
ratios <- GOOG$GOOG.Adjusted / AAPL$AAPL.Adjusted
plot(index(ratios), ratios, type="l",col="green")
依指定日期區間取得修正股票收盤價
getData <- function(beg, end, data){
range <-new_interval(beg, end)
tData <- data[,6][index(data) %within% range]
return (tData)
}
依指定日期區間取得配對交易的股價比例
getRatio <- function(beg, end, pair1 = GOOG, pair2 = AAPL){
tPair2 <- getData(beg, end, pair2)
tPair1 <- getData(beg, end, pair1)
ratios <- tPair1 / tPair2
return (ratios)
}
依照輸入日期,取得歷史區間(60天)的股價比例
getHistoryRatio <- function(start, period = days(60), pair1 = GOOG, pair2 = AAPL){
#print(start)
end <- start - days(1)
#print(end)
beg <- start - period
#print(beg)
ratios <- getRatio(beg, end)
return (ratios)
}
依照模擬股價值取得均值與上下限值,預設兩個標準差 (96信心比例)
getBand <- function(simu, cfvalue = 2) {
simuR <- exp(simu)
meanRatio <- apply(simuR, 1, mean)
sdRatio <- apply(simuR, 1, sd)
highRatio <-meanRatio + cfvalue * sdRatio
lowRatio <-meanRatio - cfvalue * sdRatio
#print(paste("meanRatio",meanRatio))
#print(paste("sdRatio",sdRatio))
band <- c(meanRatio[2], highRatio[2], lowRatio[2])
#print(band)
return (band)
}
回測每個交易日
1 .使用過去60天歷史資料
2. 透過Vasicek取的隨機參數
3. 透過蒙地卡羅透過隨機亂數模擬1000條曲線估算每個交易日的均值與上下限值
4. 彙整配對交易每日的股價與比例
rollBand <- function(startT, endT, steps= 1, pair1 = GOOG, pair2 = AAPL) {
dates <- seq(startT, endT, by = "day")
dates <- dates[as.Date(dates) %in% as.Date(index(pair1))]
dates <- dates[as.Date(dates) %in% as.Date(index(pair2))]
#print(dates)
#print(class(dates))
data <- data.frame(matrix(NA, nrow=length(dates), ncol=3))
for(i in 1:length(dates)) {
#print(start)
ratios <- getHistoryRatio(dates[i])
# print(ratios)
results <- ML_VasicekParams(ratios, dt = 1/365)
# print(results)
r0 <- log(ratios[length(ratios)])
#print(paste("r0",r0) )
simuR <- VasicekSimulator(r0, parms = results, steps = 1, paths = 1000)
band <- getBand(simu = simuR)
data[i,] <- band
}
pairRatio <- getRatio(startT , endT, pair1, pair2)
data <- cbind.data.frame(pairRatio, data)
data <- cbind(data, dates)
tPair1 <- getData(startT, endT, pair1)
#data.frame(tPari1[,1])
#print(tPair1[,1])
data <- cbind.data.frame(data, tPair1[,1])
tPair2 <- getData(startT, endT, pair2)
#print(tPair1)
data <- cbind.data.frame(data, tPair2[,1])
#print(dim(data))
names(data) <- c("ratio", "mid", "upper", "lower", "date", "GOOG","AAPL")
return (data)
}
配對交易在股價比例超過上限現值會出現套利空間,透過模擬估算結果,將繪製可能的進場時機
aa <- rollBand(ymd("2014-08-01"), ymd("2015-11-26"))
tail(aa)
## ratio mid upper lower date GOOG AAPL
## 2015-11-18 6.309148 6.373780 6.588635 6.158926 2015-11-18 740.00 117.29
## 2015-11-19 6.216619 6.304252 6.519886 6.088618 2015-11-19 738.41 118.78
## 2015-11-20 6.341995 6.206973 6.425263 5.988683 2015-11-20 756.60 119.30
## 2015-11-23 6.420212 6.320626 6.544013 6.097239 2015-11-23 755.98 117.75
## 2015-11-24 6.294415 6.402461 6.631260 6.173661 2015-11-24 748.28 118.88
## 2015-11-25 6.338643 6.275871 6.490417 6.061326 2015-11-25 748.15 118.03
par(mfrow=c(1,1))
plot(aa$date, aa$ratio,type="l",col="black")
lines(aa$date, aa$mid,type="l",col="blue")
lines(aa$date, aa$upper,type="l",col="green")
lines(aa$date, aa$lower,type="l",col="red")
byin1 <- which(aa$ratio > aa$upper)
byin2 <- which(aa$ratio < aa$lower)
which(abs(aa$ratio - aa$mid) < 0.01)
## [1] 4 12 25 33 62 72 91 94 139 151 164 165 183 191 193 202 205
## [18] 211 215 224 227 228 229 233 277 310 313 325 327 328
points(aa[byin1,]$date, aa[byin1,]$ratio, color="green")
## Warning in plot.xy(xy.coords(x, y), type = type, ...): "color" 不是一個繪圖
## 參數
points(aa[byin2,]$date, aa[byin2,]$ratio, color="red")
## Warning in plot.xy(xy.coords(x, y), type = type, ...): "color" 不是一個繪圖
## 參數
出場條件
設定布林通道標準差
賺賠比
勝率
連續虧損的天數
最大虧損
報酬率對交易回測不是那麼重要,勝率比較重要,可以提高交易頻率增加獲利
交易策略的頻率是否重要?看資訊取得頻率