S&P 500 Index Volatility Short and Medium Term

This project analyses S&P500 and VIX Index with specific focus on the comparison between Short and Medium term volatility measures. The analysis is far from concluded and might be subject to further development as further reaserch is still in progress.

The CBOE Volatility Index (VIX) is constructed using the implied volatilities of a wide range of S&P 500 Index options. Using the price of these options, the VIX methodology estimates how volatile the options will be between now and the option’s expiration date. This volatility is calculated from both calls and puts and is widely used as a measure of market risk.

CBOE Volatility Index is an indicator of the 30-day volatility expectations.

CBOE Short Term Volatility (VXST) provides expectations of 9-day volatility.

CBOE Medium Term Volatility (VXMT) is a measure of the expected volatility of the S&P500 Index over a 6-month horizon.

Data

The historical data for S&P500 and Volatility are downloaded from Quandl and go back since 2011.

A series of packages are needed for downloading and handling the data, plotting and backtesting.

# Load libraries
library(Quandl)
library(ggplot2)
library(quantmod)
library(dplyr)
library(PerformanceAnalytics)
library(gridExtra)

The data is downloaded and processed in order to obtain a format suitable for the analysis. Volatility and Index are time series, for this reason the data is converted into extensible time series class and then merged. Furthermore, two new columns are also computed:

Diff is the difference between Volatility terms, Medium and Short
Neg indicates if Diff is positive (0) or negative (1).

# Download data and transform into time series
vix_m=Quandl('CBOE/VXMT')
vix_m=as.xts(vix_m$Close,order.by = vix_m$Date)
vix_s=Quandl('CBOE/VXST')
vix_s=as.xts(vix_s$Close,order.by = vix_s$Date)
spx<-Quandl("CHRIS/CME_SP1")
spx<-as.xts(spx$Last, order.by = spx$Date)

# Merge data, find diff and neg factor
data=merge.xts(spx,vix_s,join = 'inner')
data=merge.xts(data,vix_m,join = 'inner')
data$diffvix=data$vix_m-data$vix_s
data$neg=ifelse(data$diffvix<0,1,0)
colnames(data)=c('SP500','Short','Medium','Diff','Neg')

# Check spx for NA values ans substitute with the nearest day
#spx[is.na(spx)]   #show records with NA values
data$SP500['2017-03-17']<-2375
data$SP500['2017-06-16']<-2440

head(data)

##             SP500 Short Medium Diff Neg
## 2011-01-03 1265.3 16.04  23.40 7.36   0
## 2011-01-04 1265.3 16.06  23.19 7.13   0
## 2011-01-05 1271.8 15.57  22.78 7.21   0
## 2011-01-06 1270.2 15.71  22.87 7.16   0
## 2011-01-07 1267.5 15.01  22.92 7.91   0
## 2011-01-10 1265.5 15.81  22.93 7.12   0

Analysis

After processing the data, we can have a look at the historical charts. The first chart shows the s&P 500 Index with two different colors:

black when volatility Medium is higher than Short
blue when volatility Medium is lower than Short

The second chart shows the two volatility terms, Short (red line) and Medium (blue line). We can notice that volatility Short term is lower than Medium term most of the time and there are a few occasions when Short crosses over Medium. These occasions seem to identify situations where the S&P 500 Index reaches temporary bottoms.

# Plot index and vix short / medium
vix_plot <- ggplot(data=data,aes(x=index(data)))+
                geom_line(data = data,aes(y=Short),color='red')+
                geom_line(data = data,aes(y=Medium),color='blue')+
                ggtitle('Volatility (VIX) Short and Medium Terms')+
                ylab('VIX')+xlab('Date')
indice_plot=ggplot(data=data,aes(x=index(data)))+
                geom_line(data = data,aes(y=SP500,colour=Neg))+
                theme(legend.position = 'none')+ylab('Index')+xlab('Date')+
                ggtitle('S&P 500 Index')
grid.arrange(indice_plot,vix_plot,nrow = 2)

Looking at the summary of the two volatility measures it is possible to notice how Short has lower mean, lower median and higher variability.

cbind(summary(data$Short),summary(data$Medium))

##      Index                  Short             Index             
##  "Min.   :2011-01-03  " "Min.   : 7.10  " "Min.   :2011-01-03  "
##  "1st Qu.:2012-08-25  " "1st Qu.:12.03  " "1st Qu.:2012-08-25  "
##  "Median :2014-04-17  " "Median :14.55  " "Median :2014-04-17  "
##  "Mean   :2014-04-16  " "Mean   :16.21  " "Mean   :2014-04-16  "
##  "3rd Qu.:2015-12-03  " "3rd Qu.:18.11  " "3rd Qu.:2015-12-03  "
##  "Max.   :2017-07-25  " "Max.   :68.00  " "Max.   :2017-07-25  "
##      Medium       
##  "Min.   :13.75  "
##  "1st Qu.:16.97  "
##  "Median :18.99  "
##  "Mean   :20.15  "
##  "3rd Qu.:21.88  "
##  "Max.   :41.36  "

This difference is confirmed by plotting the histogram of the two series.

# Mean of vix short and medium
short_mean <- mean(data$Short)
medium_mean <- mean(data$Medium)
min_vix <- min(data$Short,data$Medium)
max_vix <- max(data$Short,data$Medium)

# Plot histogram of the short / medium vix
par(mfrow=c(2,1))
hist(data$Short,breaks = 100,col='blue',main = 'Volatility Short Term',
     xlab = 'Short Volatility',xlim = c(min_vix,40))
abline(v=short_mean,col='black',lwd=2)
hist(data$Medium,breaks = 100, col='red',main = 'Volatility Medium Term',
     xlab = 'Medium Volatility',xlim = c(min_vix,40))
abline(v=medium_mean,col='black',lwd=2)

The data set is then populated with new columns / variables. Return is the daily return of the S&P 500 Index.

# Calculate S&P 500 Index return daily
data$Return <- Delt(data$SP500)  # 1 day return
data$Return['2011-01-03'] <- 0

Price10 is the S&P 500 price after d days from the current date, and Return10 is the return after d days. This can be used to simulate the performance of a long investment (buy S&P 500).

# Function to calculate future price and returns (with d days window)
d <- 30
limit <- length(data$SP500)-d
for (i in c(1:limit)){
    data$Price10[i] <- data$SP500[i+d]
    data$Return10[i] <- data$Price10[i]/data$SP500[i]-1
}    
head(data)

##             SP500 Short Medium Diff Neg       Return Price10   Return10
## 2011-01-03 1265.3 16.04  23.40 7.36   0  0.000000000  1326.3 0.04820991
## 2011-01-04 1265.3 16.06  23.19 7.13   0  0.000000000  1333.0 0.05350510
## 2011-01-05 1271.8 15.57  22.78 7.21   0  0.005137122  1337.8 0.05189495
## 2011-01-06 1270.2 15.71  22.87 7.16   0 -0.001258059  1342.4 0.05684144
## 2011-01-07 1267.5 15.01  22.92 7.91   0 -0.002125650  1314.4 0.03700197
## 2011-01-10 1265.5 15.81  22.93 7.12   0 -0.001577909  1305.5 0.03160806

After calculating the returns, it is possible to analyse them dividing the data into two groups, based on volatility Short higher or lower than Medium. Using the summary function we can compare the returns for the two groups. We can notice that minimun, median and mean returns (with d day holding period) are higher when volatility Short is higher than Medium.

# Subset data based on sort higher or lower than medium
data_neg <- subset(data,Diff<0)
data_pos <- subset(data,Diff>0)

# summary of 10 day returns 
neg_pos <- data.frame(summary(data_neg$Return10)[,2],summary(data_pos$Return10)[,2])
colnames(neg_pos) <- c('Short>Medium','Short<Medium')
neg_pos

##          Short>Medium        Short<Medium
## 1 Min.   :-0.106956   Min.   :-0.168455  
## 2 1st Qu.: 0.008781   1st Qu.:-0.008355  
## 3 Median : 0.043498   Median : 0.015338  
## 4 Mean   : 0.038820   Mean   : 0.010642  
## 5 3rd Qu.: 0.071797   3rd Qu.: 0.036136  
## 6 Max.   : 0.152904   Max.   : 0.115051

It is also possible to conduct a t.test to verify if the difference between the two groups is statistically significany. The test confirms that the difference is significant, p-value is very small and the confidence interval does not comprend zero.

# Test difference in Return10 between Short>Medium and Short<Medium
t.test(data_neg$Return10,data_pos$Return10)

## 
##  Welch Two Sample t-test
## 
## data:  data_neg$Return10 and data_pos$Return10
## t = 7.1821, df = 176.01, p-value = 1.861e-11
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.02043487 0.03592049
## sample estimates:
##  mean of x  mean of y 
## 0.03881957 0.01064189

The returns of the two groups created above can be compared using an histogram chart. In fact, in the following figure we can observe the difference in mean and median returns (after d days).

# Plot and compare returns for short>medium and short<medium
min_axis <- min(data$Return10)
max_axis <- max(data$Return10)
p1 <- ggplot(data = data_neg, aes(Return10))+
        geom_histogram(fill='red',col='black',binwidth = 0.005)+
        ggtitle('VIX Short > Medium')+
        geom_vline(xintercept = mean(data_neg$Return10),col='blue',lwd=1.4)+
        xlim(min_axis,max_axis)
p2 <- ggplot(data = data_pos, aes(Return10))+
        geom_histogram(fill='green',col='black',binwidth = 0.005)+
        ggtitle('VIX Short < Medium')+
        geom_vline(xintercept = mean(data_pos$Return10),col='blue',lwd=1.4)+
        xlim(min_axis,max_axis)

grid.arrange(p1,p2,ncol=1)

We can also compare the two subsets calculating the number of times we obtained positive versus negative returns. This means we can see how many times we would have obtained gains or losses if we were to buy the S&P500 and hold for d days.

# Compare number of losses / gains for short higher or lower than medium
pl <- data.frame(c('n profit','n loss','% profit','% return'))
neg_pl <- c(length(which(data_neg$Return10>0)),
            length(which(data_neg$Return10<0)),
            round(length(which(data_neg$Return10>0))/nrow(data_neg)*100),
            round(mean(data_neg$Return10)*100,2))
pos_pl <- c(length(which(data_pos$Return10>0)),
            length(which(data_pos$Return10<0)),
            round(length(which(data_pos$Return10>0))/nrow(data_pos)*100),
            round(mean(data_pos$Return10)*100,2))
pl <- data.frame(pl,neg_pl)
pl <- data.frame(pl,pos_pl)
colnames(pl) <- c('Profit & Loss','Short>Medium','Short<Medium')
pl

##   Profit & Loss Short>Medium Short<Medium
## 1      n profit       124.00      1014.00
## 2        n loss        29.00       475.00
## 3      % profit        79.00        68.00
## 4      % return         3.88         1.06

Analysing the data with regards to the interaction between volatility short and volatility medium, we create a variable Trend that summarises this interaction. The goal is to have a more complex representation of this interaction because the hypothesis is that the most important moments are when Short crosses over or below Medium.

In particular we can have 4 cases:

Short is lower than Medium (3)
Short crosses over Medium (2)
Short is higher than Medium (0)
Short crosses below Medium (1)

# Calculate new column 'trend' to summarise the trend of diff(medium-short)
# Trend can have 4 values:
# 3 if diff was positive today and yesterday 
# 0 if diff was negative today and yesterday
# 2 if diff changed from positive to negative
# 1 if diff changed from negative to positive
ftrend <- function(i) {
    trend <- numeric(1)
    if (data$Diff[i]<0) {
        if (data$Diff[i-1]<0) {trend <- 0}
        else {trend <- 2}
        }
    else if (data$Diff[i]>0) {
        if (data$Diff[i-1]<0) {trend <- 1}
        else {trend <- 3}
    }
    return(trend)
}
days <- nrow(data)
data$Trend <- numeric(days)
for (i in c(2:days)){
    data$Trend[i] <- ftrend(i)
}
head(data)

##             SP500 Short Medium Diff Neg       Return Price10   Return10
## 2011-01-03 1265.3 16.04  23.40 7.36   0  0.000000000  1326.3 0.04820991
## 2011-01-04 1265.3 16.06  23.19 7.13   0  0.000000000  1333.0 0.05350510
## 2011-01-05 1271.8 15.57  22.78 7.21   0  0.005137122  1337.8 0.05189495
## 2011-01-06 1270.2 15.71  22.87 7.16   0 -0.001258059  1342.4 0.05684144
## 2011-01-07 1267.5 15.01  22.92 7.91   0 -0.002125650  1314.4 0.03700197
## 2011-01-10 1265.5 15.81  22.93 7.12   0 -0.001577909  1305.5 0.03160806
##            Trend
## 2011-01-03     0
## 2011-01-04     3
## 2011-01-05     3
## 2011-01-06     3
## 2011-01-07     3
## 2011-01-10     3

The following table summarises the returns for the 4 groups based on Trend. We can notice that the highest returns (after d days) and percentage of gains are registered when the volatility Short is higher than the volatility Medium and when the first one crosses below the second one.

# Compare the 4 groups with regards to returns 
pospos <- data[data$Trend==3]
posneg <- data[data$Trend==2]
negpos <- data[data$Trend==1]
negneg <- data[data$Trend==0]
pl2 <- data.frame(c('%return','%gains'))
negneg_pl <- c(round(mean(negneg$Return10)*100,2),
               round(length(which(negneg$Return10>0))/nrow(negneg)*100))
negpos_pl <- c(round(mean(negpos$Return10)*100,2),
               round(length(which(negpos$Return10>0))/nrow(negpos)*100))
posneg_pl <- c(round(mean(posneg$Return10)*100,2),
               round(length(which(posneg$Return10>0))/nrow(posneg)*100))
pospos_pl <- c(round(mean(pospos$Return10)*100,2),
               round(length(which(pospos$Return10>0))/nrow(pospos)*100))
pl2 <- data.frame(pl2,negneg_pl)
pl2 <- data.frame(pl2,negpos_pl)
pl2 <- data.frame(pl2,posneg_pl)
pl2 <- data.frame(pl2,pospos_pl)
colnames(pl2) <- c('P&L','NegNeg','NegPos','PosNeg','PosPos')
pl2

##       P&L NegNeg NegPos PosNeg PosPos
## 1 %return   4.11   3.31   3.39   0.99
## 2  %gains  76.00  88.00  88.00  67.00

Backtesting

In this section the analysis focuses on testing a trading srategy that uses the variable trend as a buy or sell signal. When volatility short crosses below the volatility medium we have a buy signal. Using Return10 we can see the performance if we were to hold the long operation for d days.

# calculate returns
signal <- lag(ifelse(data$Trend==1,1,0))
signal[is.na(signal)] <- 0
returns <- data$Return10*signal
#portfolio <- exp(cumsum(returns))
#plot(portfolio)
charts.PerformanceSummary(returns)

Vix Short Medium

Flavio Angeli

4 July 2017

S&P 500 Index Volatility Short and Medium Term

Data

Analysis

Backtesting