Equity Backtesting

Simple SMI Backtest



Project Description

The Stochastic Momentum Index, or Stochastic Oscillator, is a momentum indicator that relates the location of a stocks daily close relative to the high-low range over a set number of periods. The SMI follows the speed or momentum of a stocks price. As a rule, the momentum changes direction before the price. Because of that, bullish and bearish divergences in the SMI can be used to predict future reversals.

The range of the SMI is from 0 to 100. Traditional settings of the SMI use 80 as the overbought threshold and 20 as the oversold threshold. It is important to note that overbought and oversold readings are not necessarily indicative of a bear or bull trend. For example, securities can become overbought and remain overbought during a strong uptrend. Similarily, securities can become oversold and remain oversold during a strong downtrend.

The default setting for the SMI is 14 periods. These periods can be days, weeks, months, or an intraday timeframe. There are 3 statistics generated by quantmod for the SMI.

The developer of the SMI, George C. Lane, states the FastD would be the only signal that would cause a buy or sell condition since the FastK tends to be choppy and has a higher number of signals generated. The SlowD will therefore generate even fewer signals and might allow for a better predictor of a reversal by smoothing false signals and indicating a sustained buy/sell pressure on the stock.


This project will be a simple test for oversold stocks using the SMI, a threshold of 20, and a period of 14 days with a 3 day smoothing. A 3 month return on the stock will be calculated along with an optimal return which identifies the highest return possible within the 3 month period.

Since there are 3 specific statistics in the SMI, we will compare the returns across the FastK, FastD, and SlowD signals to see if there are meaningful differences in the simple test.

The source data will be the backtest environment created in a previous project, Building the Backtest Environment. Recall that the environment contains historical pricing data from 2007-2016. We also have a symbols table that contains all of the stocks that were successfully loaded in the environment.

As before, we will be using the quantmod package to retrieve data from the backtest environment and perform the SMI generation.


Libraries Required

library(quantmod)   # Quantitative financial strategies
library(dplyr)      # Data manipulation
library(lubridate)  # Date and time processing
library(knitr)      # Dynamic report generation

Working directory

setwd("U:/Equity Backtesting")

Prepare the Environment

We will require the two files previously created in order to perform the SMI backtest. To access the environment we created we simply need to load the environment file. The symbols table is in the form of an .rds.

Load Backtest Environment

load("Environments/bt_env_2007_2016.Rdata")

Read Backtest Symbols Table

bt_env_symbols <- readRDS("Symbols/bt_env_symbols.rds")

We can check to make sure that the data is loaded correctly and contains expected values. This time we can look for symbols that start with “GO”.

Verify Backtest Environment

ls(bt_env_2007_2016, pattern = "^GO")
##  [1] "GOGL"  "GOGO"  "GOL"   "GOLD"  "GOLF"  "GOOD"  "GOODM" "GOODO"
##  [9] "GOODP" "GOOG"  "GOOGL" "GOVNI"

Verify Data in Backtest Environment

kable(head(bt_env_2007_2016$GOOG, 10))
GOOG.Open GOOG.High GOOG.Low GOOG.Close GOOG.Volume GOOG.Adjusted
464.724 475.355 459.847 466.3098 15470700 232.9220
467.716 482.625 467.068 481.9369 15834200 240.7277
481.179 486.165 476.801 485.8561 13795600 242.6853
486.355 488.529 480.880 482.2560 9544400 240.8871
484.121 486.913 479.882 484.1707 10803000 241.8435
483.104 492.199 480.720 488.1199 11981700 243.8161
495.839 500.376 494.821 498.3518 14470400 248.9270
500.616 503.617 498.631 503.6173 8980800 251.5571
506.160 511.595 501.922 502.8993 15194500 251.1984
502.012 506.380 493.026 495.9185 13448300 247.7115

Verify Backtest Symbols

kable(head(bt_env_symbols, 10))
Symbol Sector Industry Name Exchange
AAAP Health Care Major Pharmaceuticals Advanced Accelerator Applications S.A. NASDAQ
AAL Transportation Air Freight/Delivery Services American Airlines Group, Inc. NASDAQ
AAME Finance Life Insurance Atlantic American Corporation NASDAQ
AAOI Technology Semiconductors Applied Optoelectronics, Inc. NASDAQ
AAON Capital Goods Industrial Machinery/Components AAON, Inc. NASDAQ
AAPC Consumer Services Services-Misc. Amusement & Recreation Atlantic Alliance Partnership Corp. NASDAQ
AAPL Technology Computer Manufacturing Apple Inc. NASDAQ
AAWW Transportation Transportation Services Atlas Air Worldwide Holdings NASDAQ
ABAC Consumer Non-Durables Farming/Seeds/Milling Aoxin Tianli Group, Inc. NASDAQ
ABAX Capital Goods Industrial Machinery/Components ABAXIS, Inc. NASDAQ
## [1] "Number of Backtest Symbols: 4667"

Process for SMI

Since this is a backtest project, we want to define the criteria for the data we are going to be retrieving. When using quantmod in a normal interactive mode, the symbol and historical data to be evaluated will be retrieved with the getSymbol() function. The symbol will be charted and the addSMI() function is called to add the SMI indicator to the chart. In this scenario, we want the SMI to be calculated and then stored for further use.

To accomplish this we will use the stoch() function call to perform the required calculations. The default parameters are a period length of 14 days and a smoothing period of 3 days. We will be using the defaults for the test.

The next item to define is the value of the SMI that we are interested in. Since an SMI value of 20 is considered to be the signal for oversold conditions, we will be looking for SMI values that are equal to or less than 20. The output of stoch() is in a percentage form so the values will need to be multiplied by 100 to generate the actual SMI value.

All of the processing described will be performed on the 3 different signals of the SMI: FastK, FastD, and SlowD.


The first step in the processing is to retrieve the desired symbol data from the backtest environment. This is accomplished using the following:

get(symbol_name, source_environment)

The second step is to approximate, or substitute, any NA values in the time series. This is needed when a time series has a missing set of values for a given date. If the data is not adjusted, the SMI call will fail and we lose that stock as a potential test object.

Next, we invoke the stoch command. The stoch() function requires a data frame with Open, High, Low, Close (OHLC) to produce the statistics we are looking for. The data we retrieve from the environment will be subsetted to contain only the required columns.

We do need an error check for the SMI call for the situations in which there isn’t enough data to perform the calculation. So we will use the function try() to validate the call and test whether the call inherits an error.

If the call is successful the results will be placed in a data frame and then processed. The processing consists of the following for each of the 3 signals:

The data is grouped and filtered to keep only one occurrence of the SMI threshold in order to test the SMI as a trigger for initiating a buy condition for the stock for a given month. The data frame created is then joined to a master data frame which will contain all occurrences of the SMI threshold being met for the first time in a given month for all of the stocks.


Create Symbol Array

bt_symbols <- bt_env_symbols$Symbol

SMI Gather Loop

j <- 1

for(i in 1:length(bt_symbols)) {
        
        sym_sum  <- get(bt_symbols[i], envir = bt_env_2007_2016)
        
        sym_sum  = na.approx(sym_sum)
        
        sym_sum <- sym_sum[,2:4]
        
        get_try  <- try(stoch(sym_sum, nFastK=14, nFastD=3, nSlowD=3))
        
        if(inherits(get_try, "try-error")) {
                
        } 
        
        else {
         
              smi_sum <- data.frame(stoch(sym_sum, nFastK=14, nFastD=3, nSlowD=3))
              
              smi_fastK  <- smi_sum %>%
                            select(fastK) %>%
                            mutate(Date = rownames(.)) %>%
                            transform(fastK = round(fastK * 100, 1)) %>%
                            filter(complete.cases(.),
                                   fastK <= 20) %>%
                            mutate(Symbol = bt_symbols[i],
                                   Year   = year(Date),
                                   Mnth   = month(Date)) %>%
                            group_by(Year, Mnth) %>%
                            slice(1L) %>%
                            ungroup() %>%
                            select(Symbol, Date, fastK)
  
              if(j == 1) { fastK_frame <- smi_fastK } else
                         { fastK_frame <- bind_rows(fastK_frame, smi_fastK) }
              

              smi_fastD  <- smi_sum %>%
                            select(fastD) %>%
                            mutate(Date = rownames(.)) %>%
                            transform(fastD = round(fastD * 100, 1)) %>%
                            filter(complete.cases(.),
                                   fastD <= 20) %>%
                            mutate(Symbol = bt_symbols[i],
                                   Year   = year(Date),
                                   Mnth   = month(Date)) %>%
                            group_by(Year, Mnth) %>%
                            slice(1L) %>%
                            ungroup() %>%
                            select(Symbol, Date, fastD)
  
              if(j == 1) { fastD_frame <- smi_fastD } else
                         { fastD_frame <- bind_rows(fastD_frame, smi_fastD) }
              

              smi_slowD  <- smi_sum %>%
                            select(slowD) %>%
                            mutate(Date = rownames(.)) %>%
                            transform(slowD = round(slowD * 100, 1)) %>%
                            filter(complete.cases(.),
                                   slowD <= 20) %>%
                            mutate(Symbol = bt_symbols[i],
                                   Year   = year(Date),
                                   Mnth   = month(Date)) %>%
                            group_by(Year, Mnth) %>%
                            slice(1L) %>%
                            ungroup() %>%
                            select(Symbol, Date, slowD)
  
              if(j == 1) { slowD_frame <- smi_slowD } else
                         { slowD_frame <- bind_rows(slowD_frame, smi_slowD) }
              
              
              j <- j + 1
              
              }
}

Process for Returns

The goal for the backtest is to determine if an SMI trigger, indicating a buy signal, is a potential predictor for positive returns based on the expectation of a bounceback in stock price after an oversold condition. For this example, we will look at a 3 month return from the date of the trigger.

Using the backtest environment again, we will retrieve the adjusted closing prices beginning with the date of the trigger and ending 3 months after. If the trigger date is such that the calculated end data is greater than 12/30/2016, the last trading day of 2016 and the last date in our backtest environment, the recorded returns will be NA.

During the 3 month testing window, it can be expected that the stock price will have volatility and that the starting and ending prices for the window may not reflect the low and high stock prices observed. To determine the potential for returns, we will also record the highest stock price during the window. It is possible that a 3 month return may show a negative value but there is an opportunity for a positive return within the window.

There are 3 unique data frames that need to be tested, one for each of the 3 signals previously defined. In order to create an efficient testing environment, I have created multiple functions that will be called using the desired data frames. These functions will be described as they appear below.


The signal data frames will be updated with the following:


The first function, Prepare_Frame(), will take the signal data frame to be processed as input and add/initialize the needed return fields.

Prepare Frame Function

Prepare_Frame <- function(temp_frame) {
        
        temp_frame <- temp_frame %>%
                      mutate(Trigger_Cl = 0,
                             Period_Cl  = 0,
                             Return_3mt = 0,
                             Period_Hi  = 0,
                             Return_Hi  = 0,
                             Volume_Min = 0,
                             Volume_Avg = 0)
        
        return(temp_frame)
}

The following calls Prepare_Frame() and places the results in the target signal data frame.

Add Fields to SMI Frame

fastK_frame <- Prepare_Frame(fastK_frame)
fastD_frame <- Prepare_Frame(fastD_frame)
slowD_frame <- Prepare_Frame(slowD_frame)
Symbol Date fastK Trigger_Cl Period_Cl Return_3mt Period_Hi Return_Hi Volume_Min Volume_Avg
AAAP 2016-10-28 12.6 0 0 0 0 0 0 0
AAAP 2016-11-28 15.9 0 0 0 0 0 0 0
AAAP 2016-12-01 14.7 0 0 0 0 0 0 0
AAL 2007-01-23 4.1 0 0 0 0 0 0 0
AAL 2007-02-21 5.3 0 0 0 0 0 0 0
Symbol Date fastD Trigger_Cl Period_Cl Return_3mt Period_Hi Return_Hi Volume_Min Volume_Avg
AAAP 2016-11-30 13.4 0 0 0 0 0 0 0
AAAP 2016-12-01 13.0 0 0 0 0 0 0 0
AAL 2007-01-25 11.3 0 0 0 0 0 0 0
AAL 2007-02-23 7.7 0 0 0 0 0 0 0
AAL 2007-03-01 16.2 0 0 0 0 0 0 0
Symbol Date slowD Trigger_Cl Period_Cl Return_3mt Period_Hi Return_Hi Volume_Min Volume_Avg
AAAP 2016-12-01 18.1 0 0 0 0 0 0 0
AAL 2007-01-29 13.2 0 0 0 0 0 0 0
AAL 2007-02-26 12.6 0 0 0 0 0 0 0
AAL 2007-03-01 11.5 0 0 0 0 0 0 0
AAL 2007-04-16 18.5 0 0 0 0 0 0 0

Generating the returns requires a processing loop to retrieve the symbol and date of the SMI trigger. A target end date is generated for the 3 month window. Each month, on average, contains 4.33 weeks. So a 3 month testing window will consist of 13 weeks.

If the calculated end date falls outside the range of the backtest environment, we simply set a value of NA to the desired fields. If the date range is within the backtest environment range, we can process the record and retrieve/calculate the statistics. We will be using the adjusted close field for all of the data points.

The second function, Generate_Returns(), will take the signal data frame to be processed, perform the data retrieval, perform the required statistical processing, and update the data frame accordingly.

Generate Returns Function

Generate_Returns <- function(temp_frame) {
        
        for(i in 1:nrow(temp_frame)) {
        
        Symbol   <- temp_frame$Symbol[i]
        
        Beg_Date <- as.Date(temp_frame$Date[i])
        End_Date <- Beg_Date + dweeks(13)
        
        if(End_Date > "2016-12-31") {
                
                temp_frame$Trigger_Cl[i]     <- NA
                temp_frame$Period_Cl[i]      <- NA
                temp_frame$Return_3mt[i]     <- NA
                temp_frame$Period_Hi[i]      <- NA
                temp_frame$Return_Hi[i]      <- NA
                temp_frame$Volume_Min[i]     <- NA
                temp_frame$Volume_Avg[i]     <- NA
                
        } 
        else {
                
                sym_sum    <- get(Symbol, envir = bt_env_2007_2016)
                
                date_range <- paste(as.character(Beg_Date),"/",as.character(End_Date), sep = "")
                
                sym_sum    <- data.frame(sym_sum[date_range])
                
                colnames(sym_sum) <- c("Open", "High", "Low", "Close", "Volume", "AdjClose")
                
                temp_frame$Trigger_Cl[i] <- as.numeric(sym_sum$AdjClose[1])
                temp_frame$Period_Cl[i]  <- as.numeric(sym_sum$AdjClose[nrow(sym_sum)])
                
                temp_frame$Return_3mt[i] <- round((temp_frame$Period_Cl[i]  - temp_frame$Trigger_Cl[i])
                                                 / temp_frame$Trigger_Cl[i], 3)
                
                temp_frame$Period_Hi[i]  <- max(sym_sum$AdjClose)
                
                temp_frame$Return_Hi[i]  <- round((temp_frame$Period_Hi[i]  - temp_frame$Trigger_Cl[i])
                                                 / temp_frame$Trigger_Cl[i], 3)
                
                temp_frame$Volume_Min[i] <- min(sym_sum$Volume)
                temp_frame$Volume_Avg[i] <- round(mean(sym_sum$Volume),0)

                }
        }
        
        return(temp_frame)
        
}

The following calls Generate_Returns() and places the results in the target signal data frame.

Note: There are many thousands of rows that need to be processed so the completion of this section will take some time to finish depending on the machine performing the work (CPU, Memory, etc…).

Generate Returns

fastK_frame <- Generate_Returns(fastK_frame)
fastD_frame <- Generate_Returns(fastD_frame)
slowD_frame <- Generate_Returns(slowD_frame)

Once the data has been retrieved and the desired information stored in the data frame we can filter to remove records that do not meet our acceptance criteria. We will define the rejection criteria as the following:

The purpose of the acceptance criteria is to eliminate irregularly traded stocks which might otherwise skew the results.


The third function, Clean_Frame(), will take the signal data frame to be processed and perform the cleaning and filtering described above.

Clean Frame Function

Clean_Frame <- function(temp_frame) {
        
        temp_final <- temp_frame %>%
                      filter(complete.cases(.),
                      Trigger_Cl >= 5,
                      Trigger_Cl < 1000,
                      Volume_Min > 50000,
                      Volume_Avg >= 100000)
        
        return(temp_final)
}

The following calls Clean_Frame() with the target data frame and loads the result into the final version of the data frame which will be used for return analysis.

Clean Frames

fastK_final <- Clean_Frame(fastK_frame)
fastD_final <- Clean_Frame(fastD_frame)
slowD_final <- Clean_Frame(slowD_frame)
Symbol Date fastK Trigger_Cl Period_Cl Return_3mt Period_Hi Return_Hi Volume_Min Volume_Avg
AAL 2007-01-23 4.1 53.27 42.77 -0.197 59.31 0.113 1014900 2147523
AAL 2007-02-21 5.3 55.73 34.95 -0.373 55.73 0.000 1102500 2651518
AAL 2007-03-01 17.6 52.08 35.65 -0.315 52.08 0.000 1036200 2682342
AAL 2007-04-02 17.1 45.05 30.70 -0.319 47.87 0.063 1036200 3054338
AAL 2007-05-01 12.1 37.06 31.01 -0.163 37.06 0.000 1036200 3085808
Symbol Date fastD Trigger_Cl Period_Cl Return_3mt Period_Hi Return_Hi Volume_Min Volume_Avg
AAL 2007-01-25 11.3 53.86 39.33 -0.270 59.31 0.101 1014900 2188900
AAL 2007-02-23 7.7 54.25 34.25 -0.369 54.25 0.000 1036200 2664329
AAL 2007-03-01 16.2 52.08 35.65 -0.315 52.08 0.000 1036200 2682342
AAL 2007-04-12 16.7 44.12 34.93 -0.208 46.98 0.065 1036200 3145822
AAL 2007-05-01 5.0 37.06 31.01 -0.163 37.06 0.000 1036200 3085808
Symbol Date slowD Trigger_Cl Period_Cl Return_3mt Period_Hi Return_Hi Volume_Min Volume_Avg
AAL 2007-01-29 13.2 54.43 36.94 -0.321 59.31 0.090 1014900 2232588
AAL 2007-02-26 12.6 53.34 34.25 -0.358 53.34 0.000 1036200 2686431
AAL 2007-03-01 11.5 52.08 35.65 -0.315 52.08 0.000 1036200 2682342
AAL 2007-04-16 18.5 45.34 34.96 -0.229 46.98 0.036 1036200 3134914
AAL 2007-05-01 4.7 37.06 31.01 -0.163 37.06 0.000 1036200 3085808
## [1] "Total Remaining Records - FastK: 154700"
## [1] "Total Remaining Records - FastD: 128932"
## [1] "Total Remaining Records - SlowD: 114830"

As we would expect, the number of records remaining in the signal data frames decreases as we move from FastK to FastD and to SlowD.


Review Results

The first thing we will review is the overall 3 month totals based on the adjusted closing prices from the beginning and end of the test window.

The fourth function, Three_Month_Stats(), will generate the summary statistics for each of the final signal data frames and print the results.

3 Month Return Function

Three_Month_Stats <- function(temp_frame) {
        
        pos_ret_3mt <- nrow(temp_frame %>%
                            filter(Return_3mt > 0))

        neg_ret_3mt <- nrow(temp_frame %>%
                            filter(Return_3mt <= 0))

        pos_pct_3mt <- round(pos_ret_3mt / nrow(temp_frame), 3)

        avg_ret_3mt <- round(mean(temp_frame$Return_3mt), 3)

        avg_pos_ret <- round(mean(temp_frame$Return_3mt[temp_frame$Return_3mt > 0]), 3)
        avg_neg_ret <- round(mean(temp_frame$Return_3mt[temp_frame$Return_3mt <= 0]), 3)
        
        
        print(paste("  Total Positive Returns:", pos_ret_3mt))
        print(paste("  Total Negative Returns:", neg_ret_3mt))
        print(paste("Percent Positive Returns:", pos_pct_3mt))
        print(paste("          Average Return:", avg_ret_3mt))
        print(paste(" Average Positive Return:", avg_pos_ret))
        print(paste(" Average Negative Return:", avg_neg_ret))

}

FastK 3 Month Returns

Three_Month_Stats(fastK_final)
## [1] "  Total Positive Returns: 85445"
## [1] "  Total Negative Returns: 69255"
## [1] "Percent Positive Returns: 0.552"
## [1] "          Average Return: 0.023"
## [1] " Average Positive Return: 0.159"
## [1] " Average Negative Return: -0.146"

The FastK signal resulted in an average return of 2.3% with 55.2% of observations resulting in a positive return at the end of the testing window.

FastD 3 Month Returns

Three_Month_Stats(fastD_final)
## [1] "  Total Positive Returns: 71852"
## [1] "  Total Negative Returns: 57080"
## [1] "Percent Positive Returns: 0.557"
## [1] "          Average Return: 0.025"
## [1] " Average Positive Return: 0.163"
## [1] " Average Negative Return: -0.148"

The FastD signal improved on the results by generating an average return of 2.5% and positive return percentage of 55.7%.

SlowD 3 Month Returns

Three_Month_Stats(slowD_final)
## [1] "  Total Positive Returns: 64162"
## [1] "  Total Negative Returns: 50668"
## [1] "Percent Positive Returns: 0.559"
## [1] "          Average Return: 0.027"
## [1] " Average Positive Return: 0.166"
## [1] " Average Negative Return: -0.149"

The more conservative SlowD sees an average return of 2.7% and a positive return percentage of 55.9%.

The statistics seem to indicate that the more conservative triggers are eliminating many of the false signals of buy conditions and allowing the stock price to approach lower entry points before signaling a buy resulting in a slightly higher return rate.


Now we can look at whether there was a potential for higher returns by looking at the period high stock price observed within the testing period instead of using the testing window closing price.

The fifth function, Three_Month_Period(), will generate the summary statistics using the high stock price in the testing period to determine returns.

3 Month Period Returns

Three_Month_Period <- function(temp_frame) {
        
        pos_ret_per <- nrow(temp_frame %>%
                            filter(Return_Hi > 0))

        neg_ret_per <- nrow(temp_frame %>%
                            filter(Return_Hi <= 0))

        pos_pct_per <- round(pos_ret_per / nrow(temp_frame), 3)

        avg_ret_per <- round(mean(temp_frame$Return_Hi[temp_frame$Return_Hi > 0]), 3) 
        
        
        print(paste("  Total Positive Returns:", pos_ret_per))
        print(paste("  Total Negative Returns:", neg_ret_per))
        print(paste("Percent Positive Returns:", pos_pct_per))
        print(paste(" Average Positive Return:", avg_ret_per))

}

FastK 3 Month Period Returns

Three_Month_Period(fastK_final)
## [1] "  Total Positive Returns: 144174"
## [1] "  Total Negative Returns: 10526"
## [1] "Percent Positive Returns: 0.932"
## [1] " Average Positive Return: 0.159"

The FastK signal yielded a potential average return for positive stocks of 15.9%. This is not an improvement on the same statistic observed when using the ending price to calculate returns. There is an improvement in the percentage of positive returns, 93.2%, however.

FastD 3 Month Period Returns

Three_Month_Period(fastD_final)
## [1] "  Total Positive Returns: 120318"
## [1] "  Total Negative Returns: 8614"
## [1] "Percent Positive Returns: 0.933"
## [1] " Average Positive Return: 0.164"

The FastD signal yielded a slightly higher potential average return for positive stocks of 16.4%, but not enough to claim a difference. There is an improvement in the percentage of positive returns, 93.3%.

SlowD 3 Month Period Returns

Three_Month_Period(slowD_final)
## [1] "  Total Positive Returns: 107405"
## [1] "  Total Negative Returns: 7425"
## [1] "Percent Positive Returns: 0.935"
## [1] " Average Positive Return: 0.166"

The FastD signal yielded the same potential average return for positive stocks of 16.6%. There is an improvement in the percentage of positive returns, 93.5%.


Of course, using the average return potentially creates a false expectation of returns because large outliers can skew the statistical result. So we can look at the positive returns broken out by percentages of occurence. This helps identify a potential threshold for returns within the 3 month window that could be used as an exit point for the stock. That is, if we employ a strategy that sought a 5% return, or bounceback, from the entry point of the stock, triggered by the SMI, what would be the percentage of investments that would yield the desired return?

FastK Quantiles for Positive Period Returns

quantile(fastK_final$Return_Hi[fastK_final$Return_Hi > 0], prob = seq(0, 1, length = 11), type = 5)
##    0%   10%   20%   30%   40%   50%   60%   70%   80%   90%  100% 
## 0.001 0.023 0.044 0.065 0.088 0.114 0.144 0.183 0.238 0.338 5.325

Around 75% of all observations had a 5% return or better. Around 55% had a return of 10% or better.

FastD Quantiles for Positive Returns

quantile(fastD_final$Return_Hi[fastD_final$Return_Hi > 0], prob = seq(0, 1, length = 11), type = 5)
##     0%    10%    20%    30%    40%    50%    60%    70%    80%    90% 
## 0.0010 0.0240 0.0450 0.0670 0.0910 0.1170 0.1490 0.1881 0.2450 0.3470 
##   100% 
## 5.8320

Around 75% of all observations had a 5% return or better. Around 55% had a return of 10% or better.

SlowD Quantiles for Positive Returns

quantile(slowD_final$Return_Hi[slowD_final$Return_Hi > 0], prob = seq(0, 1, length = 11), type = 5)
##    0%   10%   20%   30%   40%   50%   60%   70%   80%   90%  100% 
## 0.001 0.024 0.046 0.069 0.093 0.119 0.151 0.191 0.248 0.353 6.000

Around 75% of all observations had a 5% return or better. Around 55% had a return of 10% or better.

Overall, very little difference between the three signals when looking at percentage of positive returns. The SlowD signal simply shows a higher upside return for each of the deciles at and above 20%.


Finally, let’s look at the average positive returns by SMI using two levels:

FastK SMI Breakdown

round(mean(fastK_final$Return_Hi[fastK_final$Return_Hi > 0 &
                                 fastK_final$fastK > 10]), 3)
## [1] 0.154
round(mean(fastK_final$Return_Hi[fastK_final$Return_Hi > 0 &
                                 fastK_final$fastK <= 10]), 3)
## [1] 0.165

FastD SMI Breakdown

round(mean(fastD_final$Return_Hi[fastD_final$Return_Hi > 0 &
                                 fastD_final$fastD > 10]), 3)
## [1] 0.16
round(mean(fastD_final$Return_Hi[fastD_final$Return_Hi > 0 &
                                 fastD_final$fastD <= 10]), 3)
## [1] 0.181

SlowD SMI Breakdown

round(mean(slowD_final$Return_Hi[slowD_final$Return_Hi > 0 &
                                 slowD_final$slowD > 10]), 3)
## [1] 0.164
round(mean(slowD_final$Return_Hi[slowD_final$Return_Hi > 0 &
                                 slowD_final$slowD <= 10]), 3)
## [1] 0.194

The optimal return, 19.4%, is found using the SlowD signal and an SMI trigger of 10 or less.


Conclusion

The Stochastic Momentum Indicator, developed in the 1950’s, provides an effective signal in identifying oversold stocks with the potential for a bounceback in price. By applying smoothing of the FastK signal, the success rate and potential rates increase as the risk of false signals, by the FastK alone, decrease. The SMI has the ability to act as both a trigger for oversold conditions and a momentum indicator for identifying an appropriate level to trigger a buy.

As is the case with most of the signals for evaluating securities, there are many adjustments one can make to find the appropriate signal generator based on the desired type of strategy. The period can be altered to lengthen the signal evaluation, the smoothing period can be altered to provide a faster or slower sensitivity to the FastK, and the SMI trigger value can be adjusted to evaluate different types of conditions and their related returns.

As always, further analysis can be done to look at specific companies, sectors, or industries to determine whether any of those categories have tendencies towards better or worse results when using the SMI.




sessionInfo()
## R version 3.4.0 (2017-04-21)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 15063)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.1252 
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] knitr_1.16      lubridate_1.6.0 dplyr_0.5.0     quantmod_0.4-9 
## [5] TTR_0.23-1      xts_0.9-7       zoo_1.8-0      
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.11     magrittr_1.5     lattice_0.20-35  R6_2.2.1        
##  [5] rlang_0.1.1      stringr_1.2.0    highr_0.6        tools_3.4.0     
##  [9] grid_3.4.0       DBI_0.6-1        htmltools_0.3.6  lazyeval_0.2.0  
## [13] yaml_2.1.14      rprojroot_1.2    digest_0.6.12    assertthat_0.2.0
## [17] tibble_1.3.3     evaluate_0.10    rmarkdown_1.5    stringi_1.1.5   
## [21] compiler_3.4.0   backports_1.1.0