July 13, 2016

Using the tsoutliers package

  • Used to detect outliers in time series data
  • Detecting outliers is important because they impact model selection and future forecasts of the data
  • This library uses an automatic procedure for detecting outliers in time series data using the Chen & Liu method

Outlier Types

  • 5 types of outliers can be considered.
  • By default the tso function looks for additive outliers, level shifts, and temporary changes but in addition to those you can also choose to look for innovative outliers and seasonal level shifts.
  • An additive outlier is a surprisingly large or small value occurring for a single observation. Subsequent observations are unaffected by an additive outlier. Consecutive additive outliers are typically referred to additive outlier patches.
  • In a level shift outlier, all observations appearing after the outlier move to a new level. In contrast to additive outliers, a level shift outlier affects many observations and has a permanent effect.

Outlier Types

  • Temporary change outliers are similar to level shift outliers, but the effect of the outlier diminishes exponentially over the subsequent observations. Eventually, the series returns to its normal level.
  • An innovational outlier is an outlier with an initial impact with effects lingering over observations after that. The influence of the outliers may increase as time proceeds.
  • A seasonal additive outlier appears as a surprisingly large or small value occurring repeatedly at regular intervals.
  • By default the outliers chosen to find are additive, level shift, and temporary change

Tasking Priority 4 July 7

The graph here shows temporary change outliers at Day 4 and Day 8

RCS for Tasking Priorty 4 July 7

The graph here shows an additive outlier at Day 8

Correlate Adjusted plots

## [1] -0.4265296
## [1] 0.07809466
  • Before the adjustment the correlation was about -.4 meaning Tasking and RCS were moderately inversely proportional
  • After the adjustment the correlation is about .07 meaning there is little to no correlation after accounting for outliers
  • This tells us that the 2 fields are not correlated at all once outliers are taken into account

Conclusion

  • With this package we can find outliers in the data
  • With these outliers we can correlate outliers in certain fields with outliers in other fields
  • An example of this is on Day 8 there were outliers in both tasking priority and RCS which could mean that they are responsible for one another
  • Another possibility is that we can correlate the adjusted graphs of the data without the outliers and see if the outliers are the cause of our correlation

More documentation on the tsoutliers package can be found here https://cran.r-project.org/web/packages/tsoutliers/tsoutliers.pdf