I’m pleased to announce that version 0.1.1 of TSstudio is available now on CRAN and Github.

The TSstudio package provides a set of tools for time series analysis, supporting “ts”, “mts”, “zoo”, and “xts” objects. It includes interactive visualizations tools based on the plotly package for descriptive time series analysis. In addition for the visualization tools, the package provides a set of utility functions for preprocessing time series objects.

A detailed examples of the package’s key functions is available here.

All the visualization tools in the package, support multiple time series objects (“ts”, “mts”, “zoo”, “xts” and model output from the forecast package) without the need of any data transformation. They include the following plots:

  1. Visualization of time series objects (ts_plot)
  2. Seasonality plots (ts_seasonal)
  3. Heatmap, surface and polar plots (ts_heatmap, ts_surface, ts_polar)
  4. Lags and correlation plots (ts_lags, ts_acf, ts_pacf)
  5. Decompose plot (ts_decompose)
  6. Residual plot (check_res)
  7. Forecast performance (test_forecast)

Besides the visualization functions, there are set of utility functions:

  1. ts_split - for splitting time series object to training and testing partitions
  2. ts_reshape - transforming time series objects to year/cycle unit data frame format
  3. xts_to_ts and zoo_to_ts - functions for transforming xts or zoo objects to ts format

Note: most of the current functions support only monthly and quarterly frequencies, the plan is to expend the ability to lower frequencies such as daily and hourly in the next release

The main emphasis of the package is simplifying the process, by getting maximum results with minimum effort (and code…). Here are some examples of the key functions of the package with the USgas data set (the US natural gas consumption between 2000 and 2017):

Installation

install.packages("TSstudio")

Visualization of time series objects

The ts_plot() designed to plot any univariate or multivariate “ts”, “zoo” and “xts” objects:

library(TSstudio)

data(USgas)

ts_plot(USgas)

In addition, this function has controlled parameters that allow the user to add to the plot additional features such as titles, color, line type and width as well as adding a slider. For example, lets add titles for the previous plot and slider:

ts_plot(USgas,
        title = "US Natural Gas Consumption 2000 - 2017",
        Xtitle = "Source: U.S. Bureau of Transportation Statistics",
        Ytitle = "Billion Cubic Feet",
        slider = TRUE
        )

Seasonal analysis

The package provides a variety of tools for seasonality analysis, which includes cycle box plots, seasonal heatmap, surface and polar plots.

The main function, the ts_seasonal(), has four modes:

  1. “normal” for splitting the series by full cycle units (e.g. years)
  2. “cycle” for splitting the series by cycle unites (e.g. months or quarters)
  3. “box” for box plot by cycle unites (e.g. months or quarters)
  4. “all” for combining all the options above together

The “all” option is my favorite mode for this plot as each plot may provide partial information about the seasonality and the trend of the series. Combining all three plots together allows to connect all the dots and see the full picture. For example, by looking at the basic time series plot above of the USgas dataset, we can see a clear seasonal pattern (which makes sense), however, the seasonal plot reveal additional insights:

  1. Overall during the winter period (e.g. Nov, Dec and Jan the variation in consumption increased compared to the summer period (e.g. Jun, Jul and Aug) as can be seen in the box-plot
  2. This variation is related to the general positive increase trend over time in consumption which can be seen in the cycle plot (the 2nd plot below). Furthermore, please note that the gap and order between the following months are kept the same in most of the cases.
  3. The first plot shows us a constant growth over time across all the months of the year, and the lines are almost perfectly ordered by the year (the 2000’s are at the button and the most recent years are at the top of the plot)
ts_seasonal(USgas, type = "all")

The heatmap is another way to look at seasonality. In the output of the ts_heatmap() function below you can notice:

  1. Overall, the colors across all months of the year changing toward the yellow scale, meaning consumption is increasing, which may be related to several factors such as population growth, increase in energy production with natural gas and weather effects
  2. January 2014 seems to be as an outlier point, which could be explained by the fact that 2014 was one of the coldest winters in North America.
ts_heatmap(USgas)

Correlation analysis

Another way to look at correlation of a time series data, beside the acf and pacf functions (which are also available in plotly format with the ts_acf() and ts_pacf() functions), is the use of lag plots. The ts_lags() function provides a set of plots of the current series with its previous lags. Setting the number of lags to 15, it is fairly easy to see that the 12 lag is highly correlated with the current series:

ts_lags(USgas, lag.max = 15)

Road map

While the main focus in the current and previous releases was on descriptive analysis tools, the next release of the package will focus on tools for predictive analysis such as cross validation and training methods for time series forecasting, automation of forecasting and model selection.