Horizon Plots

Line graphs are best suited for Time Series data. But if a large number of such plots have to be compactly presented on a cramped dashboard, they tend to become less useful. For example, here is the simulated daily sales volume of 10 outlets of a retail chain store, over a one year period.

suppressMessages(library(lattice))
set.seed(1)
n = 10  # number of retail stores
t = 365  # one year daily sales
dat = ts(matrix(cumsum(rnorm(t * n)), ncol = n))

But if you want to see all of them side by side for comparison, the graphs become so flattened that the trend is no longer visible:

xyplot(dat, scales = list(y = "same"), strip = FALSE, layout = c(1, 10))

plot of chunk unnamed-chunk-3

Horizon plots are ideally suited for this situation. Here is a quick example:

x = 1:300
y = x * sin(0.1 * x)
plot(x, y, type = "l")
abline(h = 0, col = "gray")
abline(h = 100, col = "lightgray", lty = 2)
abline(h = 200, col = "lightgray", lty = 2)
abline(h = -100, col = "lightgray", lty = 2)
abline(h = -200, col = "lightgray", lty = 2)

plot of chunk unnamed-chunk-4

Let us see how this simple graph is represented on a horizon plot.

suppressMessages(library(latticeExtra))
horizonplot(ts(y))

plot of chunk unnamed-chunk-5

Two things have happened here:

(1) Both positive and negative values are plotted above the x axis. Color is used to differentiate between them. Positives are blue, negatives are red. Think of it as an origami. You fold the paper horizontally at the center, so postive and negative parts are now packed within the upper half of the paper.

(2) Cut the graph along the dotted lines and fold them back. But now they will overlap. To differentiate between high and low numeric values, color saturation is used: darkest blue represents largest positive numbers; medium blue stands for moderate values; and pale blue represents the smallest positive values. Similarly on the negative side. Note that there are three shades each of blue and red, corresponding to the three levels of cut we have made.

Armed with this tool, we will recreate the previous retail outlet sales data using horizon plots.

horizonplot(ts(dat), horizonscale = 10, colorkey = TRUE, layout = c(1, n))

plot of chunk unnamed-chunk-6

It is immediately clear that outlets 1,9 and 10 have performed consistenly above the sales target, but outlets 2,5 and 6 are the laggards.