Forecasting Strategies

Purpose

Bussiness rely on forecasts of sales to plan production, justify marketing decisions, and guide research. A very efficient method of forecasing one variable is to find a related variable that leads it by one or more time intervals. The closer the relationship and the longer the lead time, the better this strategy becomes. The trick is to find a suitable lead variable. An Australain example is the Building Approvals time series. This provides valuable information on the likely demand over the next few months for all sectors of the building industry. A variation on the strategy of seeking a leading variable is to find a variable that is associated with the variable we need to forecast and easier to predict. In many applications, we cannot rely on finding a suitable leading variable and have to try other methods. A second approach, common in marketing, is to use information about the sales of similar products in the past. The indfluential Bass diffusion model is based on this principle. A third strategy is to make extrapolations based on present trends continuing and to iplemet adaptive estimates of these trends. The statistical technicalities of forecasting are covered here and the puporse of this session is to introduce the general strategies that are available.

Building approvals and building activity time series

The data ApprovActiv.dat are the total dwellings approved per month, averaged over the past three months (chain volume measured in millions of Australian dollars at the reference year 2004-05 price), lablled “Activity”, from March 1996 until Sept 2006. We start by reading the data into R and then construct time series objects and plot the two series on the same graph using ts. plot.

www7 <- "ApprovActiv.dat"

build.dat <- read.table(www7, header = TRUE); attach(build.dat)

app.ts <- ts(Approvals, start = c(1996, 1), freq=4)

act.ts <- ts(Activity, start = c(1996, 1), freq=4)

ts.plot(app.ts, act.ts, lty = c(1,3))

Cross-correlation between building approvals and activity

The ts.union function binds tie series with a common frequency, padding with ‘NA’’s to the union of their time coverages. If ts.union is used within the acf command R retunrs the correlograms for the two variables and the cross-correlograms in a single figure.

acf(ts.union(app.ts, act.ts))

In the figure above, the acfs for x and y are in the upper left and lower right frames, respectively, and the ccfs are in the lower left and upper right frames. The time unit for lag is one year, so a correlation at a lag of one wuarter appers at 0.25. If the variables are independent, we would expect 5% of saple correlations to lie outside the dashed lines. Several of the cross-correlatons at negative lags do pass these lines, indicating that the approvals time series is leading the activity. Numerical values can be printed using the print() function. It may be that common trends and seasonal effects are precisely what we are looking for, but the population ccf is defined for stationary random processes and it is usual to remove the trend and seasonal effects before investigating cross-correlations. Here we remove the trend using decompose, which uses a centred moving average of the four quarters. However, since it has missing values object I am not able to get the table.

#app.ran <- decompose (app.ts)$random
#app.ran.ts <- window (app.ran, start = c(1996, 3) )
#act.ran <- decompose (act.ts)$random
#act.ran.ts <- window (act.ran, start = c(1996, 3) )

#acf (ts.union(app.ran.ts, act.ran.ts))
#ccf (app.ran.ts, act.ran.ts)

#print(acf(ts.union(app.ran.ts, act.ran.ts)))

Gas supply

Gas suppliers typically have to place orders from gas from offshore fields 24 hours ahead. Variation about the average use of gas, for the time of year, depends on temperature and, to some extent, humidity and wind speed.

Bass model

Interpretation of the Bass model

I won’t explain the mathematics behind the bass model but would go for the interpretation of the bass model. One interpretation of the bass model is that the time from product launch until purchase is assumed to have a probability distribution that can be parametrised in terms of p and q. A plot of sales per time unit against time is obtained by multiplying the probabilty density by the number of people, m, who eventually buy the product. Let f(t), F(t) and h(t) be the density, cumulative distribution function (cdf), and hazard, respectively, of the distribution of time until purchase. Won’t go into defining the hazard as well as I have to appeal to mathematical functions. But will go with an example.

T79 <- 1:10 
Tdelt <- (1:100) /10
Sales <- c(840, 1470, 2110, 4000, 7590, 10950, 10530, 9470, 7790, 5890)
Cusales <- cumsum(Sales)

Bass.nls <- nls(Sales ~ M * ( ((P+Q)^2/P) *exp(-(P+Q) * T79)) / (1+(Q/P) * exp(-(P+Q)*T79))^2 , start = list(M = 60630, P=0.03, Q=0.38))

summary(Bass.nls)

## 
## Formula: Sales ~ M * (((P + Q)^2/P) * exp(-(P + Q) * T79))/(1 + (Q/P) * 
##     exp(-(P + Q) * T79))^2
## 
## Parameters:
##    Estimate Std. Error t value Pr(>|t|)    
## M 6.798e+04  3.128e+03   21.74 1.10e-07 ***
## P 6.594e-03  1.430e-03    4.61  0.00245 ** 
## Q 6.381e-01  4.140e-02   15.41 1.17e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 727.2 on 7 degrees of freedom
## 
## Number of iterations to convergence: 8 
## Achieved convergence tolerance: 7.323e-06

The final estimates for m, p, q, rounded to two significant places, are 68000, 0.0066, and 0.64 respectively. The starting values for P and Q are p and q for typical product. We assume the sales figures are prone to error and estimate the total sales, m, setting the starting value for M to the recorded total sales. The data and fitted curve can be plotted using the code below.

Bcoef <- coef(Bass.nls)

m <- Bcoef[1]
p <- Bcoef[2]
q <- Bcoef[3]

ngete <- exp(-(p+q) *Tdelt)
Bpdf <- m * ((p+q)^2/p) * ngete / (1+(q/p) *ngete)^2
plot(Tdelt, Bpdf, xlab = "Year from 1979", 
                  ylab = "Sales per year", type = 'l')

points(T79, Sales)

Bcdf <- m * (1 - ngete)/(1+(q/p) * ngete)

plot(Tdelt, Bcdf, xlab = "Year from 1979", 
                  ylab = "Cumulative sales", type="l")

points(T79, Cusales)

Exponential smoothing & the Holt-Winters method

Exponential smoothing

Complaints to a motoring organisation

The number of letters of complaint received each month by a motoring organisation over the four years 1996 to 1999, At the beggining of the year 2000, the organisation wishes to estimate the current level of complaints and ivestigate any trend in the level of complaints. We should first plot the data, and even thought there are only fpour years of data we should check for any marled systematic trend or seasonal affects/

www8 <- "motororg.dat"
motor.dat <- read.table(www8, header = T); attach(motor.dat)

blocklenth <- 12
comp.ts <- ts(complaints, start = c(1996, 1), freq=blocklenth)

plot(comp.ts, xlab = "Time / months", ylab = "Complaints")

There is no evidence of a systematic trend or seasonal effect, so it seems reasonable to use exponential smoothing for this time series. Exponential smoothing is a spexial case of the Holt-Winters algorithm. We implement in R using the HoltWinters function with the additional parameters set to 0. If we do not specify a value of beta, R will find the value that minimises the one-step-ahead prediction error.

# I follow the book and the code does not work with gamma = 0 so I suppose to put it as FALSE 
comp.hw1 <- HoltWinters(complaints, beta = 0, gamma = FALSE);
comp.hw1

## Holt-Winters exponential smoothing with trend and without seasonal component.
## 
## Call:
## HoltWinters(x = complaints, beta = 0, gamma = FALSE)
## 
## Smoothing parameters:
##  alpha: 0.9999444
##  beta : 0
##  gamma: FALSE
## 
## Coefficients:
##   [,1]
## a   23
## b    7

comp.hw1$SSE

## [1] 5899.138

plot(comp.hw1)

The estimated value of the mean number of letters of complaint per month at the end of 1999 is 17.7. The value of alpah that gives a minimum SS1PE, of 5899, IS 0.99. We now compare these results with those obtained if we specify a value of beta of 0.2

comp.hw2 <- HoltWinters(complaints, beta = 0.2, gamma = FALSE);
comp.hw2

## Holt-Winters exponential smoothing with trend and without seasonal component.
## 
## Call:
## HoltWinters(x = complaints, beta = 0.2, gamma = FALSE)
## 
## Smoothing parameters:
##  alpha: 0.63402
##  beta : 0.2
##  gamma: FALSE
## 
## Coefficients:
##        [,1]
## a 20.679273
## b  0.791666

plot(comp.hw2)

comp.hw2$SSE

## [1] 3903.202

The estimated value of the mean number of letters of complaint per month at the end of 1999 is now 20, and the SS1PE has decreased to 3903. From the last plot, it seems that there was a decrease in the number of complaints at the start of the period and a slight rise towards the end, although this has not yet affected the exponentially weighted moving average.

Holt-Winters method

In R, the function HoltWinters can be used to estimate smoothing perameters for the Holt-Winters model by minimising the one-step-ahead prediction errors (SS1PE)

In order to make use of the Holt-Winters method, I will use the wine data, which is montly sales of Australian wine by category, in thousands of litres, from January 1980 until 1995. The categories are fortified white, dry white, sweet white, red, rose, and sparkling. Here we present results for the model with multiplicative seasonals only. The Holt-Winters componetns and fitted values are shown.

www9 <- "wine.dat"

wine.dat <- read.table(www9, header = TRUE); attach(wine.dat)
sweets.ts <- ts(sweetw, start = c(1980, 1), freq=12)

plot(sweets.ts, xlab ="Time(months", ylab = "sales (1000 litres)")

sweetw.hw <- HoltWinters(sweets.ts, seasonal = "mult")

sweetw.hw ; sweetw.hw$coef ; sweetw.hw$SSE

## Holt-Winters exponential smoothing with trend and multiplicative seasonal component.
## 
## Call:
## HoltWinters(x = sweets.ts, seasonal = "mult")
## 
## Smoothing parameters:
##  alpha: 0.4086698
##  beta : 0
##  gamma: 0.4929402
## 
## Coefficients:
##            [,1]
## a   285.6890314
## b     1.3509615
## s1    0.9498541
## s2    0.9767623
## s3    1.0275900
## s4    1.1991924
## s5    1.5463100
## s6    0.6730235
## s7    0.8925981
## s8    0.7557814
## s9    0.8227500
## s10   0.7241711
## s11   0.7434861
## s12   0.9472648

##           a           b          s1          s2          s3          s4 
## 285.6890314   1.3509615   0.9498541   0.9767623   1.0275900   1.1991924 
##          s5          s6          s7          s8          s9         s10 
##   1.5463100   0.6730235   0.8925981   0.7557814   0.8227500   0.7241711 
##         s11         s12 
##   0.7434861   0.9472648

## [1] 477693.9

In the figure above, there is a dramatic increase in sales in the second half of the 1980’s followed by a reduction to a level well above the starting values. the optimum values for the smoothing parameters, based on minimisiong the one-step- ahead prediction erros, are 0.4107 , 0.0001516, and 0.4695 for alpga, beta and y respectively. It follows that the level and seasonal variation adapt rapidly whereas the trend is slow to do so. The coefficietns are the estimated values, of the level, slope, and multiplicative seasonals from January to December available at the latest time point (t=n=187), and these are the values that will be used for predictions. Finally, we have calculated the mean square one-step-ahead prediction error, which equals 50, and have compared it with the standard deviation of the original time series which is 121. The decrease is substantioa, but a more testing comparison would be with the mean one-step-ahead prediction error if we forecast the next month’s sales as equal to this month’s sales.

The seasonal variation looks as though it would be better modelled as multiplicative, and comparison of the SS1PE for the fitted models confirms this. The Holt-Winters components and fitted values are show in the bellow figures.

sqrt(sweetw.hw$SSE/length(sweetw))

## [1] 50.54219

sd(sweetw)

## [1] 121.3908

plot(sweetw.hw$fitted)

plot(sweetw.hw)

Four-year-ahead forecasts for the air passenger data

The passenger data is used here. if the seassonal effect for the air passenger data appears to increase with the trend, which it is, this suggests that a “multiplicative” seasonal component be used in the Holt-Winters procedure. The Holt-Winters procedure fit is impressive - see below the figure. The predict function in R can be used with the fitted model to make forecasts into the future.

data("AirPassengers")
AP <- AirPassengers

ap.hw <- HoltWinters(AP, seasonal = "mult")

plot(ap.hw)

ap.predict <- predict(ap.hw, n.head = 4*12)

ts.plot(AP, ap.predict, lty = 1:2)

The estimates of the model parameters are obtained bellow. It should be noted that the extrapolated forecasts are based entirely on the trends in the period during which the model was fitted and would be a sensible prediction assuming these trends continue. Whilst the extrapolation in the first figure above looks visually appropriate, unforeseen events could lead to completely different future sales than those shown here.

ap.hw$alpha

##     alpha 
## 0.2755925

ap.hw$beta

##       beta 
## 0.03269295

ap.hw$gamma

##     gamma 
## 0.8707292

Time Series - Forecasting Strategies

Gabriel Burcea

09/04/2018