Forecasting Fortified Wine sales using linear regression. Step 1 is clearing down the local variables, and loading libraries.
rm(list = ls())
setwd("C:\\Users\\Rich\\SkyDrive\\Documents\\Futurelearn\\")
require(forecast)
require(imputeTS)
The next step is to load the data.
The next step creates a training period which ends in December 1993.
AllStart <- c(1980,1)
AllEnd <- c(1994,12)
trimmed.ts <-ts(trimmed, start=AllStart, end=AllEnd)
n.valid <- 12
n.train <- length(trimmed.ts) - n.valid
TrainStart <- c(1980,1)
TrainEnd <- c(1993,12)
Fortified.ts <- ts(trimmed$Fortified, start=AllStart, end=AllEnd, frequency =12)
TrainFortified.ts <- window(Fortified.ts, start=TrainStart, end=TrainEnd)
# Create the validation time periods
# This time I created variables for the time series partitions
valstart <- c(1994,1)
valend <- c(1994,12)
ValFortified.ts <- window(Fortified.ts, start=valstart, end=valend)
The next step is to fit a regression model to the sales of fortified wine. We have to chose which predictors to include.
Let’s take a quick look at the data first.
plot(Fortified.ts)
So, in eyeballing the data we can see strong seasonality and a trend included also. The
Fortified.Fit <- tslm(TrainFortified.ts ~ trend + season)
summary(Fortified.Fit)
Finally, let’s forecat the fortified fit for 12 periods.
Fortified.Fit.Forecast <- forecast(Fortified.Fit, h=12)
Now let’s plot the forecast with the actuals. If we take a quick look and plot the results we can see the difference.
plot(Fortified.Fit.Forecast)
lines(ValFortified.ts)
And finally a two step foreast for the final values, note that this forecast fully utilises the validation and training periods.
Fortified.Fit.Full <- tslm(Fortified.ts ~ trend + season)
FullForecast <- forecast(Fortified.Fit.Full, h = 2)
plot(FullForecast)
This under-forecasted, and even though I was way over on time on this one, I decided to go back and apply a different linear model.
Fortified.Fit.Full <- tslm(Fortified.ts ~ trend + I(trend^2) + season)
FullForecast <- forecast(Fortified.Fit.Full, h = 2)
plot(FullForecast)
FullForecast
Still not great, so how about transforming the data?
Fortified.Fit.Full <- tslm(Fortified.ts ~ trend + I(trend^2) + season, lambda =0)
FullForecast <- forecast(Fortified.Fit.Full, h = 2)
dev.off()
FullForecast
plot(FullForecast)