What you will learn
Time series data
Forecasting is estimating how the sequence of observations will continue into the future.
Creating time series objects in R
A time series can be thought of as a vector or matrix of numbers along with some information about what times those numbers were recorded. This information is stored in a ts
object in R. In most exercises, you will use time series that are part of existing packages. However, if you want to work with your own data, you need to know how to create a ts
object in R.
Let’s look at an example usnim_2002
below, containing net interest margins for US banks for the year 2002 (source: FFIEC).
> usnim_2002
usnim
1 2002-01-01 4.08
2 2002-04-01 4.10
3 2002-07-01 4.06
4 2002-10-01 4.04
> # ts(data, start, frequency, ...)
> usnim_ts = ts(usnim_2002[, 2], start = c(2002, 1), frequency = 4)
The function ts()
takes in three arguments:
usnim_2002
except for the date column; it isn’t needed since the ts
object will store time information separately.start
is set to the form c(year, period)
to indicate the time of the first observation. Here, January corresponds with period 1; likewise, a start date in April would refer to 2, July to 3, and October to 4. Thus, period corresponds to the quarter of the year.frequency
is set to 4 because the data are quarterly.In this exercise, you will read in some time series data from an xlsx file using read_excel()
, a function from the readxl
package, and store the data as a ts
object. Both the xlsx file and package have been loaded into your workspace.
# Read the data from Excel into R
mydata <- read_excel("exercise1.xlsx")
# Look at the first few lines of mydata
head(mydata)
head(mydata)
Sales AdBudget GDP
1 Mar-81 1020.2 659.2 251.8
2 Jun-81 889.2 589.0 290.9
3 Sep-81 795.0 512.5 290.8
4 Dec-81 1003.9 614.1 292.4
5 Mar-82 1057.7 647.2 279.1
6 Jun-82 944.4 602.0 254.0
# Create a ts object called myts
myts <- ts(mydata[, 2:4], start = c(1981, 1), frequency = 4)
You have now created a time series object from external data.
Time series plots
The first step in any data analysis task is to plot the data. Graphs enable you to visualize many features of the data, including patterns, unusual observations, changes over time, and relationships between variables. Just as the type of data determines which forecasting method to use, it also determines which graphs are appropriate.
You can use the autoplot()
function to produce a time plot of the data with or without facets, or panels that display different subsets of data:
> autoplot(usnim_2002, facets = FALSE)
The above method is one of the many taught in this course that accepts boolean arguments. Both T
and TRUE
mean “true”, and F
and FALSE
mean “false”, however, T
and F
can be overwritten in your code. Therefore, you should only rely on TRUE
and FALSE
to set your indicators for the remainder of the course.
You will use two more functions in this exercise, which.max()
and frequency()
. which.max()
can be used to identify the smallest index of the maximum value
> x <- c(4, 5, 5)
> which.max(x)
[1] 2
To find the number of observations per unit time, use frequency()
. Recall the usnim_2002
data from the previous exercise:
> frequency(usnim_2002)
[1] 4
Because this course involves the use of the forecast
and ggplot2
packages, they have been loaded into your workspace for you, as well as myts
from the previous exercise and the following three series (available in the package forecast
):
gold
containing gold prices in US dollars woolyrnq
containing information on the production of woollen yarn in Australia gas
containing Australian gas production
# Plot the data with facetting
autoplot(myts, facets = TRUE)
# Plot the data without facetting
autoplot(myts, facets = FALSE)
library(pacman)
p_load(forecast, ggplot2)
# Plot the three series
ggplot2::autoplot(gold)
autoplot(woolyrnq)
autoplot(gas)
# Find the outlier in the gold series
goldoutlier <- which.max(gold)
# Look at the seasonal frequencies of the three series
frequency(gold)
## [1] 1
frequency(woolyrnq)
## [1] 4
frequency(gas)
## [1] 12
You can now produce time series plots as well as identify seasonal frequencies.
Seasonal plots
Along with time plots, there are other useful ways of plotting data to emphasize seasonal patterns and show changes in these patterns over time.
ggseasonplot()
function the same way you do with autoplot()
.polar
argument and set it to TRUE
.One way of splitting a time series is by using the window()
function, which extracts a subset from the object x
observed between the times start
and end
.
> window(x, start = NULL, end = NULL)
In this exercise, you will load the fpp2
package and use two of its datasets:
a10
contains monthly sales volumes for anti-diabetic drugs in Australia. In the plots, can you see which month has the highest sales volume each year? What is unusual about the results in March and April 2008?ausbeer
which contains quarterly beer production for Australia. What is happening to the beer production in Quarter 4?These examples will help you to visualize these plots and understand how they can be useful.
# Load the fpp2 package
p_load(fpp2)
# Create plots of the a10 data
autoplot(a10)
ggseasonplot(a10)
# Produce a polar coordinate season plot for the a10 data
ggseasonplot(a10, polar = TRUE)
# Restrict the ausbeer data to start in 1992
beer <- window(ausbeer, start = 1992)
# Make plots of the beer data
autoplot(beer)
ggsubseriesplot(beer)
You can now produce three different types of plots for studying seasonality.
Trends, seasonality, and cyclicity
Seasonal or cyclic?
Differences between seasonal and cyclic patterns:
The timing of peaks and troughs is predictable with seasonal data, but unpredictable in the long term with cyclic data.
Autocorrelation of non-seasonal time series
Another way to look at time series data is to plot each observation against another observation that occurred some time previously by using gglagplot()
. For example, you could plot \(y_t\) against \(y_{t−1}\). This is called a lag plot because you are plotting the time series against lags of itself.
The correlations associated with the lag plots form what is called the autocorrelation function (ACF). The ggAcf()
function produces ACF plots.
In this exercise, you will work with the pre-loaded oil
data (available in the package fpp2
), which contains the annual oil production in Saudi Arabia from 1965-2013 (measured in millions of tons).
# Create an autoplot of the oil data
autoplot(fpp2::oil)
# Create a lag plot of the oil data
gglagplot(oil)
# Create an ACF plot of the oil data
ggAcf(oil)
You can now produce plots for lags and autocorrelations.
Autocorrelation of seasonal and cyclic time series
When data are either seasonal or cyclic, the ACF will peak around the seasonal lags or at the average cycle length.
You will investigate this phenomenon by plotting the annual sunspot series (which follows the solar cycle of approximately 10-11 years) in sunspot.year
and the daily traffic to the Hyndsight blog (which follows a 7-day weekly pattern) in hyndsight
. Both objects have been loaded into your workspace.
# Plots of annual sunspot numbers
autoplot(sunspot.year)
ggAcf(sunspot.year)
# Save the lag corresponding to maximum autocorrelation
maxlag_sunspot <- 1
# Plot the traffic on the Hyndsight blog
autoplot(hyndsight)
ggAcf(hyndsight)
# Save the lag corresponding to maximum autocorrelation
maxlag_hyndsight <- 7
You can now visualize seasonality in the ACF plots.
White noise
Ljung-Box Test
The Llung-Box test considers the first h autocorrelation values together
A significant test (small p-value) indicates the data are probably not white noise.
White noise summary
Stock prices and white noise
As you learned in the video, white noise is a term that describes purely random data. You can conduct a Ljung-Box test using the function below to confirm the randomness of a series; a p-value greater than 0.05 suggests that the data are not significantly different from white noise.
> Box.test(pigs, lag = 24, fitdf = 0, type = "Ljung")
There is a well-known result in economics called the “Efficient Market Hypothesis” that states that asset prices reflect all available information. A consequence of this is that the daily changes in stock prices should behave like white noise (ignoring dividends, interest rates and transaction costs). The consequence for forecasters is that the best forecast of the future price is the current price.
You can test this hypothesis by looking at the goog
series, which contains the closing stock price for Google over 1000 trading days ending on February 13, 2017. This data has been loaded into your workspace.
# Plot the original series
autoplot(goog)
# Plot the differenced series
autoplot(diff(goog))
# ACF of the differenced series
ggAcf(diff(goog))
# Ljung-Box test of the differenced series
Box.test(diff(goog), lag = 10, type = "Ljung")
##
## Box-Ljung test
##
## data: diff(goog)
## X-squared = 13.123, df = 10, p-value = 0.2169
You have seen that the Ljung-Box test was not significant, so daily changes in the Google stock price look like white noise.
Forecasts and potential futures
Naive forecasting methods
As you learned in the video, a forecast is the mean or median of simulated futures of a time series.
The very simplest forecasting method is to use the most recent observation; this is called a naive forecast and can be implemented in a namesake function. This is the best that can be done for many time series including most stock price data, and even if it is not a good forecasting method, it provides a useful benchmark for other forecasting methods.
For seasonal data, a related idea is to use the corresponding season from the last year of data. For example, if you want to forecast the sales volume for next March, you would use the sales volume from the previous March. This is implemented in the snaive()
function, meaning, seasonal naive.
For both forecasting methods, you can set the second argument h
, which specifies the number of values you want to forecast; as shown in the code below, they have different default values. The resulting output is an object of class forecast
. This is the core class of objects in the forecast
package, and there are many functions for dealing with them including summary()
and autoplot()
.
naive(y, h = 10)
snaive(y, h = 2 * frequency(x))
You will try these two functions on the goog
series and ausbeer
series, respectively. These are available to use in your workspace.
# Use naive() to forecast the goog series
fcgoog <- forecast::naive(goog, h = 20)
# Plot and summarize the forecasts
ggplot2::autoplot(fcgoog)
summary(fcgoog)
##
## Forecast method: Naive method
##
## Model Information:
## Call: forecast::naive(y = goog, h = 20)
##
## Residual sd: 8.7343
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.4212612 8.734286 5.829407 0.06253998 0.9741428 1 0.03871446
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 1001 813.67 802.4765 824.8634 796.5511 830.7889
## 1002 813.67 797.8401 829.4999 789.4602 837.8797
## 1003 813.67 794.2824 833.0576 784.0192 843.3208
## 1004 813.67 791.2831 836.0569 779.4322 847.9078
## 1005 813.67 788.6407 838.6993 775.3910 851.9490
## 1006 813.67 786.2518 841.0882 771.7374 855.6025
## 1007 813.67 784.0549 843.2850 768.3777 858.9623
## 1008 813.67 782.0102 845.3298 765.2505 862.0895
## 1009 813.67 780.0897 847.2503 762.3133 865.0266
## 1010 813.67 778.2732 849.0667 759.5353 867.8047
## 1011 813.67 776.5456 850.7944 756.8931 870.4469
## 1012 813.67 774.8948 852.4452 754.3684 872.9715
## 1013 813.67 773.3115 854.0285 751.9470 875.3930
## 1014 813.67 771.7880 855.5520 749.6170 877.7230
## 1015 813.67 770.3180 857.0220 747.3688 879.9711
## 1016 813.67 768.8962 858.4437 745.1944 882.1455
## 1017 813.67 767.5183 859.8217 743.0870 884.2530
## 1018 813.67 766.1802 861.1597 741.0407 886.2993
## 1019 813.67 764.8789 862.4610 739.0505 888.2895
## 1020 813.67 763.6114 863.7286 737.1120 890.2280
# Use snaive() to forecast the ausbeer series
fcbeer <- forecast::snaive(ausbeer, h = 16)
# Plot and summarize the forecasts
ggplot2::autoplot(fcbeer)
summary(fcbeer)
##
## Forecast method: Seasonal naive method
##
## Model Information:
## Call: forecast::snaive(y = ausbeer, h = 16)
##
## Residual sd: 19.3259
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 3.098131 19.32591 15.50935 0.838741 3.69567 1 0.01093868
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2010 Q3 419 394.2329 443.7671 381.1219 456.8781
## 2010 Q4 488 463.2329 512.7671 450.1219 525.8781
## 2011 Q1 414 389.2329 438.7671 376.1219 451.8781
## 2011 Q2 374 349.2329 398.7671 336.1219 411.8781
## 2011 Q3 419 383.9740 454.0260 365.4323 472.5677
## 2011 Q4 488 452.9740 523.0260 434.4323 541.5677
## 2012 Q1 414 378.9740 449.0260 360.4323 467.5677
## 2012 Q2 374 338.9740 409.0260 320.4323 427.5677
## 2012 Q3 419 376.1020 461.8980 353.3932 484.6068
## 2012 Q4 488 445.1020 530.8980 422.3932 553.6068
## 2013 Q1 414 371.1020 456.8980 348.3932 479.6068
## 2013 Q2 374 331.1020 416.8980 308.3932 439.6068
## 2013 Q3 419 369.4657 468.5343 343.2438 494.7562
## 2013 Q4 488 438.4657 537.5343 412.2438 563.7562
## 2014 Q1 414 364.4657 463.5343 338.2438 489.7562
## 2014 Q2 374 324.4657 423.5343 298.2438 449.7562
You have naively forecasted non-seasonal and seasonal time series!
Fitted values and residuals
A fitted value is the forecast of an observation using all previous observations
A residual is the difference between an observation and its fitted value
Residuals should look like white noise
Essential assumptions:
Useful properties (for computing prediction intervals):
We can test these assumptions using the checkresiduals()
function.
Checking time series residuals
When applying a forecasting method, it is important to always check that the residuals are well-behaved (i.e., no outliers or patterns) and resemble white noise. The prediction intervals are computed assuming that the residuals are also normally distributed. You can use the checkresiduals()
function to verify these characteristics; it will give the results of a Ljung-Box test.
You haven’t used the pipe function (%>%
) so far, but this is a good opportunity to introduce it. When there are many nested functions, pipe functions make the code much easier to read. To be consistent, always follow a function with parentheses to differentiate it from other objects, even if it has no arguments. An example is below:
> function(foo) # These two
> foo %>% function() # are the same!
> foo %>% function # Inconsistent
In this exercise, you will test the above functions on the forecasts equivalent to what you produced in the previous exercise (fcgoog
obtained after applying naive()
to goog
, and fcbeer
obtained after applying snaive()
to ausbeer
).
# Check the residuals from the naive forecasts applied to the goog series
goog %>% naive() %>% checkresiduals()
##
## Ljung-Box test
##
## data: Residuals from Naive method
## Q* = 13.123, df = 10, p-value = 0.2169
##
## Model df: 0. Total lags used: 10
# Do they look like white noise (TRUE or FALSE)
googwn <- TRUE
# Check the residuals from the seasonal naive forecasts applied to the ausbeer series
ausbeer %>% snaive() %>% checkresiduals()
##
## Ljung-Box test
##
## data: Residuals from Seasonal naive method
## Q* = 60.535, df = 8, p-value = 3.661e-10
##
## Model df: 0. Total lags used: 8
# Do they look like white noise (TRUE or FALSE)
beerwn <- FALSE
You have naively forecasted non-seasonal and seasonal time series.
Training and test sets
Forecast errors
Forecast “error” = the difference between observed value and its forecast in the test set
\(\neq\) residuals
Compute accuracy using forecast errors on test data
MAE and MSE cannot be used to compare series due to scaling problems
MAPE and MASE can be used to compare series - however, MAPE assumes a natural zero point for the data, and no negative values
The accuracy() command
accuracy(fc, test)
Evaluating forecast accuracy of non-seasonal methods
In data science, a training set is a data set that is used to discover possible relationships. A test set is a data set that is used to verify the strength of these potential relationships. When you separate a data set into these parts, you generally allocate more of the data for training, and less for testing.
One function that can be used to create training and test sets is subset.ts()
, which returns a subset of a time series where the optional start and end arguments are specified using index values.
> # x is a numerical vector or time series
> # To subset observations from 101 to 500
> train <- subset(x, start = 101, end = 500, ...)
> # To subset the first 500 observations
> train <- subset(x, end = 500, ...)
As you saw in the video, another function, accuracy()
, computes various forecast accuracy statistics given the forecasts and the corresponding actual observations. It is smart enough to find the relevant observations if you give it more than the ones you are forecasting.
> # f is an object of class "forecast"
> # x is a numerical vector or time series
> accuracy(f, x, ...)
The accuracy measures provided include root mean squared error (RMSE) which is the square root of the mean squared error (MSE). Minimizing RMSE, which corresponds with increasing accuracy, is the same as minimizing MSE.
The pre-loaded time series gold
comprises daily gold prices for 1108 days. Here, you’ll use the first 1000 days as a training set, and compute forecasts for the remaining 108 days. These will be compared to the actual values for these days using the simple forecasting functions naive()
, which you used earlier in this chapter, and meanf()
, which gives forecasts equal to the mean of all observations. You’ll have to specify the keyword h
(which specifies the number of values you want to forecast) for both.
# Create the training data as train
train <- subset(gold, end = 1000)
# Compute naive forecasts and save to naive_fc
naive_fc <- naive(train, h = 108)
# Compute mean forecasts and save to mean_fc
mean_fc <- meanf(train, h = 108)
# Use accuracy() to compute RMSE statistics
accuracy(naive_fc, gold)
## ME RMSE MAE MPE MAPE MASE
## Training set 0.1079897 6.358087 3.20366 0.0201449 0.8050646 1.014334
## Test set -6.5383495 15.842361 13.63835 -1.7462269 3.4287888 4.318139
## ACF1 Theil's U
## Training set -0.3086638 NA
## Test set 0.9793153 5.335899
accuracy(mean_fc, gold)
## ME RMSE MAE MPE MAPE MASE
## Training set -4.239671e-15 59.17809 53.63397 -2.390227 14.230224 16.981449
## Test set 1.319363e+01 19.55255 15.66875 3.138577 3.783133 4.960998
## ACF1 Theil's U
## Training set 0.9907254 NA
## Test set 0.9793153 6.123788
# Assign one of the two forecasts as bestforecasts
bestforecasts <- naive_fc
You have compared two types of forecasts based on their RMSE.
Evaluating forecast accuracy of seasonal methods
As you learned in the first chapter, the window()
function specifies the start
and end
of a time series using the relevant times rather than the index values. Either of those two arguments can be formatted as a vector like c(year, period)
which you have also previously used as an argument for ts()
. Again, period refers to quarter here.
Here, you will use the Melbourne quarterly visitor numbers (vn[, "Melbourne"]
) to create three different training sets, omitting the last 1, 2 and 3 years, respectively. Inspect the pre-loaded vn
data in your console before beginning the exercise; this will help you determine the correct value to use for the keyword h
(which specifies the number of values you want to forecast) in your forecasting methods.
Then for each training set, compute the next year of data, and finally compare the mean absolute percentage error (MAPE) of the forecasts using accuracy()
. Why do you think that the MAPE vary so much?
# Create three training series omitting the last 1, 2, and 3 years
train1 <- window(vn[, "Melbourne"], end = c(2014, 4))
train2 <- window(vn[, "Melbourne"], end = c(2013, 4))
train3 <- window(vn[, "Melbourne"], end = c(2012, 4))
# Produce forecasts using snaive()
fc1 <- snaive(train1, h = 4)
fc2 <- snaive(train2, h = 4)
fc3 <- snaive(train3, h = 4)
# Use accuracy() to compare the MAPE of each series
accuracy(fc1, vn[, "Melbourne"])["Test set", "MAPE"]
accuracy(fc2, vn[, "Melbourne"])["Test set", "MAPE"]
accuracy(fc3, vn[, "Melbourne"])["Test set", "MAPE"]
accuracy(fc1, vn[, "Melbourne"])["Test set", "MAPE"]
[1] 5.477917
accuracy(fc2, vn[, "Melbourne"])["Test set", "MAPE"]
[1] 12.50071
accuracy(fc3, vn[, "Melbourne"])["Test set", "MAPE"]
[1] 7.950202
Seasonal accuracy is just as easy to compare.
Time series cross-validation
tsCV function
MSE using time series cross-validation
Using tsCV() for time series cross-validation
The tsCV()
function computes time series cross-validation errors. It requires you to specify the time series, the forecast method, and the forecast horizon. Here is the example used in the video:
> e = tsCV(oil, forecastfunction = naive, h = 1)
Here, you will use tsCV()
to compute and plot the MSE values for up to 8 steps ahead, along with the naive()
method applied to the goog
data. The exercise uses ggplot2
graphics which you may not be familiar with, but we have provided enough of the code so you can work out the rest.
Be sure to reference the slides on tsCV()
in the lecture. The goog
data has been loaded into your workspace.
# Compute cross-validated errors for up to 8 steps ahead
e <- tsCV(goog, forecastfunction = naive, h = 8)
# Compute the MSE values and remove missing values
mse <- colMeans(e^2, na.rm = TRUE)
# Plot the MSE values against the forecast horizon
data.frame(h = 1:8, MSE = mse) %>%
ggplot(aes(x = h, y = MSE)) + geom_point()
Visualizations are a great way to get a deeper understanding of your model.
Exponentially weighted forecasts
simple exponential smoothing
Forecasting Notation: \(\hat{y}_{t+h|t}\) = point forecast of \(\hat{y}_{t+h}\) given data \(y_1,...,y_t\)
\(\alpha\) indicates weights on more recent data, with higher \(\alpha\) indicating higher weights with faster decay over time, and lower weights indicating lower weights with slower decay over time.
two output parameters for ses()
function: - l - alpha
# Use ses() to forecast the next 10 years of winning times
fc <- ses(marathon, h = 10)
# Use summary() to see the model parameters
summary(fc)
##
## Forecast method: Simple exponential smoothing
##
## Model Information:
## Simple exponential smoothing
##
## Call:
## ses(y = marathon, h = 10)
##
## Smoothing parameters:
## alpha = 0.3457
##
## Initial states:
## l = 167.1741
##
## sigma: 5.519
##
## AIC AICc BIC
## 988.4474 988.6543 996.8099
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.8874349 5.472771 3.826294 -0.7097395 2.637644 0.8925685
## ACF1
## Training set -0.01211236
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2017 130.3563 123.2835 137.4292 119.5394 141.1733
## 2018 130.3563 122.8727 137.8399 118.9111 141.8015
## 2019 130.3563 122.4833 138.2293 118.3156 142.3970
## 2020 130.3563 122.1123 138.6003 117.7482 142.9644
## 2021 130.3563 121.7573 138.9553 117.2053 143.5074
## 2022 130.3563 121.4164 139.2963 116.6839 144.0288
## 2023 130.3563 121.0880 139.6247 116.1816 144.5310
## 2024 130.3563 120.7708 139.9418 115.6966 145.0161
## 2025 130.3563 120.4639 140.2488 115.2271 145.4856
## 2026 130.3563 120.1661 140.5466 114.7717 145.9409
# Use autoplot() to plot the forecasts
autoplot(fc)
# Add the one-step forecasts for the training data to the plot
autoplot(fc) + autolayer(fitted(fc))
You just forecasted using simple exponential smoothing.
SES vs naive
In this exercise, you will apply your knowledge of training and test sets, the subset()
function, and the accuracy()
function, all of which you learned in Chapter 2, to compare SES and naive forecasts for the marathon
data.
You did something very similar to compare the naive and mean forecasts in an earlier exercise “Evaluating forecast accuracy of non-seasonal methods”.
Let’s review the process:
h
equal to the number of values you want to forecast, which is also the length of the test set.accuracy()
function with the forecast as the first argument and original data (or test set) as the second.The marathon data is loaded into your workspace.
# Create a training set using subset()
train <- subset(marathon, end = length(marathon) - 20)
# Compute SES and naive forecasts, save to fcses and fcnaive
fcses <- ses(train, h = 20)
fcnaive <- naive(train, h = 20)
# Calculate forecast accuracy measures
accuracy(fcses, marathon)
## ME RMSE MAE MPE MAPE MASE
## Training set -1.0851741 5.863790 4.155948 -0.8603998 2.827993 0.8990906
## Test set 0.4574579 2.493971 1.894237 0.3171919 1.463862 0.4097960
## ACF1 Theil's U
## Training set -0.01595953 NA
## Test set -0.12556096 0.6870735
accuracy(fcnaive, marathon)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.4638047 6.904742 4.622391 -0.4086317 3.123559 1.0000000
## Test set 0.2266667 2.462113 1.846667 0.1388780 1.429608 0.3995047
## ACF1 Theil's U
## Training set -0.3589323 NA
## Test set -0.1255610 0.6799062
# Save the best forecasts as fcbest
fcbest <- fcnaive
More complex models aren’t always better!
Exponential smoothing methods with trend
Holt’s method in R
holt()
Damped trend method
holt(damped = TRUE)
Holt’s trend methods
Holt’s local trend method is implemented in the holt()
function:
> holt(y, h = 10, ...)
Here, you will apply it to the austa
series, which contains annual counts of international visitors to Australia from 1980-2015 (in millions). The data has been pre-loaded into your workspace.
# Produce 10 year forecasts of austa using holt()
fcholt <- holt(austa, h = 10)
# Look at fitted model using summary()
summary(fcholt)
##
## Forecast method: Holt's method
##
## Model Information:
## Holt's method
##
## Call:
## holt(y = austa, h = 10)
##
## Smoothing parameters:
## alpha = 0.9999
## beta = 0.0085
##
## Initial states:
## l = 0.656
## b = 0.1706
##
## sigma: 0.1952
##
## AIC AICc BIC
## 17.14959 19.14959 25.06719
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.00372838 0.1840662 0.1611085 -1.222083 5.990319 0.7907078
## ACF1
## Training set 0.2457733
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2016 7.030683 6.780483 7.280882 6.648036 7.413330
## 2017 7.202446 6.847114 7.557778 6.659013 7.745879
## 2018 7.374209 6.937169 7.811249 6.705814 8.042604
## 2019 7.545972 7.039179 8.052765 6.770899 8.321045
## 2020 7.717736 7.148723 8.286748 6.847506 8.587965
## 2021 7.889499 7.263543 8.515455 6.932181 8.846816
## 2022 8.061262 7.382302 8.740222 7.022882 9.099642
## 2023 8.233025 7.504136 8.961915 7.118285 9.347766
## 2024 8.404788 7.628444 9.181133 7.217472 9.592105
## 2025 8.576552 7.754792 9.398311 7.319779 9.833324
# Plot the forecasts
autoplot(fcholt)
# Check that the residuals look like white noise
checkresiduals(fcholt)
##
## Ljung-Box test
##
## data: Residuals from Holt's method
## Q* = 4.8886, df = 3, p-value = 0.1801
##
## Model df: 4. Total lags used: 7
Do you think the Holt’s trend model gives a good forecast?
Exponential smoothing methods with trend and seasonality
Taxonomy of exponential smoothing methods
Holt-Winters with monthly data
In the video, you learned that the hw()
function produces forecasts using the Holt-Winters method specific to whatever you set equal to the seasonal
argument:
fc1 <- hw(aust, seasonal = "additive")
fc2 <- hw(aust, seasonal = "multiplicative")
Here, you will apply hw()
to a10
, the monthly sales of anti-diabetic drugs in Australia from 1991 to 2008. The data are available in your workspace.
# Plot the data
autoplot(a10)
# Produce 3 year forecasts
fc <- hw(a10, seasonal = "multiplicative", h = 36)
# Check if residuals look like white noise
checkresiduals(fc)
##
## Ljung-Box test
##
## data: Residuals from Holt-Winters' multiplicative method
## Q* = 75.764, df = 8, p-value = 3.467e-13
##
## Model df: 16. Total lags used: 24
whitenoise <- FALSE
# Plot forecasts
autoplot(fc)
The forecasts might still provide useful information even with residuals that fail the white noise test.
Holt-Winters method with daily data
The Holt-Winters method can also be used for daily type of data, where the seasonal pattern is of length 7, and the appropriate unit of time for h
is in days.
Here, you will compare an additive Holt-Winters method and a seasonal naive()
method for the hyndsight
data, which contains the daily pageviews on the Hyndsight blog for one year starting April 30, 2014. The data are available in your workspace.
# Create training data with subset()
train <- subset(hyndsight, end = length(hyndsight) - 28)
# Holt-Winters additive forecasts as fchw
fchw <- hw(train, seasonal = "additive", h = 28)
# Seasonal naive forecasts as fcsn
fcsn <- snaive(train, h = 28)
# Find better forecasts with accuracy()
accuracy(fchw, hyndsight)
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -3.976241 228.2440 165.0244 -2.407211 13.9955 0.7492131 0.1900853
## Test set -3.999460 201.7656 152.9584 -3.218292 10.5558 0.6944332 0.3013328
## Theil's U
## Training set NA
## Test set 0.4868701
accuracy(fcsn, hyndsight)
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 10.50 310.3282 220.2636 -2.1239387 18.01077 1.0000000 0.4255730
## Test set 0.25 202.7610 160.4643 -0.6888732 10.25880 0.7285101 0.3089795
## Theil's U
## Training set NA
## Test set 0.450266
# Plot the better forecasts
autoplot(fchw)
Could we do better than both of these models?
State space models for exponential smoothing
ETS models
ets()
Automatic forecasting with exponential smoothing
The namesake function for finding errors, trend, and seasonality (ETS) provides a completely automatic way of producing forecasts for a wide range of time series.
You will now test it on two series, austa
and hyndsight
, that you have previously looked at in this chapter. Both have been pre-loaded into your workspace.
# Fit ETS model to austa in fitaus
fitaus <- ets(austa)
# Check residuals
checkresiduals(fitaus)
##
## Ljung-Box test
##
## data: Residuals from ETS(A,A,N)
## Q* = 4.8886, df = 3, p-value = 0.1801
##
## Model df: 4. Total lags used: 7
# Plot forecasts
autoplot(forecast(fitaus))
# Repeat for hyndsight data in fiths
fiths <- ets(hyndsight)
checkresiduals(fiths)
##
## Ljung-Box test
##
## data: Residuals from ETS(A,N,A)
## Q* = 68.616, df = 5, p-value = 1.988e-13
##
## Model df: 9. Total lags used: 14
autoplot(forecast(fiths))
# Which model(s) fails test? (TRUE or FALSE)
fitausfail <- FALSE
fithsfail <- TRUE
The automatic forecasting is great for getting quick results.
ETS vs seasonal naive
Here, you will compare ETS forecasts against seasonal naive forecasting for 20 years of cement
, which contains quarterly cement production using time series cross-validation for 4 steps ahead. Because this takes a while to run, a shortened version of the cement
series will be available in your workspace.
The second argument for tsCV()
must return a forecast object, so you need a function to fit a model and return forecasts. Recall:
> args(tsCV)
function (y, forecastfunction, h = 1, ...)
In this exercise you will use an existing forecasting function as well as one that has been created for you. Remember, sometimes simple methods work better than more sophisticated methods!
# Function to return ETS forecasts
fets <- function(y, h) {
forecast(ets(y), h = h)
}
# Apply tsCV() for both methods
e1 <- tsCV(cement, fets, h = 4)
e2 <- tsCV(cement, snaive, h = 4)
# Compute MSE of resulting errors (watch out for missing values)
mean(e1^2, na.rm = TRUE)
mean(e2^2, na.rm = TRUE)
[1] 0.0329896
[1] 0.03046892
# Copy the best forecast MSE
bestmse <- mean(e2^2, na.rm = TRUE)
Complex isn’t always better.
When does ETS fail?
Computing the ETS does not work well for all series.
Here, you will observe why it does not work well for the annual Canadian lynx population available in your workspace as lynx
.
# Plot the lynx series
autoplot(lynx)
# Use ets() to model the lynx series
fit <- ets(lynx)
# Use summary() to look at model and parameters
summary(fit)
## ETS(M,N,N)
##
## Call:
## ets(y = lynx)
##
## Smoothing parameters:
## alpha = 0.9999
##
## Initial states:
## l = 2372.8047
##
## sigma: 0.9594
##
## AIC AICc BIC
## 2058.138 2058.356 2066.346
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 8.975647 1198.452 842.0649 -52.12968 101.3686 1.013488 0.3677583
# Plot 20-year forecasts of the lynx series
fit %>% forecast(h = 20) %>% autoplot()
It’s important to realize that ETS doesn’t work for all cases.
Transformations for variance stabilization
Variance stabilization
BoxCox.lambda()
Box-Cox transformations for time series
Here, you will use a Box-Cox transformation to stabilize the variance of the pre-loaded a10
series, which contains monthly anti-diabetic drug sales in Australia from 1991-2008.
In this exercise, you will need to experiment to see the effect of the lambda
\((\lambda)\) argument on the transformation. Notice that small changes in \(\lambda\) make little difference to the resulting series. You want to find a value of \(\lambda\) that makes the seasonal fluctuations of roughly the same size across the series.
Recall from the video that the recommended range for lambda
values is \(−1 \leq \lambda \leq 1\).
# Plot the series
autoplot(a10)
# Try four values of lambda in Box-Cox transformations
a10 %>% BoxCox(lambda = 0.0) %>% autoplot()
a10 %>% BoxCox(lambda = 0.1) %>% autoplot()
a10 %>% BoxCox(lambda = 0.2) %>% autoplot()
a10 %>% BoxCox(lambda = 0.3) %>% autoplot()
# Compare with BoxCox.lambda()
BoxCox.lambda(a10)
## [1] 0.1313326
It seems like a lambda of .13 would work well.
Non-seasonal differencing for stationarity
Differencing is a way of making a time series stationary; this means that you remove any systematic patterns such as trend and seasonality from the data. A white noise series is considered a special case of a stationary time series.
With non-seasonal data, you use lag-1 differences to model changes between observations rather than the observations directly. You have done this before by using the diff()
function.
In this exercise, you will use the pre-loaded wmurders
data, which contains the annual female murder rate in the US from 1950-2004.
# Plot the US female murder rate
autoplot(wmurders)
# Plot the differenced murder rate
autoplot(diff(wmurders))
# Plot the ACF of the differenced murder rate
ggAcf(diff(wmurders))
It seems like the data look like white noise after differencing.
Seasonal differencing for stationarity
With seasonal data, differences are often taken between observations in the same season of consecutive years, rather than in consecutive periods. For example, with quarterly data, one would take the difference between Q1 in one year and Q1 in the previous year. This is called seasonal differencing.
Sometimes you need to apply both seasonal differences and lag-1 differences to the same series, thus, calculating the differences in the differences.
In this exercise, you will use differencing and transformations simultaneously to make a time series look stationary. The data set here is h02
, which contains 17 years of monthly corticosteroid drug sales in Australia. It has been loaded into your workspace.
# Plot the data
autoplot(h02)
# Take logs and seasonal differences of h02
difflogh02 <- diff(log(h02), lag = 12)
# Plot difflogh02
autoplot(difflogh02)
# Take another difference and plot
ddifflogh02 <- diff(difflogh02)
autoplot(ddifflogh02)
# Plot ACF of ddifflogh02
ggAcf(ddifflogh02)
The data doesn’t look like white noise after the transformation, but you could develop an ARIMA model for it.
ARIMA models
How does auto.arima() work?
Hyndman-Khandakar algorithm:
Automatic ARIMA models for non-seasonal time series
In the video, you learned that the auto.arima()
function will select an appropriate autoregressive integrated moving average (ARIMA) model given a time series, just like the ets()
function does for ETS models. The summary()
function can provide some additional insights:
> # p = 2, d = 1, p = 2
> summary(fit)
Series: usnetelec
ARIMA(2,1,2) with drift
...
In this exercise, you will automatically choose an ARIMA model for the pre-loaded austa
series, which contains the annual number of international visitors to Australia from 1980-2015. You will then check the residuals (recall that a p-value greater than 0.05 indicates that the data resembles white noise) and produce some forecasts. Other than the modelling function, this is identical to what you did with ETS forecasting.
# Fit an automatic ARIMA model to the austa series
fit <- auto.arima(austa)
# Check that the residuals look like white noise
checkresiduals(fit)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,1) with drift
## Q* = 2.297, df = 5, p-value = 0.8067
##
## Model df: 2. Total lags used: 7
residualsok <- TRUE
# Summarize the model
summary(fit)
## Series: austa
## ARIMA(0,1,1) with drift
##
## Coefficients:
## ma1 drift
## 0.3006 0.1735
## s.e. 0.1647 0.0390
##
## sigma^2 estimated as 0.03376: log likelihood=10.62
## AIC=-15.24 AICc=-14.46 BIC=-10.57
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.0008313383 0.1759116 0.1520309 -1.069983 5.513269 0.7461559
## ACF1
## Training set -0.000571993
# Find the AICc value and the number of differences used
AICc <- -14.46
d <- 1
# Plot forecasts of fit
fit %>% forecast(h = 10) %>% autoplot()
It looks like the ARIMA model created a pretty good forecast for you.
Forecasting with ARIMA models
The automatic method in the previous exercise chose an ARIMA(0,1,1) with drift model for the austa
data, that is, \(y_t=c+y_{t−1}+ \theta e_{t−1}+e_t\). You will now experiment with various other ARIMA models for the data to see what difference it makes to the forecasts.
The Arima()
function can be used to select a specific ARIMA model. Its first argument, order
, is set to a vector that specifies the values of \(p\), \(d\) and \(q\). The second argument, include.constant
, is a booolean that determines if the constant \(c\), or drift, should be included. Below is an example of a pipe function that would plot forecasts of usnetelec
from an ARIMA(2,1,2) model with drift:
> usnetelec %>%
Arima(order = c(2,1,2), include.constant = TRUE) %>%
forecast() %>%
autoplot()
In the examples here, watch for how the different models affect the forecasts and the prediction intervals. The austa
data is ready for you to use in your workspace.
# Plot forecasts from an ARIMA(0,1,1) model with no drift
austa %>% Arima(order = c(0,1,1), include.constant = FALSE) %>% forecast() %>% autoplot()
# Plot forecasts from an ARIMA(2,1,3) model with drift
austa %>% Arima(order = c(2,1,3), include.constant = TRUE) %>% forecast() %>% autoplot()
# Plot forecasts from an ARIMA(0,0,1) model with a constant
austa %>% Arima(order = c(0,0,1), include.constant = TRUE) %>% forecast() %>% autoplot()
# Plot forecasts from an ARIMA(0,2,1) model with no constant
austa %>% Arima(order = c(0,2,1), include.constant = FALSE) %>% forecast() %>% autoplot()
The model specification makes a big impact on the forecast!
Comparing auto.arima() and ets() on non-seasonal data
The AICc statistic is useful for selecting between models in the same class. For example, you can use it to select an ETS model or to select an ARIMA model. However, you cannot use it to compare ETS and ARIMA models because they are in different model classes.
Instead, you can use time series cross-validation to compare an ARIMA model and an ETS model on the austa
data. Because tsCV()
requires functions that return forecast objects, you will set up some simple functions that fit the models and return the forecasts. The arguments of tsCV()
are a time series, forecast function, and forecast horizon h
. Examine this code snippet from the second chapter:
e <- matrix(NA_real_, nrow = 1000, ncol = 8)
for (h in 1:8)
e[, h] <- tsCV(goog, naive, h = h)
...
Furthermore, recall that pipe operators in R take the value of whatever is on the left and pass it as an argument to whatever is on the right, step by step, from left to right. Here’s an example based on code you saw in an earlier chapter:
# Plot 20-year forecasts of the lynx series modeled by ets()
lynx %>% ets() %>% forecast(h = 20) %>% autoplot()
In this exercise, you will compare the MSE of two forecast functions applied to austa
, and plot forecasts of the function that computes the best forecasts. Once again, austa
has been loaded into your workspace.
# Set up forecast functions for ETS and ARIMA models
fets <- function(x, h) {
forecast(ets(x), h = h)
}
farima <- function(x, h) {
forecast(auto.arima(x), h = h)
}
# Compute CV errors for ETS on austa as e1
e1 <- tsCV(austa, fets, h = 1)
# Compute CV errors for ARIMA on austa as e2
e2 <- tsCV(austa, farima, h = 1)
# Find MSE of each model class
mean(e1^2, na.rm = TRUE)
## [1] 0.05623684
mean(e2^2, na.rm = TRUE)
## [1] 0.04336277
# Plot 10-year forecasts using the best model class
austa %>% farima(h = 10) %>% autoplot()
Now you know how to compare across model classes.
Seasonal ARIMA models
p = autoregression terms d = differences q = lagged errors
Automatic ARIMA models for seasonal time series
As you learned in the video, the auto.arima()
function also works with seasonal data. Note that setting lambda = 0
in the auto.arima()
function - applying a log transformation - means that the model will be fitted to the transformed data, and that the forecasts will be back-transformed onto the original scale.
After applying summary()
to this kind of fitted model, you may see something like the output below which corresponds with \((p,d,q)(P,D,Q)[m]:\)
ARIMA(0,1,4)(0,1,1)[12]
In this exercise, you will use these functions to model and forecast the pre-loaded h02
data, which contains monthly sales of cortecosteroid drugs in Australia.
# Check that the logged h02 data have stable variance
h02 %>% log() %>% autoplot()
# Fit a seasonal ARIMA model to h02 with lambda = 0
fit <- auto.arima(h02, lambda = 0)
# Summarize the fitted model
summary(fit)
## Series: h02
## ARIMA(2,1,1)(0,1,2)[12]
## Box Cox transformation: lambda= 0
##
## Coefficients:
## ar1 ar2 ma1 sma1 sma2
## -1.1358 -0.5753 0.3683 -0.5318 -0.1817
## s.e. 0.1608 0.0965 0.1884 0.0838 0.0881
##
## sigma^2 estimated as 0.004278: log likelihood=248.25
## AIC=-484.51 AICc=-484.05 BIC=-465
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.003931805 0.0501571 0.03629816 -0.5323365 4.611253 0.5987988
## ACF1
## Training set -0.003740267
# Record the amount of lag-1 differencing and seasonal differencing used
d <- 1
D <- 1
# Plot 2-year forecasts
fit %>% forecast(h = 24) %>% autoplot()
auto.arima()
is flexible enough to even work with seasonal time series!
Exploring auto.arima() options
The auto.arima()
function needs to estimate a lot of different models, and various short-cuts are used to try to make the function as fast as possible. This can cause a model to be returned which does not actually have the smallest AICc value. To make auto.arima()
work harder to find a good model, add the optional argument stepwise = FALSE
to look at a much larger collection of models.
Here, you will try finding an ARIMA model for the pre-loaded euretail
data, which contains quarterly retail trade in the Euro area from 1996-2011. Inspect it in the console before beginning this exercise.
# Find an ARIMA model for euretail
fit1 <- auto.arima(euretail)
# Don't use a stepwise search
fit2 <- auto.arima(euretail, stepwise = FALSE)
# AICc of better model
AICc <- 68.39
# Compute 2-year forecasts from better model
fit2 %>% forecast(h = 8) %>% autoplot()
auto.arima()
has a wealth of options that are worth exploring.
Comparing auto.arima() and ets() on seasonal data
What happens when you want to create training and test sets for data that is more frequent than yearly? If needed, you can use a vector in form c(year, period)
for the start
and/or end
keywords in the window()
function. You must also ensure that you’re using the appropriate values of h
in forecasting functions. Recall that h
should be equal to the length of the data that makes up your test set.
For example, if your data spans 15 years, your training set consists of the first 10 years, and you intend to forecast the last 5 years of data, you would use h = 12 * 5
not h = 5
because your test set would include 60 monthly observations. If instead your training set consists of the first 9.5 years and you want forecast the last 5.5 years, you would use h = 66
to account for the extra 6 months.
In the final exercise for this chapter, you will compare seasonal ARIMA and ETS models applied to the quarterly cement production data qcement
. Because the series is very long, you can afford to use a training and test set rather than time series cross-validation. This is much faster.
The qcement
data is available to use in your workspace.
# Use 20 years of the qcement data beginning in 1988
train <- window(qcement, start = 1988, end = c(2007, 4))
# Fit an ARIMA and an ETS model to the training data
fit1 <- auto.arima(train)
fit2 <- ets(train)
# Check that both models have white noise residuals
checkresiduals(fit1)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,0,1)(2,1,1)[4] with drift
## Q* = 3.3058, df = 3, p-value = 0.3468
##
## Model df: 6. Total lags used: 9
checkresiduals(fit2)
##
## Ljung-Box test
##
## data: Residuals from ETS(M,N,M)
## Q* = 6.3457, df = 3, p-value = 0.09595
##
## Model df: 6. Total lags used: 9
# Produce forecasts for each model
fc1 <- forecast(fit1, h = 25)
fc2 <- forecast(fit2, h = 25)
# Use accuracy() to find best model based on RMSE
accuracy(fc1, qcement)
## ME RMSE MAE MPE MAPE MASE
## Training set -0.006205705 0.1001195 0.07988903 -0.6704455 4.372443 0.5458078
## Test set -0.158835253 0.1996098 0.16882205 -7.3332836 7.719241 1.1534049
## ACF1 Theil's U
## Training set -0.01133907 NA
## Test set 0.29170452 0.7282225
accuracy(fc2, qcement)
## ME RMSE MAE MPE MAPE MASE
## Training set 0.01406512 0.1022079 0.07958478 0.4938163 4.371823 0.5437292
## Test set -0.13495515 0.1838791 0.15395141 -6.2508975 6.986077 1.0518075
## ACF1 Theil's U
## Training set -0.03346295 NA
## Test set 0.53438371 0.680556
bettermodel <- fit2
Looks like the ETS model did better here.
Dynamic regression
Forecasting sales allowing for advertising expenditure
Welcome to the last chapter of the course!
The auto.arima()
function will fit a dynamic regression model with ARIMA errors. The only change to how you used it previously is that you will now use the xreg
argument containing a matrix of regression variables. Here are some code snippets from the video:
> fit <- auto.arima(uschange[, "Consumption"],
xreg = uschange[, "Income"])
> # rep(x, times)
> fcast <- forecast(fit, xreg = rep(0.8, 8))
You can see that the data is set to the Consumption
column of uschange
, and the regression variable is the Income
column. Furthermore, the rep()
function in this case would replicate the value 0.8 exactly eight times for the matrix argument xreg
.
In this exercise, you will model sales data regressed against advertising expenditure, with an ARMA error to account for any serial correlation in the regression errors. The data are available in your workspace as advert
and comprise 24 months of sales and advertising expenditure for an automotive parts company. The plot shows sales vs advertising expenditure.
Think through everything you have learned so far in this course, inspect the advert
data in your console, and read each instruction carefully to tackle this challenging exercise.
# Time plot of both variables
autoplot(advert, facets = TRUE)
# Fit ARIMA model
fit <- auto.arima(advert[, "sales"], xreg = advert[, "advert"], stationary = TRUE)
# Check model. Increase in sales for each unit increase in advertising
salesincrease <- coefficients(fit)[3]
# Forecast fit as fc
fc <- forecast(fit, xreg = rep(10, 6))
# Plot fc with x and y labels
autoplot(fc) + xlab("Month") + ylab("Sales")
The dynamic regression allows you to include other outside information into your forecast.
Forecasting electricity demand
You can also model daily electricity demand as a function of temperature. As you may have seen on your electric bill, more electricity is used on hot days due to air conditioning and on cold days due to heating.
In this exercise, you will fit a quadratic regression model with an ARMA error. One year of daily data are stored as elec
including total daily demand, an indicator variable for workdays (a workday is represented with 1, and a non-workday is represented with 0), and daily maximum temperatures. Because there is weekly seasonality, the frequency
has been set to 7.
Let’s take a look at the first three rows:
> elec[1:3, ]
Demand Temperature Workday
[1,] 168.2798 20.2 0
[2,] 183.0822 21.9 1
[3,] 184.5135 25.1 1
elec has been pre-loaded into your workspace.
# Time plots of demand and temperatures
autoplot(elec[, c("Demand", "Temperature")], facets = TRUE)
# Matrix of regressors
xreg <- cbind(MaxTemp = elec[, "Temperature"],
MaxTempSq = elec[, "Temperature"]^2,
Workday = elec[, "Workday"])
# Fit model
fit <- auto.arima(elec[, "Demand"], xreg = xreg)
# Forecast fit one day ahead
forecast(fit, xreg = cbind(20, 20^2, 1))
Warning message: xreg contains different column names from the xreg used in training. Please check that the regressors are in the same order.
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
53.14286 204.7168 196.1512 213.2823 191.6169 217.8166
Now you’ve seen how multiple independent variables can be included using matrices.
Dynamic harmonic regression
Forecasting weekly data
With weekly data, it is difficult to handle seasonality using ETS or ARIMA models as the seasonal length is too large (approximately 52). Instead, you can use harmonic regression which uses sines and cosines to model the seasonality.
The fourier()
function makes it easy to generate the required harmonics. The higher the order \((K)\), the more “wiggly” the seasonal pattern is allowed to be. With \(K=1\), it is a simple sine curve. You can select the value of \(K\) by minimizing the AICc value. As you saw in the video, fourier()
takes in a required time series, required number of Fourier terms to generate, and optional number of rows it needs to forecast:
> # fourier(x, K, h = NULL)
> fit <- auto.arima(cafe, xreg = fourier(cafe, K = 6),
seasonal = FALSE, lambda = 0)
> fit %>%
forecast(xreg = fourier(cafe, K = 6, h = 24)) %>%
autoplot() + ylim(1.6, 5.1)
The pre-loaded gasoline
data comprises weekly data on US finished motor gasoline products. In this exercise, you will fit a harmonic regression to this data set and forecast the next 3 years.
# Set up harmonic regressors of order 13
harmonics <- fourier(gasoline, K = 13)
# Fit regression model with ARIMA errors
fit <- auto.arima(gasoline, xreg = harmonics, seasonal = FALSE)
# Forecasts next 3 years
newharmonics <- fourier(gasoline, K = 13, h = 156)
fc <- forecast(fit, xreg = newharmonics)
# Plot forecasts fc
autoplot(fc)
The point predictions look to be a bit low.
Harmonic regression for multiple seasonality
Harmonic regressions are also useful when time series have multiple seasonal patterns. For example, taylor
contains half-hourly electricity demand in England and Wales over a few months in the year 2000. The seasonal periods are 48 (daily seasonality) and 7 x 48 = 336 (weekly seasonality). There is not enough data to consider annual seasonality.
auto.arima()
would take a long time to fit a long time series such as this one, so instead you will fit a standard regression model with Fourier terms using the tslm()
function. This is very similar to lm()
but is designed to handle time series. With multiple seasonality, you need to specify the order \(K\) for each of the seasonal periods.
# The formula argument is a symbolic description
# of the model to be fitted
> args(tslm)
function (formula, ...)
tslm()
is a newly introduced function, so you should be able to follow the pre-written code for the most part. The taylor
data are loaded into your workspace.
# Fit a harmonic regression using order 10 for each type of seasonality
fit <- tslm(taylor ~ fourier(taylor, K = c(10, 10)))
# Forecast 20 working days ahead
fc <- forecast(fit, newdata = data.frame(fourier(taylor, K = c(10, 10), h = 20 * 48)))
# Plot the forecasts
autoplot(fc)
# Check the residuals of fit
checkresiduals(fit)
##
## Breusch-Godfrey test for serial correlation of order up to 672
##
## data: Residuals from Linear regression model
## LM test = 3938.9, df = 672, p-value < 2.2e-16
The residuals from the fitted model fail the tests badly, yet the forecasts are quite good.
Forecasting call bookings
Another time series with multiple seasonal periods is calls
, which contains 20 consecutive days of 5-minute call volume data for a large North American bank. There are 169 5-minute periods in a working day, and so the weekly seasonal frequency is 5 x 169 = 845. The weekly seasonality is relatively weak, so here you will just model daily seasonality. calls
is pre-loaded into your workspace.
The residuals in this case still fail the white noise tests, but their autocorrelations are tiny, even though they are significant. This is because the series is so long. It is often unrealistic to have residuals that pass the tests for such long series. The effect of the remaining correlations on the forecasts will be negligible.
# Plot the calls data
autoplot(calls)
# Set up the xreg matrix
xreg <- fourier(calls, K = c(10, 0))
# Fit a dynamic regression model
fit <- auto.arima(calls, xreg = xreg, seasonal = FALSE, stationary = TRUE)
# Check the residuals
checkresiduals(fit)
##
## Ljung-Box test
##
## data: Residuals from Regression with ARIMA(5,0,1) errors
## Q* = 6846.8, df = 1663, p-value < 2.2e-16
##
## Model df: 27. Total lags used: 1690
# Plot forecast for 10 working days ahead
fc <- forecast(fit, xreg = fourier(calls, c(10, 0), h = 1690))
autoplot(fc)
Now you’ve gotten a lot of experience using complex forecasting techniques.
TBATS models
TBATS models for electricity demand
As you saw in the video, a TBATS model is a special kind of time series model. It can be very slow to estimate, especially with multiple seasonal time series, so in this exercise you will try it on a simpler series to save time. Let’s break down elements of a TBATS model in TBATS(1, {0,0}, -, {<51.18,14>})
, one of the graph titles from the video:
Component | Meaning |
---|---|
1 | Box-Cox transformation parameter |
{0,0} | ARMA error |
- | Damping parameter |
\({<51.18,14>}\) | Seasonal period, Fourier terms |
The gas
data contains Australian monthly gas production. A plot of the data shows the variance has changed a lot over time, so it needs a transformation. The seasonality has also changed shape over time, and there is a strong trend. This makes it an ideal series to test the tbats()
function which is designed to handle these features.
gas
is available to use in your workspace.
# Plot the gas data
autoplot(gas)
# Fit a TBATS model to the gas data
fit <- tbats(gas)
# Forecast the series for the next 5 years
fc <- forecast(fit, h = 12 * 5)
# Plot the forecasts
autoplot(fc)
# Record the Box-Cox parameter and the order of the Fourier terms
lambda <- 0.082
K <- 5
Just remember that completely automated solutions don’t work every time.
Your future in forecasting!