Question No. 3.1:

For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance.

Analysis

The author tasks us with transforming time series by applying the Box Cox transformation. BoX-Cox transformations are used to normalize dependent variables. It is done so that the errors of the data are normally distributed. This also improves the predictability of the model.

Using Box Cox on a Time Series dataset helps reduce variations that either increase or decrease over the time of the series.

usnetelec

## starting httpd help server ... done

usnetelec {expsmooth} R Documentation Annual US net electricity generation Description Annual US net electricity generation (billion kwh) for 1949-2003

Usage data(usnetelec)

## [1] 1

This series is collected on an annual basis, so there is no seasonality to this time series.

The BoxCox lambda is 0.517

## [1] 0.5167714

As stated, there is no seasonality in this time series, but there is an upwards trend with constant variation. Thus, there was no need for the Box Cox transformation. The two plots below show no real transformation from applying the Box Cox transformation.

usgdpc

Quarterly US GDP Description Quarterly US GDP. 1947:1 - 2006.1.

Usage data(usgdp) Format time series

Source Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D., (2008) Forecasting with exponential smoothing: the state space approach, Springer

References http://www.exponentialsmoothing.net

## [1] 4

This time series is collected on a quarterly basis, and from the polar plot, no seasonality.

The BoxCox lambda is 0.366

## [1] 0.366352

As with the previous time series, there is no seasonality nor is there any variance that is not constant. Thus, there was no need for the Box Cox transformation.

mcopper

Monthly copper prices Description Monthly copper prices. Copper, grade A, electrolytic wire bars/cathodes,LME,cash (pounds/ton) Source: UNCTAD (http://stats.unctad.org/Handbook).

Usage data(mcopper) Format time series

Source Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D., (2008) Forecasting with exponential smoothing: the state space approach, Springer

References http://www.exponentialsmoothing.net

## [1] 12

The frequency of this time series is monthly.

The BoxCox lambda is 0.192

## [1] 0.1919047

In the mcopper time series, there is some seasonality, but the variance does not increase with the trend. Box Cox transformation was not needed.

enplanements

Monthly US domestic enplanements Description "Domestic Revenue Enplanements (millions): 1996-2000. SOURCE: Department of Transportation, Bureau of Transportation Statistics, Air Carrier Traffic Statistic Monthly.

Usage data(enplanements) Format time series

Source Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D., (2008) Forecasting with exponential smoothing: the state space approach, Springer

References http://www.exponentialsmoothing.

## [1] 12

## [1] -0.2269461

We see here that the Box Cox transformation did transform the seasonality of the enplanements data which does show a seasonal jump in the summer months.

Question No. 3.2:

Why is a Box-Cox transformation unhelpful for the cangas data?

cangas

Monthly Canadian gas production Description Monthly Canadian gas production, billions of cubic metres, January 1960 - February 2005

Usage data(cangas) Format time series

Source Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D., (2008) Forecasting with exponential smoothing: the state space approach, Springer.

References http://www.exponentialsmoothing.net

## [1] 12
##         Jan    Feb    Mar    Apr    May    Jun
## 1960 1.4306 1.3059 1.4022 1.1699 1.1161 1.0113
##          Jan     Feb Mar Apr May Jun Jul Aug     Sep     Oct     Nov     Dec
## 2004                                         16.9067 17.8268 17.8322 19.4526
## 2005 19.5284 16.9441

## [1] 0.5767759

Answer

The cangas time series shows a long term, increasing trend from 1960 to 2005 without much seasonality but with cycles. It also shows Homoscedasticity, where variation in the data neither increases nor decreases with the level of the series in general.

Box Cox transformations work best for data that shows Heteroscedasticity where the variation is inconsistent with the level of the series.

Question No. 3.3:

What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?

Setting the polar attribute to TRUE for the ggseasonplot further deomonstrates the seasonal nature of liquor sales with a spike in December sales

Over the next two plots, the gglagplot and the ggACF plot, we see how observations are plotted against earlier observations or lags against itself. Lag 12 supports the finding that sales spike in the 12 month which is shown in the ggacf plot.

## [1] -0.04159144

### Answer

Given that the best Lambda chosen was 0, the best Box Cox transform would be a simple log transform. In the plots below, the log transform of the time series is the same as the Box Cox transform.

Question No. 3.8

For your retail time series (from Exercise 3 in Section 2.10):

b. Check that your data have been split appropriately by producing the following plot.

c. Calculate forecasts using snaive applied to myts.train.

d. Compare the accuracy of your forecasts against the actual values stored in myts.test.

##                     ME      RMSE       MAE      MPE      MAPE     MASE
## Training set  4.455255  8.699864  5.818619  6.15400  9.948117 1.000000
## Test set     19.170833 22.956217 19.520833 11.59039 11.813322 3.354891
##                   ACF1 Theil's U
## Training set 0.7261600        NA
## Test set     0.5801161 0.7479721

e. Check the residuals.

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 783.91, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24

Answer

The ACF plot shows siginificant correlations between time lags of the residuals. Multiple periods exceed the upper bound of the period which mean that the correlations are significant. According to the authors, “If there are correlations between residuals, then there is information left in the residuals which should be used in computing forecasts.”

The histogram plot shows a normal distribution of the residuals even though there is a little bit of a right tail to the distribution.

