3.1: For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance.

# Install and load necessary packages 
if(!require(fpp2)) install.packages("fpp2")
## Loading required package: fpp2
## Warning: package 'fpp2' was built under R version 4.4.2
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## ── Attaching packages ────────────────────────────────────────────── fpp2 2.5 ──
## ✔ ggplot2   3.5.1      ✔ fma       2.5   
## ✔ forecast  8.23.0     ✔ expsmooth 2.3
## Warning: package 'forecast' was built under R version 4.4.2
## Warning: package 'fma' was built under R version 4.4.2
## Warning: package 'expsmooth' was built under R version 4.4.2
## 
library(fpp2)

# Load the datasets
data("usnetelec")
data("usgdp")
data("mcopper")
data("enplanements")

# Find and print optimal Box-Cox lambda for each series
lambda_usnetelec <- BoxCox.lambda(usnetelec)
lambda_usgdp <- BoxCox.lambda(usgdp)
lambda_mcopper <- BoxCox.lambda(mcopper)
lambda_enplanements <- BoxCox.lambda(enplanements)

cat("Optimal Box-Cox Lambda:\n")
## Optimal Box-Cox Lambda:
cat("usnetelec: ", lambda_usnetelec, "\n")
## usnetelec:  0.5167714
cat("usgdp: ", lambda_usgdp, "\n")
## usgdp:  0.366352
cat("mcopper: ", lambda_mcopper, "\n")
## mcopper:  0.1919047
cat("enplanements: ", lambda_enplanements, "\n")
## enplanements:  -0.2269461
# Apply the Box-Cox transformation
transformed_usnetelec <- BoxCox(usnetelec, lambda_usnetelec)
transformed_usgdp <- BoxCox(usgdp, lambda_usgdp)
transformed_mcopper <- BoxCox(mcopper, lambda_mcopper)
transformed_enplanements <- BoxCox(enplanements, lambda_enplanements)

# Plot the original and transformed series
par(mfrow=c(2,2))
plot(usnetelec, main="Original usnetelec")
plot(transformed_usnetelec, main="Transformed usnetelec")

plot(usgdp, main="Original usgdp")
plot(transformed_usgdp, main="Transformed usgdp")

plot(mcopper, main="Original mcopper")
plot(transformed_mcopper, main="Transformed mcopper")

plot(enplanements, main="Original enplanements")
plot(transformed_enplanements, main="Transformed enplanements")

ANALYSIS: The provided R code uses the fpp2 package to apply Box-Cox transformations to four time series datasets: usnetelec, usgdp, mcopper, and enplanements. The goal of the Box-Cox transformation is to stabilize the variance of these time series, which is crucial for many time series modeling techniques that assume constant variance. The code first calculates the optimal Box-Cox lambda for each series using the BoxCox.lambda() function. Then, it applies the transformation using the BoxCox() function with the computed lambda. Finally, the code plots the original and transformed series side-by-side for visual comparison, enabling an assessment of the variance stabilization. From the graphs, we can observe several patterns. The usnetelec and usgdp series show exponential growth, with the variance increasing as the series progresses over time. After the Box-Cox transformation, both series exhibit a more stabilized variance, making the growth appear more linear. The mcopper series exhibits significant fluctuations with sharp increases and decreases. After applying the transformation, the variability is somewhat reduced, though some irregular patterns persist, likely due to the inherent volatility of copper prices. The enplanements series demonstrates a clear seasonal pattern, with higher variability in later years. The Box-Cox transformation successfully stabilizes this variance, making the seasonal patterns more consistent across the time period. Overall, the Box-Cox transformation effectively reduces heteroscedasticity in these datasets, preparing them for more reliable time series modeling and analysis.

3.2: Why is a Box-Cox transformation unhelpful for the cangas data?

The Box-Cox transformation is unhelpful for the cangas dataset because this series exhibits characteristics that the transformation cannot effectively address. The primary purpose of a Box-Cox transformation is to stabilize variance and make a non-linear relationship more linear. However, if the variance in the cangas series is already stable or if the series exhibits patterns such as non-stationarity due to seasonality or structural breaks rather than changing variance, the transformation won’t provide meaningful improvements. For instance, if the cangas dataset has strong seasonal patterns or long-term trends without significant variance changes, a Box-Cox transformation won’t make the data more suitable for modeling. Instead, techniques like seasonal differencing or decomposition might be more appropriate. Additionally, if the series contains level shifts due to external factors (e.g., policy changes or market shocks), a Box-Cox transformation won’t address these structural breaks.

3.3: What Box-Cox transformation would you select for your retail data (from Exercise 3 in Section 2.10)?

Step 1: Identify the Series Characteristics

The Box-Cox transformation is used to stabilize variance and make the data more suitable for modeling. For our retail dataset, we first need to visualize the time series.

library(ggplot2)
library(forecast)
library(readxl)

# Load the retail dataset
data <- read_excel("retail.xlsx", skip=1)

# Select a specific time series
myts <- ts(data[["A3349873A"]], frequency=12, start=c(1982,4))

# Plot the time series
autoplot(myts) + ggtitle("Retail Data Time Series")

From the plot, we observe a clear upward trend with seasonal patterns and increasing variance.

Step 2: Apply Box-Cox Transformation

The Box-Cox transformation requires a positive series, so we verify that all observations are positive.

# Check for positive values
all(myts > 0)
## [1] TRUE

If any negative values are found, we add a constant to shift the series before applying the transformation.

Next, we find the optimal lambda for the Box-Cox transformation.

# Find the optimal lambda
lambda <- BoxCox.lambda(myts)
lambda
## [1] 0.1276369

Step 3: Apply the Transformation

Using the optimal lambda, we transform the series.

# Apply the Box-Cox transformation
myts_bc <- BoxCox(myts, lambda)

# Plot the transformed series
autoplot(myts_bc) + ggtitle("Box-Cox Transformed Retail Data")

Step 4: Analyze the Results

We now compare the variance and patterns before and after transformation using ACF plots.

# Compare ACF before and after
ggAcf(myts) + ggtitle("ACF of Original Series")

ggAcf(myts_bc) + ggtitle("ACF of Transformed Series")

CONCLUSON: For the retail data from Exercise 3 in Section 2.10, the optimal Box-Cox transformation is selected based on the variance stabilization observed in the time series. After applying BoxCox.lambda(myts), the optimal lambda is typically close to 0, indicating a log transformation. This transformation helps manage the increasing variance and seasonal fluctuations in the data. By applying BoxCox(myts, lambda), the variance becomes more stable, improving the model’s performance and interpretability. Thus, a log transformation (λ ≈ 0) is the most appropriate choice for this retail dataset.

3.4: For each of the following series, make a graph of the data. If transforming seems appropriate, do so and describe the effect. dole, usdeaths, bricksq.

The Box-Cox transformation is used to stabilize variance and make the data more suitable for modeling. For our retail dataset, we first need to visualize the time series.

# Plotting the given series
autoplot(dole) + ggtitle("Dole Series")

autoplot(usdeaths) + ggtitle("US Deaths Series")

autoplot(bricksq) + ggtitle("Bricksq Series")

# Check if transformation is needed
BoxCox.lambda(dole)
## [1] 0.3290922
BoxCox.lambda(usdeaths)
## [1] -0.03363775
BoxCox.lambda(bricksq)
## [1] 0.2548929
# Apply Box-Cox transformations if necessary
dole_bc <- BoxCox(dole, BoxCox.lambda(dole))
usdeaths_bc <- BoxCox(usdeaths, BoxCox.lambda(usdeaths))
bricksq_bc <- BoxCox(bricksq, BoxCox.lambda(bricksq))

# Plot transformed series
autoplot(dole_bc) + ggtitle("Transformed Dole Series")

autoplot(usdeaths_bc) + ggtitle("Transformed US Deaths Series")

autoplot(bricksq_bc) + ggtitle("Transformed Bricksq Series")

# Compare ACF plots before and after
ggAcf(dole) + ggtitle("ACF of Original Dole")

ggAcf(dole_bc) + ggtitle("ACF of Transformed Dole")

ggAcf(usdeaths) + ggtitle("ACF of Original US Deaths")

ggAcf(usdeaths_bc) + ggtitle("ACF of Transformed US Deaths")

ggAcf(bricksq) + ggtitle("ACF of Original Bricksq")

ggAcf(bricksq_bc) + ggtitle("ACF of Transformed Bricksq")

Analysis: For the dole series, the variance appears relatively stable, so no transformation is necessary. For the usdeaths series, the variance increases over time, and the Box-Cox transformation helps stabilize the variance and reduce seasonality. The bricksq series shows significant seasonal patterns and increasing variance, which are effectively minimized with the transformation. Overall, the Box-Cox transformation helps achieve more stationarity and variance stability, making the data easier to model and interpret.

3.5: Calculate the residuals from a seasonal naïve forecast applied to the quarterly Australian beer production data from 1992.

In this analysis, we’ll use a seasonal naïve forecasting model on the quarterly Australian beer production data starting from 1992. We’ll evaluate the model’s performance by calculating the residuals and checking if they resemble white noise and follow a normal distribution. The goal here is to assess how well the model captures the seasonal pattern and whether the remaining error terms display randomness, indicating a good model fit.

Step 1: Data Preparation

We’ll first extract the beer production data from 1992 onwards.

# Load necessary library
library(fpp3)
## Warning: package 'fpp3' was built under R version 4.4.2
## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.1 ──
## ✔ tibble      3.2.1     ✔ tsibble     1.1.6
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.1     ✔ feasts      0.4.1
## ✔ lubridate   1.9.3     ✔ fable       0.4.1
## Warning: package 'tsibble' was built under R version 4.4.2
## Warning: package 'tsibbledata' was built under R version 4.4.2
## Warning: package 'feasts' was built under R version 4.4.2
## Warning: package 'fabletools' was built under R version 4.4.2
## Warning: package 'fable' was built under R version 4.4.2
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()
## 
## Attaching package: 'fpp3'
## The following object is masked from 'package:fpp2':
## 
##     insurance
# Subset the data starting from 1992
beer <- window(ausbeer, start=1992)

Step 2: Seasonal Naïve Forecast

We’ll apply the seasonal naïve forecasting model, which assumes that the value of each season will be the same as the corresponding season in the previous year.

# Apply seasonal naïve forecast
fc <- snaive(beer)

# Visualize the forecast
autoplot(fc) +
  ggtitle("Seasonal Naïve Forecast of Australian Beer Production (1992 Onward)")

Step 3: Calculate and Visualize Residuals

Residuals represent the differences between observed and forecasted values. We’ll compute and plot them to look for patterns.

# Calculate residuals
res <- residuals(fc)

# Plot the residuals
autoplot(res) +
  ggtitle("Residuals of Seasonal Naïve Forecast")

Step 4: Residual Diagnostics

To test whether the residuals resemble white noise, we’ll use diagnostic tools to check for randomness and normality.

Check residuals for white noise and normality

checkresiduals(fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 32.269, df = 8, p-value = 8.336e-05
## 
## Model df: 0.   Total lags used: 8

Interpretation

  • White Noise: If the residuals are white noise, the ACF plot should show no significant correlations at any lag, and the p-value from the Ljung-Box test should be greater than 0.05.
  • Normal Distribution: The histogram and Q-Q plot should indicate a bell-shaped curve.

If the residuals pass both tests, it suggests that the seasonal naïve model effectively captures the seasonality in the data. Otherwise, more sophisticated models may be needed to improve forecast accuracy.

3.6: Repeat the exercise for the WWWusage and bricksq data. Use whichever of naive() or snaive() is more appropriate in each case.

library(forecast)

# Load WWWusage data
www_ts <- ts(WWWusage)
www_forecast <- naive(www_ts)

# Plot the forecast for WWWusage
autoplot(www_forecast) + ggtitle("Naive Forecast for WWWusage")

# Load bricksq data
bricksq_ts <- ts(bricksq)
bricksq_forecast <- snaive(bricksq)

# Plot the forecast for bricksq
autoplot(bricksq_forecast) + ggtitle("Seasonal Naive Forecast for bricksq")

For the WWWusage dataset, the naive() method is appropriate because the data represents internet usage over time, which does not exhibit clear seasonality but rather a trend. The naive method assumes that the most recent observation is the best predictor of future values, making it suitable for time series with persistent trends. For the bricksq dataset, the snaive() (seasonal naive) method is more appropriate as it accounts for recurring seasonal patterns. Since bricksq represents brick production, which is likely to be influenced by seasonal demand, the seasonal naive approach predicts future values based on the corresponding period from the previous cycle, effectively capturing seasonal fluctuations.

3.7: Are the following statements true or false? Explain your answer.

  1. Good forecast methods should have normally distributed residuals. False. While normality in residuals is often desirable for inference and hypothesis testing, it is not a strict requirement for good forecasting. The key assumptions for a good forecasting model are that residuals should be uncorrelated and have constant variance (homoscedasticity). If residuals exhibit autocorrelation, the model may be missing important patterns in the data.

  2. A model with small residuals will give good forecasts. False. A model with small residuals on the training data does not necessarily translate to good forecasts. Overfitting can lead to a model that fits historical data well but performs poorly on unseen data. The true measure of forecast performance is how well the model generalizes to new observations, often evaluated using test set errors.

  3. The best measure of forecast accuracy is MAPE. False. While Mean Absolute Percentage Error (MAPE) is a widely used metric, it has limitations, such as issues when actual values are near zero. The best measure of forecast accuracy depends on the context and nature of the data. For example, Mean Squared Error (MSE) or Root Mean Squared Error (RMSE) might be preferred when large errors need to be penalized more, while Mean Absolute Error (MAE) is useful when interpreting errors in absolute terms.

  4. If your model doesn’t forecast well, you should make it more complicated. False. Increasing model complexity does not always lead to better forecasts. More complex models can overfit the training data and fail to generalize. Instead, one should consider alternative approaches such as feature selection, different model structures, or regularization to improve forecasting performance.

  5. Always choose the model with the best forecast accuracy as measured on the test set. False. While forecast accuracy is important, other factors such as interpretability, computational efficiency, and robustness should be considered. A slightly less accurate but simpler and more interpretable model may be preferable in practice, especially when implementing forecasting solutions in real-world applications.

3.8: For your retail time series (from Exercise 3 in Section 2.10):

(a) Splitting the Data

myts.train <- window(myts, end=c(2010,12))
myts.test <- window(myts, start=2011)

Explanation: We split the time series myts into two subsets: myts.train, which includes data up until December 2010, and myts.test, which includes data from January 2011 onward. This allows us to train our model on historical data and test its forecasting performance on unseen data.

(b) Visualizing the Training and Test Sets

library(ggplot2)
library(forecast)

autoplot(myts) +
  autolayer(myts.train, series="Training", color="blue") +
  autolayer(myts.test, series="Test", color="red") +
  ggtitle("Retail Time Series Split into Training and Test Sets") +
  ylab("Sales") +
  xlab("Year")

Explanation: This visualization helps confirm that the data has been correctly split. autoplot(myts) plots the entire series, while autolayer(myts.train, series=“Training”) and autolayer(myts.test, series=“Test”) overlay the training and test data in different colors. The plot should clearly show that myts.train stops at 2010 and myts.test begins in 2011.

(c) Forecasting Using Seasonal Naïve Method

fc <- snaive(myts.train)
autoplot(fc) +
  ggtitle("Seasonal Naïve Forecasts for Retail Data") +
  ylab("Sales") +
  xlab("Year")

Explanation: The snaive() function applies the seasonal naïve forecasting method, which assumes that future values will be equal to the values from the same season in the previous year. This is a strong baseline model for seasonal data, particularly retail sales, where patterns often repeat annually.

(d) Evaluating Forecast Accuracy

accuracy(fc, myts.test)
##                     ME     RMSE      MAE       MPE      MAPE     MASE      ACF1
## Training set  7.772973 20.24576 15.95676  4.702754  8.109777 1.000000 0.7385090
## Test set     55.300000 71.44309 55.78333 14.900996 15.082019 3.495907 0.5315239
##              Theil's U
## Training set        NA
## Test set      1.297866

Explanation:This function calculates forecast accuracy metrics, such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and others. These metrics help assess how well the model’s predictions align with actual values from myts.test.

(e) Checking Residuals

checkresiduals(fc)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 624.45, df = 24, p-value < 2.2e-16
## 
## Model df: 0.   Total lags used: 24

Explanation: The checkresiduals() function performs diagnostic checks on the residuals of the forecast. It includes: * A time series plot of residuals to check for patterns. * An autocorrelation function (ACF) plot to detect correlations in residuals. *A histogram and normal Q-Q plot to assess the normality of residuals. For a good forecasting model, residuals should be uncorrelated (i.e., white noise) and have constant variance. If there is autocorrelation, the model might be missing key information.

  1. Sensitivity to Training/Test Split
myts.train.alt <- window(myts, end=c(2008,12)) 
myts.test.alt <- window(myts, start=2009)

fc.alt <- snaive(myts.train.alt)
accuracy(fc.alt, myts.test.alt)
##                     ME     RMSE      MAE       MPE     MAPE     MASE      ACF1
## Training set  8.866667 19.60898 15.41100 5.2561804 8.116672 1.000000 0.7165701
## Test set     -0.237500 25.09626 18.47083 0.1008644 6.074517 1.198548 0.4798145
##              Theil's U
## Training set        NA
## Test set     0.7159931

Explanation:This checks whether different train-test splits affect accuracy. If the model’s error metrics vary significantly depending on where the split occurs, it indicates that the time series is non-stationary or that a different forecasting method might be more appropriate. Comparing results from multiple splits can help determine the model’s robustness.

3.9: visnights contains quarterly visitor nights (in millions) from 1998 to 2016 for twenty regions of Australia.

(a) Creating Training Sets

train1 <- window(visnights[, "QLDMetro"], end=c(2015, 4))
train2 <- window(visnights[, "QLDMetro"], end=c(2014, 4))
train3 <- window(visnights[, "QLDMetro"], end=c(2013, 4))

Explanation: We create three training sets for visnights[,“QLDMetro”], each omitting a different amount of recent data: * train1 excludes the last one year (2016). * train2 excludes the last two years (2015–2016). * train3 excludes the last three years (2014–2016). These subsets allow us to compare how forecast accuracy changes when the model is trained on different amounts of historical data.

(b) Forecasting Using Seasonal Naïve Method

fc1 <- snaive(train1, h=4)  # Forecasting four quarters ahead (1 year)
fc2 <- snaive(train2, h=4)
fc3 <- snaive(train3, h=4)

autoplot(train1) + autolayer(fc1, series="Forecast 1 Year Ahead") +
  ggtitle("Seasonal Naïve Forecasts (1 Year of Test Data)")

autoplot(train2) + autolayer(fc2, series="Forecast 2 Years Ahead") +
  ggtitle("Seasonal Naïve Forecasts (2 Years of Test Data)")

autoplot(train3) + autolayer(fc3, series="Forecast 3 Years Ahead") +
  ggtitle("Seasonal Naïve Forecasts (3 Years of Test Data)")

Explanation: We apply the seasonal naïve method to each training set and generate forecasts (fc1, fc2, fc3) for the following four quarters (one year). The assumption is that visitor nights follow a strong seasonal pattern, making snaive() a reasonable choice. The plots visualize how forecasts compare to the actual data in different training scenarios.

(c) Evaluating Forecast Accuracy

test1 <- window(visnights[, "QLDMetro"], start=2016)
test2 <- window(visnights[, "QLDMetro"], start=2015, end=c(2015, 4))
test3 <- window(visnights[, "QLDMetro"], start=2014, end=c(2014, 4))

acc1 <- accuracy(fc1, test1)
acc2 <- accuracy(fc2, test2)
acc3 <- accuracy(fc3, test3)

data.frame(Model=c("1 Year Omitted", "2 Years Omitted", "3 Years Omitted"),
           MAPE=c(acc1["Test set", "MAPE"], acc2["Test set", "MAPE"], acc3["Test set", "MAPE"]))
##             Model     MAPE
## 1  1 Year Omitted 6.159821
## 2 2 Years Omitted 3.057463
## 3 3 Years Omitted 8.476977

Explanation: We extract the corresponding test sets for each training period and calculate forecast accuracy using accuracy(). The focus is on Mean Absolute Percentage Error (MAPE), which measures forecast error as a percentage of actual values. The output helps assess how much accuracy degrades as older training data is used.