Module 4 Discussion

Author

Teddy Kelly

Box-Cox Method

The Box-Cox method is a transformation that can be applied to a time series to stabilize its variance over time and to transform skewed time series data to be more normally distributed. The Box-Cox method relies on a parameter \(\lambda\) to determine which kind of transformation is optimal based on the time series. The value of \(\lambda\) is selected based on which value will best approximate the time series to be normally distributed when the transformation is applied.

Advantages:

One advantage of applying a Box-Cox transformation is that it can result in a stationary time series which makes forecasting easier for many different methods.
It also allows for multiple types of transformations to be applied. For example, if the parameter \(\lambda=0\), then the optimal transformation is to apply a log transformation and if \(\lambda\neq0\), then some power transformation will be used.

Disadvantages:

If the value of \(\lambda\) is not equal to zero, a log transformation will not be used, and some other power transformation will be applied. As a result, the time series will be much less interpretable whereas keeping the original data as or applying a log transformation is a lot easier to interpret.
Also, applying a Box-Cox transformation requires that all of the values in the time series are positive. If there are zeros or negative values, then applying a log or some power transformation will result in undefined values.

Basic Method

I have decided to use the aus_accommodation time series from the fpp3 package and specifically, I will be analyzing quarterly Takings data from Queensland in Australia.

Below, I have loaded in the time series, split the series into training and testing sets, and I have plotted the Takings values for the training series.

rm(list=ls())
library(fpp3)
library(kableExtra)

aus_accommodation <- aus_accommodation |> filter(State == "Queensland") |>
  select(-Occupancy, -CPI)

train <- aus_accommodation |> filter_index(~"2012 Q3") |> filter(State == "Queensland")
test <- aus_accommodation |> filter_index("2012 Q4"~.) |> filter(State == "Queensland") 

# Graphing the Training Data
train |>
  autoplot(Takings) +
  labs(title = "Queensland Accommodation Takings from 1998-2012",
       subtitle = "Quartely Data",
       x = "Time (Quarters)",
       y = "Takings (Millions of Australian Dollars)")

Looking at the graph of the training data above, it appears that the variance is changing over time and so a Box-Cox transformation could be applied to stabilize this variance and produce more reliable forecasts.
However, before applying a Box-Cox transformation, I will first produce forecasts of the untransformed Takings data in Queensland using a linear regression.

I have fit a time series linear regression model with a trend dummy and quarterly dummy variables to the training data. Below is the estimating equation for the model:

\[ Y_t=\beta_0+\beta_1t+\beta_2Q_{2,t}+\beta_3Q_{3,t}+\beta_4Q_{4,t}+\varepsilon_t \]

where \(t\) represents the trend component dummy, and \(Q_{i,t}\) represent the quarterly dummies.

fit_regression <- train |>
  model(TSLM(Takings ~ trend() + season()))


# Displaying the regression table output
fit_regression |> report()

Series: Takings 
Model: TSLM 

Residuals:
    Min      1Q  Median      3Q     Max 
-53.905 -15.335  -2.229  12.155  56.937 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)   204.7723     7.9148  25.872  < 2e-16 ***
trend()         6.0358     0.1779  33.930  < 2e-16 ***
season()year2 -15.7828     8.4909  -1.859   0.0685 .  
season()year3  62.9459     8.4965   7.408 8.95e-10 ***
season()year4  56.8013     8.6411   6.573 2.03e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.25 on 54 degrees of freedom
Multiple R-squared: 0.9602, Adjusted R-squared: 0.9573
F-statistic:   326 on 4 and 54 DF, p-value: < 2.22e-16

Coefficient Interpretations:

The adjusted R-squared value is very high with a value of about 0.95, suggesting that about 95% of the variation in quarterly takings is explained by the linear regression model
Quarter 1 is captured by the intercept term and is therefore the baseline in which we compare the other quarters to.
According to the trend term, takings increase on average by about 6 million Australian dollars each quarter.
The quarterly dummies suggest that Takings in Quarter 2 are lower on average than Quarter 1 and that Takings in Quarter 3 and Quarter 4 are both higher compared to Takings in Quarter 1 with Quarter 3 being the highest. Interpreting the magnitude of the quarterly dummy coefficients is straightforward
For Quarter 3, the coefficient of \(\beta_3=62.9\) means that Takings in quarter 3 are, on average, about 63 million Australian dollars higher than in Quarter 1. This coefficient is also statistically significant, meaning it’s unlikely for this coefficient to be generated if there really wasn’t a difference in the level of Takings between Quarter 1 and Quarter 3.
For Quarter 2, the coefficient of \(\beta_2=-15.78\) means that Takings in quarter 2 are, on average, about 15 million Australian dollars lower than in Quarter 1, although this value is not statistically significant.
In fact, the sub-series plot below confirms this with Quarter 3 having higher average takings compared to Quarter 1 and Quarter 2 having the lowest average Takings.

# Plotting the subseries to compare means of each quarter
train |> gg_subseries(Takings) +
  labs(title = "Queensland Accommodation Takings Sub-series Plot",
       x = "Time (Quarters)",
       y = "Takings (Millions of Australian Dollars)")

Generating Forecasts

Below, I have generated the forecasts with the linear regression model against the testing set.

regression_fc <- fit_regression |>
  forecast(h = nrow(test))

regression_fc |> autoplot(train) +
  autolayer(test, Takings) +
  labs(title = "Queensland Accommodation Takings Forecasts",
       subtitle = "Original Values",
       x = "Time (Quarters)",
       y = "Takings (Millions of Australian Dollars)")

It appears that the linear regression model forecasts systematically over predict the Takings values for all of the observed values in the testing set.
I have confirmed this with the bias-focused accuracy metrics (Mean error and mean percentage error) below.

Accuracy Metrics

metrics <- accuracy(regression_fc, aus_accommodation) |> select(ME, MPE)
metrics <- metrics |> mutate(Model = "Linear Regression", .before = ME)
metrics |> kable(digits = 2)

Model	ME	MPE
Linear Regression	-49.56	-8.96

We can see that both the mean error and mean percentage errors are negative which confirms what we see in the graph that the linear regression model consistently over predicts the Takings values in the testing set.
A possible solution to generate more accurate forecasts is to stabilize the variance by performing a Box-Cox transformation.

Applying a Box-Cox Transformation

lambda <- aus_accommodation |>
  features(Takings, features = guerrero) |>
  pull(lambda_guerrero)

train |> autoplot(box_cox(Takings, lambda)) +
  labs(title = "Queensland Accommodation Takings from 1998-2012 with Box-Cox Transformation",
       subtitle = "lambda = -0.25",
       x = "Time (Quarters)",
       y = "Takings")

We can see that the variance is now stabilized by performing a Box-Cox transformation.
Although the variance may have been stabilized, it appears that the Takings levels off more so towards the end of the series compared to the un-transformed data. This could potentially have negative effects on the accuracy of the linear regression model since the trend is not as linear as it was before.
Also, the units on the y-axis have now changed which makes it more difficult to interpret the level of Takings at any given point.
Now, I will train the linear model using the transformed takings value and analyze coefficients before generating forecasts. The estimating equation is the exact same as highlighted before.

# make train and test sets for the transformed values
aus_accommodation <- aus_accommodation |> mutate(takings_transformed = box_cox(Takings, lambda))

train <- train |> mutate(takings_transformed = box_cox(Takings, lambda))
test <- test |> mutate(takings_transformed = box_cox(Takings, lambda))

# Fitting the model on the transformed takings values
fit_boxcox <- train |>
  model(TSLM(takings_transformed ~ trend() + season()))

fit_boxcox |> report()

Series: takings_transformed 
Model: TSLM 

Residuals:
      Min        1Q    Median        3Q       Max 
-0.027293 -0.012821 -0.001513  0.014158  0.028017 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)    2.9945508  0.0053334 561.471  < 2e-16 ***
trend()        0.0035733  0.0001199  29.810  < 2e-16 ***
season()year2 -0.0099832  0.0057216  -1.745   0.0867 .  
season()year3  0.0336928  0.0057254   5.885 2.61e-07 ***
season()year4  0.0326340  0.0058228   5.604 7.28e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.01567 on 54 degrees of freedom
Multiple R-squared: 0.9485, Adjusted R-squared: 0.9447
F-statistic: 248.9 on 4 and 54 DF, p-value: < 2.22e-16

Interpretations

It appears that the sign of the coefficients are the same, indicating that takings levels are higher during Quarter 3 and Quarter 4 compared to Quarter 1 and that takings level are on average lower in Quarter 2 compared to Quarter 1.
However, there is no clear way to interpret the magnitude of the coefficients because of the transformation. Specifically, the value of lambda is \(\lambda=-0.25\) which means an inverse power transformation was applied to stabilize the variance, making it very difficult to interpret the coefficients.

Forecasting the values with the new linear model

boxcox_fc <- fit_boxcox |>
  forecast(h = nrow(test))

boxcox_fc |> autoplot(train) +
  autolayer(test, takings_transformed) +
  labs(title = "Queensland Accommodation Takings Forecasts",
       subtitle = "Original Values",
       x = "Time (Quarters)",
       y = "Takings (Millions of Australian Dollars)")

It appears that the forecasted values of the transformed takings values constantly over predict the actual transformed value by even more than we saw in the transformed data.
Therefore, this suggests that maybe applying a Box-Cox transformation was not necessarily the best method to deal with the unstable variance in the original time series. Not only does this reduce the interpretation of the model, but it’s also leading to worse predictions.

Updated Accuracy Metrics

metrics2 <- accuracy(boxcox_fc, aus_accommodation) |> select(ME, MPE)

metrics2 <- metrics2 |> mutate(Model = "Linear Regression Box-Cox", .before = ME)

metrics2 |> kable(digits = 2)

Model	ME	MPE
Linear Regression Box-Cox	-0.05	-1.44

The accuracy metrics confirm that the linear regression model is systematically over predicting the actual values in the testing set. Although the magnitudes of the mean error and mean percentage error are less than we saw before, this is a result of the reduced units from the transformed data, and does not indicate more accurate predictions.
Comparing the graphs of the two forecasts indicate that the linear regression model fitted on the original data produces more accurate forecasts than the model that was fitted on the Box-Cox transformed data. Furthermore, the first method on the original data provides much more interpretable results.