Author: Farhod Ibragimov

library(fpp3)
  1. Produce forecasts for the following series using whichever of NAIVE(y)SNAIVE(y) or RW(y ~ drift()) is more appropriate in each case:
aus_population <- global_economy |> 
  filter(Code == 'AUS') |> 
  select(Population)
aus_population
## # A tsibble: 58 x 2 [1Y]
##    Population  Year
##         <dbl> <dbl>
##  1   10276477  1960
##  2   10483000  1961
##  3   10742000  1962
##  4   10950000  1963
##  5   11167000  1964
##  6   11388000  1965
##  7   11651000  1966
##  8   11799000  1967
##  9   12009000  1968
## 10   12263000  1969
## # ℹ 48 more rows
aus_population |> 
  autoplot(Population)+
  scale_y_continuous(labels = scales::comma)+
  labs(title = 'Australian population time series')

aus_population_train <- aus_population |> 
  filter(Year < 2000)

aus_population_fit <- aus_population_train |> 
  model(RW_drift = RW(Population ~ drift()))

aus_population_fc <- aus_population_fit |> forecast(h = 20)

aus_population_fc |> 
  autoplot(aus_population_train, level = NULL) +
  autolayer(
    filter_index(aus_population, '2000' ~.),
    colour = 'black'
  )+
  scale_y_continuous(labels = scales::comma)+
  guides(colour = guide_legend(title = 'Forecast'))+
  labs(title = 'Forecast for Australian population from year of 2000')
## Plot variable not specified, automatically selected `.vars = Population`

This is yearly time series of Australian population. Since it has yearly period without seasonality and steady rising trend over years the appropriate benchmark (and probably the best) forecast model is Random Walk with Drift RW(y ~ drift())

aus_bricks <- aus_production |> 
  select(Bricks)
aus_bricks
## # A tsibble: 218 x 2 [1Q]
##    Bricks Quarter
##     <dbl>   <qtr>
##  1    189 1956 Q1
##  2    204 1956 Q2
##  3    208 1956 Q3
##  4    197 1956 Q4
##  5    187 1957 Q1
##  6    214 1957 Q2
##  7    227 1957 Q3
##  8    222 1957 Q4
##  9    199 1958 Q1
## 10    229 1958 Q2
## # ℹ 208 more rows
aus_bricks |> 
  ACF() |> 
  autoplot()
## Response variable not specified, automatically selected `var = Bricks`

From ACF plot we can see that this time series has seasonality pattern, and because autocorrelation coefficients are positive we can say that there is a presence of the trend pattern

aus_bricks |> 
  autoplot(Bricks)
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

As we see the time series has increasing trend till 1980’s, and declining after.
There is a huge drop of production around 1982-1983, possibly related to conditions of economy.

aus_bricks_train <- aus_bricks |> 
  filter(year(Quarter) <2000)

aus_bricks_fit <- aus_bricks_train |> 
  model(Snaive = SNAIVE(Bricks ~ lag('year')),
        RWdrift = RW(Bricks~drift()))

aus_bricks_fc <- aus_bricks_fit |> 
  forecast(h = 30)

aus_bricks_fc |> 
  autoplot(aus_bricks_train, level = NULL)

After applying Seasonal Naive and Random walk with the drift models, and considering the presence of seasonal pattern I can say that Seasonal Naive is better benchmark method for this series.

lambs <- aus_livestock |> 
  filter(Animal == 'Lambs', 
         State == 'New South Wales', 
         !is.na(Count))
  
lambs
## # A tsibble: 558 x 4 [1M]
## # Key:       Animal, State [1]
##       Month Animal State            Count
##       <mth> <fct>  <fct>            <dbl>
##  1 1972 Jul Lambs  New South Wales 587600
##  2 1972 Aug Lambs  New South Wales 553700
##  3 1972 Sep Lambs  New South Wales 494900
##  4 1972 Oct Lambs  New South Wales 533500
##  5 1972 Nov Lambs  New South Wales 574300
##  6 1972 Dec Lambs  New South Wales 517500
##  7 1973 Jan Lambs  New South Wales 562600
##  8 1973 Feb Lambs  New South Wales 426900
##  9 1973 Mar Lambs  New South Wales 496300
## 10 1973 Apr Lambs  New South Wales 496000
## # ℹ 548 more rows
lambs |> 
  autoplot(Count)

lambs |> 
  ACF() |> 
  autoplot()
## Response variable not specified, automatically selected `var = Count`

In this plot i wanted to see if seasonality is present in this series. There are some spikes at lags 1, 12, 23 – showing spikes every 11 months. That’s very unusual and not enough evidence to make a conclusion that seasonality present in this series.
Also the trend of the series can overlap seasonality pattern to be detected in ACF. I think after decomposition it may show the seasonal pattern

lamb_train <- lambs |> 
  filter_index('1973 Jan' ~ '2005 Jan')

lamb_fit <-  lamb_train |> 
  model(
    Mean = MEAN(Count),
    Naive = NAIVE(Count),
    Seas_naive = SNAIVE(Count),
    Rand_walk = RW(Count ~ drift())
  )

lamb_fc <- lamb_fit |> 
  forecast(h = 24)

lamb_fc |> 
  autoplot(lamb_train, level = NULL)

In ACF plot i wanted to see if seasonality is present in this series. There are some spikes at lags 1, 12, 23 – showing spikes every 11 months. That’s very unusual and not enough evidence to make a conclusion that seasonality present in this series.
Also the trend of the series can overlap seasonality pattern to be detected in ACF. I think after decomposition it may show the seasonal pattern.
Seasonal Naive benchmark method is most appropriate for this series.

?hh_budget
## starting httpd help server ... done
wealth <- hh_budget |>
  filter(Country == 'Australia') |> 
  select(Wealth) 
 
wealth
## # A tsibble: 22 x 2 [1Y]
##    Wealth  Year
##     <dbl> <dbl>
##  1   315.  1995
##  2   315.  1996
##  3   323.  1997
##  4   339.  1998
##  5   354.  1999
##  6   350.  2000
##  7   348.  2001
##  8   349.  2002
##  9   360.  2003
## 10   379.  2004
## # ℹ 12 more rows
wealth |> autoplot(Wealth)

wealth_fit <- wealth |> 
  model(RW(Wealth~drift()))

wealth_fit |> 
  forecast(h = 9) |> 
  autoplot(wealth, level= NULL)

The series apper to have overall increasing trend and I think the appropriate benchmark model for this seies is Random Walk with the Drift.

food_takeaway <- aus_retail |> 
  filter(Industry == 'Takeaway food services') |> 
  summarise(Turnover = sum(Turnover, na.rm = TRUE))
food_takeaway
## # A tsibble: 441 x 2 [1M]
##       Month Turnover
##       <mth>    <dbl>
##  1 1982 Apr     194.
##  2 1982 May     194.
##  3 1982 Jun     186.
##  4 1982 Jul     190.
##  5 1982 Aug     190.
##  6 1982 Sep     195.
##  7 1982 Oct     209 
##  8 1982 Nov     212.
##  9 1982 Dec     238.
## 10 1983 Jan     225.
## # ℹ 431 more rows
food_takeaway |> 
  autoplot(Turnover)

The plot of this series shows overall rising trend, and presence of seasonality.
There is also the variance increases with the level of the trend over the time, indicating Heteroscedasticity in this data.
To stabilize the variance I will apply Box-Cox and log transformations.

lambda_food <- food_takeaway |> 
  features(Turnover, features = guerrero) |> 
  pull(lambda_guerrero)

food_takeaway <- food_takeaway |> 
  mutate(Turnover_log = log(Turnover)) |> 
  mutate(Turnover_bxcx = box_cox(Turnover, lambda_food))
lambda_food
## [1] 0.04226665
food_takeaway |> 
  select(Month, Turnover, Turnover_log, Turnover_bxcx) |> 
  pivot_longer(-Month,
               names_to = 'Transformation',
               values_to = 'Value') |> 
  autoplot(Value)+
  facet_wrap(~Transformation,
             nrow = 3,
             scales = 'free_y')

#food_takeaway |> 
#  ACF(Turnover_log) |> 
#  autoplot()

As we see Box-Cox (lambda = 0.042) and Log transformation are almost identical, I will use Log transformation to apply a model.

ft_dcmp <- food_takeaway |> 
  model(stlf = decomposition_model(
    STL(log(Turnover) ~ trend(window = 7), robust = TRUE),
    NAIVE(season_adjust))
  ) 


ft_dcmp |> 
  forecast(h = 36) |> 
  autoplot(food_takeaway, level = NULL)

The series appear to have a strong seasonal pattern, therefore Seasonal Naive benchmark method applied to variance stabilized data is appropriate.

  1. Use the Facebook stock price (data set gafa_stock) to do the following:
fb_stock <- gafa_stock |> 
  filter(Symbol == 'FB') |> 
  mutate(day = row_number()) |> 
  update_tsibble(index = day, regular = TRUE)
fb_stock
## # A tsibble: 1,258 x 9 [1]
## # Key:       Symbol [1]
##    Symbol Date        Open  High   Low Close Adj_Close   Volume   day
##    <chr>  <date>     <dbl> <dbl> <dbl> <dbl>     <dbl>    <dbl> <int>
##  1 FB     2014-01-02  54.8  55.2  54.2  54.7      54.7 43195500     1
##  2 FB     2014-01-03  55.0  55.7  54.5  54.6      54.6 38246200     2
##  3 FB     2014-01-06  54.4  57.3  54.0  57.2      57.2 68852600     3
##  4 FB     2014-01-07  57.7  58.5  57.2  57.9      57.9 77207400     4
##  5 FB     2014-01-08  57.6  58.4  57.2  58.2      58.2 56682400     5
##  6 FB     2014-01-09  58.7  59.0  56.7  57.2      57.2 92253300     6
##  7 FB     2014-01-10  57.1  58.3  57.1  57.9      57.9 42449500     7
##  8 FB     2014-01-13  57.9  58.2  55.4  55.9      55.9 63010900     8
##  9 FB     2014-01-14  56.5  57.8  56.1  57.7      57.7 37503600     9
## 10 FB     2014-01-15  58.0  58.6  57.3  57.6      57.6 33663400    10
## # ℹ 1,248 more rows
fb_stock |> 
  autoplot(Close)+
  labs(title = "Facebook closing stock prices")

fb_stock_fit <- fb_stock |> 
  model(Drift = RW(Close~drift()))

fb_stock_fc <- fb_stock_fit |> 
  forecast(h= 150) # forecasted for 90days since index = day

fb_stock_fc |> 
  autoplot(fb_stock, level = NULL) +
  labs(title = "Facebook closing stock prices 150 days forecast",
       subtitle = 'Drift forecast model in blue color')

line_data <- tibble(
  Date  = c(first(fb_stock$day), last(fb_stock$day)),
  Close = c(first(fb_stock$Close), last(fb_stock$Close))
)

fb_stock_fc |> 
  autoplot(fb_stock, level = NULL) +
  geom_line(data = line_data, aes(x = Date, y = Close),
            colour = "blue", linetype = 2) +
  labs(title = "Facebook closing stock prices 150 days forecast",
       subtitle = 'Drift forecast model in blue color line\nIdentical dash line drawn between the first and last actual observations ')

In this plot dotted line is between first and last observations of the data, and we can see that RW Drift forecast’s solid blue line is extending that line.

fb_stock_before_2016 <- fb_stock |> 
  filter(year(Date)<=2015)

fb_stock_fit_all <- fb_stock_before_2016 |> 
  model(
    Mean = MEAN(Close),
    Naive = NAIVE(Close),
    RWDrift = RW(Close~drift())
  )
  
fb_stock_2016_jan_feb <- fb_stock |> 
  filter(yearmonth(Date) %in% yearmonth(c('2016 Jan', '2016 Feb')))

fb_stock_fc_all <- fb_stock_fit_all |> 
  forecast(new_data = fb_stock_2016_jan_feb)

fb_stock_fc_all |> 
  autoplot(fb_stock_before_2016, level = NULL) +
  autolayer(fb_stock_2016_jan_feb, Close, color = 'black')+
  guides(colour = guide_legend(title = 'Forecast'))

From this plot I can say that the MEAN model is inappropriate, because stock prices are not fluctuating around it over long period.
Random Walk with the drift model assumes that previous growth will project into future forecast, but usually stock prices are not guaranteed to have a constant growth.

The best and reasonable for short-term forecast is Naive model. The reason is that usually today’s price is the best to approximate tomorrow’s price, and Naive model is best option for stock prices benchmark forecast model.

  1. Apply a seasonal naïve method to the quarterly Australian beer production data from 1992. Check if the residuals look like white noise, and plot the forecasts. The following code will help.
# Extract data of interest
recent_production <- aus_production |>
  filter(year(Quarter) >= 1992) |> 
  select(Beer)


recent_production |> 
  autoplot(Beer)+
  labs(title = 'Australian beer production time series')

There is a slightly declining overall trend, and strong evidence of seasonal pattern in this series.

# Define and estimate a model
fit <- recent_production |> model(SNAIVE(Beer))
# Look at the residuals
fit |> gg_tsresiduals()

Residual plot shows that residuals fluctuate around zero with no constant variance. There are couple of residuals with significant negative values.

Most of the lags on ACF plot are below confidence bounds, however there is noticeable spike at seasonal lag 4 on and suggests that Naive model didn’t fully captured seasonal pattern.

The histogram of residuals looks fairly symmetric and centered around zero. There is slight left-skewness.

Overall, these plots does not suggest that residuals looks like white noise. While the model captures much of the seasonal pattern, there is still some remaining seasonal pattern in the data.

aug <- augment(fit)
aug
## # A tsibble: 74 x 6 [1Q]
## # Key:       .model [1]
##    .model       Quarter  Beer .fitted .resid .innov
##    <chr>          <qtr> <dbl>   <dbl>  <dbl>  <dbl>
##  1 SNAIVE(Beer) 1992 Q1   443      NA     NA     NA
##  2 SNAIVE(Beer) 1992 Q2   410      NA     NA     NA
##  3 SNAIVE(Beer) 1992 Q3   420      NA     NA     NA
##  4 SNAIVE(Beer) 1992 Q4   532      NA     NA     NA
##  5 SNAIVE(Beer) 1993 Q1   433     443    -10    -10
##  6 SNAIVE(Beer) 1993 Q2   421     410     11     11
##  7 SNAIVE(Beer) 1993 Q3   410     420    -10    -10
##  8 SNAIVE(Beer) 1993 Q4   512     532    -20    -20
##  9 SNAIVE(Beer) 1994 Q1   449     433     16     16
## 10 SNAIVE(Beer) 1994 Q2   381     421    -40    -40
## # ℹ 64 more rows
aug |> 
  features(.innov, ljung_box, lag = 8) |> 
  mutate(
    lb_stat = round(lb_stat, 2),
    lb_pvalue = format(lb_pvalue, scientific = FALSE),
    innov_mean = mean(aug$.innov, na.rm = TRUE)
  )
## # A tibble: 1 × 4
##   .model       lb_stat lb_pvalue     innov_mean
##   <chr>          <dbl> <chr>              <dbl>
## 1 SNAIVE(Beer)    32.3 0.00008335611      -1.57

Ljung-Box test reveals large Q* = 32.27 and extremely low p-value = 0.000083.
The p-value is less than 0.05 therefore we reject hypothesis that residuals are white noise.
Ljung-Box test shows significant remaining autocorrelation in residuals (as we saw at lag 4 on CAF plot), so we can suggest that SNAIVE model does not perfectly capture underlying time series structure.

# Look a some forecasts
fit |> forecast(h=20) |> autoplot(recent_production)

I think because the series appear to have declining trend over the years it affects the residuals of SNAIVE model:

  1. Repeat the previous exercise using the Australian Exports series from global_economy and the Bricks series from aus_production. Use whichever of NAIVE() or SNAIVE() is more appropriate in each case.

4.1Australian Exports series from global_economy

?global_economy
aus_export <- global_economy |> 
  filter(Code == 'AUS') |> 
  select(Exports)

aus_export
## # A tsibble: 58 x 2 [1Y]
##    Exports  Year
##      <dbl> <dbl>
##  1    13.0  1960
##  2    12.4  1961
##  3    13.9  1962
##  4    13.0  1963
##  5    14.9  1964
##  6    13.2  1965
##  7    12.9  1966
##  8    12.9  1967
##  9    12.3  1968
## 10    12.0  1969
## # ℹ 48 more rows
aus_export |> 
  autoplot(Exports) +
  labs(title = 'Australian exports',
       y = 'Exports of goods and services (% of GDP)'
       )

aus_export_fit <- aus_export |> 
  model(RW(Exports ~ drift()))
aus_export_fit |> 
  forecast(h = 5) |> 
  autoplot(aus_export)

aus_export_fit |> gg_tsresiduals()

The residual series plot shows residuals are fluctuating around zero, and it does not appear to have constant variance.
ACF plot shows that most of lags are below confidence boundaries, except at lag 1 showing slightly higher autocorrelation.
Residuals distribution plot appears to be approximately normal, with mild left-skewness.
Overall it appears that residual looks like a white noise. Let’s see if Ljung-Box test confirms it.

aus_export_aug <- augment(aus_export_fit)
#aus_export_aug
aus_export_aug |> 
  features(.innov, ljung_box, lag = 10) |> 
  mutate(
    lb_stat = round(lb_stat, 2),
    lb_pvalue = format(lb_pvalue, scientific = FALSE),
    mean = mean(aus_export_aug$.innov, na.rm = TRUE)
  )
## # A tibble: 1 × 4
##   .model                lb_stat lb_pvalue       mean
##   <chr>                   <dbl> <chr>          <dbl>
## 1 RW(Exports ~ drift())    16.4 0.08963678 -8.41e-16

This test produces a larger p-value = 0.09, which means we fail to reject the null hypothesis of no autocorrelation in the residuals. This suggests that residuals looks like white noise and that the model has captured the underlying structure of the time series reasonably well.

4.2  Bricks series from aus_production

?aus_production

bricks <- aus_production |> 
  select(Bricks) |> 
  filter(year(Quarter) >=1992)

bricks
## # A tsibble: 74 x 2 [1Q]
##    Bricks Quarter
##     <dbl>   <qtr>
##  1    383 1992 Q1
##  2    404 1992 Q2
##  3    446 1992 Q3
##  4    420 1992 Q4
##  5    394 1993 Q1
##  6    462 1993 Q2
##  7    475 1993 Q3
##  8    443 1993 Q4
##  9    421 1994 Q1
## 10    475 1994 Q2
## # ℹ 64 more rows
bricks |> 
  autoplot(Bricks)+
  labs(title = 'Australian bricks production',
       y = 'Bricks (millions)')
## Warning: Removed 20 rows containing missing values or values outside the scale range
## (`geom_line()`).

bricks_fit <- bricks |> 
  model(SNAIVE(Bricks))

bricks_fit |> gg_tsresiduals()
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 24 rows containing non-finite outside the scale range
## (`stat_bin()`).
## Warning: Removed 24 rows containing missing values or values outside the scale range
## (`geom_rug()`).

The residual plot strongly suggests that residuals do not have zero mean, there is no stable variation with couple of extreme negative residuals around 1997 and 2003 possibly resulting from recessions in economy. From this I already can say residuals are not like a white noise.
AFC plot has a significantly high autocorrelation at lag 1, with several other lags also above confidence bounds. If residuals were white noise, most spikes would lie inside the confidence limits.
The distribution on histogram plot does not appear to be normal, it looks bi-modal with the strong left skew.
Overall these residual diagnostic plots strongly suggest that residuals are not white noise.

bricks_aug <-  augment(bricks_fit)
bricks_aug
## # A tsibble: 74 x 6 [1Q]
## # Key:       .model [1]
##    .model         Quarter Bricks .fitted .resid .innov
##    <chr>            <qtr>  <dbl>   <dbl>  <dbl>  <dbl>
##  1 SNAIVE(Bricks) 1992 Q1    383      NA     NA     NA
##  2 SNAIVE(Bricks) 1992 Q2    404      NA     NA     NA
##  3 SNAIVE(Bricks) 1992 Q3    446      NA     NA     NA
##  4 SNAIVE(Bricks) 1992 Q4    420      NA     NA     NA
##  5 SNAIVE(Bricks) 1993 Q1    394     383     11     11
##  6 SNAIVE(Bricks) 1993 Q2    462     404     58     58
##  7 SNAIVE(Bricks) 1993 Q3    475     446     29     29
##  8 SNAIVE(Bricks) 1993 Q4    443     420     23     23
##  9 SNAIVE(Bricks) 1994 Q1    421     394     27     27
## 10 SNAIVE(Bricks) 1994 Q2    475     462     13     13
## # ℹ 64 more rows
bricks_aug |> 
  features(.innov, ljung_box, lag = 8) |> 
  mutate(
    lb_stat = round(lb_stat, 2),
    lb_pvalue = format(lb_pvalue, scientific = FALSE),
    mean = mean(bricks_aug$.innov, na.rm = TRUE)
  )
## # A tibble: 1 × 4
##   .model         lb_stat lb_pvalue           mean
##   <chr>            <dbl> <chr>              <dbl>
## 1 SNAIVE(Bricks)    62.2 0.0000000001735133 -0.76

As we see p-value from Ljung-Box test is extremely low and almost equals to zero, and confirms that residuals are not like a white noise.
More advanced approaches and models likely perform better than SNAIVE model in this series.

  1. For your retail time series (from Exercise 7 in Section 2.10):

For this exercise I would like to choose “Supermarket and grocery stores” time series.

supermarket <- aus_retail |> 
  filter(Industry == 'Supermarket and grocery stores') |> 
  summarise(Turnover = sum(Turnover))

supermarket
## # A tsibble: 441 x 2 [1M]
##       Month Turnover
##       <mth>    <dbl>
##  1 1982 Apr     919.
##  2 1982 May     906.
##  3 1982 Jun     918.
##  4 1982 Jul     957.
##  5 1982 Aug     909.
##  6 1982 Sep     940.
##  7 1982 Oct     984.
##  8 1982 Nov    1017.
##  9 1982 Dec    1174.
## 10 1983 Jan     945.
## # ℹ 431 more rows

This series have 441 monthly observations of overall Australian supermarket and grocery stores sales turnovers in a time range of April 1982 to November 2018.

supermarket |> 
  autoplot(Turnover)+
  labs(title = 'Australian Supermarket and grocery stores sales',
       y = 'Turnover ($ million AUD)')

supermarket_train <- supermarket |> 
  filter(year(Month) < 2011)

supermarket |> 
  autoplot(Turnover)+
  autolayer(supermarket_train, Turnover, color = 'red')+
  labs(title = 'Time series splitted into train and test sets',
       subtitle = 'Train set before 2011 in red color',
        y = 'Turnover ($ million AUD)' )

The variance of the data proportionally increases with the level of the trend over time, therefore it is appropriate to apply a Log transformation.
There also rising trend with strong evidence of seasonality in this series.
I will fit following models to see how residuals and accuracy behave in these models:

supermarket_fit_decomposed <- supermarket_train |> 
  model(
    stl_sadj = decomposition_model(
      STL(log(Turnover)~ trend(window = 15)+
            season(window = 'periodic'), robust = TRUE
          ),
      NAIVE(season_adjust))
  )
supermarket_fit <- supermarket_train |> 
  model(Snaive_log = SNAIVE(log(Turnover)),
        Snaive = SNAIVE(Turnover),
        STL_LOG_Naive= decomposition_model(
                      STL(log(Turnover)~ trend(window = 15)+
                      season(window = 'periodic'), robust = TRUE
                              ),
                      NAIVE(season_adjust)))
supermarket_fit |>
  select(Snaive) |> 
  gg_tsresiduals()+
  labs(title = 'Residual diagnostic plots for SNAIVE model on original data')

ACF plot reveals that all lags have significantly high autocorrelations, strongly suggesting residuals are not a white noise.
Also residual plot shows that almost all residuals are above zero, which means residuals do not have zero mean. The variance of residuals do not appear to be stable and increasing over time, because of the strong rising trend in the series.
Histogram plot confirms majority of residuals are above zero and there is strong right skeweness.

Overall, these residuals diagnostics strongly suggest that residuals from SNAIVE model on original data are not white noise.

supermarket_fit |> 
  select(Snaive_log) |> 
  gg_tsresiduals()+
  labs(title = 'Residual diagnostic plots for SNAIVE model on log-transformed data')

Residuals variance slightly stabilized compared to previous diagnostics, but still almost all of the residuals are above zero and do not have zero mean.
Log transformation reduced amount of lags with significantly high autocorrelations compared to residuals from model using original data. But there are still several lags well above confidence boundaries confirming that residuals are not white noise.
Distribution of residuals appears to be close to be normal, however it’s centered far right from zero.

Overall log transformation mildly improved SNAIVE model to capture structure of the series, but residual are still not a white noise.

supermarket_fit |> 
  select(STL_LOG_Naive) |> 
  gg_tsresiduals()+
  labs(title = 'Residual diagnostic plots for decomposed Log transformed data')

Residuals appear to have zero mean and more stable variance compared to previous models. However there is a significant spike around 1983.
ACF plot is much improved, but still has several lags with significant autocorrelations.
Histogram of residuals distribution appears to be close to normal centered close to zero.

Overall, NAIVE model fitted with decomposed component showed much improved residual diagnostic results compared to previous models. However, it did not fully captured structure of the series and residual are not a white noise.

supermarket_fc <- supermarket_fit |> 
  forecast(new_data = anti_join(supermarket, 
                                supermarket_train))
## Joining with `by = join_by(Month, Turnover)`
supermarket_fc |>
   autoplot(supermarket) +
  facet_wrap(~.model, nrow = 3, scales = 'free_y')

As we see from this plots, prediction intervals of the model fitted on original almost entirely missed actual test data.
Prediction intervals of the model trained on Log transformed data captures actual values well below 95% probability bound, but with most of actual values are above 80% probability bound.
As we see in the last plot STL_LOG_Naive 80% prediction interval contains majority of the actual values.
Overall, the STL_LOG_Naive model appears to provide the most reasonable forecasts. The actual data points are largely contained within its prediction intervals, and the spread of those intervals looks consistent with the variability of the series. The simple Snaive model on the original data seems too restrictive for a series with strong growth and multiplicative seasonality.

supermarket_fc |> 
  accuracy(supermarket, list(skill = skill_score(CRPS)))
## # A tibble: 3 × 3
##   .model        .type skill
##   <chr>         <chr> <dbl>
## 1 STL_LOG_Naive Test  0.273
## 2 Snaive        Test  0    
## 3 Snaive_log    Test  0.212

As we see from comparisons using skill scores SNAIVE method fitted on log transformed train set is 21.2% better than simple SNAIVE model fitted on original scale data.
NAIVE model applied to STL decomposed component and Log transformed data shows best 27.27% improvement in forecast interval accuracy compared to SNAIVE fitted on original data.
This is consistent with what we observed visually. The original SNAIVE model does not handle the increasing variance and evolving trend very well. Log transformation improves stability, and combining STL decomposition with log transformation gives the most accurate and flexible forecasts intervals.

Overall, the results suggest that using Log or Box-Cox transformations for multiplicative structure (variance proportional change along trend level) and separating trend from seasonality improves forecasting performance for this series.

supermarket_fc |> accuracy(supermarket) |> 
  select(.model, RMSE, MAE, MAPE, MASE)
## # A tibble: 3 × 5
##   .model         RMSE   MAE  MAPE  MASE
##   <chr>         <dbl> <dbl> <dbl> <dbl>
## 1 STL_LOG_Naive 1353. 1195.  14.2  5.82
## 2 Snaive        1564. 1380.  16.5  6.72
## 3 Snaive_log    1467. 1294.  15.4  6.30

From point accuracy measurements we can see that SNAIVE method with original scale data has largest errors.
The SNAIVE with log-transformed data model improves all error measures compared to original SNAIVE. RMSE and MAE are lower, and MAPE drops from about 16.47% to 15.44%.
The STL_LOG_Naive model performs best. It has the lowest RMSE (1353.280), lowest MAE (1194.703), lowest MAPE (14.23%).

So based on these results, STL_LOG_Naive is the most accurate model among the three. The improvement suggest that variance stabilization using log transformation, and separating trend and seasonality with STL decomposition helps to capture the underlying structure of the series better.