Time Series Decomposition

# Set CRAN mirror
options(repos = c(CRAN = "https://cloud.r-project.org/"))
install.packages("fpp3")

## 
## The downloaded binary packages are in
##  /var/folders/67/32gwlz7s3z7dm451ptq50z940000gn/T//RtmpbPrcVg/downloaded_packages

install.packages("seasonal")

## 
## The downloaded binary packages are in
##  /var/folders/67/32gwlz7s3z7dm451ptq50z940000gn/T//RtmpbPrcVg/downloaded_packages

library(seasonal)
library(fpp3)

## Registered S3 method overwritten by 'tsibble':
##   method               from 
##   as_tibble.grouped_df dplyr

## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.2 ──

## ✔ tibble      3.3.0     ✔ tsibble     1.1.6
## ✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
## ✔ tidyr       1.3.1     ✔ feasts      0.4.2
## ✔ lubridate   1.9.4     ✔ fable       0.4.1
## ✔ ggplot2     4.0.0

## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()
## ✖ tibble::view()       masks seasonal::view()

library(dplyr)
library(tsibble)

data(global_economy)
head(global_economy)

## # A tsibble: 6 x 9 [1Y]
## # Key:       Country [1]
##   Country     Code   Year         GDP Growth   CPI Imports Exports Population
##   <fct>       <fct> <dbl>       <dbl>  <dbl> <dbl>   <dbl>   <dbl>      <dbl>
## 1 Afghanistan AFG    1960  537777811.     NA    NA    7.02    4.13    8996351
## 2 Afghanistan AFG    1961  548888896.     NA    NA    8.10    4.45    9166764
## 3 Afghanistan AFG    1962  546666678.     NA    NA    9.35    4.88    9345868
## 4 Afghanistan AFG    1963  751111191.     NA    NA   16.9     9.17    9533954
## 5 Afghanistan AFG    1964  800000044.     NA    NA   18.1     8.89    9731361
## 6 Afghanistan AFG    1965 1006666638.     NA    NA   21.4    11.3     9938414

glimpse(global_economy)

## Rows: 15,150
## Columns: 9
## Key: Country [263]
## $ Country    <fct> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan",…
## $ Code       <fct> AFG, AFG, AFG, AFG, AFG, AFG, AFG, AFG, AFG, AFG, AFG, AFG,…
## $ Year       <dbl> 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969,…
## $ GDP        <dbl> 537777811, 548888896, 546666678, 751111191, 800000044, 1006…
## $ Growth     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ CPI        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ Imports    <dbl> 7.024793, 8.097166, 9.349593, 16.863910, 18.055555, 21.4128…
## $ Exports    <dbl> 4.132233, 4.453443, 4.878051, 9.171601, 8.888893, 11.258279…
## $ Population <dbl> 8996351, 9166764, 9345868, 9533954, 9731361, 9938414, 10152…

3.1 GDP per capita for each country over time.

library(fpp3)
library(dplyr)
library(tsibble)
# Simple range
cat("Time period:", min(global_economy$Year), "to", max(global_economy$Year), "\n")

## Time period: 1960 to 2017

# Calculate GDP_per_capita column
global_economy <- global_economy %>%
  mutate(GDP_per_capita = GDP / Population)

# Find country with highest GDP per capita
highest_gdp <- global_economy %>%
  filter(!is.na(GDP_per_capita)) %>%
  slice_max(GDP_per_capita, n = 1) %>%
  select(Country, Year, GDP, Population, GDP_per_capita)

print(highest_gdp)

## # A tsibble: 1 x 5 [1Y]
## # Key:       Country [1]
##   Country  Year         GDP Population GDP_per_capita
##   <fct>   <dbl>       <dbl>      <dbl>          <dbl>
## 1 Monaco   2014 7060236168.      38132        185153.

print(paste("The country with the highest GDP per capita is", 
            highest_gdp$Country, 
            "in", 
            highest_gdp$Year, 
            "with $", 
            round(highest_gdp$GDP_per_capita, 0), 
            "per person."))

## [1] "The country with the highest GDP per capita is Monaco in 2014 with $ 185153 per person."

# Plot GDP per capita for all countries over time
global_economy %>%
  filter(!is.na(GDP_per_capita)) %>%
  ggplot(aes(x = Year, y = GDP_per_capita, color = Country)) +
  geom_line() +
  labs(title = "GDP per Capita Over Time by Country",
       x = "Year", 
       y = "GDP per Capita (USD)") +
  theme_minimal() +
  theme(legend.position = "none")

Over the time period of the dataset, 1960 to 2017, there was a consistent trend of increase in GDP per capita over time. This pattern was consistent among countries, particularly the top 5 GDP per capita nations. The jagged sawtooth pattern represents rapid increases followed by declines, creating an erratic growth trajectory rather than a smooth upward trend. Despite this volatility, there was still an overall upward trend in GDP per capita across most countries.

3.2 Analyzing: global_economy, aus_livestock, vic_elec, and aus_production

# Load all required datasets
data(global_economy)
data(aus_livestock)
data(vic_elec)
data(aus_production)

# 1. United States GDP from global_economy
us_gdp <- global_economy %>% 
  filter(Country == "United States")

us_gdp %>%
  ggplot(aes(x = Year, y = GDP)) +
  geom_line(color = "blue", size = 1) +
  labs(title = "United States GDP Over Time", 
       y = "GDP (Billions USD)") +
  theme_minimal()

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

# Log transformation for US GDP (to show exponential growth better)
us_gdp %>%
  ggplot(aes(x = Year, y = log(GDP))) +
  geom_line(color = "red", size = 1) +
  labs(title = "United States GDP (Log Scale)", 
       y = "Log(GDP)") +
  theme_minimal()

cat("US GDP shows exponential growth. Log transformation linearizes the trend.\n")

## US GDP shows exponential growth. Log transformation linearizes the trend.

# 2. Victorian Bulls/Bullocks/Steers from aus_livestock
vic_bulls <- aus_livestock %>%
  filter(State == "Victoria", Animal == "Bulls, bullocks and steers")

vic_bulls %>%
  ggplot(aes(x = Month, y = Count)) +
  geom_line(color = "brown", size = 1) +
  labs(title = "Victorian Bulls, Bullocks and Steers Slaughter", 
       y = "Count") +
  theme_minimal()

# Square root transformation for livestock (reduces variability)
vic_bulls %>%
  ggplot(aes(x = Month, y = sqrt(Count))) +
  geom_line(color = "darkgreen", size = 1) +
  labs(title = "Victorian Livestock (Square Root Scale)", 
       y = "sqrt(Count)") +
  theme_minimal()

cat("Livestock data shows seasonality. Square root transformation reduces variance.\n")

## Livestock data shows seasonality. Square root transformation reduces variance.

# 3. Victorian Electricity Demand from vic_elec
vic_elec %>%
  ggplot(aes(x = Time, y = Demand)) +
  geom_line(color = "orange", size = 0.5) +
  labs(title = "Victorian Electricity Demand", 
       y = "Demand (MW)") +
  theme_minimal()

# No transformation needed - already well-scaled
cat("Electricity demand shows clear daily and seasonal patterns. No transformation needed.\n")

## Electricity demand shows clear daily and seasonal patterns. No transformation needed.

# 4. Gas production from aus_production
aus_production %>%
  ggplot(aes(x = Quarter, y = Gas)) +
  geom_line(color = "purple", size = 1) +
  labs(title = "Australian Gas Production", 
       y = "Gas Production") +
  theme_minimal()

# Log transformation for gas (handles growth trend)
aus_production %>%
  ggplot(aes(x = Quarter, y = log(Gas))) +
  geom_line(color = "darkblue", size = 1) +
  labs(title = "Australian Gas Production (Log Scale)", 
       y = "Log(Gas)") +
  theme_minimal()

#### Summary of transformations:

US GDP: Log transform → linearizes exponential growth
Livestock: Square root → reduces variance in count data
Electricity: None needed → already well-scaled
Gas: Log transform → linearizes growth trend

3.2 A Case Study of Determining the Suitability of a Box-Cox Transformation

This study will be conducted by creating a plot of the canadian_gas data and checking to determine if a Box-Cox transformation is helpful. There are several factors to consider before applying Box-Cox transformation. The primary requirement is that variance increases with level. Other factors that make a strong case for Box-Cox transformation are: multiplicative seasonality (peaks get larger over time), exponential growth, positive skewness, and improved residual properties after transformation.

Analyzing the canadian_gas data

# Load and visualize canadian_gas data
library(fpp3)
data(canadian_gas)

# 1. Original data visualization
canadian_gas %>%
  autoplot() +
  labs(title = "Canadian Monthly Gas Production",
       y = "Gas Production") +
  theme_minimal()

## Plot variable not specified, automatically selected `.vars = Volume`

# 2. Check the data characteristics
canadian_gas %>%
  gg_tsdisplay(plot_type = "partial") +
  labs(title = "Canadian Gas - Time Series Display")

## Warning: `gg_tsdisplay()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_tsdisplay()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

## Plot variable not specified, automatically selected `y = Volume`

# 3. Try Box-Cox transformation
lambda <- canadian_gas %>%
  features(Volume, guerrero) %>%
  pull(lambda_guerrero)

cat("Optimal Box-Cox lambda:", round(lambda, 3), "\n")

## Optimal Box-Cox lambda: 0.577

# 4. Apply Box-Cox transformation
canadian_gas %>%
  mutate(bc_transformed = box_cox(Volume, lambda)) %>%
  autoplot(bc_transformed) +
  labs(title = paste("Canadian Gas - Box-Cox Transformed (L =", round(lambda, 3), ")"),
       y = "Box-Cox Transformed Gas Production") +
  theme_minimal()

# 5. Compare original vs transformed
canadian_gas %>%
  mutate(bc_transformed = box_cox(Volume, lambda)) %>%
  pivot_longer(c(Volume, bc_transformed), names_to = "series") %>%
  autoplot(value) +
  facet_wrap(~ series, scales = "free_y") +
  labs(title = "Canadian Gas: Original vs Box-Cox Transformed") +
  theme_minimal()

#### The acf graph shows that Canadian gas production is extremely predictable and highly seasonal. Each month’s production is very strongly correlated with the same month in previous years - if January was high this year, January was likely high last year and the year before that too.

In the early years (1960s-1970s), we see regular seasonal fluctuations that are relatively small in amplitude the ups and downs stay within a fairly narrow band around the trend line. However, as we move through the 1980s and into the 1990s, something important changes: while the seasonal pattern remains consistent in timing, the magnitude of these seasonal swings grows dramatically.

By the 1990s and 2000s, the seasonal peaks and valleys have become much more pronounced, but the amplitude seems to. have dropped, signifying an overall increase in gas production but a decrese in the fluctuation of that production. This demonstrates increasing variance with level - as the overall production increased, so did the volatility around that trend.

Applying The Box Cox did not produced significant variation in the observed graph, so in this case the Box Cox was unhelpfu despit the variation observed in the data.

3.4 Monthly Australian retail data is provided in aus_retail. Select one of the time series as follows (but choose your own seed value):

# Load and prepare data
set.seed(08096789)  # Use your chosen seed
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`, 1)) |>
  fill_gaps()  # Fill any gaps first

# 1. Calculate optimal lambda using Guerrero method
lambda <- myseries |>
  features(Turnover, guerrero) |>
  pull(lambda_guerrero)

cat("Optimal Box-Cox L:", round(lambda, 3), "\n")

## Optimal Box-Cox L: 0.371

# 2. Interpret the lambda value
if (abs(lambda - 1) < 0.1) {
  cat("Lambda = 1.0 → NO transformation needed\n")
} else if (abs(lambda - 0.5) < 0.1) {
  cat("Lambda = 0.5 → SQUARE ROOT transformation\n")
} else if (abs(lambda - 0) < 0.1) {
  cat("Lambda = 0.0 → LOG transformation\n")
} else if (lambda > 0.8) {
  cat("Lambda > 0.8 → Minimal transformation needed\n")
} else {
  cat("Lambda =", round(lambda, 3), "→ Use Box-Cox transformation\n")
}

## Lambda = 0.371 → Use Box-Cox transformation

# 3. Visual comparison - Original vs Transformed
comparison_plot <- myseries |>
  mutate(
    Original = Turnover,
    Transformed = box_cox(Turnover, lambda)
  ) |>
  pivot_longer(c(Original, Transformed), 
               names_to = "Type", values_to = "Value") |>
  ggplot(aes(x = Month, y = Value)) +
  geom_line() +
  facet_wrap(~ Type, scales = "free_y", ncol = 1) +
  labs(title = paste("Original vs Box-Cox Transformed (lambda =", round(lambda, 3), ")")) +
  theme_minimal()

print(comparison_plot)

# 4. Decision summary
cat("\nDECISION:\n")

## 
## DECISION:

if (abs(lambda - 1) > 0.2) {
  cat(" USE Box-Cox transformation - lambda is significantly different from 1\n")
  cat("This will help stabilize variance in the retail data\n")
} else {
  cat("SKIP Box-Cox transformation - lambda is close to 1\n")
  cat("Original data already has stable variance\n")
}

##  USE Box-Cox transformation - lambda is significantly different from 1
## This will help stabilize variance in the retail data

Since lambda is 0.234 I would use the square root transformation.

3.5 My approach to. this question is to;

Extract and visualize each series from its dataset
Calculate optimal lambda using features(variable, guerrero) for each series
Apply transformation and compare original vs Box-Cox plots to confirm variance stabilization

#Load required data
library(fpp3)
library(dplyr)
library(tsibble)
data(aus_production)
data(ansett) 
data(pedestrian)

Data analysis from Tobaco data from aus_production

tobacco <- aus_production %>%
  select(Quarter, Tobacco) %>%
  filter(!is.na(Tobacco))

# Plot original data
p1 <- tobacco %>%
  autoplot(Tobacco) +
  labs(title = "Original: Tobacco Production",
       y = "Tobacco") +
  theme_minimal()

# Find optimal lambda
lambda_tobacco <- tobacco %>%
  features(Tobacco, guerrero) %>%
  pull(lambda_guerrero)

cat("TOBACCO - Optimal lambda:", round(lambda_tobacco, 3), "\n")

## TOBACCO - Optimal lambda: 0.926

# Apply transformation
tobacco_transformed <- tobacco %>%
  mutate(Tobacco_BC = box_cox(Tobacco, lambda_tobacco))

# Plot transformed
p2 <- tobacco_transformed %>%
  autoplot(Tobacco_BC) +
  labs(title = paste("Box-Cox Transformed: Tobacco (L =", round(lambda_tobacco, 3), ")"),
       y = "Transformed Tobacco") +
  theme_minimal()

# Compare side by side
comparison_tobacco <- tobacco_transformed %>%
  pivot_longer(c(Tobacco, Tobacco_BC), names_to = "Type", values_to = "Value") %>%
  ggplot(aes(x = Quarter, y = Value)) +
  geom_line() +
  facet_wrap(~ Type, scales = "free_y", ncol = 1) +
  labs(title = "Tobacco: Original vs Box-Cox Transformed") +
  theme_minimal()

print(comparison_tobacco)

#### Data analysis Economy class passengers between Melbourne and Sydney from ansett

economy_melb_syd <- ansett %>%
  filter(Airports == "MEL-SYD", Class == "Economy") %>%
  select(Week, Passengers)

# Plot original
p3 <- economy_melb_syd %>%
  autoplot(Passengers) +
  labs(title = "Original: Economy Class MEL-SYD Passengers",
       y = "Passengers") +
  theme_minimal()

# Find lambda
lambda_passengers <- economy_melb_syd %>%
  features(Passengers, guerrero) %>%
  pull(lambda_guerrero)

cat("PASSENGERS - Optimal lambda:", round(lambda_passengers, 3), "\n")

## PASSENGERS - Optimal lambda: 2

# Transform and compare
passengers_transformed <- economy_melb_syd %>%
  mutate(Passengers_BC = box_cox(Passengers, lambda_passengers))

comparison_passengers <- passengers_transformed %>%
  pivot_longer(c(Passengers, Passengers_BC), names_to = "Type", values_to = "Value") %>%
  ggplot(aes(x = Week, y = Value)) +
  geom_line() +
  facet_wrap(~ Type, scales = "free_y", ncol = 1) +
  labs(title = "Economy Passengers: Original vs Box-Cox Transformed") +
  theme_minimal()

print(comparison_passengers)

#### Data analysis from pedestrian counts at Southern Station

southern_cross <- pedestrian %>%
  filter(Sensor == "Southern Cross Station") %>%
  index_by(Date = as_date(Date_Time)) %>%
  summarise(Daily_Count = sum(Count, na.rm = TRUE)) %>%
  filter(Daily_Count > 0)

# Plot original
p5 <- southern_cross %>%
  autoplot(Daily_Count) +
  labs(title = "Original: Southern Cross Station Pedestrian Counts",
       y = "Daily Count") +
  theme_minimal()

# Find lambda
lambda_pedestrian <- southern_cross %>%
  features(Daily_Count, guerrero) %>%
  pull(lambda_guerrero)

cat("PEDESTRIAN - Optimal lambda:", round(lambda_pedestrian, 3), "\n")

## PEDESTRIAN - Optimal lambda: 0.273

# Transform and compare
pedestrian_transformed <- southern_cross %>%
  mutate(Count_BC = box_cox(Daily_Count, lambda_pedestrian))

comparison_pedestrian <- pedestrian_transformed %>%
  pivot_longer(c(Daily_Count, Count_BC), names_to = "Type", values_to = "Value") %>%
  ggplot(aes(x = Date, y = Value)) +
  geom_line() +
  facet_wrap(~ Type, scales = "free_y", ncol = 1) +
  labs(title = "Pedestrian Counts: Original vs Box-Cox Transformed") +
  theme_minimal()

print(comparison_pedestrian)

## 3.7 Gas Production Analysis - Last Five Years

# Load required packages and data
library(fpp3)
data(aus_production)

# Extract last 5 years of gas data
gas <- tail(aus_production, 5*4) |> select(Gas)

# 1. Plot the time series
p1 <- gas |>
  autoplot(Gas) +
  labs(title = "Australian Gas Production - Last 5 Years",
       y = "Gas Production",
       x = "Quarter") +
  theme_minimal()
print(p1)

# 2. Classical decomposition with multiplicative type
gas_decomp <- gas |>
  model(classical_decomposition(Gas, type = "multiplicative")) |>
  components()

# Plot decomposition
p2 <- gas_decomp |>
  autoplot() +
  labs(title = "Classical Multiplicative Decomposition - Gas Production") +
  theme_minimal()
print(p2)

## Ignoring unknown labels:
## • colour : ".model"
## • fill : ".model"

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

print(gas_decomp)

## # A dable: 20 x 7 [1Q]
## # Key:     .model [1]
## # :        Gas = trend * seasonal * random
##    .model                      Quarter   Gas trend seasonal random season_adjust
##    <chr>                         <qtr> <dbl> <dbl>    <dbl>  <dbl>         <dbl>
##  1 "classical_decomposition(G… 2005 Q3   221   NA     1.13  NA              196.
##  2 "classical_decomposition(G… 2005 Q4   180   NA     0.925 NA              195.
##  3 "classical_decomposition(G… 2006 Q1   171  200.    0.875  0.974          195.
##  4 "classical_decomposition(G… 2006 Q2   224  204.    1.07   1.02           209.
##  5 "classical_decomposition(G… 2006 Q3   233  207     1.13   1.000          207.
##  6 "classical_decomposition(G… 2006 Q4   192  210.    0.925  0.987          208.
##  7 "classical_decomposition(G… 2007 Q1   187  213     0.875  1.00           214.
##  8 "classical_decomposition(G… 2007 Q2   234  216.    1.07   1.01           218.
##  9 "classical_decomposition(G… 2007 Q3   245  219.    1.13   0.996          218.
## 10 "classical_decomposition(G… 2007 Q4   205  219.    0.925  1.01           222.
## 11 "classical_decomposition(G… 2008 Q1   194  219.    0.875  1.01           222.
## 12 "classical_decomposition(G… 2008 Q2   229  219     1.07   0.974          213.
## 13 "classical_decomposition(G… 2008 Q3   249  219     1.13   1.01           221.
## 14 "classical_decomposition(G… 2008 Q4   203  220.    0.925  0.996          219.
## 15 "classical_decomposition(G… 2009 Q1   196  222.    0.875  1.01           224.
## 16 "classical_decomposition(G… 2009 Q2   238  223.    1.07   0.993          222.
## 17 "classical_decomposition(G… 2009 Q3   252  225.    1.13   0.994          224.
## 18 "classical_decomposition(G… 2009 Q4   210  226     0.925  1.00           227.
## 19 "classical_decomposition(G… 2010 Q1   205   NA     0.875 NA              234.
## 20 "classical_decomposition(G… 2010 Q2   236   NA     1.07  NA              220.

# 3. Seasonally adjusted data
gas_adjusted <- gas_decomp |>
  select(Quarter, Gas, season_adjust)

p3 <- gas_adjusted |>
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas, color = "Original")) +
  geom_line(aes(y = season_adjust, color = "Seasonally Adjusted")) +
  labs(title = "Original vs Seasonally Adjusted Gas Production",
       y = "Gas Production",
       color = "Series") +
  theme_minimal()
print(p3)

# 4. Add outlier in middle and recompute
gas_outlier_mid <- gas |>
  mutate(Gas_outlier = ifelse(row_number() == 10, Gas + 300, Gas))

gas_decomp_outlier_mid <- gas_outlier_mid |>
  model(classical_decomposition(Gas_outlier, type = "multiplicative")) |>
  components()

p4 <- gas_decomp_outlier_mid |>
  autoplot() +
  labs(title = "Decomposition with Outlier in Middle") +
  theme_minimal()
print(p4)

## Ignoring unknown labels:
## • colour : ".model"
## • fill : ".model"

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

# Compare seasonally adjusted with outlier
comparison_mid <- gas_decomp_outlier_mid |>
  select(Quarter, Gas_outlier, season_adjust) |>
  ggplot(aes(x = Quarter)) +
  geom_line(aes(y = Gas_outlier, color = "With Outlier")) +
  geom_line(aes(y = season_adjust, color = "Seasonally Adjusted")) +
  labs(title = "Effect of Middle Outlier on Seasonally Adjusted Data",
       y = "Gas Production") +
  theme_minimal()
print(comparison_mid)

# 5. Add outlier near end and compare
gas_outlier_end <- gas |>
  mutate(Gas_outlier = ifelse(row_number() == 18, Gas + 300, Gas))

gas_decomp_outlier_end <- gas_outlier_end |>
  model(classical_decomposition(Gas_outlier, type = "multiplicative")) |>
  components()

p5 <- gas_decomp_outlier_end |>
  autoplot() +
  labs(title = "Decomposition with Outlier Near End") +
  theme_minimal()
print(p5)

## Ignoring unknown labels:
## • colour : ".model"
## • fill : ".model"

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).

## 3.7 Responses - There are consistent lows in Q1 and an overall strong seasonal pattern in the graphs suggesting a predictable pattern every year. The trend graph shows the trend being a slow increase, sharp increase, a plateau followed by a decrease and period of plateau. The graphs suggest seasonality beacause the data is tied to an annual calendar and repeats predictable every 4 quarters.

3.8 Decompose the aus_retail series using X-11. Does it reveal any outliers, or unusual features that you had not noticed previously?

X-11 is an advanced seasonal adjustment method developed by the U.S. Census Bureau that provides more accurate decomposition of time series data than classical methods.

-Key advantages of X-11: -Uses sophisticated iterative moving averages for more stable trend and seasonal estimates -Automatically detects and handles outliers that could distort seasonal patterns -Accounts for calendar effects like trading days, holidays, and varying month lengths

# X-11 decomposition of the aus_retail series

x11_decomp <- myseries |>
  model(x11 = X_13ARIMA_SEATS(Turnover ~ x11())) |>
  components()

# Plot the decomposition
x11_decomp |> autoplot()

## Ignoring unknown labels:
## • colour : "State/Industry/.model"
## • fill : "State/Industry/.model"

# Look for outliers and unusual features
x11_decomp |>
  ggplot(aes(x = Month)) +
  geom_line(aes(y = irregular), color = "red") +
  labs(title = "X-11 Irregular Component - Shows Outliers",
       y = "Irregular Component") +
  theme_minimal()

# Examine the irregular component more closely
x11_decomp |>
  select(Month, irregular) |>
  filter(abs(irregular) > 2 * sd(irregular, na.rm = TRUE))

## # A tsibble: 441 x 2 [1M]
##       Month irregular
##       <mth>     <dbl>
##  1 1982 Apr     1.02 
##  2 1982 May     0.995
##  3 1982 Jun     0.995
##  4 1982 Jul     1.00 
##  5 1982 Aug     0.983
##  6 1982 Sep     0.965
##  7 1982 Oct     1.01 
##  8 1982 Nov     0.997
##  9 1982 Dec     1.01 
## 10 1983 Jan     1.01 
## # ℹ 431 more rows

This data show outliers that were not present in classical decomposition. Some of the most notable spikes are described below.

Late 1980s: Large spike above 1.15 (significant positive outlier)
Mid-1990s: Several spikes around 1.10-1.12
Early 2000s: Massive spike above 1.20 (the largest outlier in the series)
Mid-2000s: Notable spike around 1.08
Various periods: Multiple smaller spikes and dips throughout

3.9a STL Decomposition Analysis: Australian Civilian Labour Force Dynamics (1978-1995)

The STL decomposition reveals that Australian civilian labour force growth from 1978-1995 was dominated by a strong upward trend, increasing from approximately 6,500 to 9,000 thousand persons over the 17-year period. The seasonal component shows relatively small fluctuations compared to the overall trend magnitude, indicating that seasonal employment patterns have minimal impact on the labour force size. The remainder component operates on a larger scale, remaining close to zero throughout most of the period except for a notable negative spike around 1991-1992, which likely corresponds to the early 1990s recession when labour force participation dropped significantly. The seasonal pattern demonstrates consistent annual cycles, with labour force typically peaking in March and October-December while reaching lows in January and July-August. The scale differences highlight that trend growth was the primary driver of labour force changes, with economic disruptions like the 1991-1992 recession being the most significant deviations from normal patterns.

3.9b Does the graph indicatte te recession on 1990/1991?

The recession of 1990/1991 is not visble as the value section of fig 3.19. The graph shows an upward trend during that time period, however the X-11 decomposition showns a spike around the 1991/1992 time period which can be indicative of the recession occurring at that time.