Hierarchical Time Series Forecasting

I. Nesting vs Grouping (Crossed Structures)

1. Why This Matters

Many datasets in econometrics, panel data, and hierarchical modeling involve structured observations —
individuals within groups, or observations classified by multiple dimensions.

Understanding whether these structures are nested or crossed determines:

How we model random effects / fixed effects
How we cluster standard errors
How we interpret within- and between-group variation

2. Nested Data Structures

Definition

A nested structure means smaller units are fully contained within larger units.
Each lower-level unit belongs to exactly one higher-level unit.

\[ \text{Level 3: Country} ; ⊃ ; \text{Level 2: Region} ; ⊃ ; \text{Level 1: City} \]

Key Properties

Feature	Explanation
Belonging	One-to-one hierarchy
Independence	Observations within the same group are correlated
Model implication	Use hierarchical or multilevel models with random intercepts/slopes
Typical notation	`a/b/c` (e.g., country/region/city)

Economic Examples

Model Form

\[ Y_{ij} = \beta*0 + u_j +* \beta*1 X*{ij} + e{ij} \] where \(u_j\) = group-level (nested) random effect.

3. Grouped or Crossed Data Structures

Definition

A grouped (crossed) structure means units are classified along multiple independent dimensions.
Each observation can belong to several groups simultaneously.

\[ \text{Example: } \text{(City × Brand)} \Rightarrow \text{Sales observed for each brand in each city.} \]

Key Properties

Feature	Explanation
Belonging	Many-to-many relationship
Independence	Observations share multiple group memberships
Model implication	Use crossed random effects (or two-way fixed effects)
Typical notation	`a * b` (e.g., city * brand)

Economic Examples

Example	Interpretation
Students within Schools	Each student belongs to one school
Cities within Regions within Countries	Strict geographic hierarchy
Years within Firms	Time nested within individual firms (panel data)

Model Form

\[ Y_{ij} = \beta*0 + u_i^{(city)} + v_j^{(brand)} + e*{ij} \] where \(u_i\), \(v_j\) = crossed random effects.

4. Mixed (Nested + Crossed) Structures

Sometimes, you have both:

\[ (\text{country/region/city}) * (\text{brand/product}) \]

Interpretation:

Cities are nested within regions and countries.
Brands/products are nested within brand lines.
The two hierarchies cross each other → mixed structure.

Example:

Country	Region	City	Brand	Product	Sales
USA	California	Los Angeles	Nike	Air Max	300
USA	California	Los Angeles	Apple	iPhone	500
UK	England	London	Nike	Air Max	220

Model Form:

\[ Y_{ijkm} = \beta*0 + u*{country_j} + u_{region_{k(j)}} + u_{city_{m(k,j)}} + v_{brand_b} + v_{product_{p(b)}} + e_{ijkm} \]

5. Summary Table

Concept	Relationship	Example	Model Type	Typical Notation
Nesting	One unit belongs exclusively to another	Students within schools	Multilevel / hierarchical	`a/b/c`
Grouping (Crossed)	Units classified by multiple factors	City × Brand sales	Crossed random / two-way FE	`a * b`
Mixed	Nested hierarchies crossed with another	(Country/Region/City) × (Brand/Product)	Mixed-effects	`(a/b) * (c/d)`

6. Key Takeaways

Nesting = contained within (hierarchy).
→ one-to-one belonging (tree structure).
Grouping = classified by (intersection).
→ many-to-many relationship (grid structure).
Mixed structures combine both.
→ e.g., “Cities within Countries × Brands across Markets.”

Econometric translation:

Data Type	Model Framework
Time within Firm	Fixed or random effects (nested)
Individuals across Firms × Industries	Two-way fixed effects (crossed)
Schools within Districts × Teachers across Classes	Mixed / Crossed multilevel

Nesting forms hierarchies; grouping forms grids.
Econometric models must reflect which structure your data actually follow.

II. Introduction to Hierarchical Time Series

Hierarchical time series are collections of related time series organized in a hierarchical structure. These structures naturally arise when data can be disaggregated by different categorical variables or attributes.

Key Concepts

Hierarchy: A nested structure where series at higher levels are the sum of series at lower levels. For example:

- Total tourism → State level → Purpose of travel

Grouped Time Series: A generalization of hierarchies where series can be disaggregated in multiple ways that don’t nest perfectly. Series can be grouped by different attributes simultaneously.

Example: In the tourism data:

Hierarchy: State / Region / Purpose (State contains Regions)
Grouped: State × Purpose AND Region × Purpose (cross-classification)

- You can view by State + Purpose: “NSW Business”, “NSW Holiday”…

- You can view by Region + Purpose: “Sydney Business”, “Melbourne Business”…

- These groupings overlap but don’t nest cleanly

Why Use Hierarchical Forecasting?

Coherence: Ensures forecasts are mathematically consistent (lower-level forecasts sum to upper-level forecasts)
Information sharing: Allows information to flow between levels
Flexibility: Can forecast at any level while maintaining consistency
Improved accuracy: Often improves forecast accuracy through reconciliation

Reconciliation Methods

Bottom-Up

Forecast only the bottom-level series
Sum these forecasts to get higher-level forecasts
Pros: Simple, no information loss at bottom level
Cons: Ignores potentially useful information at aggregate levels

Top-Down

Forecast only the top-level series
Disaggregate using proportions
Pros: Can be more stable for volatile bottom-level series
Cons: Loses disaggregate-level information

Middle-Out

Forecast at a middle level
Combine top-down and bottom-up approaches

Optimal Reconciliation (MinT)

Uses all forecasts (base forecasts at all levels)
Finds optimal combination that minimizes forecast error variance
Methods include: OLS, WLS, MinT (minimum trace)

Example: Australian Tourism Data

Data Structure

library(fpp3)

Warning: package 'fpp3' was built under R version 4.4.3

Warning: package 'tsibble' was built under R version 4.4.3

Warning: package 'ggtime' was built under R version 4.4.3

Warning: package 'feasts' was built under R version 4.4.3

Warning: package 'fabletools' was built under R version 4.4.3

Warning: package 'fable' was built under R version 4.4.3

library(dplyr)

remove(list=ls())

# Load tourism data
df <- tsibble::tourism

Examine the hierarchy structure (using `length`, `unique` and `cat` commands)

Time periods: 80 quarters

States: 8 states

Regions: 76 regions

Purpose categories: 4

Expected series (State × Purpose): 32

Understanding the Hierarchy

The tourism data has a natural hierarchy:

- Top level: Total Australian tourism

- Level 1: Disaggregated by State (8 states/territories)

- Level 2: Disaggregated by Purpose (4 categories: Business, Holiday, Visiting, Other)

- Bottom level: State × Purpose combinations (8 × 4 = 32 series)

Creating the Hierarchical Structure

# Create hierarchical time series using aggregate_key
tourism_hier <- tourism %>%
  aggregate_key(State / Purpose, Trips = sum(Trips))

# View the structure
tourism_hier

# A tsibble: 3,280 x 4 [1Q]
# Key:       State, Purpose [41]
   Quarter State        Purpose       Trips
     <qtr> <chr*>       <chr*>        <dbl>
 1 1998 Q1 <aggregated> <aggregated> 23182.
 2 1998 Q2 <aggregated> <aggregated> 20323.
 3 1998 Q3 <aggregated> <aggregated> 19827.
 4 1998 Q4 <aggregated> <aggregated> 20830.
 5 1999 Q1 <aggregated> <aggregated> 22087.
 6 1999 Q2 <aggregated> <aggregated> 21458.
 7 1999 Q3 <aggregated> <aggregated> 19914.
 8 1999 Q4 <aggregated> <aggregated> 20028.
 9 2000 Q1 <aggregated> <aggregated> 22339.
10 2000 Q2 <aggregated> <aggregated> 19941.
# ℹ 3,270 more rows

# The aggregate_key function creates:
# - Aggregated series at all levels
# - Special <aggregated> labels for higher levels

tourism_hier %>%
  as_tibble() %>%
  distinct(State, Purpose) %>%
  arrange(State, Purpose) %>%
  print(n = 50)

# A tibble: 41 × 2
   State              Purpose     
   <chr*>             <chr*>      
 1 ACT                Business    
 2 ACT                Holiday     
 3 ACT                Other       
 4 ACT                Visiting    
 5 ACT                <aggregated>
 6 New South Wales    Business    
 7 New South Wales    Holiday     
 8 New South Wales    Other       
 9 New South Wales    Visiting    
10 New South Wales    <aggregated>
11 Northern Territory Business    
12 Northern Territory Holiday     
13 Northern Territory Other       
14 Northern Territory Visiting    
15 Northern Territory <aggregated>
16 Queensland         Business    
17 Queensland         Holiday     
18 Queensland         Other       
19 Queensland         Visiting    
20 Queensland         <aggregated>
21 South Australia    Business    
22 South Australia    Holiday     
23 South Australia    Other       
24 South Australia    Visiting    
25 South Australia    <aggregated>
26 Tasmania           Business    
27 Tasmania           Holiday     
28 Tasmania           Other       
29 Tasmania           Visiting    
30 Tasmania           <aggregated>
31 Victoria           Business    
32 Victoria           Holiday     
33 Victoria           Other       
34 Victoria           Visiting    
35 Victoria           <aggregated>
36 Western Australia  Business    
37 Western Australia  Holiday     
38 Western Australia  Other       
39 Western Australia  Visiting    
40 Western Australia  <aggregated>
41 <aggregated>       <aggregated>

41 * 80 # 41 series * 80 quarters

[1] 3280

Creating a Grouped Time Series Structure

# Create GROUPED time series: State * Purpose (cross-classification)
tourism_grouped <- tourism %>%
  aggregate_key(State * Purpose, Trips = sum(Trips))

tourism_grouped

# A tsibble: 3,600 x 4 [1Q]
# Key:       State, Purpose [45]
   Quarter State        Purpose       Trips
     <qtr> <chr*>       <chr*>        <dbl>
 1 1998 Q1 <aggregated> <aggregated> 23182.
 2 1998 Q2 <aggregated> <aggregated> 20323.
 3 1998 Q3 <aggregated> <aggregated> 19827.
 4 1998 Q4 <aggregated> <aggregated> 20830.
 5 1999 Q1 <aggregated> <aggregated> 22087.
 6 1999 Q2 <aggregated> <aggregated> 21458.
 7 1999 Q3 <aggregated> <aggregated> 19914.
 8 1999 Q4 <aggregated> <aggregated> 20028.
 9 2000 Q1 <aggregated> <aggregated> 22339.
10 2000 Q2 <aggregated> <aggregated> 19941.
# ℹ 3,590 more rows

# This creates all combinations:
# - Total (all aggregated)
# - By State only (Purpose aggregated)
# - By Purpose only (State aggregated)  
# - By State AND Purpose (bottom level, no aggregation)

# Compare counts
cat("Hierarchical series:", NROW(tourism_hier), "\n")

Hierarchical series: 3280

cat("Grouped series:", NROW(tourism_grouped), "\n")

Grouped series: 3600

Difference: Grouped structure adds 4 ‘Purpose-only’ series.

4 purpose only series * 80 quarters = 320 obs

Total

├── By State (aggregated over Purpose)

├── By Purpose (aggregated over State)

└── By State × Purpose (bottom level)

Fitting Base Models

# Fit ETS models to ALL levels of the hierarchy
tourism_fit <- tourism_hier %>%
  model(base = ETS(Trips))

# This creates forecasts at every level, which may not be coherent
tourism_fit

# A mable: 41 x 3
# Key:     State, Purpose [41]
   State           Purpose              base
   <chr*>          <chr*>            <model>
 1 ACT             Business     <ETS(M,N,M)>
 2 ACT             Holiday      <ETS(M,N,A)>
 3 ACT             Other        <ETS(M,N,N)>
 4 ACT             Visiting     <ETS(M,N,N)>
 5 ACT             <aggregated> <ETS(M,A,N)>
 6 New South Wales Business     <ETS(M,N,A)>
 7 New South Wales Holiday      <ETS(M,N,A)>
 8 New South Wales Other        <ETS(A,N,N)>
 9 New South Wales Visiting     <ETS(A,N,A)>
10 New South Wales <aggregated> <ETS(A,N,A)>
# ℹ 31 more rows

Reconciliation with MinT

# Reconcile forecasts to ensure coherence
reconciled <- tourism_fit %>%
  reconcile(
    bu = bottom_up(base),           # Bottom-up reconciliation
    td = top_down(base),             # Top-down reconciliation  
    mint = min_trace(base, method = "mint_shrink")  # MinT with shrinkage
  )

reconciled

# A mable: 41 x 6
# Key:     State, Purpose [41]
   State          Purpose            base bu           td           mint        
   <chr*>         <chr*>          <model> <model>      <model>      <model>     
 1 ACT          … Business   <ETS(M,N,M)> <ETS(M,N,M)> <ETS(M,N,M)> <ETS(M,N,M)>
 2 ACT          … Holiday    <ETS(M,N,A)> <ETS(M,N,A)> <ETS(M,N,A)> <ETS(M,N,A)>
 3 ACT          … Other      <ETS(M,N,N)> <ETS(M,N,N)> <ETS(M,N,N)> <ETS(M,N,N)>
 4 ACT          … Visiting   <ETS(M,N,N)> <ETS(M,N,N)> <ETS(M,N,N)> <ETS(M,N,N)>
 5 ACT          … <aggregat… <ETS(M,A,N)> <ETS(M,A,N)> <ETS(M,A,N)> <ETS(M,A,N)>
 6 New South Wal… Business   <ETS(M,N,A)> <ETS(M,N,A)> <ETS(M,N,A)> <ETS(M,N,A)>
 7 New South Wal… Holiday    <ETS(M,N,A)> <ETS(M,N,A)> <ETS(M,N,A)> <ETS(M,N,A)>
 8 New South Wal… Other      <ETS(A,N,N)> <ETS(A,N,N)> <ETS(A,N,N)> <ETS(A,N,N)>
 9 New South Wal… Visiting   <ETS(A,N,A)> <ETS(A,N,A)> <ETS(A,N,A)> <ETS(A,N,A)>
10 New South Wal… <aggregat… <ETS(A,N,A)> <ETS(A,N,A)> <ETS(A,N,A)> <ETS(A,N,A)>
# ℹ 31 more rows

Generating Forecasts

# Generate 2-year ahead forecasts
tourism_fc <- reconciled %>%
  forecast(h = "2 years")

tourism_fc

# A fable: 1,312 x 6 [1Q]
# Key:     State, Purpose, .model [164]
   State  Purpose  .model Quarter
   <chr*> <chr*>   <chr>    <qtr>
 1 ACT    Business base   2018 Q1
 2 ACT    Business base   2018 Q2
 3 ACT    Business base   2018 Q3
 4 ACT    Business base   2018 Q4
 5 ACT    Business base   2019 Q1
 6 ACT    Business base   2019 Q2
 7 ACT    Business base   2019 Q3
 8 ACT    Business base   2019 Q4
 9 ACT    Business bu     2018 Q1
10 ACT    Business bu     2018 Q2
# ℹ 1,302 more rows
# ℹ 2 more variables: Trips <dist>, .mean <dbl>

Visualizing Reconciled Forecasts

# Plot forecasts for Queensland by purpose
tourism_fc %>%
  filter(State == "Queensland") %>% # try eliminating
  autoplot(tourism_hier, level = 95) +
  labs(
    title = "Reconciled Forecasts: Queensland Tourism by Purpose",
    y = "Trips ('000)",
    x = "Quarter"
  ) +
  facet_wrap(~ Purpose, scales = "free_y", ncol = 2) +
  theme_minimal()

Comparing Reconciliation Methods

# Compare different reconciliation methods for total Queensland
tourism_fc %>%
  filter(State == "Queensland", is_aggregated(Purpose)) %>% # try eliminating
  autoplot(tourism_hier, level = NULL) +
  labs(
    title = "Comparison of Reconciliation Methods: Total Queensland Tourism",
    y = "Trips ('000)",
    x = "Quarter"
  ) +
  theme_minimal()

Key Functions Reference

Creating Hierarchies

aggregate_key(): Creates hierarchical structure with aggregations
Syntax: aggregate_key(var1 / var2 / ..., measure = sum(measure))

Reconciliation Methods

bottom_up(): Bottom-up reconciliation
top_down(): Top-down reconciliation
middle_out(): Middle-out reconciliation
min_trace(): Optimal reconciliation (MinT)
- Methods: "ols", "wls_struct", "wls_var", "mint_cov", "mint_shrink"

Workflow

Create hierarchical structure with aggregate_key()
Fit models at all levels with model()
Reconcile forecasts with reconcile()
Generate forecasts with forecast()
Evaluate and visualize

Best Practices

Always reconcile: Base forecasts at different levels are rarely coherent
Use MinT for accuracy: Generally provides best forecast accuracy
Check residuals: Ensure base models are well-specified
Consider computational cost: MinT with many bottom-level series can be slow
Visualize multiple levels: Check forecasts make sense at all levels

References

Hyndman, R.J., & Athanasopoulos, G. (2021). Forecasting: principles and practice (3rd ed.). OTexts: Melbourne, Australia. OTexts.com/fpp3
Wickramasuriya, S. L., Athanasopoulos, G., & Hyndman, R. J. (2019). Optimal forecast reconciliation for hierarchical and grouped time series through trace minimization. Journal of the American Statistical Association, 114(526), 804-819.

I. Nesting vs Grouping (Crossed Structures)

1. Why This Matters

2. Nested Data Structures

Definition

Economic Examples

Model Form

3. Grouped or Crossed Data Structures

Definition

Economic Examples

Model Form

4. Mixed (Nested + Crossed) Structures

Interpretation:

Example:

Model Form:

5. Summary Table

6. Key Takeaways

II. Introduction to Hierarchical Time Series

Key Concepts

Why Use Hierarchical Forecasting?

Reconciliation Methods

Bottom-Up

Top-Down

Middle-Out

Optimal Reconciliation (MinT)

Example: Australian Tourism Data

Data Structure

Examine the hierarchy structure (using length, unique and cat commands)

Understanding the Hierarchy

Creating the Hierarchical Structure

Creating a Grouped Time Series Structure

Fitting Base Models

Reconciliation with MinT

Generating Forecasts

Visualizing Reconciled Forecasts

Comparing Reconciliation Methods

Key Functions Reference

Creating Hierarchies

Reconciliation Methods

Workflow

Best Practices

References

Examine the hierarchy structure (using `length`, `unique` and `cat` commands)