Judgmental Forecasting

Author

Ethan Wright

Question 1:

The Delphi Method is an iterative, anonymous process that collects consensus opinions/forecasts from experts. We can aggregate data throughout rounds of surveying to converge on a consensus forecast.

Pros of Classic:

The anonymity removes dominance, for example, if Professor Sharma gave his opinion on a topic, we would likely adjust our forecast towards his. If we remove identity, forecasts are less biased. Delphi also iterates. This means that forecasts can be adjusted when new points-of-view are introduced to inform incomplete opinions. Lastly, each opinion is weighed equally in the aggregation of the data in the final round, leading to a group forecast rather than a power-structured forecast.

Cons of Classic:

Similar to a market research focus group/survey, the quality of the data collection and summary can be highly dependent on the moderator. If the data summary is poor, the next rounds can provide misleading feedback, skewing results in the final forecast. Also, panel fatigue can weaken the integrity of the final forecast, as experts can lose interest in long processes, as time is a valuable asset to experts in subject matter across the board.

Variants:

Estimate-Talk-Estimate:

Reduces the anonymity, but allows interactions to discuss nuance in estimates.

Policy Delphi:

Doesn’t produce a forecast, but it identifies policy arguments so that policymakers can better defend/adjust their positions.

Real-Time Delphi:

A faster process that limits panel fatigue.

Mini-Delphi:

Also called estimate-discuss-estimate, which is a condensed version that provides 2 estimates split by a discussion. It gives an improved consensus over the first estimate while also keeping the panel short.

Real-World:

Using the case from the .qmd file, Australia’s Tourism Forecasting Committee publishes forecasts twice a year. The panel included stakeholders in Australia’s tourism industry, such as hotel leaders. The panel forecasts actually showed more optimism than the statistical ones. This is likely because the stakeholders were hopeful that their industry would grow, introducing bias into the forecast.

Question 2:

judgmental <- tibble(
  Method = c(
    "Forecasting by Analogy",
    "Scenario Forecasting",
    "Sales Force Composite",
    "Executive Opinion",
    "Customer Intentions Survey"
  ),
  Description = c(
    "Finds forecasts using comparable/similar historical products.",
    "Long-run planning under uncertainty, producing many estimates.",
    "Uses sales teams to forecast demand over territories and aggregates.",
    "Uses senior executives to guide forecasts.",
    "Surveys potential customers directly about purchase intent, timing, and price sensitivity"
  ),
  Pros = c(
    "Uses historical comps to guide forecasts",
    "Multiple possible scenarios during encertainty",
    "Uses people closest to the customers/products",
    "Fast and experienced opinions across business functions",
    "Opinions come from the source of revenue (customers)"
  ),
  Cons = c(
    "Can bias toward well-perfomring products",
    "No single point. More scenarios can stretch data or make scenarios innacurately represents estimate",
    "Commissions can guide expectations",
    "Hierarchy effect and bias toward performance",
    "Intentions can be misguided"
  ),
  `Performed Well` = c(
    "COVID-19 forecast using the 1918 Spanish Flu",
    "Doctor Strange forecasted 14 mil scenarios",
    "Sales reps forecasting long-standing products with strong customer relationships",
    "Long-term strategy planning due to experience through regimes",
    "Works well when using moderator with no inclination toward either side of the relationship"
  ),
  `Performed Badly` = c(
    "Post-COVID rebound prediction due to differing markets and global supply-chain",
    "Supply chain planning pre-Covid",
    "New product launches since sales rep don't understand customer relatinoship with new products",
    "Where the most senior leader speaks first; introduces bias",
    "Where close relationships guide discussion. People want and give optomistic answers"
  )
)

judgmental |>
  kable(
    col.names = c("Method", "Description", "Pros", "Cons", "Performed Well", "Performed Badly"),
    align = c("l", "l", "l", "l", "l", "l")
  ) |>
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width = TRUE,
    font_size = 12
  ) |>
  column_spec(1, bold = TRUE, width = "13%") |>
  column_spec(2, width = "17%") |>
  column_spec(3, width = "17%") |>
  column_spec(4, width = "17%") |>
  column_spec(5, width = "18%") |>
  column_spec(6, width = "18%")
Method Description Pros Cons Performed Well Performed Badly
Forecasting by Analogy Finds forecasts using comparable/similar historical products. Uses historical comps to guide forecasts Can bias toward well-perfomring products COVID-19 forecast using the 1918 Spanish Flu Post-COVID rebound prediction due to differing markets and global supply-chain
Scenario Forecasting Long-run planning under uncertainty, producing many estimates. Multiple possible scenarios during encertainty No single point. More scenarios can stretch data or make scenarios innacurately represents estimate Doctor Strange forecasted 14 mil scenarios Supply chain planning pre-Covid
Sales Force Composite Uses sales teams to forecast demand over territories and aggregates. Uses people closest to the customers/products Commissions can guide expectations Sales reps forecasting long-standing products with strong customer relationships New product launches since sales rep don't understand customer relatinoship with new products
Executive Opinion Uses senior executives to guide forecasts. Fast and experienced opinions across business functions Hierarchy effect and bias toward performance Long-term strategy planning due to experience through regimes Where the most senior leader speaks first; introduces bias
Customer Intentions Survey Surveys potential customers directly about purchase intent, timing, and price sensitivity Opinions come from the source of revenue (customers) Intentions can be misguided Works well when using moderator with no inclination toward either side of the relationship Where close relationships guide discussion. People want and give optomistic answers

Question 3:

1. STL

The only real limitation is that it requires an extra step to convert observations to log form when decomposing multiplicity. Great when data has outliers, which can be common.

2. X-11

Great when using industry statistics and irregular patterns (stocks). It handles changing seasonality well.

3. SEATS

Its a step up from the standard ARIMA, but it’s hard to interpret and can’t handle daily data.

4. Classical Decomposition

It fixes seasonality, meaning that we can’t model changing seasonality which is common to see.