a) If data is available only for the last month, how does this affect the choice of model-based vs. data-driven methods?

If there’s only a month worth of data to work with it’s typically smarter to use a model-based forecasting method as “Model-based methods are especially advantageous when the series at hand is very short” (Shmueli 69).

b) The clinic has access to the admissions data of a nearby hospital. Under what conditions will including the hospital information be potentially useful for forecasting the clinic’s daily visits?

If the nearby hospital operates similarly to the one being analyzed, such as similar hours, it could be potentially useful to add them in. For instance, if Maine Medical Center didn’t have more admissions data, but Mercy did, it wouldn’t be too much of a stretch to include that and be able to get a rough estimate of numbers.

c) Thus far, the clinic administrator takes a heuristic approach, using the visit numbers from the same day of the previous week as a forecast. What is the advantage of this approach? What is the disadvantage?

The advantage is that you get a baseline to work with using a simple, naïve approach. Chances are Monday admissions stay within a certain range of numbers most of the time. The disadvantage is that you’re not using other external information, such as holidays, school vacations, etc. that can have a strong impact on the numbers.

d) What level of automation appears to be required for this task? Explain

Data-driven methods are typically better for automation purposes, so it would be difficult to automate the forecast using the model-based method. However, the model isn’t based on many assumptions, only uses numbers from the same day the previous week, which makes the automation easier, because then it doesn’t require constantly checking to see if the assumptions are being met.

e) Describe two approaches for improving the current heuristic (naïve) forecasting approach using ensembles.

Simply averaging the results of applying multiple forecasting methods to the time series is one way to improve the current approach, and will result in more precise predictions. They could also improve their forecast accuracy even more by applying a weighted average.

Question Two

a-b) For each of the four methods, describe whether it is model-based, data-driven, or a combination/ Whether it is based on extrapolation, causal modeling, correlation modeling or a combination

Persistence Method aka Naïve Predictor: Data-driven, extrapolation
Physical Approach: Model-based, causal modeling
Statistical Approach: Data-driven, extrapolation
Hybrid approach: combination of model and data based, and is based on a combination

c) Describe the advantages and disadvantages of the hybrid approach

The advantage of a hybrid approach is it combines methods and gives a more accurate forecast. Because the hybrid approach is a combination, the disadvantages are increased costs and needing analysts who are experts in multiple different methods.