Assignment

Due Monday, February 19th, 2018 at 11:59PM: Problems 1 and 2 from Chapter 4 of Shmueli

Medical Clinic Forecasting Daily Patient Visits

A large medical clinic would like to forecast daily patient visits for purposes of staffing.

(a) If data is available only for the last month, how does this affect the choice of model-based vs. data-driven methods?

Since data is only available for the month prior, then it would behoove the clinic administrator to utilize a model-based method. The reason for this is that because we have a short time series we are not able to rely on learning the patterns from the data, instead we are estimating parameters of it.

Data-driven methods will learn patterns from the data, requiring longer time series data to capture the highs and lows of the data as a way to understand what has occurred in the past.

(b) The clinic has access to the admissions data of a nearby hospital. Under what conditions will including the hospital information be potentially useful for forecasting the clinic’s daily visits?

First we must evaluate the data and work towards identifying whether the data provided is sufficient enough to act upon. Given the uncertainty, we would weight the hospital data equal to that of the clinic. We also need to ensure that the data provided by the hospital is avaialable at the time of prediction.

After which, the data from the nearby hospital may be used to identify where there is a correlation between the two data sets. For example, it may be used to identify whether there is a positive correlation — meaning, an increase in visits to the hospital corresponds to an increase in visits to the clinic. Perhaps when hospitals are busy and/or full then patients are more likely to visit the medical clinic instead. The opposite may also hold true, that with an influx in hospital visits the number of visits to the clinic declines.

(c) Thus far, the clinic administrator takes a heuristic approach, using the visit numbers from the same day of the previous week as a forecast. What is the advantage of this approach? What is the disadvantage?

Using a heuristic approach to forecasting patient visits for the upcoming week can prove to be a relatively quick and easier way of identifying the number of staff needed on-call for the clinic. However, this approach doesn’t account for any seasonality. By simply looking at the previous week, the clinic administrator may underestimate patient visits for … Take for example, St.Patrick’s Day. If the clinic administrator used the previous week’s visits to forecast this upcoming week then she may be grossly underestimating the number of staff on-call that day. The reason is that she is looking at the Saturday prior, but had she been able to compared to St.Patrick’s Days in the past then she could account for the 1,000% uptick in hospital visits (Note: the 1,000% percent lift in hospital visits on St.Patrick’s Day is assumed. However, based on experience, rather likely.)

(d) What level of automation appears to be required for this task? Explain.

Given the ongoing forecasting that is required to continuously staff the clinic week after week, automating the forecasting would greatly benefit the clinic. Automation is well suited to forecast patient visits throughout the year with little (or no) in-house expertise. However, because we are utilizing a model-based method of forecasting then we would need to continuously check to see whether our model is accurately forecasting the appropriate staff levels.

(e) Describe two approaches for improving the current heuristic (naive) forecasting approach using ensembles.

The first approach could use a weighted average to leverage the hospital data and compare it against the clinic data. By continuously evaluating the effectiveness of this model, adjustments in its weight may be made to further enhance it’s forecasting performance.

The second approach would be to utilize additional data sources for the purpose of collecting patient visits. Perhaps the clinic administrator begins to incorporate seasonal weather temperatures to evaluate staff levels against. This may identify that in unexpectedly warmer temperatures, people may be inclined to rush outside and enjoy the sun. Then in that haste end up hurting themselves and require a visit to the clinic. The additional source would need to be weighted appropriately, it may be that at the offset it is based primarily on the staff’s experience or good domain knowledge.

Methods for Wind Power Forecasting

The ability to scale up reneweable energy, and in particular wind power and speed, is dependent on the ability to forecast its short-term availability.

(a) For each of the four types of methods, describe whether it is model-based, data-driven, or a combination

Persistence Method: Data-driven, the reason being that it uses the speed at time t to to predict future speed.

Physical Approach: Model-based, because it uses parameterizations based on the atmosphere.

Statistical Approach: Data-driven, based on measurement data patterns rather than a predefined mathematical model, this approach uses the difference between recent past wind speeds compared to the predicted speeds as a means to fine-tune the parameters.

Hybrid Approach: Combination, as it combines both physical and statistical approaches.

(b) For each of the four types of methods, describe whether it is based on extrapolation, causal modeling, correlation modeling or a combination.

Persistence Method: Extrapolation, using its own historical values to predict future wind speeds.

Physical Approach: Causal modeling, as it utilizes atmospheric conditions to forecast wind speed.

Statistical Approach: Extrapolation, this approach calculates the difference between immediate past wind speeds with the predicted speeds.

Hybrid Approach: Combination, is based on combining the physical and statistical approaches or the combination of short term and medium-term models.

(c) Describe the advantages and disadvantages of the hybrid approach.

Combining approaches can be advantageous with the Hybrid Approach as it can prove to be more accurate and effective than using a single approach. The disadvantages of the Hybrid Approach are its costs, not only as a function of the time involved in producing and combining multiple approaches but also an increase in cost for an analyst’s experience and expertise in the varying methods. The Hybrid Approach also requires upfront alignment on the method and rules in which the forecasts are comnbined.

Week 4 : Beginning February 12th, 2018

Pete Wiernusz

2/18/2018