Due Monday, February 19th, 2018 at 11:59PM: Problems 1 and 2 from Chapter 4 of Shmueli
A large medical clinic would like to forecast daily patient visits for purposes of staffing.
Since data is only available for the month prior, then it would behoove the clinic administrator to utilize a model-based method. The reason for this is that because we have a short time series we are not able to rely on learning the patterns from the data, instead we are estimating parameters of it.
Data-driven methods will learn patterns from the data, requiring longer time series data to capture the highs and lows of the data as a way to understand what has occurred in the past.
First we must evaluate the data and work towards identifying whether the data provided is sufficient enough to act upon. Given the uncertainty, we would weight the hospital data equal to that of the clinic. We also need to ensure that the data provided by the hospital is avaialable at the time of prediction.
After which, the data from the nearby hospital may be used to identify where there is a correlation between the two data sets. For example, it may be used to identify whether there is a positive correlation — meaning, an increase in visits to the hospital corresponds to an increase in visits to the clinic. Perhaps when hospitals are busy and/or full then patients are more likely to visit the medical clinic instead. The opposite may also hold true, that with an influx in hospital visits the number of visits to the clinic declines.
Using a heuristic approach to forecasting patient visits for the upcoming week can prove to be a relatively quick and easier way of identifying the number of staff needed on-call for the clinic. However, this approach doesn’t account for any seasonality. By simply looking at the previous week, the clinic administrator may underestimate patient visits for … Take for example, St.Patrick’s Day. If the clinic administrator used the previous week’s visits to forecast this upcoming week then she may be grossly underestimating the number of staff on-call that day. The reason is that she is looking at the Saturday prior, but had she been able to compared to St.Patrick’s Days in the past then she could account for the 1,000% uptick in hospital visits (Note: the 1,000% percent lift in hospital visits on St.Patrick’s Day is assumed. However, based on experience, rather likely.)
Given the ongoing forecasting that is required to continuously staff the clinic week after week, automating the forecasting would greatly benefit the clinic. Automation is well suited to forecast patient visits throughout the year with little (or no) in-house expertise. However, because we are utilizing a model-based method of forecasting then we would need to continuously check to see whether our model is accurately forecasting the appropriate staff levels.
The first approach could use a weighted average to leverage the hospital data and compare it against the clinic data. By continuously evaluating the effectiveness of this model, adjustments in its weight may be made to further enhance it’s forecasting performance.
The second approach would be to utilize additional data sources for the purpose of collecting patient visits. Perhaps the clinic administrator begins to incorporate seasonal weather temperatures to evaluate staff levels against. This may identify that in unexpectedly warmer temperatures, people may be inclined to rush outside and enjoy the sun. Then in that haste end up hurting themselves and require a visit to the clinic. The additional source would need to be weighted appropriately, it may be that at the offset it is based primarily on the staff’s experience or good domain knowledge.
The ability to scale up reneweable energy, and in particular wind power and speed, is dependent on the ability to forecast its short-term availability.
Persistence Method: Data-driven, the reason being that it uses the speed at time t to to predict future speed.
Physical Approach: Model-based, because it uses parameterizations based on the atmosphere.
Statistical Approach: Data-driven, based on measurement data patterns rather than a predefined mathematical model, this approach uses the difference between recent past wind speeds compared to the predicted speeds as a means to fine-tune the parameters.
Hybrid Approach: Combination, as it combines both physical and statistical approaches.
Persistence Method: Extrapolation, using its own historical values to predict future wind speeds.
Physical Approach: Causal modeling, as it utilizes atmospheric conditions to forecast wind speed.
Statistical Approach: Extrapolation, this approach calculates the difference between immediate past wind speeds with the predicted speeds.
Hybrid Approach: Combination, is based on combining the physical and statistical approaches or the combination of short term and medium-term models.
Combining approaches can be advantageous with the Hybrid Approach as it can prove to be more accurate and effective than using a single approach. The disadvantages of the Hybrid Approach are its costs, not only as a function of the time involved in producing and combining multiple approaches but also an increase in cost for an analyst’s experience and expertise in the varying methods. The Hybrid Approach also requires upfront alignment on the method and rules in which the forecasts are comnbined.