This is the final blueprint of the analysis which includes the predicted values for filled quantity of each modeling technique. The output is useful to compare the different forecasting methods and to evaluate their results.
1. Results
1.1. UK Volume Data - Ensemble Forecast (Predicted vs Actuals)
The ensemble forecast predicts far off for the HSBA stock, which is the stock with highest variance. This method predicts lower compared to the actuals.
1.2. UK Volume Data - Nested Forecast (Predicted vs Actuals)
The nested forecasting approach includes confidence intervals which are good indication of the spread of the prediction range and based on the results, actual values fall well within their range.
This modeling technique is also quite off for the HSBA and INRG stocks and generally it appears to predict lower values compared to the actuals as well.
1.3. UK Volume Data - Ensemble & Nested Forecasts (Predicted vs Actuals)
Combining the modeling approaches and deriving the mean and standard deviation allow for easier interpretation of the results. Additionally, this method highlights their differences.
As noted above, high variance is observed in the HSBA stock. Historically during the last two months HSBA realized quantity of 8.7k on 1st of July and almost 30k on 2nd of August. Two month sample is not enough to establish more realistic prediction for such high variance.
Differences between the predictions across several stocks could be attributed to many factors. For instance the irregular increase of orders is a major driver of high variance and apparently each method handles it differently. The lac of engineered features as external regressors due to the small sample also plays a role.
1.4. EU Volume Data - Predicted vs Actuals (Nested)
In this group the values are so low that there is not much to be said. Bigger data sample could be used to produce more meaningful results.
Important note is that the model accurately predicted that two of the stocks will have 0 filled quantity for 1st of September - JDW & INRG.
2. Conclusions
Data scientists can attempt to identify patterns in the temporal waves and map them mathematically to equations and models that “predict” the foreseeable future to a certain extent.
This project involved complex time series signatures that are by nature difficult to forecast. Additionally the small sample introduced further difficulties.
Generally all trading activities regardless of their domain are connected to immeasurable amount of factors attributing to their dynamics. Especially in finance the local and global economic background attribute directly to how equity and stocks are traded.
Despite the small sample, combining the UK group forecast provides good results. Stock tickers where forecasts deviate significantly from the actuals are CCL, HSBA and INRG. This deviation is definitively attributed to the high variance within these groups and can be improved if a bigger sample were available.