dxt title: “Assignment 2” author: “John Ross” date: “February 11, 2018” output: html_document —
A) Why was the data partioned
The above data was partioned in order to test the model in order to determine the accuracy of the model. By separating the data into a training set, it enables one to test the data against the validataion set. This comparison helps to identify how well the model fits and evaluate prediction accuracy.
B) Why did the analyst choose a 12-month validation period?
The forecast horizon was to forecast the next 12 months of sales. In order to properly test the accuracy of the model for the next 12 months, a 12 month validation would be necessary to prevent less accurate results.
C) What is the naive forecast for the validation period?
## Jan Feb Mar Apr May Jun Jul
## 2001 7615.03 9849.69 14558.40 11587.33 9332.56 13082.09 16732.78
## Aug Sep Oct Nov Dec
## 2001 19888.61 23933.38 25391.35 36024.80 80721.71
D) Compute the RMSE and MAPE for naive forecasts.
## ME RMSE MAE MPE MAPE MASE
## Training set 3401.361 6467.818 3744.801 22.39270 25.64127 1.000000
## Test set 7828.278 9542.346 7828.278 27.27926 27.27926 2.090439
## ACF1 Theil's U
## Training set 0.4140974 NA
## Test set 0.2264895 0.7373759
E) Plot a histogram of the forecast errors that result from the naive forecasts
Plot also a time plot for the naive forecasts and the actual sales numbers in the validation period
F) What must she do to use the forecasting model for generating forecasts for year 2002? The analyst should merge the data from both the training and validation periods and run them against the forecasting model that she thinks provides satisfactory results to help predict 2002 sales.
Partition the data into training and validation periods
Partitioning the data will help with develop and determine accuracy of models.
Examine time plots of the series and of model forecasts only for the training period
Examining the time plots of the series and of model forecasts for the training period alone will not provide enough information on its own to assist with future month forecasting. Analyzing both the training data, along with the validation data, would produce the most accurate forecasting results.
Look at MAPE and RMSE values for the training period
The MAPE and RMSE values for the training period would not be evaluated in predicting future sales.
Look at MAPE and RMSE values for the validation period
These values from the validation period however provide insight as to how the validation data fits to the training period and would be an indicator of model accuracy.
Compute naive forecasts
This step shows how recent data impacts future forecast. Naive forecasting is a good baseline model in which to compare other models to.