dxt title: “Assignment 2” author: “John Ross” date: “February 11, 2018” output: html_document —

1. Souvenir Sales

A) Why was the data partioned

The above data was partioned in order to test the model in order to determine the accuracy of the model. By separating the data into a training set, it enables one to test the data against the validataion set. This comparison helps to identify how well the model fits and evaluate prediction accuracy.

B) Why did the analyst choose a 12-month validation period?

The forecast horizon was to forecast the next 12 months of sales. In order to properly test the accuracy of the model for the next 12 months, a 12 month validation would be necessary to prevent less accurate results.

C) What is the naive forecast for the validation period?

##           Jan      Feb      Mar      Apr      May      Jun      Jul
## 2001  7615.03  9849.69 14558.40 11587.33  9332.56 13082.09 16732.78
##           Aug      Sep      Oct      Nov      Dec
## 2001 19888.61 23933.38 25391.35 36024.80 80721.71

D) Compute the RMSE and MAPE for naive forecasts.

##                    ME     RMSE      MAE      MPE     MAPE     MASE
## Training set 3401.361 6467.818 3744.801 22.39270 25.64127 1.000000
## Test set     7828.278 9542.346 7828.278 27.27926 27.27926 2.090439
##                   ACF1 Theil's U
## Training set 0.4140974        NA
## Test set     0.2264895 0.7373759

E) Plot a histogram of the forecast errors that result from the naive forecasts

Plot also a time plot for the naive forecasts and the actual sales numbers in the validation period

F) What must she do to use the forecasting model for generating forecasts for year 2002? The analyst should merge the data from both the training and validation periods and run them against the forecasting model that she thinks provides satisfactory results to help predict 2002 sales.

2. Forecasting Shampoo Sales

Partition the data into training and validation periods

Partitioning the data will help with develop and determine accuracy of models.

Examine time plots of the series and of model forecasts only for the training period

Examining the time plots of the series and of model forecasts for the training period alone will not provide enough information on its own to assist with future month forecasting. Analyzing both the training data, along with the validation data, would produce the most accurate forecasting results.

Look at MAPE and RMSE values for the training period

The MAPE and RMSE values for the training period would not be evaluated in predicting future sales.

Look at MAPE and RMSE values for the validation period

These values from the validation period however provide insight as to how the validation data fits to the training period and would be an indicator of model accuracy.

Compute naive forecasts

This step shows how recent data impacts future forecast. Naive forecasting is a good baseline model in which to compare other models to.