I am working in a company that selling Agriculture Equipment, we have a product that marked as the highest sales since the beginning (about 9 years ago), i want to predict the sales for the next year in order to prepare how many stock that we need in order to fulfil the demand.

First, lets prepare the libraries.

Data Preparation

Next, we read the data. (I have remove product name in the data so it’s only shows Date and Quantity :) )

Data Preprocess & Exploratory

To make a timeseries data, first we need to have a sequential date (tanggal) in order and fill the empty QTY with average value, We are going to use pad and na.fill command.

Timeseries

Now we have a proper timeseries data, lets explore the data with HoltWinters method.

Creating timeseries the data shown are daily sales so i am using frequency 7*4 to get a better trend grasp.

visualize

It shows slightly sales increasement on late 2019, lets see a better detail with decompose() with multiplicative type

After we get the information, lets separate the data into train and test

Model Fitting with HoltWinters method

Forecast Model

## [1] 325.7463

I did compared this output with actual sales number, its about the same amount which is good, bet let see the visualisation, can the model predict it nicely?

Visualize the model

accuracy test

##                ME     RMSE      MAE      MPE     MAPE ACF1 Theil's U
## Test set 3.484933 4.112942 3.484933 18.80892 18.80892    0 0.8861136

After the calculation steps above, it shows that the result for the next 24 days would be:

## [1] 325.7463

with MAPE 18,80%, I am quite confident with the result despite i have done cross-validation with the actual sales

Multi Seasonal Time Series

One of my Mentor in my school tells that we can grasp a better pattern with Multi Seasonal Time Series (MSTS), its a method that can learn the-not-quite-tidy-time-series-data, we can add multiple frequency on reading the data, so let’s see the comparison.

For this i am going to use frequency of a week, a month and a year.

I am going to use only the data from 2011 to 2016 because it shows some unusual sales activity during 2017 to 2018 that can disrupt my model pattern.

From the visual above, it shows a better trend pattern so it means that we can grasp the data-pattern quite good compared to standard timeseries.

Cross-Validation

Modeling 1

Forecast& Evaluation

Visual 1

Modeling 2

Lets try with log model

Forecast & Evaluation

Visual 2

## [1] 330.3879

Conclusion

By Using Multi-Seasonal Time Series, we’re able to grasp the data-pattern pretty well (by looking at the decomposed trend visual). Despite of the MAPE value are still quite high (45ish %), it’s still acceptable based on the business wise. I also checked with the forecast value that shows 330.38, it can precisely forecast with current sales condition.