## Reading in Daily Demand Forecasting Order Data Set
## Source: https://archive.ics.uci.edu/ml/datasets/Daily+Demand+Forecasting+Orders
## Abstract: The dataset was collected during 60 days, this is a real database of a brazilian logistics company.

## Added column called DayTotal to capture total number of days [=(Week-1)*7+Day]
library(readr)
bank <- read_csv("C:/Users/bryce_anderson/Desktop/Boston College/Predictive Analytics and Forecasting/Week 2 (PA&F)/Daily_Demand_Forecasting_OrdersFinal.csv")
## R Packages Utilized
library(ggplot2)
library(forecast)
## Plotting the Target variable over time
library(ggplot2)
attach(bank)
plot(Target~DayTotal, data=bank, xlab="Total Number of Days", ylab="Target (Total Orders)", main="Target (Total Orders) over Number of Days", col="DodgerBlue")
abline(lm(Target~DayTotal), col="orange")

## Setting the data as a ts object to be compatable with autoplot
bankTS <- ts(bank)

## Plotting the full set of values
autoplot(bankTS, main="Forecasted values for Daily Demand", ylab="Order Quantity", xlab="Days")

## Using the forecast() function to forecast 30 values of Target at a 95% confidence interval
library(forecast)
forecast(bankTS[,"Target"], h=30, level=95)
##    Point Forecast    Lo 95    Hi 95
## 61        300.871 123.7379 478.0041
## 62        300.871 123.7379 478.0041
## 63        300.871 123.7379 478.0041
## 64        300.871 123.7379 478.0041
## 65        300.871 123.7379 478.0041
## 66        300.871 123.7379 478.0041
## 67        300.871 123.7379 478.0041
## 68        300.871 123.7379 478.0041
## 69        300.871 123.7379 478.0041
## 70        300.871 123.7379 478.0041
## 71        300.871 123.7379 478.0041
## 72        300.871 123.7379 478.0041
## 73        300.871 123.7379 478.0041
## 74        300.871 123.7379 478.0041
## 75        300.871 123.7379 478.0041
## 76        300.871 123.7379 478.0041
## 77        300.871 123.7379 478.0041
## 78        300.871 123.7379 478.0041
## 79        300.871 123.7379 478.0041
## 80        300.871 123.7379 478.0041
## 81        300.871 123.7379 478.0041
## 82        300.871 123.7379 478.0041
## 83        300.871 123.7379 478.0041
## 84        300.871 123.7379 478.0041
## 85        300.871 123.7379 478.0041
## 86        300.871 123.7379 478.0041
## 87        300.871 123.7379 478.0041
## 88        300.871 123.7379 478.0041
## 89        300.871 123.7379 478.0041
## 90        300.871 123.7379 478.0041
## Plotting the forecasting Target Variable
bankF <- forecast(bankTS[,"Target"], h=30, level=95)
autoplot(bankF, main="Forecasted Target values over Number of Days", ylab="Order Quantity", xlab="Days")

## Shows a large forecast range because of the variability of the Target values

## Create a time series of bankF (Forecasted Target variable)
## Start date of Day 0 with a frequency of 5
bankF <- ts(start=(0), Target, frequency=5)
## Multiplicative Decomposition
## Appropriate when the variation in the seasonal pattern, or the variation around the trend-cycle, appears to be proportional to the level of the time series. Most common with economic time series.
bankF %>% decompose(type="multiplicative") %>%
  autoplot() + xlab("Week") +
  ggtitle("Classical Mutliplicative Decomposition of Target (Total Orders) by Days")

## Additive Decomposition
## Most appropriate if the magnitude of the seasonal fluctuations, or the variation around the trend-cycle, does not vary with the level of the time series.
bankF %>% decompose(type="additive") %>%
  autoplot() + xlab("Week") +
  ggtitle("Classical Additive Decomposition of Target (Total Orders) by Days")

Classical Decompoition Method Analysis

The results from both methods are fairly similar in this case. In interpretting the output, the frequency of 5 creates roughly a week of order days. The frequency of 5 was chosen because it generated more visually descriptive results in terms of trends in the data. The largest difference can be seen in the Remainder graph shown last. The grey bars in this chart show the relative scales of the components. Based on the results, I believe the Multiplicative method shows slightly better results for the data set based on the slightly smaller ranges. Overall, the remainder terms may serve as a weak evaluator in this case because of how similar the outputs are. The most significant trends can be seen during Weeks 6, 7, and 8. In future research, these weeks should be analyzed more closely to reveal possible seasonal trends in the data.