Part A – ATM Forecast

Files: ATM624Data.xlsx

In part A, I want you to forecast how much cash is taken out of 4 different ATM machines for May 2010. The data is given in a single file. The variable ‘Cash’ is provided in hundreds of dollars, other than that it is straight forward. I am being somewhat ambiguous on purpose to make this have a little more business feeling. Explain and demonstrate your process, techniques used and not used, and your actual forecast. I am giving you data via an excel file, please provide your written report on your findings, visuals, discussion and your R code via an RPubs link along with the actual.rmd file Also please submit the forecast which you will put in an Excel readable file.

Import and Clean Data

First step is to import the data from excel. When I imported it the first time it was not reading the dates correctly, by simply opening the file in excel and formatting it as a date, and saving the file, I was able to import the dates correctly the second time.

Next step is to examine the data before converting it to a time series to see if there is any missing data or other problems with the data, and since we have 4 ATMs we need to move each ATM to a separate column. Upon doing this I discovered a few problems that needed to be dealt with:

  1. there are NA’s in the ATM field that needed to be removed
  2. ATM3 has almost no values and is mostly zeros
  3. there is an outlier in ATM4’s data that is far above the normal
  4. there are 3 NA’s in the Cash field for ATM1 and 2 for ATM2

##       ATM1             ATM2             ATM3              ATM4          
##  Min.   :  1.00   Min.   :  0.00   Min.   : 0.0000   Min.   :    1.563  
##  1st Qu.: 73.00   1st Qu.: 25.50   1st Qu.: 0.0000   1st Qu.:  124.334  
##  Median : 91.00   Median : 67.00   Median : 0.0000   Median :  403.839  
##  Mean   : 83.89   Mean   : 62.58   Mean   : 0.7206   Mean   :  474.043  
##  3rd Qu.:108.00   3rd Qu.: 93.00   3rd Qu.: 0.0000   3rd Qu.:  704.507  
##  Max.   :180.00   Max.   :147.00   Max.   :96.0000   Max.   :10919.762  
##  NA's   :3        NA's   :2

The NA’s in the ATM field turned out to be the dates that will be forecast with no data in the other columns, so they were removed using the complete.cases function.

The NA’s in ATM1 and ATM2’s data were imputed using the median since the number of missing values was so small.

ATM4’s outlier was also replaced with the median since it was so far above the norm, and the only abnormal data point in the entire ATM4 series, so it seemed likely to be an error.

Cleaned up data

##       ATM1             ATM2            ATM3              ATM4         
##  Min.   :  1.00   Min.   :  0.0   Min.   : 0.0000   Min.   :   1.563  
##  1st Qu.: 73.00   1st Qu.: 26.0   1st Qu.: 0.0000   1st Qu.: 124.334  
##  Median : 91.00   Median : 67.0   Median : 0.0000   Median : 403.839  
##  Mean   : 83.95   Mean   : 62.6   Mean   : 0.7206   Mean   : 445.233  
##  3rd Qu.:108.00   3rd Qu.: 93.0   3rd Qu.: 0.0000   3rd Qu.: 704.192  
##  Max.   :180.00   Max.   :147.0   Max.   :96.0000   Max.   :1712.075

ATM’s 1 and 2 appear to have some weekly seasonality. ATM3 with only 3 data points does not have enough data to do a proper forecast so a simple mean will be used. ATM4 Appears to be white noise, but there may still be some seasonality that we will investigate further o determine. There is no apparent trend to any of the series.

Convert to Time Series

Time series objects were created for each ATM since there seemed to be some weekly and possibly monthly seasonality in the plots of the data (although I was tempted to just sum the data on a monthly basis since the instructions were not clear about whether or not we were required to create daily forecasts or if a monthly forecast would be adequate).

ATM1

For each of ATM1, ATM2 and ATM4, I want to try running at least a Holt-Winter’s additive model with damped trend, since the seasonal variations are roughly constant through the series, and ETS and ARIMA models to see if they result in better performance than the Holt-Winter’s model.

Before running any models I will check the ACF and PACF plots, and the ndiffs, nsdiffs, and BoxCox.lambda functions to see what they recommend for differencing and what type of model they suggest might be most appropriate.

## [1] 0
## [1] 1
## [1] 0.2446101

For ATM1 no first order differencing is recommended, only a first order seasonal difference and a box-cox transformation with \(\lambda\) = 0.2446101. Let’s plot the data again after these transformations are performed to see what impact they have.

The plot above shows that most of the seasonality has been eliminated and we are left with an almost stationary time series although there are still spikes in the ACF plot at lag 7 and in the PACF plot at lags 7, 14, and 21. By adding a first order differencing we can make

Holt-Winters

ME RMSE MAE MPE MAPE MASE ACF1
Training set 2.100449 25.21299 16.10613 -92.92647 111.5487 0.9096064 0.1149874

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' additive method
## Q* = 19.421, df = 3, p-value = 0.0002237
## 
## Model df: 12.   Total lags used: 15

The residuals plot looks not too bad, but our Ljung-Box test has an extremely small p-value indicating that there is still some autocorrelation in our data as we saw in the plot of the transformed data. Our forecast plot looks not too bad either although those confidence intervals extend way past what we have seen historically in the data.

ETS

ME RMSE MAE MPE MAPE MASE ACF1
Training set 2.337381 25.04745 16.1103 -91.70147 110.6442 0.9098418 0.1213482

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,A)
## Q* = 19.696, df = 5, p-value = 0.001425
## 
## Model df: 9.   Total lags used: 14

The ETS model gave us almost exactly the same results with only slightly better RMSE and Ljung-Box results.

ARIMA

ME RMSE MAE MPE MAPE MASE ACF1
Training set -0.0736394 23.3332 14.60281 -102.7359 117.6027 0.8247052 -0.0081341

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,1)(0,1,2)[7]
## Q* = 16.789, df = 11, p-value = 0.1143
## 
## Model df: 3.   Total lags used: 14

The ARIMA model resulted in the best fit with the best RMSE and a Ljung-Box p-value that means we cannot reject the null hypothesis that the series is consistent with white noise. The plot of the forecast also looks like a more reasonable estimate of what we can expect based on the historical data.

ME RMSE MAE MPE MAPE MASE ACF1
Holt-Winter’s 2.1004493 25.21299 16.10613 -92.92647 111.5487 0.9096064 0.1149874
ETS 2.3373807 25.04745 16.11030 -91.70147 110.6442 0.9098418 0.1213482
ARIMA(0,0,1)(0,1,2)[7] -0.0736394 23.33320 14.60281 -102.73591 117.6027 0.8247052 -0.0081341

ATM2

Following the same procedure for ATM2.

## [1] 1
## [1] 1
## [1] 0.7278107

For ATM2 both first order differencing and seasonal differencing are recommended, and a box-cox transformation with \(\lambda\) = 0.7278107. Let’s plot the data again after these transformations are performed to see what impact they have.

Once again we can see clear seasonality in the ACF and PACF plots before transformation in the plot above, and after in the plot below. Note the first order differencing was not applied because it resulted in a PACF with many spikes outside the critical values.

Holt-Winters

ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.6284115 25.40858 17.92711 -Inf Inf 0.8614639 0.0199519

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' additive method
## Q* = 34.137, df = 3, p-value = 1.853e-07
## 
## Model df: 12.   Total lags used: 15

ETS

ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.5166014 25.36411 17.84184 -Inf Inf 0.8573662 0.019386

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,A)
## Q* = 33.355, df = 5, p-value = 3.199e-06
## 
## Model df: 9.   Total lags used: 14

ARIMA

ME RMSE MAE MPE MAPE MASE ACF1
Training set 1.44981 24.25163 17.12271 -Inf Inf 0.8228094 0.0058678

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(3,0,3)(0,1,1)[7] with drift
## Q* = 8.859, df = 6, p-value = 0.1817
## 
## Model df: 8.   Total lags used: 14

ARIMA with first order differencing

Since the ndiffs function recommended first order differencing but the auto.arima function did not use differencing in the model, let’s try manually adding it to see if we can improve the model.

ME RMSE MAE MPE MAPE MASE ACF1
Training set 2.332289 28.9457 20.52568 NaN Inf 0.7141976 -0.0447085

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(3,1,3)(0,1,1)[7]
## Q* = 24.274, df = 7, p-value = 0.001019
## 
## Model df: 7.   Total lags used: 14
ME RMSE MAE MPE MAPE MASE ACF1
Holt-Winter’s 0.6284115 25.40858 17.92711 -Inf Inf 0.8614639 0.0199519
ETS 0.5166014 25.36411 17.84184 -Inf Inf 0.8573662 0.0193860
ARIMA(3,0,3)(0,1,1)[7] 1.4498098 24.25163 17.12271 -Inf Inf 0.8228094 0.0058678
ARIMA(3,1,3)(0,1,1)[7] 2.3322890 28.94570 20.52568 NaN Inf 0.7141976 -0.0447085

The auto.arima function gave us the best results so that model will be used for predictions.

ATM3

As mentioned earlier there just is not enough data to make a good forecast for ATM3, so a simple mean will be used to forecast until more data can be collected.

ATM4

Once again following the same procedure for ATM4 as done earlier for ATM’s 1 and 2.

## [1] 0
## [1] 0
## [1] 0.4525697

Here no differencing or seasonal differencing was recommended but a box-cox transformation with \(\lambda\) = 0.7278107 was. Looking at the plots however, it’s not clear that the box-cox transformation improved the stationarity of the data. Seasonal spikes are still apparent.

Holt-Winters

ME RMSE MAE MPE MAPE MASE ACF1
Training set 72.46451 340.8815 261.8729 -369.0055 413.0543 0.7559486 0.1006656

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' additive method
## Q* = 21.115, df = 3, p-value = 9.965e-05
## 
## Model df: 12.   Total lags used: 15

ETS

ME RMSE MAE MPE MAPE MASE ACF1
Training set 67.23932 338.2241 259.7873 -376.9491 420.7907 0.7499281 0.0973636

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,A)
## Q* = 19.488, df = 5, p-value = 0.001559
## 
## Model df: 9.   Total lags used: 14

ARIMA

ME RMSE MAE MPE MAPE MASE ACF1
Training set 84.4142 351.8346 274.3411 -370.6976 415.1683 0.7919408 0.019962

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,1)(2,0,0)[7] with non-zero mean
## Q* = 15.872, df = 10, p-value = 0.1034
## 
## Model df: 4.   Total lags used: 14

On ATM4 the ARIMA model performs poorly as compared with either the Holt-Winter’s or the ETS models. But since the auto.arima function did not choose to use any seasonal differencing and some seasonality seems apparent in the plots, various arima models were tested using first order seasonal differencing until the best performance was attained using the model below.

ME RMSE MAE MPE MAPE MASE ACF1
Training set 58.30159 334.008 251.7076 -339.9261 382.9995 0.7266044 0.0143578

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,1)(14,1,0)[7]
## Q* = 16.453, df = 3, p-value = 0.0009157
## 
## Model df: 15.   Total lags used: 18
ME RMSE MAE MPE MAPE MASE ACF1
Holt-Winter’s 72.46451 340.8815 261.8729 -369.0055 413.0543 0.7559486 0.1006656
ETS 67.23932 338.2241 259.7873 -376.9491 420.7907 0.7499281 0.0973636
ARIMA(0,0,1)(2,0,0)[7] 84.41420 351.8346 274.3411 -370.6976 415.1683 0.7919408 0.0199620
ARIMA(0,0,1)(14,1,0)[7] 58.30159 334.0080 251.7076 -339.9261 382.9995 0.7266044 0.0143578

Although similar performance was eventually reached with the ARIMA model the much less complex ETS model is preferred since the improvement in performance is very minimal.

Part B – Forecasting Power

Files: ResidentialCustomerForecastLoad-624.xlsx

Part B consists of a simple dataset of residential power usage for January 1998 until December 2013. Your assignment is to model these data and a monthly forecast for 2014. The data is given in a single file. The variable ‘KWH’ is power consumption in Kilowatt hours, the rest is straight forward. Add this to your existing files above.

##   CaseSequence     YYYY-MMM              KWH          
##  Min.   :733.0   Length:192         Min.   :  770523  
##  1st Qu.:780.8   Class :character   1st Qu.: 5429912  
##  Median :828.5   Mode  :character   Median : 6283324  
##  Mean   :828.5                      Mean   : 6502475  
##  3rd Qu.:876.2                      3rd Qu.: 7620524  
##  Max.   :924.0                      Max.   :10655730  
##                                     NA's   :1

Here again like in the ATM data, we can clearly see an outlier that is most likely a data error so we imputed that point with the mean of the other data for the same month. We did the same with another NA data point.

CaseSequence YYYY-MMM KWH date month
861 2008-Sep-01 NA 2008-09-01 9
CaseSequence YYYY-MMM KWH date month
883 2010-Jul-01 770523 2010-07-01 7

## [1] 1
## [1] 1
## [1] -0.2018638

Holt-Winters

ME RMSE MAE MPE MAPE MASE ACF1
Training set 78844.59 620901.7 455088.7 0.477213 6.802764 0.7435267 0.2858823

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' additive method
## Q* = 44.62, df = 7, p-value = 1.621e-07
## 
## Model df: 17.   Total lags used: 24

ME RMSE MAE MPE MAPE MASE ACF1
Training set 42476.76 617283 459499.3 -0.1199566 6.936035 0.7507327 0.2469939

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 45.599, df = 7, p-value = 1.046e-07
## 
## Model df: 17.   Total lags used: 24

Here a Holt-Winter’s multiplicative model resulted in just slightly better performance than the additive model with box-cox transformation.

ETS

ME RMSE MAE MPE MAPE MASE ACF1
Training set 32423.43 613027.6 453744.7 -0.2772083 6.836306 0.7413308 0.2817121

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,A,A)
## Q* = 45.157, df = 8, p-value = 3.436e-07
## 
## Model df: 16.   Total lags used: 24

ARIMA

ME RMSE MAE MPE MAPE MASE ACF1
Training set 48157 626456.3 477342.9 0.1542706 7.247545 0.7798858 0.0998182

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,1)(2,1,0)[12] with drift
## Q* = 29.88, df = 20, p-value = 0.07183
## 
## Model df: 4.   Total lags used: 24

ME RMSE MAE MPE MAPE MASE ACF1
Training set 45040.95 594600.3 432060.5 0.1024936 6.5269 0.7059031 -0.0003831

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,1,1)(2,1,2)[12]
## Q* = 20.892, df = 17, p-value = 0.2312
## 
## Model df: 7.   Total lags used: 24
ME RMSE MAE MPE MAPE MASE ACF1
Holt-Winter’s Additive 78844.59 620901.7 455088.7 0.4772130 6.802764 0.7435267 0.2858823
Holt-Winter’s Multiplicative 42476.76 617283.0 459499.3 -0.1199566 6.936035 0.7507327 0.2469939
ETS 32423.43 613027.6 453744.7 -0.2772083 6.836306 0.7413308 0.2817121
ARIMA(0,0,1)(2,1,0)[12] 48157.00 626456.3 477342.9 0.1542706 7.247545 0.7798858 0.0998182
ARIMA(2,1,1)(2,1,2)[12] 45040.95 594600.3 432060.5 0.1024936 6.526900 0.7059031 -0.0003831

Some trial and error resulted in an ARIMA(2,1,1)(2,1,2)[12] model (without any box-cox transformation) that is not so complex and has significantly improved performance, so that it is the preferred model for forecasting.