Discussion3

This data is from the Federal Reserve Bank of St. Louis. It is the Consumer Price Index for All Urban Consumers: Electricity in U.S. City Average (not seasonally adjusted). Data is from 1/1970 to 5/2020. The prices are indexed to 1982:1984 = 100.

Setup

library(forecast)

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

library(ggplot2)
library(seasonal)

Data

elecTS = ts(electricityData, frequency = 12, start = c(1970,1))
elecTS = elecTS[,-1]
logElec = log(elecTS)
autoplot(logElec, ylab = "Electricity Price per kWh (US City Avg)", main = "US City Electricity Prices")

ggseasonplot(logElec)

decomp = seas(elecTS,x11="")
autoplot(decomp)

Split data

trainData = window(logElec, end = c(2009,12))
testData = window(logElec, start = c(2010,01))

Model 1: ETS - ZZZ

ETSmod = ets(trainData)
ETSfc = forecast(ETSmod, h = 125)
autoplot(ETSfc)

Model 2: Holt Winters

HWmod = HoltWinters(trainData)
HWfc = forecast(HWmod, h = 125)
autoplot(HWfc)

Model 3: Simple exponential smoothing with multiplicative errors

SESmod = ets(trainData, model = "MNN")
SESfc = forecast(SESmod, h=125)
autoplot(SESfc)

Comparison

checkresiduals(ETSfc)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,Ad,A)
## Q* = 519.16, df = 7, p-value < 2.2e-16
## 
## Model df: 17.   Total lags used: 24

checkresiduals(HWfc)

## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.

checkresiduals(SESfc)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,N,N)
## Q* = 1236.5, df = 22, p-value < 2.2e-16
## 
## Model df: 2.   Total lags used: 24

print("ETS")

## [1] "ETS"

accuracy(ETSfc)

##                       ME       RMSE         MAE        MPE      MAPE      MASE
## Training set 0.001984658 0.01230379 0.009429267 0.04587238 0.2090927 0.1868717
##                  ACF1
## Training set 0.570778

print("HW")

## [1] "HW"

accuracy(HWfc)

##                         ME       RMSE         MAE          MPE      MAPE
## Training set -0.0002519377 0.01218034 0.009148561 -0.004446316 0.2023543
##                   MASE      ACF1
## Training set 0.1813086 0.5993047

print("SES")

## [1] "SES"

accuracy(SESfc)

##                       ME       RMSE       MAE        MPE      MAPE      MASE
## Training set 0.003735428 0.02179474 0.0137353 0.08634322 0.2950853 0.2722098
##                   ACF1
## Training set 0.3273252

The biggest issue here is autocorrelation. I took the log of the original data in the hopes this would not be an issue, but it did not work. We can see that none of these models is biased, as shown in the low Mean Error scores. The ETS ZZZ model and the Holt Winters perform very similarly in terms on RMSE, with the SES performing slightly worse.

This is likely because the simple exponential smoothing model (ETS MNN) doesn't include seasonality. Interestingly, the ZZZ model chooses an additive seasonal component despite, as we saw in the x11 decomposition, the seasonal component varying over time. It's also worth noting again that the additive dampended trend in the ZZZ model performed about as well as the Holt Winters. The plots on these show the ZZZ model forecasting mostly a flat line, while the Holt Winters includes an upwards trend. The data as a whole trends upwards, so this is not the result I expected, especially when we only consider the training set.

Discussion3

Justin Lynch

7/14/2020