if (!require("fma")) install.packages("fma")
if (!require("fpp")) install.packages("fpp")
library(fma)
library(fpp)

Exercise 8.1

The below figure shows the ACFs for 36 random numbers, 360 random numbers, and for 1,000 random numbers.

Left: ACF for a white noise series of 36 numbers. Middle: ACF for a white noise series of 360 numbers. Right: ACF for a white noise series of 1,000 numbers.

Exercise 8.1.a

Explain the differences among these figures. Do they all indicate the data are white noise?

Approach

The ARIMA(0, 0, 0) model with all zero components represents a White Noise model with independent and identically distributed data. The arima.sim() function which uses a default rand.gen = rnorm can be used to simulate data from a variety of time series models. This means that using arima.sim(model = list(order = c(0, 0, 0)), n = 1) and rnorm(1) are two equivalent ways of simulating white noise.

Results

Box.test(arima.sim(model = list(order = c(0, 0, 0)), n = 36), type="Ljung")

## 
##  Box-Ljung test
## 
## data:  arima.sim(model = list(order = c(0, 0, 0)), n = 36)
## X-squared = 1.9546, df = 1, p-value = 0.1621

Box.test(arima.sim(model = list(order = c(0, 0, 0)), n = 360), type="Ljung")

## 
##  Box-Ljung test
## 
## data:  arima.sim(model = list(order = c(0, 0, 0)), n = 360)
## X-squared = 1.5324, df = 1, p-value = 0.2158

Box.test(arima.sim(model = list(order = c(0, 0, 0)), n = 1000), type="Ljung")

## 
##  Box-Ljung test
## 
## data:  arima.sim(model = list(order = c(0, 0, 0)), n = 1000)
## X-squared = 0.11712, df = 1, p-value = 0.7322

Interpretation

The differences between the three figures are the number of lags displayed and the distances of the boundary limits (dashed blue lines) from zero which are both related to the sample size. All the ACF plots indicate the data are white noise since over 95% of the lags fall inside the boundaries. To further demonstrate this, Box-Ljung tests are performed on White Noise data of each sample size to show that autocorrelation in white noise (regardless of sample size) is not significantly different from zero. Although both arima.sim(model = list(order = c(0, 0, 0)), n = 1) and rnorm(1) can be used to simulate a white noise model, the former was used to facilitate understanding of the problem.

Exercise 8.1.b

Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

Approach

The boundary limits (dashed blue lines) produced by the acf() function are determined by the values entered in the arguments for the random number distribution (rand.gen) and for the percentage value used for the confidence intervals (level). The default values are rand.gen=rnorm and level=95.

Results

alpha <- 1 - 0.95
for (n in c(36, 360, 1000)) {
  plotLags = floor(10 * log10(n / 1))
  lower = qnorm(alpha / 2, lower.tail=TRUE) / sqrt(n)
  upper = qnorm(alpha / 2, lower.tail=FALSE) / sqrt(n)
  width = upper - lower
  print(data.frame(n, plotLags, lower, upper, width))
}

##    n plotLags      lower     upper     width
## 1 36       15 -0.3266607 0.3266607 0.6533213
##     n plotLags      lower     upper     width
## 1 360       25 -0.1032992 0.1032992 0.2065983
##      n plotLags      lower     upper    width
## 1 1000       30 -0.0619795 0.0619795 0.123959

Interpretation

The number of lags displayed and the distances of the boundary limits (dashed blue lines) from zero both use functions that involve the sample size. The number of lags displayed on the plot is equal to \(\min { \left[ \left\lfloor 10\log _{ 10 }{ \left( { n }/{ m } \right) } \right\rfloor ,n-1 \right] }\) where \(n\) is the number of observations and \(m\) the number of series. The distance of the boundary limits from zero is equivalent to \(\pm { z_{{\alpha}/{2}} }/{ \sqrt { n } }\) where \(n\) is the number of observations and \(z_{{\alpha}/{2}}\) is the critical value in the normal distribution for \(\alpha=1-0.95=0.05\). The distribution and level of \(\alpha\) results from the acf() function’s default values of rand.gen=rnorm and level=95. The resulting effects on the plots are a number of lags that increases proportional to the base 10 log of the sample size and boundary limits that decrease proportional to the inverse square root of the sample size.

Exercise 8.2

A classic example of a non-stationary series is the daily closing IBM stock prices (data set ibmclose). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows the series is non-stationary and should be differenced.

Approach

In general, a stationary time series has no predictable patterns in the long-term. A time series with trends, seasonality, or heteroskedasticity is not stationary. Transformations such as logarithms can help to stabilize the variance of a time series and differencing can help stabilize the mean of a time series by removing changes in the level (trend and seasonality) of a time series. A time series with aperiodic cyclic behavior can stationary because the timing of the cycles is not predictable in the long-term. The tsdisplay() function plots a time series along with its ACF and either its PACF, Histogram, Scatterplot, or Spectrum in the lower right corner. The ACF plot shows the autocorrelations between each observation and its immediate predecessor (lagged observation). The PACF plot shows the autocorrelations between the current observation and each lagged observation individually. These plots can be used together to spot nonstationary characteristics that are indicative of a need for differencing.

Plot	Characteristic	Description	Stationary	Nonstationary	Removal
Time	Trend	Long Up or Down Periods	Present	Absent	Difference
Time	Seasonality	Annual, Monthly, etc.	Present	Absent	Difference
Time	Heteroskedasticity	Change in Variance	Present	Absent	Transform
Time	Cyclic Behavior	Aperiodic/Unpredictable	Irrelevant	Irrelevant	Irrelevant
ACF	Exponential Decay	Significant Spike Decrease	Present	Absent	Difference
PACF	Exponential Decay	Significant Spike Decrease	Present	Absent	Difference

Results

tsdisplay(ibmclose, plot.type = "partial")

Interpretation

Using these plots together show characteristics that are indicative of a need for differencing. The Time Plot displays long periods of apparent upward and downward trends in the data. Seasonality and changing variance are not readily apparent. There is some cyclical behavior happening, but its unpredictability makes it irrelevant. The ACF Plot shows spikes that are dropping to zero relatively slowly. The PACF Plot does not show any significant spikes meaning that although there are significant autocorrelations between subsequent lags, there are no significant partial autocorrelations between the first observation and other lags that are not its immediately predecessor.

Exercise 8.6

Consider the number of women murdered each year (per 100,000 standard population) in the United States (data set wmurders).

Exercise 8.6.a

By studying appropriate graphs of the series in R, find an appropriate \(\textrm{ARIMA}(p,d,q)\) model for these data.

Approach

A time series with trends, seasonality, or heteroskedasticity is not stationary. Transformations such as logarithms can help to stabilize the variance of a time series and differencing can help stabilize the mean of a time series by removing changes in the level (trend and seasonality) of a time series. Trends and seasonality in the Time Plot are indicative of a need for differencing. The tsdisplay() function plots a time series along with its ACF and either its PACF, Histogram, Scatterplot, or Spectrum in the lower right corner. The ACF and PACF plots are helpful in determining the value of \(p\) or \(q\) if both are not positive.

Significant Spikes	\(\textrm{ARIMA}(p,d,0)\)	\(\textrm{ARIMA}(0,d,q)\)
ACF	Exponential Decay or Sinusoidal	At, but not beyond, Lag \(q\)
PACF	At, but not beyond, Lag \(p\)	Exponential Decay or Sinusoidal

Results

plot(wmurders)

tsdisplay(diff(wmurders), plot.type = "partial")

Interpretation

The Time Plot displays long periods of apparent upward and downward trends in the data. This supports the needs for differencing. After applying first-order differencing to the data, the tsdisplay() function shows a Time Plot with no apparent trend. The ACF and PACF plots show one significant spike in each. Although these lone significant spikes are in the first few lags, one of the twenty lags (or 0.05%) appearing significant at a 95% confidence interval is within probabilistic expectations. These lone significant spikes are therefore ignored. The appropriate model is an \(\textrm{ARIMA}(0,1,0)\) model.

Exercise 8.6.b

Should you include a constant in the model? Explain.

Approach

The higher the value of \(d\), the more rapidly the prediction intervals increase in size. The constant \(c\) has important effects on the long-term forecasts obtained from ARIMA models. Below are the effects of combining different values of \(d\) and \(c\).

Long-term Forecast	\(c=0\)	\(c\neq0\)
\(d=0\)	Goes to Zero	Goes to the Mean of the Data
\(d=1\)	Goes to a Non-zero Constant	Follows a Straight Line
\(d=2\)	Follows a Straight Line	Follows a Quadratic Trend

Results

plot(wmurders)

Interpretation

The \(\textrm{ARIMA}(0,1,0)\) model chosen has a \(d=1\) parameter. The wmurders data have trends and a non-zero mean. An \(\textrm{ARIMA}(0,1,0)\) model with \(c=0\) represents a random walk and an \(\textrm{ARIMA}(0,1,0)\) model with \(c\neq0\) represents a random walk with drift. Adding drift to these data seems like the most reasonable choice, therefore a constant should be added to the model.

Exercise 8.6.c

Write this model in terms of the backshift operator.

Approach

The backward shift operator \(B\) is a useful notational device when working with time series lags. A seasonal backshift is represented by \(\nabla_m=B^m\) where \(m\) represents the seasonal period. Differencing is represented by \(\nabla^d=(1-B)^d\) where \(d\) represts the differencing order. A differenced seasonal backshift can be annotated as \(\nabla^d_m=(1-B^m)^d\) where \(m\) represents the seasonal period and \(d\) represents the differencing order.

Results

plot(diff(wmurders))

Interpretation

An \(\textrm{ARIMA}(0,1,0)\) model for the non-seasonal wmurders data has \(m=d=1\). The representation of this model in terms of the backshift operator is equivalent to \(\nabla^d_m y_t=\nabla^1_1 y_t=(1-B^1)^1 y_t=(1-B) y_t=y_t-y_{t-1}\).

Exercise 8.6.d

Fit the model using R and examine the residuals. Is the model satisfactory?

Approach

The arima() and Arima() functions are useful when specifying an ARIMA model. The difference is that unlike the arima() function, the Arima() function allows for a constant \(c\) when \(d\neq0\), allows the fitted model to be applied to new data, and works well with the forecast() function. Setting the parameter include.constant to TRUE in the Arima() function sets include.mean=TRUE if \(d=0\) and include.drift=TRUE if \(d=1\). Setting the parameter include.constant to FALSE sets both include.mean and include.drift to FALSE.

Results

(fit <- Arima(wmurders, order=c(0,1,0), include.constant=TRUE))

## Series: wmurders 
## ARIMA(0,1,0) with drift         
## 
## Coefficients:
##        drift
##       0.0030
## s.e.  0.0291
## 
## sigma^2 estimated as 0.04649:  log likelihood=6.73
## AIC=-9.47   AICc=-9.23   BIC=-5.49

Acf(residuals(fit))

Box.test(residuals(fit), lag=24, fitdf=4, type="Ljung")

## 
##  Box-Ljung test
## 
## data:  residuals(fit)
## X-squared = 23.329, df = 20, p-value = 0.2729

Interpretation

The \(\textrm{ARIMA}(0,1,0)\) model is satisfactory. The ACF plot of the residuals shows one of the twenty residuals (or 0.05%) as significant. At a 95% confidence interval this is within probabilistic expectations. The Box-Ljung test also shows that the residuals are not significantly different from white noise.

Exercise 8.6.e

Forecast three times ahead. Check your forecasts by hand to make sure you know how they have been calculated.

Approach

Forecasts are calculated by first expanding the autoregressive (AR) and moving average (MA) parts of equation \(8.6.e.1\) with the appropriate values of \(p\) and \(q\). Then applying the backshift operator using the appropriate value of \(d\). The equation is then manipulated such that the entire equation is written in terms of \(y_t\). This establishes the equation needed for the forecast at time \(k\) where all \(t\) are replaced by \(T+k\); or \(T+k|T\) for any period greater than \(t\). For \(k=1\) the error is equual to the last observed error. For \(k>1\) the error is equual zero. \[\left( 1-\sum _{ i=0 }^{ p }{ \phi _{ i }B^{ i } } \right) \left( 1-B \right)^d y_{ t }=c+\left( 1-\sum _{ i=0 }^{ q }{ \theta _{ i }B^{ i } } \right) e_{ t } \tag{8.6.e.1}\]

Results

(fcast <- forecast(fit, h=3))

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005       2.592345 2.316033 2.868658 2.169762 3.014929
## 2006       2.595308 2.204543 2.986073 1.997684 3.192931
## 2007       2.598270 2.119683 3.076858 1.866334 3.330207

c <- fcast$model$coef[1] # drift
yt <- wmurders[55] # last obs
et <- 0 # expected white noise
ytPlus1 <- c + yt + et
ytPlus2 <- c + ytPlus1
ytPlus3 <- c + ytPlus2
data.frame(ytPlus1, ytPlus2, ytPlus3, row.names="forecast")

##           ytPlus1  ytPlus2 ytPlus3
## forecast 2.592345 2.595308 2.59827

Interpretation

Given the \(\textrm{ARIMA}(0,1,0)\) model where \(p=q=0\) and \(d=1\), get: \[\left( 1-\phi _{ 0 }B^{ 0 } \right) \left( 1-B \right) ^{ 1 }y_{ t }=c+\left( 1-\theta _{ 0 }B^{ 0 } \right) e_{ t }\Rightarrow \left( 1-B \right) y_{ t }=c+e_{ t }\Rightarrow y_{ t }-y_{ t-1 }=c+e_{ t }\Rightarrow y_{ t }=c+y_{ t-1 }+e_{ t }\] Solving for \(y_t\) and replacing all \(t\) with \(T+k\); or \(T+k|T\) for any period greater than \(t\), get: \[y_{ T+1|T }=c+y_{ T }+e_{ T },\quad\quad y_{ T+2|T }=c+y_{ T+1|T },\quad\quad y_{ T+3|T }=c+y_{ T+2|T }\] The three times ahead forecast matches the manual forecasts when using the same value for \(c\) and the expected value of zero-mean white noise for the error.

Exercise 8.6.f

Create a plot of the series with forecasts and prediction intervals for the next three periods shown.

Approach

The Time Plot is a line graph that plots each observed value against the time of the observation, with a single line connecting each observation across the entire period. When plotting forecast results, it displays the forecasts as well as predication intervals.

Results

fcast <- forecast(fit, h=3)
plot(fcast)

Interpretation

The Time Plot of the ARIMA model shows predictions like that of a random walk with drift forecast. This is because as stated in 8.6.b above, an \(\textrm{ARIMA}(0,1,0)\) model with \(c\neq0\) represents a random walk with drift.

Exercise 8.6.g

Does auto.arima give the same model you have chosen? If not, which model do you think is better?

Approach

The auto.arima() function chooses an ARMIA model automatically. It uses a variation of the Hyndman and Khandakar algorithm which combines unit root tests, minimization of the AICc, and MLE to obtain an ARIMA model. The function takes some short-cuts in order to speed up the computation and will not always yield the best model. Setting stepwise and approximation to FALSE prevents the function from taking short-cuts.

Results

auto.arima(wmurders, stepwise=FALSE, approximation=FALSE)

## Series: wmurders 
## ARIMA(0,2,3)                    
## 
## Coefficients:
##           ma1     ma2      ma3
##       -1.0154  0.4324  -0.3217
## s.e.   0.1282  0.2278   0.1737
## 
## sigma^2 estimated as 0.04475:  log likelihood=7.77
## AIC=-7.54   AICc=-6.7   BIC=0.35

fit

## Series: wmurders 
## ARIMA(0,1,0) with drift         
## 
## Coefficients:
##        drift
##       0.0030
## s.e.  0.0291
## 
## sigma^2 estimated as 0.04649:  log likelihood=6.73
## AIC=-9.47   AICc=-9.23   BIC=-5.49

Interpretation

The auto.arima() function suggests an \(\textrm{ARIMA}(0,2,3)\) model. Compared to the manually chosen \(\textrm{ARIMA}(0,1,0)\) model, the auto.arima() model outperforms the manual model with a lower \(\sigma^2\) and a higher MLE. Yet the auto.arima() model has higher AIC, AICc, and BIC than the manual model, therefore appearing inferior by these measures. It looks like the auto.arima() function may give more weight to \(\sigma^2\) and MLE than AIC, AICc, and BIC when selecting models.

Exercise 8.8

Consider the total net generation of electricity (in billion kilowatt hours) by the U.S. electric industry (monthly for the period 1985-1996); data set usmelec. In general there are two peaks per year: in mid-summer and mid-winter.

Exercise 8.8.a

Examine the 12-month moving average of this series to see what kind of trend is involved.

Approach

A moving average smoother is helpful in examining what kind of trend is involved in a series. Moving average models should not be confused with moving average smoothing. A moving average model is used for forecasting future values while moving average smoothing is used for estimating the trend-cycle of past values. The ma() function computes a simple moving average smoother of a given time series.

Results

plot(usmelec, col="grey")
lines(ma(usmelec, order=12), col="red")

Interpretation

The 12-month moving average smoother is a good fit for these data as can be seen by the low volatility of the moving average smoother line. The moving average smoother line shows that the data have an apparent upward trend.

Exercise 8.8.b

Do the data need transforming? If so, find a suitable transformation.

Approach

Heteroskedasticity refers to the circumstance in which the variability of a variable is unequal across the range of values of a second variable. Box-Cox transformations can help to stabilize the variance of a time series. Some typical Box-Cox transformations are: \[-2\Rightarrow \frac { 1 }{ y_t^{ 2 } }, \quad -1\Rightarrow \frac { 1 }{ y_t }, \quad -0.5\Rightarrow \frac { 1 }{ \sqrt { y_t } }, \quad 0\Rightarrow \log { \left( y_t \right) }, \quad 0.5\Rightarrow \sqrt { y_t }, \quad 1\Rightarrow y_t, \quad 2\Rightarrow y_t^{ 2 }\]

Results

(lambda <- BoxCox.lambda(usmelec))

## [1] -0.4772402

usmelecBC <- BoxCox(usmelec, round(lambda, 1))
par(mfrow=c(2, 1))
plot(usmelec)
plot(usmelecBC)

Interpretation

The Box-Cox transformation suggest a \(\lambda=-0.4772402\). Rounding the value to \(\lambda=-0.5\) gives a more interpretable transformation that takes the form of \({ 1 }/{ \sqrt { y_t } }\). This Box-Cox transformation stabilizes the variance and makes the series look relatively homoskedastic with equal variance.

Exercise 8.8.c

Are the data stationary? If not, find an appropriate differencing which yields stationary data.

Approach

Results

par(mfrow=c(2, 2))
diff1 <- diff(usmelecBC)
plot(diff1, ylab=expression(nabla^1*1/sqrt(y[t])))
lines(decompose(diff1)$trend, col="blue")
diff2 <- diff(usmelecBC, differences=2)
plot(diff2, ylab=expression(nabla^1*1/sqrt(y[t])))
lines(decompose(diff2)$trend, col="blue")
diff3 <- diff(usmelecBC, lag=12)
plot(diff3, ylab=expression(nabla[12]*1/sqrt(y[t])))
lines(decompose(diff3)$seasonal, col="red")
diff4 <- diff(diff(usmelecBC, lag=12))
plot(diff4, ylab=expression(nabla[12]*nabla^1*1/sqrt(y[t])))

Interpretation

The top-left plot shows first-order differencing of the transformed data to remove trend. Occasionally a differenced series will not appear stationary and it may be necessary to difference a second time to obtain a stationary series. In practice, it is almost never necessary to go beyond second-order differences. The top-right plot shows second-order differencing of the transformed data to further remove trend. Since both the first-order and second-order differencing result in a zero mean, only first-order differencing is necessary. The bottom-left shows differencing of the transformed data to remove seasonality. The bottom-right plot shows both first-order differencing and seasonal differencing operations on the transformed data. Applying both to the transformed data removes the trend and seasonality which yields stationary data.

Exercise 8.8.d

Identify a couple of ARIMA models that might be useful in describing the time series. Which of your models is the best according to their AIC values?

Approach

The tsdisplay() function plots a time series along with its ACF and either its PACF, Histogram, Scatterplot, or Spectrum in the lower right corner. The ACF and PACF plots are helpful in determining the value of \(p\) or \(q\) if both are not positive.

Significant Spikes	\(\textrm{ARIMA}(p,d,0)\)	\(\textrm{ARIMA}(0,d,q)\)
ACF	Exponential Decay or Sinusoidal	At, but not beyond, Lag \(q\)
PACF	At, but not beyond, Lag \(p\)	Exponential Decay or Sinusoidal

Results

tsdisplay(diff4, plot.type = "partial")

fit1 <- Arima(usmelec, order=c(0,1,3), seasonal=c(0,1,3), lambda=lambda)
fit1$aic

## [1] -4256.02

fit2 <- Arima(usmelec, order=c(2,1,0), seasonal=c(2,1,0), lambda=lambda)
fit2$aic

## [1] -4155.429

fit3 <- Arima(usmelec, order=c(2,1,3), seasonal=c(2,1,3), lambda=lambda)
fit3$aic

## [1] -4250.112

Interpretation

The significant spike at lag 3 followed by exponential decay in the ACF suggests a non-seasonal MA(3) component, and the significant spike at lag 12 in the ACF suggests a seasonal MA(3) component. Therefore one possible model is an \(\textrm{ARIMA}(0,1,3)(0,1,3)_{12}\) model. The significant spike at lag 2 in the PACF followed by exponential decay suggests a non-seasonal AR(2) component, and the significant spike at lag 12 in the ACF suggests a seasonal AR(2) component. Therefore another possible model is an \(\textrm{ARIMA}(2,1,0)(2,1,0)_{12}\) model. Combining both models yields an \(\textrm{ARIMA}(2,1,3)(2,1,3)_{12}\) model. After fitting each model, it turns out that the \(\textrm{ARIMA}(0,1,3)(0,1,3)_{12}\) model has the lowest AIC.

Exercise 8.8.e

Estimate the parameters of your best model and do diagnostic testing on the residuals. Do the residuals resemble white noise? If not, try to find another ARIMA model which fits better.

Approach

White Noise residuals display the characteristics of independent and identically distributed such that the autocorrelation of White Noise is not significantly different from zero. ACF plots, PACF Plots, and the Box-Ljung test are helpful in assessing if data follow a White Noise pattern.

Results

tsdisplay(residuals(fit1))

Box.test(residuals(fit1), lag=24, fitdf=4, type="Ljung")

## 
##  Box-Ljung test
## 
## data:  residuals(fit1)
## X-squared = 31.49, df = 20, p-value = 0.04905

tsdisplay(residuals(fit3))

Box.test(residuals(fit3), lag=24, fitdf=4, type="Ljung")

## 
##  Box-Ljung test
## 
## data:  residuals(fit3)
## X-squared = 29.553, df = 20, p-value = 0.07742

Interpretation

The residuals of the \(\textrm{ARIMA}(0,1,3)(0,1,3)_{12}\) model appear to display the characteristics of White Noise in the ACF plot with only one of the twenty residuals (or 0.05%) being significant. At a 95% confidence interval this is within probabilistic expectations. The Box-Ljung tests contradicts this conclusion however by showing that the autocorrelations of the residuals for the \(\textrm{ARIMA}(0,1,3)(0,1,3)_{12}\) model are significantly different from zero. The runner-up model with the second lowest AIC is therefore tested. The ACF plot of the residuals of the \(\textrm{ARIMA}(2,1,3)(2,1,3)_{12}\) model show more than one in twenty residuals (over 0.05%) as being significant. This would lead one to believe that the residuals are not White Noise. The Box-Ljung tests contradicts this conclusion as well however by showing that the autocorrelations of the residuals for the \(\textrm{ARIMA}(2,1,3)(2,1,3)_{12}\) model are not significantly different from zero. The \(\textrm{ARIMA}(2,1,3)(2,1,3)_{12}\) model passes the required checks and is therefore suitable for forecasting. This example shows the importance of using visual diagnostics in conjunction with portmanteau tests.

Exercise 8.8.f

Forecast the next 15 years of generation of electricity by the U.S. electric industry. Get the latest figures from here to check on the accuracy of your forecasts.

Approach

For these monthly data, a 15-year prediction consists of \((12)(15)=180\) months. When forecasting, a series can be transformed with a calculated value of \(\lambda\), modeled, then forecasted with the calculated value of \(\lambda\) specified in the forecast() function. The forecast() function performs much better however when the calculated \(\lambda\) is built into the model with the Arima() function,, then forecasted.

Results

lambda <- BoxCox.lambda(usmelec)
fit <- Arima(usmelec, order=c(2,1,3), seasonal=c(2,1,3), lambda=lambda)
fcast <- forecast(fit, h=12*15)
plot(fcast, ylab="kwh in Billions", main="15-year Prediction of Electricity Generation")

# DOWNLOAD ACTUAL DATA
actual <- read.csv(paste0("https://raw.githubusercontent.com/jzuniga123/",
  "SPS/master/DATA%20624/electricity-overview.csv"), stringsAsFactors=FALSE)
# CLEAN DOWNLOADED DATA
actual <- head(actual, -1) # remove last row with descriptive text
actual_0 <- as.integer(unlist(strsplit(as.character(head(actual["Month"], 1)), "-")))
actual_N <- as.integer(unlist(strsplit(as.character(tail(actual["Month"], 1)), "-")))
usmelec_test <- ts(actual[2], start=actual_0, end=actual_N, frequency=12)
# PLOT PREDICTIONS AGAINST ACTUAL
buffer <- 5
usmelec_N <- tail(time(usmelec), 1)
fcast_0 <- c(floor(usmelec_N), as.numeric(substr(usmelec_N, 5, 7)) * 12 + 1)
periods <- length(ts(actual[2], start=fcast_0, end=actual_N, frequency=12))
plot(forecast(fit, h=periods+buffer), include=buffer, main="Prediction Validation")
lines(window(usmelec_test, start=fcast_0), col="red")
legend("topleft", lty=1, col=c(1,4,2), c("Training Data", "Forecast Values", "Actual Data"))

Interpretation

The forecasts in the first plot appears to project the trend, seasonality, and heteroskedastic attributes of the time series data accurately. The prediction intervals expand widely after the tenth year in the prediction but the point estimates seem reasonable. For reproducibility, the downloaded validation dataset has been placed on GitHub. It requires some cleaning, but when finally plotted against the predictions, the forecast shows an astonishing level of accuracy. The point estimates of the forecasts and the actual data follow each other closely.

Exercise 8.8.g

How many years of forecasts do you think are sufficiently accurate to be usable?

Approach

The accuracy() function is helful for obtaining summary measures of the forecast accuracy: Mean Error (ME), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Percentage Error (MPE), Mean Absolute Percentage Error (MAPE), Mean Absolute Scaled Error (MASE), and Autocorrelation of errors at lag 1 (ACF1).

Results

accuracy(forecast(fit3, h=periods), window(usmelec_test, start=c(2010, 11)))

##                      ME      RMSE      MAE        MPE     MAPE      MASE
## Training set -0.3673897  7.079241 5.137940 -0.1572162 1.975946 0.5643384
## Test set     -4.6585379 10.249873 8.952395 -1.4997715 2.653519 0.9833086
##                     ACF1 Theil's U
## Training set -0.03110326        NA
## Test set      0.43041838 0.3281338

Interpretation

These predications a highly accurate by every measure. The predictability of the trend, seasonality, and heteroskedastic attributes of the time series makes for a more accurate forecast. With such long and stable patterns in the data, predictions up to a decade would not be unseasonable. After a decade the predication intervals expand wildly therefore any decisions based off later predications would be unwise.

References

https://www.otexts.org/fpp/

https://rpubs.com/josezuniga/282349

https://stat.ethz.ch/R-manual/R-devel/library/stats/html/arima.sim.html

https://www.rdocumentation.org/packages/forecast/versions/8.1/topics/Acf

http://s1.daumcdn.net/editor/fp/service_nc/pencil/Pencil_chromestore.html

https://campus.datacamp.com/courses/introduction-to-time-series-analysis/predicting-the-future?ex=8

http://www.statsmakemecry.com/smmctheblog/confusing-stats-terms-explained-heteroscedasticity-heteroske.html

ARIMA Models

Jose Zuniga

Exercise 8.1

Exercise 8.1.a

Approach

Results

Interpretation

Exercise 8.1.b

Approach

Results

Interpretation

Exercise 8.2

Approach

Results

Interpretation

Exercise 8.6

Exercise 8.6.a

Approach

Results

Interpretation

Exercise 8.6.b

Approach

Results

Interpretation

Exercise 8.6.c

Approach

Results

Interpretation

Exercise 8.6.d

Approach

Results

Interpretation

Exercise 8.6.e

Approach

Results

Interpretation

Exercise 8.6.f

Approach

Results

Interpretation

Exercise 8.6.g

Approach

Results

Interpretation

Exercise 8.8

Exercise 8.8.a

Approach

Results

Interpretation

Exercise 8.8.b

Approach

Results

Interpretation

Exercise 8.8.c

Approach

Results

Interpretation

Exercise 8.8.d

Approach

Results

Interpretation

Exercise 8.8.e

Approach

Results

Interpretation

Exercise 8.8.f

Approach

Results

Interpretation

Exercise 8.8.g

Approach

Results

Interpretation

References