Introduction
For this simple model development I will construct a time series with log change in Real Personal Consumption Expenditures \(y_t=\Delta log (c_t) = log (c_t) - log (c_{t-1})\) where \(c_t\) is the original quarterly Real Personal Consumption Expenditures.
Data
I will import the time series for the quarterly Real Personal Consumption Expenditures from the Quandl website.
Before I get started with loading the data I need to take care of a few house keeping issues. I will load/require the necessary packages that will be used during this model development.
require(forecast)
require(Quandl)
require(ggplot2)
require(dygraphs)
Next I will load the data.
Quandl.api_key('Ltw-PAye5rkz6MwzLNx-')
rPCECC96 <- Quandl("FRED/PCECC96", type="zoo")
A quick inspection shows that this data on Real Personal Consumption Expenditures is at quarterly frequency and that the data is available for the period 1947 Q1 to 2015 Q4.
str(rPCECC96)
## 'zooreg' series from 1947 Q1 to 2015 Q4
## Data: num [1:276] 1199 1219 1223 1224 1230 ...
## Index: Class 'yearqtr' num [1:276] 1947 1947 1948 1948 1948 ...
## Frequency: 4
A plot of the original time series data will allow us to quickly determine if any transformations of the data are necessary.
ar2.model <- arima(diff_log_rPCECC96, order = c(2,0,0))
ar1.model <- arima(diff_log_rPCECC96, order = c(1,0,0))
ar3.model <- arima(diff_log_rPCECC96, order = c(3,0,0))
# Testing AR(2) Model
tsdiag(ar2.model, gof.lag = 12)
# Testing AR(1) Model
tsdiag(ar1.model, gof.lag = 12)
# Testing AR(3) Model
tsdiag(ar3.model, gof.lag = 12)
Based on the diagnostic information above it appears that the AR(2) model may be the best fitting AR model. The AR(1) shows some autocorrelation between the residuals but more importantly the p-values on the Ljung-Box statistics are all small indicating some pattern in the residuals.
The AR(3) model is similar to the AR(2).
I will have a look at the Akaike Information Criterion for the three models to determine which would be the better to use.
aicCompare <- c(ar1=ar1.model$aic, ar2=ar2.model$aic, ar3=ar3.model$aic)
aicCompare
## ar1 ar2 ar3
## -1858.642 -1886.158 -1884.233
So as we can see the Akaike Information Criterion favors the AR(2) model compared to the three models presented in this basic autoregressive (AR) analysis.
Moving Average (MA) Model Exploration
The same ambiguity exists in the ACF and PACF plots for the moving-average but I will look again at the MA(2) as my primary model of interest and then compare the results to the MA(1) and MA(3) models.
ma2.model <- arima(diff_log_rPCECC96, order = c(0,0,2))
ma1.model <- arima(diff_log_rPCECC96, order = c(0,0,1))
ma3.model <- arima(diff_log_rPCECC96, order = c(0,0,3))
# Testing MA(2) Model
tsdiag(ma2.model, gof.lag = 12)
# Testing MA(1) Model
tsdiag(ma1.model, gof.lag = 12)
# Testing MA(3) Model
tsdiag(ma3.model, gof.lag = 12)
Based on the diagnostic information above it appears that the MA(2) model may be the best fitting MA model. The MA(1) shows some autocorrelation between the residuals but more importantly the p-values on the Ljung-Box statistics are all small indicating some pattern in the residuals.
The MA(3) model is similar to the MA(2).
I will have a look at the Akaike Information Criterion for the three models to determine which would be the better to use.
aicCompare_ma <- c(ma1=ma1.model$aic, ma2=ma2.model$aic, ma3=ma3.model$aic)
aicCompare_ma
## ma1 ma2 ma3
## -1857.783 -1889.769 -1889.209
The Akaike Information Criterion favors the MA(2) model compared to the three models presented in this basic moving-average (MA) analysis.
Conclusion
This project called for the analysis of quarterly Real Personal Consumption Expenditures time series data. The first step was to determine which order autoregressive model would fit this data better. We concluded that an AR(2) model would fit this data set better than an AR(1) or AR(3). The next step was to look at a moving-average model and determine which order MA(q) model might fit this data best. We concluded that an MA(2) model would fit the data better than an MA(1) or MA(3) model.
However, for this time series I might favor an ARMA(1,2) model as a better fit for modeling this data set. Further investigations would be needed to determine the viability of this idea.