The first step is to import the monthly time series for Consumer Price Index for All Urban Consumers, FRED/CPILFENS, using Quandl packages. We will denote FRED/CPILFENS as \(CPI_{t}\).
cpi <- Quandl("FRED/CPILFENS", type="ts")
str(cpi)
Time-Series [1:707] from 1957 to 2016: 28.5 28.5 28.7 28.8 28.8 28.9 28.9 29 29.1 29.2 ...
The next step is to construct a 12 month (y-o-y) inflation rates, \(inflation_{t}\), defined as follow:
\(inflation_{t} = \frac{CPI_{t} - CPI_{t-12}}{CPI_{t-12}}*100\)
The calculation will be implemented in R using the following code:
inflation <- (diff(cpi,12)/lag(cpi,-12))*100
plot(inflation, main="y-o-y Inflation Rate", ylab="% Changes in y-o-y CPI")
We then split the sample into two parts (up to 2014 and after 2014) in order to be able to show out model accuracy after forecast.
# split sample into two parts - estimation sample and prediction sample
inflation_all <- inflation
inflation1 <- window(inflation_all, end=c(2014,12))
inflation2 <- window(inflation_all, start=c(2015,1))
# first part used to identify and estimate the model
inflation <- inflation1
The first step in model identification is to plot the ACF and PACF of \(inflation_{t}\):
However, from the ACF and PACF above, a very slowly decaying ACF suggests nonstationarity and presence of deterministic or stochastic trend in the time series. Before proceed further into forecasting, we need to check for the existence of unit-root and the stationarity of our dataset.
To check for the unit-root, we will run ADF test for \(inflation_{t}\) as follow:
inflation.urdf <- ur.df(inflation, type = "trend", selectlags = "BIC")
summary(inflation.urdf)
##
## ###############################################
## # Augmented Dickey-Fuller Test Unit Root Test #
## ###############################################
##
## Test regression trend
##
##
## Call:
## lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.26363 -0.11118 -0.01523 0.12409 1.82953
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.0424792 0.0259896 1.634 0.1026
## z.lag.1 -0.0072426 0.0036857 -1.965 0.0498 *
## tt -0.0000481 0.0000486 -0.990 0.3226
## z.diff.lag 0.3506552 0.0359593 9.751 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.2419 on 678 degrees of freedom
## Multiple R-squared: 0.1262, Adjusted R-squared: 0.1223
## F-statistic: 32.64 on 3 and 678 DF, p-value: < 2.2e-16
##
##
## Value of test-statistic is: -1.965 1.3851 2.0639
##
## Critical values for test statistics:
## 1pct 5pct 10pct
## tau3 -3.96 -3.41 -3.12
## phi2 6.09 4.68 4.03
## phi3 8.27 6.25 5.34
From the result of ADF-test, using \(\tau_{3}\) statistics, we failed reject \(H_{0}: \gamma = 0\). This indicate that \(inflation_{t}\) is not trend stationary. Furthermore, using \(\phi_{3}\) statistics, we also failed to reject \(H_{0}: \gamma = \beta = 0\).
Further testing using alternative specification of non-zero mean (drift, Model B as in the notes) and zero mean (Model A as in the notes) are conducted using the following code:
inflation.urdf <- ur.df(inflation, type = "drift", selectlags = "BIC")
summary(inflation.urdf)
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################
Test regression drift
Call:
lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
Residuals:
Min 1Q Median 3Q Max
-1.26920 -0.11666 -0.01028 0.12278 1.82743
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.022548 0.016431 1.372 0.1704
z.lag.1 -0.006333 0.003569 -1.774 0.0765 .
z.diff.lag 0.351152 0.035955 9.766 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.2419 on 679 degrees of freedom
Multiple R-squared: 0.1249, Adjusted R-squared: 0.1224
F-statistic: 48.47 on 2 and 679 DF, p-value: < 2.2e-16
Value of test-statistic is: -1.7743 1.5878
Critical values for test statistics:
1pct 5pct 10pct
tau2 -3.43 -2.86 -2.57
phi1 6.43 4.59 3.78
inflation.urdf <- ur.df(inflation, selectlags = "BIC")
summary(inflation.urdf)
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################
Test regression none
Call:
lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
Residuals:
Min 1Q Median 3Q Max
-1.30086 -0.10505 0.00215 0.12933 1.81181
Coefficients:
Estimate Std. Error t value Pr(>|t|)
z.lag.1 -0.002287 0.002013 -1.136 0.256
z.diff.lag 0.348722 0.035935 9.704 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.242 on 680 degrees of freedom
Multiple R-squared: 0.1226, Adjusted R-squared: 0.12
F-statistic: 47.5 on 2 and 680 DF, p-value: < 2.2e-16
Value of test-statistic is: -1.1361
Critical values for test statistics:
1pct 5pct 10pct
tau1 -2.58 -1.95 -1.62
From both non-zero mean and zero mean model for ADF-test result, we failed to reject \(H_{0}\), this indicate that \(inflation_{t}\) is not trend stationary and following a random walk that warrant for differencing before we can use the time series for forecasting purpose.
To confirm this preliminary finding, we can run additional KPSS test for stationary as follow:
test1 <- kpss.test(inflation, null = "Trend")
test1
KPSS Test for Trend Stationarity
data: inflation
KPSS Trend = 1.3385, Truncation lag parameter = 6, p-value = 0.01
From the result of KPSS test we reject \(H_{0}: inflation_{t}\) is stationary, we conclude that \(inflation_{t}\) is not trend stationary and following a random walk. We have to take first difference for \(inflation_{t}\) and conduct another unit-root and stationarity test.
In this part we transform \(inflation_{t}\) to \(\Delta inflation_{t}\) by differencing and conduct the following test:
dinflation <- diff(inflation)
dinflation.urdf <- ur.df(dinflation, type = "trend", selectlags = "BIC")
summary(dinflation.urdf)
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################
Test regression trend
Call:
lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
Residuals:
Min 1Q Median 3Q Max
-1.36961 -0.10403 -0.00324 0.11885 1.77956
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.868e-03 1.833e-02 0.429 0.668
z.lag.1 -5.358e-01 4.313e-02 -12.422 < 2e-16 ***
tt -2.526e-05 4.649e-05 -0.543 0.587
z.diff.lag -1.794e-01 3.774e-02 -4.752 2.46e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.2384 on 677 degrees of freedom
Multiple R-squared: 0.3489, Adjusted R-squared: 0.3461
F-statistic: 120.9 on 3 and 677 DF, p-value: < 2.2e-16
Value of test-statistic is: -12.4218 51.4406 77.1603
Critical values for test statistics:
1pct 5pct 10pct
tau3 -3.96 -3.41 -3.12
phi2 6.09 4.68 4.03
phi3 8.27 6.25 5.34
test2 <- kpss.test(dinflation)
test2
KPSS Test for Level Stationarity
data: dinflation
KPSS Level = 0.078593, Truncation lag parameter = 6, p-value = 0.1
From ADF test we conclude that \(\Delta inflation_{t}\) is difference stationary. However KPSS test only indicate a borderline rejection of \(H_{0}: \Delta inflation_{t}\) is stationary. To confirm our finding, we will run an additional Phillips-Perron Test as follow:
dinflation.urpp <- ur.pp(dinflation)
summary(dinflation.urpp)
##################################
# Phillips-Perron Unit Root Test #
##################################
Test regression with intercept
Call:
lm(formula = y ~ y.l1)
Residuals:
Min 1Q Median 3Q Max
-1.33009 -0.11129 -0.00318 0.12262 1.79180
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.001533 0.009277 -0.165 0.869
y.l1 0.347707 0.035959 9.669 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.2423 on 680 degrees of freedom
Multiple R-squared: 0.1209, Adjusted R-squared: 0.1196
F-statistic: 93.5 on 1 and 680 DF, p-value: < 2.2e-16
Value of test-statistic, type: Z-alpha is: -516.3753
aux. Z statistics
Z-tau-mu -0.1716
The null hypothesis of Phillips-Perron test is that the \(\Delta inflation_{t}\) contains a unit root. From the magnitude of the test statistics of -516.3753, compared to the critical value for the test, we found that \(\Delta inflation_{t}\) is difference stationary.
The ACF and PACF of \(\Delta inflation_{t}\) can be seen below:
par(mfcol=c(1,2))
acf(as.data.frame(dinflation), lag=24, main=expression(Delta("y-o-y Inflation")))
pacf(as.data.frame(dinflation), lag=24, main=expression(Delta("y-o-y Inflation")))
From the ACF and PACF, the most likely specification for \(\Delta inflation_{t}\) is ARIMA(1,0,1)(2,0,1)[12]. The next step is to decide whether to use zero mean or non-zero mean specification. To make the determination, we can compare the ADF test from both specification as follow:
dinflation.urdf.nonzeromean <- ur.df(dinflation, type = "drift", selectlags = "BIC")
dinflation.urdf.zeromean <- ur.df(dinflation, selectlags = "BIC")
We then compare the \(\tau_{3}\) statistics for non-zero and zero mean and found the following:
dinflation.urdf.nonzeromean@teststat[1]
[1] -12.41715
dinflation.urdf.zeromean@teststat[1]
[1] -12.42601
We found that zero mean specification is more preferable compared to non-zero mean.Hence, we will estimate ARIMA(1,0,1)(2,0,1)[12] with zero mean as our preferred specification and the base of our forecast. This can also be confirmed using auto.arima function as follow:
m1 <- auto.arima(dinflation, ic="bic", stepwise = FALSE)
m1
Series: dinflation
ARIMA(1,0,1)(2,0,1)[12] with zero mean
Coefficients:
ar1 ma1 sar1 sar2 sma1
0.9412 -0.6809 0.0218 -0.0023 -0.8226
s.e. 0.0218 0.0514 0.0522 0.0479 0.0357
sigma^2 estimated as 0.03576: log likelihood=161.93
AIC=-311.86 AICc=-311.74 BIC=-284.7
Forecasting with ARIMA(1,0,1)(2,0,1)[12] using \(\Delta inflation_{t}\) transformation:
m1.fcast <- forecast(m1, h=13)
plot(m1.fcast, xlim=c(2012,2016))
lines(diff(inflation_all))