Introduction

This research project will be on the Canadian Inflation Rate. More specifically, headline inflation rate, which is defined by the Bank of Canada as the year to year change of headline CPI. Headline CPI is composed of a price index of all items, including major price indices such as food, services, and energy. It is important to note that the Bank of Canada is using other measures of inflation such as CPI-Core, Trim, and Median to compliment headline inflation. However, headline inflation is still the inflation-rate target for the Bank of Canada, for now at least. This demonstrates that perhaps there are items in the CPI which inaccurately capture Canadian consumer actions due to changing price sensitivities for some items, and more complex measures need to be put in place to account for this.

This leads me to my next point, why this topic is of interest. The Canadian inflation rate is incredibly important, as explained above, the Bank of Canada uses it as a benchmark to conduct monetary policy. The Bank of Canada’s target range is from 1-3%, 2% being the target rate. Inflation is a negative trade-off of economic growth, and you must balance growth and inflation to prevent an economy from growing too fast and overheating. One of the Bank of Canada’s main tools for controlling inflation is changes in the policy rate, which is the interest rate at which it lends and credits all bank’s in Canada, which in turn, is passed to the consumer through borrowing and lending. This has effects on consumers through movements in the prime rate, which affects loans and mortgages which change consumer behavior.

Data Description and Collection

The data for this project comes from Statistics Canada, Table: 18-10-0004-01. The data is monthly and spans from January 1999 to June 2018. It is important to note that the data is seasonally unadjusted.

Data Analysis

CPI.Data <- read.csv("~/Desktop/York Econ/ECON6210/CPI Data.csv")
df <- ts(CPI.Data, start=c(1999, 1), end=c(2018, 6), frequency=12)
cpi = df[, "CPI"]

Starting the data analysis process, I imported my CPI data and set it as a time series data frame, then extracting the CPI variable data from the data frame.

Inflation= 100*diff(log(cpi), lag = 12)

This generates the year-to-year change of CPI, equal to headline inflation, which is the variable in question throughout this paper.

autoplot(Inflation) +
  ylab("Inflation (%)") +
  ggtitle("Canadian Inflation Rate")

summary(Inflation)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.9545  1.2652  1.9549  1.8952  2.4016  4.5779

This is the growth rate of CPI, more specifically inflation, which implies that the trend is removed. However, this highlights the cyclical portion of our data. More evidently now is the financial crisis, it left Canada in a period of deflation and the extreme case, negative inflation. The mean of inflation is close to 1.9%, which provides evidence of the central bank’s inflation target goal of the midpoint of 1-3%. From a visual analysis, the data seems to have some seasonality. However, we will move forward to tests that better highlight seasonality.

ggseasonplot(Inflation)

ggseasonplot(Inflation, polar=TRUE)

There is some minor seasonality that can be visible in the data plots, but it is difficult to comment on the seasonal plot as the seasonality is minor.

checkresiduals(cpi)

## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.

checkresiduals(Inflation)

## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.

The error distribution for inflation is better than that of CPI above, but high auto-correlation is still present. Moreover, the distribution of the residuals is not exactly normal. Much better than the CPI residual distribution, but not ideal. You can see the financial crisis of 2008-09 to the in the tail of the left side of the distribution. The center of the distribution plot is where it appears fitting poorly. The auto-correlation chart better visualizes the minor seasonality in the data as you can see a statistically significant negative correlation 12 months behind and a positive correlation 36 months. The auto-correlation seems to have a wave-like pattern over its lags. Other observations are a high degree of auto-correlation in inflation even after transforming and differencing the data.

y = Inflation
df.start <- window(df, start=c(2000, 1), frequency = 12)
df1 <- cbind(df.start, y)
head(df1)

##          df.start.Time df.start.CPI        y
## Jan 2000           219         93.5 2.162246
## Feb 2000           217         94.1 2.692677
## Mar 2000           225         94.8 2.998083
## Apr 2000           211         94.5 2.139119
## May 2000           227         94.9 2.345523
## Jun 2000           223         95.5 2.760260

This was done to clean the data as the start of the data was showing empty values for inflation due to it being year-over-year change.

CPI.NEW <- read.csv("~/Desktop/York Econ/ECON6210/CPI NEW.csv")
df.c <- ts(CPI.NEW, start=c(1999, 1), end=c(2018, 10), frequency=12)
cpi1 = df.c[, "CPI"]
y1= 100*diff(log(cpi1), lag = 12)
y1n = window(y1, start=c(2018, 6))
df1 <- cbind(df.start, y, y1n)

For another degree of accuracy, I add the data for July, August, September, and October from the same data source as a new variable to compare with my forecast in the next section. It is not included in the original data used to conduct this analysis, however, it will be interesting to compare.

Forecasting

Time Series Decompesition

decompose(y, type= "additive")%>% autoplot() + ggtitle("Classical Decomposition of Canadian Inflation Rate")

seas(y, x11="") %>% autoplot() + ggtitle("X11 Decomposition of Canadian Inflation Rate")

mstl(y) %>% autoplot() + ggtitle("STL Decomposition of Canadian Inflation Rate")

mstl(y) %>% summary()

##       Data             Trend          Seasonal12        
##  Min.   :-0.9545   Min.   :0.3399   Min.   :-0.0604426  
##  1st Qu.: 1.2652   1st Qu.:1.4718   1st Qu.:-0.0242701  
##  Median : 1.9549   Median :1.8864   Median :-0.0020470  
##  Mean   : 1.8952   Mean   :1.8960   Mean   : 0.0005964  
##  3rd Qu.: 2.4016   3rd Qu.:2.3116   3rd Qu.: 0.0283187  
##  Max.   : 4.5779   Max.   :3.2338   Max.   : 0.0666693  
##    Remainder        
##  Min.   :-1.305289  
##  1st Qu.:-0.273163  
##  Median :-0.002234  
##  Mean   :-0.001350  
##  3rd Qu.: 0.273722  
##  Max.   : 1.388157

Seeing as I am using Canadian inflation data, I thought it would be interesting to include an X11 decomposition on it, as Statistics Canada uses it for seasonal smoothing. The advantage between the classical decomposition method and X11 is that X11 allows for seasonality to vary, where the classical method assumes constant seasonality. However, it is odd why the remainder for the X11 method are all negative. Furthermore, applying the automated STL decomposition function gives a seasonality output which is different from that of previously discussed methods. Moreover, it is increasing over time, while the X11 method shows decreasing seasonality over time. What can be drawn from this analysis is that seasonality is relatively small with a mean of 0% and accounts for a |0.06%| change at most, however, vary over time.

train <- window(y, start=c(2000,1), end = c(2016, 12))
test <- window(y, start=c(2017, 1))
both <- window(y, start=c(2000, 1))
h=length(test)

Chapter 3 Forecast Methods

fit1 <- meanf(train, h=h)
fit2 <- naive(train, h=h)
fit3 <- snaive(train, h=h)
autoplot(y) +  
  autolayer(fit1, series="Mean", PI=FALSE) + 
  autolayer(fit2, series="Naive", PI=FALSE) +
  autolayer(fit3, series="Seasonal naive", PI=FALSE) +
  xlab("Year") + ylab("Inflation (%)") +
  ggtitle("Canadian Inflation Rate Simple Forecasts") +
  guides(colour=guide_legend(title="Forecast"))

Looking at the graph, initially, it seems that the seasonal naive model fits the data better, however, it does not capture the increase near the end of the data. This, of course, makes sense, as, during the years after the oil price shock in 2015, Canada has suffered from low inflation, and an increase was inevitable due to government response. Leading to the conclusion that the seasonal naive does not capture the response. Moreover, throughout 2017 economic growth was steady and it appeared that inflation was lagging, and a sharp increase was foreseeable. The Bank of Canada then responded accordingly and increased the policy rate for the first time in 7 years by 25 basis points. This can explain why the mean seems to be the best forecasting in this model as inflation will return to the 2% mark shortly, which would be a good prediction.

Chapter 4 Forecast Methods

tps <- tslm(train ~ trend + season)
summary(tps)

## 
## Call:
## tslm(formula = train ~ trend + season)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.7892 -0.3981  0.0487  0.4147  2.2554 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.559112   0.226860  11.281  < 2e-16 ***
## trend       -0.006597   0.001006  -6.559 4.95e-10 ***
## season2      0.014072   0.289651   0.049    0.961    
## season3      0.031946   0.289657   0.110    0.912    
## season4      0.025028   0.289665   0.086    0.931    
## season5      0.041800   0.289678   0.144    0.885    
## season6      0.049404   0.289693   0.171    0.865    
## season7      0.034230   0.289712   0.118    0.906    
## season8      0.019070   0.289735   0.066    0.948    
## season9      0.011351   0.289761   0.039    0.969    
## season10     0.025351   0.289791   0.087    0.930    
## season11     0.015403   0.289824   0.053    0.958    
## season12     0.006563   0.289861   0.023    0.982    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8445 on 191 degrees of freedom
## Multiple R-squared:  0.1846, Adjusted R-squared:  0.1334 
## F-statistic: 3.604 on 12 and 191 DF,  p-value: 7.255e-05

fit4 = forecast(tps,h=h)
p4 = autoplot(fit4)
autoplot(train) + 
  autolayer(fit4) +
  xlab("Year") + ylab("Inflation (%)") +
  ggtitle("Canadian Inflation Rate Linear Regression Forecast")

A similar story can be told here, the trend with seasonal dummy variables seems to predict a downward trend for inflation. Similar to the seasonal naive forecast, the test data is during a time of lagging inflation and was predicted to rise in the future periods. Variables like unemployment and output affect inflation overtime so using a trend and seasonal dummy variables will not give us a robust forecast. Moreover, there is no statistical significance on all seasonal dummies, suggesting that seasonality is not very strong. Finally, \(R^{2}\) is equal to 0.1846, suggesting a weak model.

Chapter 7 Forecast Methods

fit5 <- ses(train, h = h)
summary(fit5)

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = train, h = h) 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
## 
##   Initial states:
##     l = 2.1625 
## 
##   sigma:  0.4535
## 
##      AIC     AICc      BIC 
## 766.2419 766.3619 776.1962 
## 
## Error measures:
##                        ME      RMSE       MAE        MPE     MAPE
## Training set -0.003293113 0.4512553 0.3518542 0.04074016 30.10265
##                   MASE       ACF1
## Training set 0.3276743 0.09157289
## 
## Forecasts:
##          Point Forecast       Lo 80    Hi 80       Lo 95    Hi 95
## Jan 2017       1.490777  0.90961377 2.071939  0.60196479 2.379588
## Feb 2017       1.490777  0.66892940 2.312624  0.23386980 2.747683
## Mar 2017       1.490777  0.48424025 2.497313 -0.04858787 3.030141
## Apr 2017       1.490777  0.32853824 2.653015 -0.28671354 3.268267
## May 2017       1.490777  0.19136113 2.790192 -0.49650784 3.478061
## Jun 2017       1.490777  0.06734302 2.914210 -0.68617716 3.667730
## Jul 2017       1.490777 -0.04670372 3.028257 -0.86059660 3.842150
## Aug 2017       1.490777 -0.15285606 3.134409 -1.02294259 4.004496
## Sep 2017       1.490777 -0.25255667 3.234110 -1.17542152 4.156975
## Oct 2017       1.490777 -0.34685597 3.328409 -1.31963985 4.301193
## Nov 2017       1.490777 -0.43654694 3.418100 -1.45681036 4.438363
## Dec 2017       1.490777 -0.52224566 3.503799 -1.58787523 4.569428
## Jan 2018       1.490777 -0.60444205 3.585995 -1.71358376 4.695137
## Feb 2018       1.490777 -0.68353335 3.665086 -1.83454347 4.816097
## Mar 2018       1.490777 -0.75984694 3.741400 -1.95125504 4.932808
## Apr 2018       1.490777 -0.83365642 3.815209 -2.06413691 5.045690
## May 2018       1.490777 -0.90519324 3.886746 -2.17354302 5.155096
## Jun 2018       1.490777 -0.97465521 3.956208 -2.27977594 5.261329

p5 = autoplot(fit5)+
  xlab("Year") + ylab("Inflation (%)")
fit6 <- holt(train,  h=h) 
summary(fit6)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = train, h = h) 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 2.2581 
##     b = -0.0031 
## 
##   sigma:  0.4558
## 
##      AIC     AICc      BIC 
## 770.2937 770.5967 786.8843 
## 
## Error measures:
##                         ME      RMSE       MAE       MPE     MAPE
## Training set -0.0006658185 0.4513126 0.3523472 0.2331963 30.10305
##                   MASE      ACF1
## Training set 0.3281334 0.0903218
## 
## Forecasts:
##          Point Forecast       Lo 80    Hi 80       Lo 95    Hi 95
## Jan 2017       1.487670  0.90353490 2.071806  0.59431224 2.381029
## Feb 2017       1.484565  0.65847211 2.310657  0.22116518 2.747964
## Mar 2017       1.481459  0.46967238 2.493245 -0.06593495 3.028852
## Apr 2017       1.478353  0.30999383 2.646712 -0.30849808 3.265204
## May 2017       1.475247  0.16892298 2.781571 -0.52260320 3.473097
## Jun 2017       1.472141  0.04106795 2.903214 -0.71649647 3.660779
## Jul 2017       1.469035 -0.07677420 3.014845 -0.89507635 3.833147
## Aug 2017       1.465930 -0.18669028 3.118549 -1.06153435 3.993393
## Sep 2017       1.462824 -0.29012972 3.215777 -1.21808719 4.143735
## Oct 2017       1.459718 -0.38814782 3.307583 -1.36634881 4.285784
## Nov 2017       1.456612 -0.48154121 3.394765 -1.50753754 4.420762
## Dec 2017       1.453506 -0.57092874 3.477941 -1.64259984 4.549612
## Jan 2018       1.450400 -0.65680255 3.557603 -1.77228837 4.673089
## Feb 2018       1.447294 -0.73956165 3.634151 -1.89721335 4.791802
## Mar 2018       1.444189 -0.81953484 3.707912 -2.01787765 4.906255
## Apr 2018       1.441083 -0.89699691 3.779162 -2.13470154 5.016867
## May 2018       1.437977 -0.97218029 3.848134 -2.24804047 5.123994
## Jun 2018       1.434871 -1.04528366 3.915026 -2.35819829 5.227941

p6 = autoplot(fit6)+
  xlab("Year") + ylab("Inflation (%)")
fit7 <- holt(train, damped=TRUE, h=h) 
summary(fit7)

## 
## Forecast method: Damped Holt's method
## 
## Model Information:
## Damped Holt's method 
## 
## Call:
##  holt(y = train, h = h, damped = TRUE) 
## 
##   Smoothing parameters:
##     alpha = 0.9977 
##     beta  = 0.0479 
##     phi   = 0.8021 
## 
##   Initial states:
##     l = 2.2186 
##     b = 0.3023 
## 
##   sigma:  0.4591
## 
##      AIC     AICc      BIC 
## 774.2032 774.6296 794.1119 
## 
## Error measures:
##                        ME      RMSE       MAE       MPE     MAPE      MASE
## Training set -0.007903691 0.4534297 0.3549855 0.4294028 30.17254 0.3305904
##                    ACF1
## Training set 0.06536501
## 
## Forecasts:
##          Point Forecast       Lo 80    Hi 80      Lo 95    Hi 95
## Jan 2017       1.496473  0.90812480 2.084822  0.5966719 2.396275
## Feb 2017       1.501616  0.65442020 2.348812  0.2059417 2.797291
## Mar 2017       1.505742  0.45134900 2.560134 -0.1068128 3.118296
## Apr 2017       1.509051  0.27446549 2.743636 -0.3790844 3.397186
## May 2017       1.511705  0.11464912 2.908761 -0.6249075 3.648317
## Jun 2017       1.513834 -0.03266004 3.060328 -0.8513245 3.878992
## Jul 2017       1.515542 -0.17014108 3.201224 -1.0624876 4.093571
## Aug 2017       1.516911 -0.29954515 3.333368 -1.2611191 4.294942
## Sep 2017       1.518010 -0.42210754 3.458128 -1.4491437 4.485164
## Oct 2017       1.518891 -0.53875028 3.576533 -1.6279999 4.665783
## Nov 2017       1.519598 -0.65019143 3.689388 -1.7988087 4.838005
## Dec 2017       1.520165 -0.75700857 3.797339 -1.9624715 5.002802
## Jan 2018       1.520620 -0.85967792 3.900918 -2.1197315 5.160972
## Feb 2018       1.520985 -0.95859967 4.000569 -2.2712124 5.313182
## Mar 2018       1.521277 -1.05411520 4.096670 -2.4174457 5.460001
## Apr 2018       1.521512 -1.14651914 4.189543 -2.5588896 5.601914
## May 2018       1.521700 -1.23606821 4.279469 -2.6959428 5.739344
## Jun 2018       1.521851 -1.32298787 4.366691 -2.8289548 5.872658

p7 = autoplot(fit7)+
  xlab("Year") + ylab("Inflation (%)")
fit8 <- hw(train, seasonal="additive", h=h) 
summary(fit8)

## 
## Forecast method: Holt-Winters' additive method
## 
## Model Information:
## Holt-Winters' additive method 
## 
## Call:
##  hw(y = train, h = h, seasonal = "additive") 
## 
##   Smoothing parameters:
##     alpha = 0.9988 
##     beta  = 0.0026 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 2.4518 
##     b = -0.0343 
##     s = 0.0463 0.0797 0.1104 -0.0834 -0.0821 -0.0817
##            -0.1033 0.0415 -0.0018 0.0446 0.0042 0.0257
## 
##   sigma:  0.4778
## 
##      AIC     AICc      BIC 
## 800.9101 804.2004 857.3181 
## 
## Error measures:
##                      ME     RMSE       MAE       MPE     MAPE      MASE
## Training set 0.02287128 0.458691 0.3630727 0.3902289 29.51818 0.3381218
##                    ACF1
## Training set 0.08119146
## 
## Forecasts:
##          Point Forecast         Lo 80    Hi 80       Lo 95    Hi 95
## Jan 2017      1.4476956  0.8353557707 2.060035  0.51120265 2.384189
## Feb 2017      1.4041505  0.5375478243 2.270753  0.07879609 2.729505
## Mar 2017      1.4223827  0.3598352794 2.484930 -0.20264333 3.047409
## Apr 2017      1.3540404  0.1256326365 2.582448 -0.52464716 3.232728
## May 2017      1.3752222  0.0001071896 2.750337 -0.72783473 3.478279
## Jun 2017      1.2086932 -0.2995770445 2.716963 -1.09800708 3.515393
## Jul 2017      1.2078671 -0.4233300371 2.839064 -1.28683366 3.702568
## Aug 2017      1.1854197 -0.5606391541 2.931479 -1.48494690 3.855786
## Sep 2017      1.1621400 -0.6922161061 3.016496 -1.67385294 3.998133
## Oct 2017      1.3335320 -0.6236481205 3.290712 -1.65971672 4.326781
## Nov 2017      1.2812158 -0.7741380683 3.336570 -1.86217669 4.424608
## Dec 2017      1.2257821 -0.9237335965 3.375298 -2.06161849 4.513183
## Jan 2018      1.1830678 -1.0571240818 3.423260 -2.24301001 4.609146
## Feb 2018      1.1395226 -1.1882314288 3.467277 -2.42046999 4.699515
## Mar 2018      1.1577548 -1.2548053697 3.570315 -2.53193760 4.847447
## Apr 2018      1.0894126 -1.4054798021 3.584305 -2.72619604 4.905021
## May 2018      1.1105944 -1.4643944961 3.685583 -2.82751126 5.048700
## Jun 2018      0.9440654 -1.7089878062 3.597119 -3.11342932 5.001560

p8 = autoplot(fit8)+
  xlab("Year") + ylab("Inflation (%)")
y.ets <- ets(train, model="ZZZ") 
summary(y.ets)

## ETS(A,N,N) 
## 
## Call:
##  ets(y = train, model = "ZZZ") 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
## 
##   Initial states:
##     l = 2.1629 
## 
##   sigma:  0.4535
## 
##      AIC     AICc      BIC 
## 766.2419 766.3619 776.1962 
## 
## Training set error measures:
##                        ME      RMSE       MAE        MPE     MAPE
## Training set -0.003295242 0.4512553 0.3518564 0.04064082 30.10275
##                   MASE       ACF1
## Training set 0.3276763 0.09156739

fit9 <- forecast(y.ets, h=h)
summary(fit9)

## 
## Forecast method: ETS(A,N,N)
## 
## Model Information:
## ETS(A,N,N) 
## 
## Call:
##  ets(y = train, model = "ZZZ") 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
## 
##   Initial states:
##     l = 2.1629 
## 
##   sigma:  0.4535
## 
##      AIC     AICc      BIC 
## 766.2419 766.3619 776.1962 
## 
## Error measures:
##                        ME      RMSE       MAE        MPE     MAPE
## Training set -0.003295242 0.4512553 0.3518564 0.04064082 30.10275
##                   MASE       ACF1
## Training set 0.3276763 0.09156739
## 
## Forecasts:
##          Point Forecast       Lo 80    Hi 80       Lo 95    Hi 95
## Jan 2017       1.490776  0.90961373 2.071939  0.60196474 2.379588
## Feb 2017       1.490776  0.66892939 2.312624  0.23386982 2.747683
## Mar 2017       1.490776  0.48424028 2.497313 -0.04858781 3.030141
## Apr 2017       1.490776  0.32853829 2.653015 -0.28671345 3.268266
## May 2017       1.490776  0.19136119 2.790192 -0.49650773 3.478061
## Jun 2017       1.490776  0.06734310 2.914210 -0.68617703 3.667730
## Jul 2017       1.490776 -0.04670363 3.028257 -0.86059644 3.842149
## Aug 2017       1.490776 -0.15285595 3.134409 -1.02294242 4.004495
## Sep 2017       1.490776 -0.25255655 3.234110 -1.17542133 4.156974
## Oct 2017       1.490776 -0.34685584 3.328409 -1.31963965 4.301193
## Nov 2017       1.490776 -0.43654681 3.418100 -1.45681014 4.438363
## Dec 2017       1.490776 -0.52224551 3.503799 -1.58787500 4.569428
## Jan 2018       1.490776 -0.60444189 3.585995 -1.71358351 4.695137
## Feb 2018       1.490776 -0.68353319 3.665086 -1.83454321 4.816096
## Mar 2018       1.490776 -0.75984677 3.741400 -1.95125476 4.932808
## Apr 2018       1.490776 -0.83365625 3.815209 -2.06413662 5.045690
## May 2018       1.490776 -0.90519305 3.886746 -2.17354272 5.155096
## Jun 2018       1.490776 -0.97465502 3.956208 -2.27977564 5.261329

p9 = autoplot(fit9)+
  xlab("Year") + ylab("Inflation (%)")
y.mstl <- mstl(train)
summary(y.mstl)

##       Data             Trend          Seasonal12        
##  Min.   :-0.9545   Min.   :0.3399   Min.   :-0.0570702  
##  1st Qu.: 1.2564   1st Qu.:1.3861   1st Qu.:-0.0178578  
##  Median : 1.9659   Median :1.9217   Median :-0.0021222  
##  Mean   : 1.9058   Mean   :1.9033   Mean   : 0.0002786  
##  3rd Qu.: 2.4184   3rd Qu.:2.3325   3rd Qu.: 0.0167604  
##  Max.   : 4.5779   Max.   :3.2338   Max.   : 0.0539490  
##    Remainder        
##  Min.   :-1.305284  
##  1st Qu.:-0.264756  
##  Median :-0.008230  
##  Mean   : 0.002228  
##  3rd Qu.: 0.281874  
##  Max.   : 1.388147

fit10 <- forecast(y.mstl, method="rwdrift",h=h)
summary(fit10)

## 
## Forecast method: STL +  Random walk with drift
## 
## Model Information:
## Call: rwf(y = x, h = h, drift = TRUE, level = level) 
## 
## Drift: -0.0034  (se 0.0318)
## Residual sd: 0.4524 
## 
## Error measures:
##                         ME      RMSE       MAE        MPE     MAPE
## Training set -9.050832e-17 0.4513318 0.3536598 -0.0242346 29.95944
##                   MASE       ACF1
## Training set 0.3293557 0.09257149
## 
## Forecasts:
##          Point Forecast       Lo 80    Hi 80       Lo 95    Hi 95
## Jan 2017       1.490362  0.90910042 2.071623  0.60139930 2.379324
## Feb 2017       1.465884  0.63983677 2.291931  0.20255376 2.729214
## Mar 2017       1.491126  0.47452981 2.507723 -0.06362384 3.045876
## Apr 2017       1.485740  0.30624565 2.665235 -0.31814104 3.289621
## May 2017       1.469575  0.14459536 2.794555 -0.55680652 3.495957
## Jun 2017       1.496772  0.03849935 2.955045 -0.73346369 3.727008
## Jul 2017       1.461005 -0.12145327 3.043463 -0.95915593 3.881165
## Aug 2017       1.440234 -0.25929867 3.139767 -1.15897695 4.039445
## Sep 2017       1.438212 -0.37266448 3.249089 -1.33128471 4.207709
## Oct 2017       1.496874 -0.42061426 3.414363 -1.43567138 4.429420
## Nov 2017       1.435579 -0.58453899 3.455696 -1.65392462 4.525082
## Dec 2017       1.449708 -0.66963448 3.569050 -1.79154659 4.690963
## Jan 2018       1.449261 -0.76635926 3.664882 -1.93923795 4.837761
## Feb 2018       1.424784 -0.88453690 3.734104 -2.10701737 4.956585
## Mar 2018       1.450026 -0.95071844 3.850770 -2.22159572 5.121648
## Apr 2018       1.444640 -1.04550273 3.934782 -2.36370457 5.252984
## May 2018       1.428475 -1.14925130 4.006201 -2.51381702 5.370766
## Jun 2018       1.455672 -1.20800181 4.119346 -2.61806549 5.529409

p10 = autoplot(fit10)+
  xlab("Year") + ylab("Inflation (%)")

gridExtra::grid.arrange(p5, p6, p7, p8, p9, p10, nrow=3)

You will notice the Holt-Winters exponential forecast fit is missing, it did not execute because the data recorded negative values, due to the financial crisis in 2008-09, where Canada experienced negative inflation. Furthermore, looking at the summaries for the forecast on the test data, they are largely similar. In the forecasts, the smoothing parameter, \(\alpha\), is essentially 1, and in the Holts methods, \(\beta\) is essentially 0. This makes sense, as the data points are very close to the mean, so the last recorded data point will likely be the best prediction. This suggests a forecast almost identical to a simple naive model. Visually, it is hard to discern the difference between the forecasts from the smoothing methods, apart from the Holt-Winters additive and STL with the random walk. Moreover, these forecasts look very close to the linear regression forecasts. Furthermore, the ETS method chooses a model identical to the simple exponential forecast, which is essentially the naive forecast.

Chapter 8 Forecast Method

fit11 = forecast(auto.arima(train, approximation=FALSE, stepwise= FALSE), h= h)
summary(fit11)

## 
## Forecast method: ARIMA(0,1,0)(2,0,2)[12]
## 
## Model Information:
## Series: train 
## ARIMA(0,1,0)(2,0,2)[12] 
## 
## Coefficients:
##         sar1     sar2     sma1    sma2
##       0.5377  -0.0313  -1.4974  0.5617
## s.e.  0.5960   0.1005   0.5856  0.5308
## 
## sigma^2 estimated as 0.1054:  log likelihood=-67.6
## AIC=145.21   AICc=145.51   BIC=161.77
## 
## Error measures:
##                     ME      RMSE       MAE      MPE     MAPE      MASE
## Training set -0.026629 0.3207243 0.2506993 -2.51351 18.19693 0.2334708
##                    ACF1
## Training set 0.08491602
## 
## Forecasts:
##          Point Forecast     Lo 80    Hi 80      Lo 95    Hi 95
## Jan 2017       1.397248 0.9810056 1.813489  0.7606605 2.033835
## Feb 2017       1.704867 1.1162364 2.293498  0.8046340 2.605101
## Mar 2017       1.552434 0.8315209 2.273347  0.4498926 2.654975
## Apr 2017       1.494002 0.6615690 2.326435  0.2209056 2.767098
## May 2017       1.529943 0.5992582 2.460627  0.1065835 2.953302
## Jun 2017       1.295580 0.2760694 2.315092 -0.2636272 2.854788
## Jul 2017       1.524808 0.4236126 2.626004 -0.1593252 3.208942
## Aug 2017       1.725298 0.5480715 2.902524 -0.0751143 3.525709
## Sep 2017       1.772526 0.5238904 3.021161 -0.1370972 3.682149
## Oct 2017       1.487160 0.1709837 2.803336 -0.5257579 3.500077
## Nov 2017       1.857576 0.4771595 3.237992 -0.2535886 3.968740
## Dec 2017       1.705435 0.2636380 3.147231 -0.4996031 3.910472
## Jan 2018       1.701436 0.2595507 3.143322 -0.5037375 3.906611
## Feb 2018       1.726238 0.2842600 3.168216 -0.4790771 3.931553
## Mar 2018       1.737459 0.2953891 3.179530 -0.4679968 3.942916
## Apr 2018       1.720729 0.2785668 3.162892 -0.4848680 3.926327
## May 2018       1.719983 0.2777285 3.162238 -0.4857551 3.925722
## Jun 2018       1.727854 0.2855072 3.170201 -0.4780253 3.933734

p11 = autoplot(fit11)
autoplot(train) + 
  autolayer(fit11) +
  xlab("Year") + ylab("Inflation (%)") +
  ggtitle("Canadian Inflation Rate ARIMA Forecast")

For this section, my prediction was that the ARIMA model would provide a more interesting forecast. I was left satisfied as it appeared to follow the test data much better. Initially rising, following by a quick drop and then rising close to the 2% mark. More specifically, the model is an ARIMA\((0,1,0)(2,0,2)_{12}\). This is in line with how you expect inflation to behave, as the policy rate increased in July of 2017 which would indicate a rise in future inflation due to lagging price movements. Therefore, this follows in line with the economic theory and provides a detailed story to go along with the data.

Chapter 11 Forecasting Methods

fit12 <- forecast(tbats(train, biasadj=TRUE), h=h)
summary(fit12)

## 
## Forecast method: BATS(1, {1,0}, 0.824, -)
## 
## Model Information:
## BATS(1, {1,0}, 0.824, -)
## 
## Call: tbats(y = train, biasadj = TRUE)
## 
## Parameters
##   Alpha: 2.141912
##   Beta: -0.3362489
##   Damping Parameter: 0.823637
##   AR coefficients: -0.843975
## 
## Seed States:
##           [,1]
## [1,] 2.7109138
## [2,] 0.3163395
## [3,] 0.0000000
## 
## Sigma: 0.4334248
## AIC: 757.7934
## 
## Error measures:
##                      ME      RMSE       MAE       MPE     MAPE      MASE
## Training set -0.0434583 0.4334248 0.3340335 -3.686932 28.39169 0.3110782
##                    ACF1
## Training set 0.07799686
## 
## Forecasts:
##          Point Forecast      Lo 80    Hi 80      Lo 95    Hi 95
## Jan 2017       1.557948 1.00249222 2.113405  0.7084515 2.407445
## Feb 2017       1.472966 0.67914478 2.266787  0.2589213 2.687010
## Mar 2017       1.521365 0.61828955 2.424441  0.1402302 2.902500
## Apr 2017       1.461307 0.45610616 2.466508 -0.0760150 2.998629
## May 2017       1.496172 0.43298536 2.559359 -0.1298318 3.122176
## Jun 2017       1.453715 0.33256658 2.574863 -0.2609335 3.168363
## Jul 2017       1.478814 0.32089336 2.636735 -0.2920730 3.249702
## Aug 2017       1.448791 0.25327584 2.644305 -0.3795914 3.277173
## Sep 2017       1.466848 0.24523176 2.688465 -0.4014531 3.335150
## Oct 2017       1.445611 0.19711949 2.694102 -0.4637919 3.355014
## Nov 2017       1.458595 0.18992808 2.727263 -0.4816638 3.398854
## Dec 2017       1.443568 0.15412984 2.733007 -0.5284577 3.415594
## Jan 2018       1.452900 0.14679938 2.759000 -0.5446085 3.450408
## Feb 2018       1.442264 0.11907047 2.765458 -0.5813861 3.465915
## Mar 2018       1.448967 0.11130046 2.786634 -0.5968176 3.494752
## Apr 2018       1.441438 0.08899536 2.793881 -0.6269445 3.509821
## May 2018       1.446251 0.08075514 2.811746 -0.6420945 3.534596
## Jun 2018       1.440919 0.06216436 2.819673 -0.6677042 3.549542

p12 = autoplot(fit12)  + xlab("Year") + ylab("Inflation (%)")
fit13<- forecast(nnetar(train), h=h)
summary(fit13)

## 
## Forecast method: NNAR(13,1,7)[12]
## 
## Model Information:
## 
## Average of 20 networks, each of which is
## a 13-7-1 network with 106 weights
## options were - linear output units 
## 
## Error measures:
##                         ME       RMSE        MAE        MPE     MAPE
## Training set -0.0005663962 0.09186461 0.06811977 -0.8882035 5.680796
##                    MASE       ACF1
## Training set 0.06343848 0.04349449
## 
## Forecasts:
##            Jan       Feb       Mar       Apr       May       Jun       Jul
## 2017 1.1585643 1.4609477 1.7125559 1.1730815 1.2388545 1.2834648 1.2028379
## 2018 1.4055597 1.2757277 0.8849764 1.1628257 1.1714780 1.0307786          
##            Aug       Sep       Oct       Nov       Dec
## 2017 1.5744337 1.2792276 1.0311942 1.3896258 1.1018710
## 2018

p13= autoplot(fit13)  + xlab("Year") + ylab("Inflation (%)")
gridExtra::grid.arrange(p12, p13, nrow=2)

This final section will be cover bootstrapping and neural network autoregression forecasting methods. The bootstrapping forecast looks to be similar to a naive forecast, which is consistent with the findings above, where most of the Chapter 8 forecasts closely resemble a naive forecast. The neural network forecast, however, is slightly similar to that of an ARIMA forecast, where it captures changes in the fit. Interestingly, the forecast onto the test period closely resembles the 13 months before it, providing a closer look at how this method generates the forecasts. A more quantitative test for accuracy will proceed, however, it seems the ARIMA forecast has the tightest prediction intervals.

Accuracy Test

a1 = accuracy(fit1, test)
a2 = accuracy(fit2, test)
a3 = accuracy(fit3, test)
a4 = accuracy(fit4, test)
a5 = accuracy(fit5, test)
a6 = accuracy(fit6, test)
a7 = accuracy(fit7, test)
a8 = accuracy(fit8, test)
a9 = accuracy(fit9, test)
a10 = accuracy(fit10, test)
a11 = accuracy(fit11, test)
a12 = accuracy(fit12, test)
a13 = accuracy(fit13, test)
a.table<-rbind(a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13)
row.names(a.table)<-c('Mean training','Mean test', 'Naive training', 'Naive test', 'S. Naive training', 'S. Naive test' , 'linear trend training','linear trend test','ES training','ES test', 'Holt linear training', 'Holt linear test', 'Holt dampled training','Holt damped test', 'Holt Winters training', 'Holt Winters test', 'ETS training', 'ETS test' , 'STL training','STL test', 'ARIMA training', 'ARIMA test', 'TBATS training','TBATS test', 'ANN training', 'ANN test')
a.table<-as.data.frame(a.table)
a.table<-a.table[order(a.table$MASE),]
kable(a.table, caption="Forecast Accuracy",digits = 4)

Forecast Accuracy
	ME	RMSE	MAE	MPE	MAPE	MASE	ACF1	Theil’s U
ANN training	-0.0006	0.0919	0.0681	-0.8882	5.6808	0.0634	0.0435	NA
ARIMA training	-0.0266	0.3207	0.2507	-2.5135	18.1969	0.2335	0.0849	NA
ARIMA test	0.1426	0.3763	0.3168	3.6562	17.5997	0.2951	0.6677	1.0805
TBATS training	-0.0435	0.4334	0.3340	-3.6869	28.3917	0.3111	0.0780	NA
ES training	-0.0033	0.4513	0.3519	0.0407	30.1027	0.3277	0.0916	NA
ETS training	-0.0033	0.4513	0.3519	0.0406	30.1028	0.3277	0.0916	NA
Holt linear training	-0.0007	0.4513	0.3523	0.2332	30.1031	0.3281	0.0903	NA
Naive training	-0.0033	0.4524	0.3536	0.0418	30.2503	0.3293	0.0915	NA
STL training	0.0000	0.4513	0.3537	-0.0242	29.9594	0.3294	0.0926	NA
Holt dampled training	-0.0079	0.4534	0.3550	0.4294	30.1725	0.3306	0.0654	NA
Holt Winters training	0.0229	0.4587	0.3631	0.3902	29.5182	0.3381	0.0812	NA
Mean test	-0.1310	0.4334	0.3759	-14.2246	25.2291	0.3501	0.6858	1.7314
Holt damped test	0.2595	0.4868	0.4077	9.1955	22.1158	0.3796	0.6825	1.4132
Naive test	0.2840	0.5013	0.4185	10.6489	22.4651	0.3897	0.6858	1.4383
ES test	0.2840	0.5013	0.4185	10.6508	22.4655	0.3898	0.6858	1.4383
ETS test	0.2840	0.5013	0.4185	10.6508	22.4655	0.3898	0.6858	1.4383
TBATS test	0.3094	0.5205	0.4293	12.1213	22.7575	0.3998	0.6914	1.4907
Holt linear test	0.3136	0.5257	0.4356	12.3000	23.1197	0.4057	0.6957	1.4929
STL test	0.3152	0.5288	0.4397	12.3359	23.3961	0.4095	0.6782	1.5106
S. Naive test	0.3165	0.5468	0.4475	13.2064	23.8969	0.4168	0.3257	1.6642
Holt Winters test	0.5397	0.7184	0.5754	25.4858	28.8095	0.5359	0.6865	1.9695
ANN test	0.5227	0.7318	0.5976	24.2143	30.4134	0.5565	0.6009	1.9386
linear trend training	0.0000	0.8171	0.5982	-24.9776	63.4347	0.5570	0.8468	NA
linear trend test	0.5998	0.7409	0.6294	29.2468	32.1284	0.5861	0.7067	2.0378
Mean training	0.0000	0.9049	0.7104	-30.0876	72.0150	0.6616	0.8749	NA
S. Naive training	-0.0789	1.4220	1.0738	-38.4952	108.1389	1.0000	0.8416	NA

Ordering the forecast accuracy of each model with respect to MASE, I conclude that the ARIMA model is the most accurate, and therefore the best model. Moreover, the ARIMA model is ranked best in both training and testing, respectively, suggesting an overall good fitting model.

Cross Validation

f4 <- function(y, h){forecast(tslm(y ~ trend + season), h=18)}
f9 <- function(y, h){forecast(ets(y, model="ZZZ"), h=18)}
f10 <- function(y, h){forecast(mstl(y), method="rwdrift", h=18)}
f11 <- function(y,h){forecast(auto.arima(y), h=18)}
f12 <- function(y,h){forecast(tbats(y), h=18)}
f13 <- function(y,h){forecast(nnetar(y), h=18)}
e1 = tsCV(y, meanf, h=h, window = 60)
e2 = tsCV(y, naive, h=h, window = 60)
e3 = tsCV(y, snaive, h=h, window = 60)
e4 <- tsCV(y, f4, h=h, window = 60) 
e5 <- tsCV(y, ses, h=h, window = 60)
e6 <- tsCV(y, holt, h=h, window=60)
e7 <- tsCV(y, holt, damped = TRUE, h=h, window=60)
e8 <- tsCV(y, hw, seasonal="additive", h=h, window=60)
e9 <- tsCV(y, f9, h=h, window=60)
e10 <- tsCV(y, f10, h=h, window=60)
e11 = tsCV(y, f11, h=h, window = 60)
e12 = tsCV(y, f12, h=h, window = 60)
e13 = tsCV(y, f13, h=h, window = 60)
mse1 <- colMeans(e1^2, na.rm = T)
mse2 <- colMeans(e2^2, na.rm = T)
mse3 <- colMeans(e3^2, na.rm = T)
mse4 <- colMeans(e4^2, na.rm = T)
mse5 <- colMeans(e5^2, na.rm = T)
mse6 <- colMeans(e6^2, na.rm = T)
mse7 <- colMeans(e7^2, na.rm = T)
mse8 <- colMeans(e8^2, na.rm = T)
mse9 <- colMeans(e9^2, na.rm = T)
mse10 <- colMeans(e10^2, na.rm = T)
mse11 <- colMeans(e11^2, na.rm = T)
mse12 <- colMeans(e12^2, na.rm = T)
mse13 <- colMeans(e13^2, na.rm = T)
mse.table <- rbind(mse1, mse2, mse3, mse4, mse5, mse6, mse7, mse8, mse9, mse10, mse11, mse12, mse13)
row.names(mse.table) <- c('Mean', 'Naive', 'S. Naive', 'linear trend','ES', 'Holt linear', 'Holt dampled', 'Holt Winters', 'ETS', 'STL', 'ARIMA', 'TBATS', 'ANN')
mse.table <- as.data.frame(mse.table)
kable(mse.table, caption="Optimal Forecasting Length",digits = 4)

Optimal Forecasting Length
	h=1	h=2	h=3	h=4	h=5	h=6	h=7	h=8	h=9	h=10	h=11	h=12	h=13	h=14	h=15	h=16	h=17	h=18
Mean	0.6508	0.6740	0.6949	0.7130	0.7278	0.7346	0.7380	0.7409	0.7432	0.7412	0.7430	0.7429	0.7422	0.7431	0.7459	0.7499	0.7559	0.7619
Naive	0.1655	0.3455	0.5019	0.6583	0.7937	0.9130	1.0121	1.1102	1.1943	1.2971	1.4320	1.5808	1.5749	1.5667	1.5715	1.5674	1.6147	1.6968
S. Naive	1.5307	1.5376	1.5345	1.5298	1.5361	1.5418	1.5475	1.5568	1.5636	1.5617	1.5716	1.5808	1.7899	1.7878	1.7852	1.7846	1.7928	1.8043
linear trend	0.8704	0.9362	0.9925	1.0418	1.0853	1.1295	1.1793	1.2287	1.2724	1.3026	1.3335	1.3578	1.5099	1.5080	1.5031	1.4900	1.4680	1.4322
ES	0.1668	0.3452	0.5013	0.6587	0.7947	0.9126	1.0104	1.1073	1.1924	1.2961	1.4289	1.5751	1.5707	1.5645	1.5698	1.5646	1.6117	1.6945
Holt linear	0.1700	0.3596	0.5401	0.7279	0.9049	1.0790	1.2455	1.4204	1.5925	1.7400	1.9478	2.1972	2.2589	2.3311	2.3852	2.4563	2.5737	2.7472
Holt dampled	0.1694	0.3491	0.5039	0.6585	0.7971	0.9209	1.0262	1.1279	1.2237	1.3319	1.4760	1.6246	1.6142	1.6082	1.6088	1.6067	1.6626	1.7679
Holt Winters	0.2510	0.4608	0.6492	0.8618	1.0467	1.2051	1.3815	1.5693	1.7716	1.9657	2.2254	2.4483	2.5770	2.7609	2.9130	3.0725	3.2749	3.5573
ETS	0.1669	0.3454	0.5013	0.6591	0.7953	0.9133	1.0109	1.1080	1.1931	1.2967	1.4294	1.5756	1.5714	1.5651	1.5703	1.5649	1.6119	1.6951
STL	0.2395	0.4652	0.6615	0.8540	1.0180	1.1677	1.3201	1.4566	1.5680	1.6892	1.8373	1.9999	2.0529	2.1067	2.1686	2.2157	2.3202	2.4634
ARIMA	0.1300	0.2944	0.4372	0.5507	0.6269	0.6913	0.7542	0.7953	0.8430	0.8880	0.9378	1.0042	1.0457	1.0617	1.0550	1.0529	1.0656	1.0805
TBATS	0.1683	0.3488	0.5035	0.6575	0.8065	0.9302	1.0103	1.0967	1.1564	1.2197	1.3111	1.4763	1.4987	1.5221	1.5369	1.5269	1.5412	1.5871
ANN	0.2151	0.4637	0.6315	0.8627	0.9573	1.0321	1.1191	1.1543	1.2721	1.2014	1.0775	1.1243	1.1074	1.1937	1.2311	1.2069	1.2791	1.2019

The above code conducts a cross validation process where the properties of forecasting the data will be more closely analyzed. This is particularly interesting as the ARIMA model performed very well, however, if I were to adjust test data and forecast over a longer period a simple forecast method such as the naive or mean will reign supreme. This does not provide an interesting analysis, as accuracy forecasting the fluctuations in inflation is paramount in monetary policy and an economy as a whole. Therefore, the goal of the cross validation will be to develop the most accurate model within the scope of this paper.

The table above lists the mean square errors of each model with respect to their forecast length. It is quite messy, and hard to read so a graphical approach will better suit the analysis.

m1= data.frame(h = 1:18, MSE = mse1) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point() + ggtitle("MSE Mean")
m2= data.frame(h = 1:18, MSE = mse2) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point()+ ggtitle("MSE Naive")
m3= data.frame(h = 1:18, MSE = mse3) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point() + ggtitle("MSE S. Naive")
m4 = data.frame(h = 1:18, MSE = mse4) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point() + ggtitle("MSE linear trend")
m5= data.frame(h = 1:18, MSE = mse5) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point() + ggtitle("MSE ES")
m6= data.frame(h = 1:18, MSE = mse6) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point()+ ggtitle("MSE Holt linear")
m7= data.frame(h = 1:18, MSE = mse7) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point() + ggtitle("MSE Holt damped")
m8 = data.frame(h = 1:18, MSE = mse8) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point() + ggtitle("MSE Holt Winters") 
m9= data.frame(h = 1:18, MSE = mse9) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point() + ggtitle("MSE ETS")
m10= data.frame(h = 1:18, MSE = mse10) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point()+ ggtitle("MSE STL")
m11= data.frame(h = 1:18, MSE = mse11) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point() + ggtitle("MSE ARIMA") 
m12= data.frame(h = 1:18, MSE = mse12) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point() + ggtitle("MSE TBATS")
m13= data.frame(h = 1:18, MSE = mse13) %>%
  ggplot(aes(x = h, y = MSE)) + geom_point() + ggtitle("MSE ANN")
gridExtra::grid.arrange(m1, m2, m3, m4, nrow=2)

gridExtra::grid.arrange(m5 ,m6, m7, m8, nrow=2)

gridExtra::grid.arrange(m9, m10, m11, m12, m13, nrow=3)

The above charts graphically show the MSE of each model with the x-axis representation of the forecast length. The chart accurately supports the conclusions drawn throughout the accuracy measures. The ARIMA model has the lowest range of MSE values and has a nice concave shape to it, suggesting decreasing unit change of accuracy over time. The mean forecast is second most accurate, and likewise has a similar concave shape to it, however, it increases linearly after 15 months. The naive forecast, seems to be somewhat of a “base” model for many of the forecast methods used, follows closely behind the two models. Finally, the plot below will conclude the analysis of optimal forecasting method using the mean forecast, naive and ARIMA mean squared errors for each respective monthly forecast.

require(reshape2)
table <- data.frame(h= 1:18, Mean = mse1, Naive = mse2, ANN= mse12, ARIMA = mse11)
table <- melt(table, id.vars = 'h', variable.name = 'series')
ggplot(table, aes(h, value)) + geom_line(aes(colour=series)) + ylab('MSE') + 
  xlab('Forecast Length') +  guides(colour=guide_legend(title="")) +
  ggtitle("Optimal Forecast Length based on Minimizing MSE")

This last plot proves the suspicions earlier on in the report. The mean forecast’s MSE is flat over time. Not surprisingly, the Bank of Canada sets the target rate at 2%, therefore, it is expected to reach 2% in the long run. Which follows in line with what has commented above, where the simple forecast methods are more accurate than the ARIMA model. The intersection of these two lines is between 6 and 7 months, which will be the optimal short-run forecasting method for the Canadian inflation rate. I will choose 6 months to be the optimal forecast length to ensure the most efficient forecast, as anything more than 6 months will be nothing but the mean forecast. The next section will highlight a vector autoregression, which is a commonly used method to forecast inflation, suggesting a model that should compete with the ARIMA forecast.

Vector Autoregression

Stage.II.Data <- read.csv("~/Desktop/York Econ/ECON6210/Stage II Data.var1.csv")
df3= ts(Stage.II.Data, start= 1999, frequency = 12)
l <- log(df3[,3:6])
df3 <- cbind(df3,l)
vardata1 = df3[,c(9,11,10,2)]

The first variable chosen for the model is the industrial product price index (ippi) of Canada (Table: 18-10-0029-01). The utility of this variable is due to it being a price index for what producers sell their final products, thereby capturing all production inputs. Therefore, providing a nice control for the real economy and at the producer level. Moreover, it useful in capturing contract negotiations, which should be useful in the dynamics of the model concerning unemployment. Furthermore, it is useful in controlling for the selling price movements in the manufacture industry and commodity price changes, which are both important to the Canadian economy. The second ordered variable is gross domestic product (GDP) of Canada at basic prices (Table: 36-10-0434-01). Like the industrial product price index, it measures the economic performance of Canada, but the GDP is a benchmark economic variable and improves the forecasting ability of the model. The third variable is the variable of interest, however, it is reported as CPI instead of the rate of change. Building an adequate model with the rate of change of CPI proved to be difficult. It was a challenge to have randomly distributed errors, and in the extreme case, have roots less than 1, suggesting stationary data. Therefore, I have used the original seasonally unadjusted CPI for Canada. All three of the variables are then log-transformed to reduce the variability of the variables to change the scale of the data, making the impulse response functions much neater and to account for stationarity. The last variable is the Canadian unemployment rate for individuals over the age of 15 who are in the labour market (Table: 14-10-0017-01), also seasonally unadjusted. The data is ordered in such a way beforehand, which will be shown through the error variance decomposition further into this section. Finally, the data is monthly and starts in January 1999.

The goal of this model was to develop a basic test for the theory developed in the aggregate supply and demand framework of monetary policy, where inflation is plotted on the y-axis and real output on the x-axis. The dynamics of the model are as follows, initially, it starts from equilibrium, a temporary increase in aggregate demand will cause a wealth effect and reduce unemployment as more people trade work for leisure. However, inflationary pressure in variables such as prices and wages develop due to a positive output gap (economic expansion). Through this, prices will increase and so will labour inputs. Therefore, due to the rising inputs unemployment must increase, which causes an economic contraction and takes the economy back to the initial equilibrium. The empirical model used is basic, however, it will be interesting to experiment generating theoretically consistent results and to test forecast performance.

VARselect(vardata1, lag.max = 24, type = "both", season= 12)

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      4      2      1      4 
## 
## $criteria
##                    1             2             3             4
## AIC(n) -4.543962e+01 -4.559743e+01 -4.556580e+01 -4.562223e+01
## HQ(n)  -4.500147e+01 -4.505618e+01 -4.492146e+01 -4.487480e+01
## SC(n)  -4.435580e+01 -4.425858e+01 -4.397194e+01 -4.377336e+01
## FPE(n)  1.846884e-20  1.579261e-20  1.633029e-20  1.547393e-20
##                    5             6             7             8
## AIC(n) -4.556361e+01 -4.553240e+01 -4.549406e+01 -4.541505e+01
## HQ(n)  -4.471309e+01 -4.457877e+01 -4.443735e+01 -4.425524e+01
## SC(n)  -4.345972e+01 -4.317348e+01 -4.288013e+01 -4.254610e+01
## FPE(n)  1.646403e-20  1.706031e-20  1.782380e-20  1.941844e-20
##                    9            10            11            12
## AIC(n) -4.535596e+01 -4.526777e+01 -4.524320e+01 -4.518060e+01
## HQ(n)  -4.409306e+01 -4.390177e+01 -4.377411e+01 -4.360842e+01
## SC(n)  -4.223199e+01 -4.188879e+01 -4.160920e+01 -4.129158e+01
## FPE(n)  2.076680e-20  2.289921e-20  2.373380e-20  2.559926e-20
##                   13            14            15            16
## AIC(n) -4.521846e+01 -4.515706e+01 -4.512135e+01 -4.511326e+01
## HQ(n)  -4.354318e+01 -4.337868e+01 -4.323988e+01 -4.312870e+01
## SC(n)  -4.107442e+01 -4.075800e+01 -4.046728e+01 -4.020417e+01
## FPE(n)  2.502273e-20  2.707101e-20  2.861254e-20  2.949505e-20
##                   17            18            19            20
## AIC(n) -4.510677e+01 -4.506350e+01 -4.502361e+01 -4.495240e+01
## HQ(n)  -4.301911e+01 -4.287275e+01 -4.272976e+01 -4.255546e+01
## SC(n)  -3.994266e+01 -3.964438e+01 -3.934946e+01 -3.902324e+01
## FPE(n)  3.044284e-20  3.269866e-20  3.512158e-20  3.906656e-20
##                   21            22            23            24
## AIC(n) -4.500451e+01 -4.497957e+01 -4.501860e+01 -4.499084e+01
## HQ(n)  -4.250448e+01 -4.237644e+01 -4.231238e+01 -4.218153e+01
## SC(n)  -3.882033e+01 -3.854037e+01 -3.832439e+01 -3.804161e+01
## FPE(n)  3.856622e-20  4.130084e-20  4.168477e-20  4.521116e-20

VARselect(vardata1, lag.max = 24, type = "const", season = 12)

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      4      2      1      4 
## 
## $criteria
##                    1             2             3             4
## AIC(n) -4.542345e+01 -4.558291e+01 -4.556109e+01 -4.560983e+01
## HQ(n)  -4.501107e+01 -4.506744e+01 -4.494252e+01 -4.488817e+01
## SC(n)  -4.440337e+01 -4.430782e+01 -4.403098e+01 -4.382471e+01
## FPE(n)  1.876560e-20  1.601773e-20  1.639886e-20  1.565588e-20
##                    5             6             7             8
## AIC(n) -4.555309e+01 -4.552523e+01 -4.548974e+01 -4.541180e+01
## HQ(n)  -4.472834e+01 -4.459739e+01 -4.445880e+01 -4.427776e+01
## SC(n)  -4.351295e+01 -4.323007e+01 -4.293956e+01 -4.260660e+01
## FPE(n)  1.662263e-20  1.716255e-20  1.787468e-20  1.944686e-20
##                    9            10            11            12
## AIC(n) -4.535012e+01 -4.525913e+01 -4.524510e+01 -4.518361e+01
## HQ(n)  -4.411299e+01 -4.391890e+01 -4.380178e+01 -4.363720e+01
## SC(n)  -4.228991e+01 -4.194390e+01 -4.167485e+01 -4.135835e+01
## FPE(n)  2.084361e-20  2.303934e-20  2.361845e-20  2.543449e-20
##                   13            14            15            16
## AIC(n) -4.522652e+01 -4.516023e+01 -4.512735e+01 -4.511271e+01
## HQ(n)  -4.357701e+01 -4.340763e+01 -4.327165e+01 -4.315392e+01
## SC(n)  -4.114624e+01 -4.082493e+01 -4.053703e+01 -4.026737e+01
## FPE(n)  2.472356e-20  2.686321e-20  2.829527e-20  2.934000e-20
##                   17            18            19            20
## AIC(n) -4.511036e+01 -4.506101e+01 -4.502138e+01 -4.494909e+01
## HQ(n)  -4.304848e+01 -4.289604e+01 -4.275331e+01 -4.257792e+01
## SC(n)  -4.001001e+01 -3.970564e+01 -3.941099e+01 -3.908368e+01
## FPE(n)  3.013516e-20  3.253980e-20  3.491139e-20  3.883834e-20
##                   21            22            23            24
## AIC(n) -4.499525e+01 -4.497327e+01 -4.499923e+01 -4.497231e+01
## HQ(n)  -4.252099e+01 -4.239591e+01 -4.231878e+01 -4.218877e+01
## SC(n)  -3.887483e+01 -3.859782e+01 -3.836877e+01 -3.808683e+01
## FPE(n)  3.853004e-20  4.109414e-20  4.197058e-20  4.542228e-20

VARselect(vardata1, lag.max = 24, type = "both")

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##     13     13      1     13 
## 
## $criteria
##                    1             2             3             4
## AIC(n) -4.332266e+01 -4.356479e+01 -4.348239e+01 -4.356245e+01
## HQ(n)  -4.316802e+01 -4.330705e+01 -4.312156e+01 -4.309852e+01
## SC(n)  -4.294013e+01 -4.292724e+01 -4.258982e+01 -4.241487e+01
## FPE(n)  1.531918e-19  1.202758e-19  1.306724e-19  1.207263e-19
##                    5             6             7             8
## AIC(n) -4.356269e+01 -4.358158e+01 -4.354964e+01 -4.365470e+01
## HQ(n)  -4.299567e+01 -4.291146e+01 -4.277643e+01 -4.277840e+01
## SC(n)  -4.216009e+01 -4.192396e+01 -4.163701e+01 -4.148705e+01
## FPE(n)  1.208661e-19  1.188448e-19  1.230411e-19  1.111723e-19
##                    9            10            11            12
## AIC(n) -4.360438e+01 -4.354093e+01 -4.360272e+01 -4.420725e+01
## HQ(n)  -4.262499e+01 -4.245844e+01 -4.241714e+01 -4.291858e+01
## SC(n)  -4.118171e+01 -4.086324e+01 -4.067002e+01 -4.101953e+01
## FPE(n)  1.174512e-19  1.258658e-19  1.191554e-19  6.564776e-20
##                   13            14            15            16
## AIC(n) -4.486685e+01 -4.479322e+01 -4.476130e+01 -4.470450e+01
## HQ(n)  -4.347508e+01 -4.329835e+01 -4.316334e+01 -4.300344e+01
## SC(n)  -4.142411e+01 -4.109546e+01 -4.080852e+01 -4.049671e+01
## FPE(n)  3.428347e-20  3.733645e-20  3.907345e-20  4.200769e-20
##                   17            18            19            20
## AIC(n) -4.470769e+01 -4.462078e+01 -4.454695e+01 -4.446104e+01
## HQ(n)  -4.290354e+01 -4.271353e+01 -4.253661e+01 -4.234761e+01
## SC(n)  -4.024488e+01 -3.990295e+01 -3.957410e+01 -3.923317e+01
## FPE(n)  4.262821e-20  4.745299e-20  5.227637e-20  5.845948e-20
##                   21            22            23            24
## AIC(n) -4.447801e+01 -4.447445e+01 -4.450305e+01 -4.452634e+01
## HQ(n)  -4.226149e+01 -4.215483e+01 -4.208034e+01 -4.200053e+01
## SC(n)  -3.899513e+01 -3.873656e+01 -3.851014e+01 -3.827840e+01
## FPE(n)  5.916897e-20  6.133955e-20  6.180785e-20  6.286699e-20

VARselect(vardata1, lag.max = 24, type = "const")

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##     13     13      2     13 
## 
## $criteria
##                    1             2             3             4
## AIC(n) -4.324262e+01 -4.349769e+01 -4.341597e+01 -4.350035e+01
## HQ(n)  -4.311375e+01 -4.326573e+01 -4.308092e+01 -4.306220e+01
## SC(n)  -4.292385e+01 -4.292390e+01 -4.258716e+01 -4.241653e+01
## FPE(n)  1.659530e-19  1.286125e-19  1.396234e-19  1.284264e-19
##                    5             6             7             8
## AIC(n) -4.350964e+01 -4.355263e+01 -4.351989e+01 -4.363633e+01
## HQ(n)  -4.296839e+01 -4.290829e+01 -4.277246e+01 -4.278580e+01
## SC(n)  -4.217079e+01 -4.195877e+01 -4.167101e+01 -4.153243e+01
## FPE(n)  1.274003e-19  1.222652e-19  1.266593e-19  1.131214e-19
##                    9            10            11            12
## AIC(n) -4.358674e+01 -4.353196e+01 -4.360259e+01 -4.419234e+01
## HQ(n)  -4.263312e+01 -4.247524e+01 -4.244278e+01 -4.292944e+01
## SC(n)  -4.122783e+01 -4.091803e+01 -4.073364e+01 -4.106837e+01
## FPE(n)  1.193914e-19  1.268037e-19  1.189473e-19  6.648483e-20
##                   13            14            15            16
## AIC(n) -4.487526e+01 -4.479750e+01 -4.476837e+01 -4.470764e+01
## HQ(n)  -4.350926e+01 -4.332841e+01 -4.319618e+01 -4.303236e+01
## SC(n)  -4.149627e+01 -4.116350e+01 -4.087935e+01 -4.056360e+01
## FPE(n)  3.390682e-20  3.706241e-20  3.865984e-20  4.170426e-20
##                   17            18            19            20
## AIC(n) -4.471196e+01 -4.461761e+01 -4.454380e+01 -4.445785e+01
## HQ(n)  -4.293358e+01 -4.273614e+01 -4.255923e+01 -4.237019e+01
## SC(n)  -4.031290e+01 -3.996354e+01 -3.963471e+01 -3.929374e+01
## FPE(n)  4.224827e-20  4.735084e-20  5.212734e-20  5.825143e-20
##                   21            22            23            24
## AIC(n) -4.447272e+01 -4.447550e+01 -4.449093e+01 -4.451415e+01
## HQ(n)  -4.228197e+01 -4.218165e+01 -4.209399e+01 -4.201411e+01
## SC(n)  -3.905360e+01 -3.880136e+01 -3.856177e+01 -3.832997e+01
## FPE(n)  5.903415e-20  6.075948e-20  6.197545e-20  6.297520e-20

var1 <- VAR(vardata1, p=4, type = "both", season = 12)

The best model using the VARselect command is a VAR with 4 lags which includes both trend and constant variables, along with seasonal dummies. This was expected as log-CPI, log-GDP and log-IPPI have a noticeable trend, along with seasonal unadjusted data being used.

roots(var1)

##  [1] 0.9798524 0.9378444 0.9378444 0.7514438 0.7514438 0.6281805 0.6281805
##  [8] 0.6030783 0.6030783 0.5474862 0.5474862 0.5337042 0.5337042 0.4930711
## [15] 0.2590086 0.2590086

serial.test(var1)

## 
##  Portmanteau Test (asymptotic)
## 
## data:  Residuals of VAR object var1
## Chi-squared = 210.63, df = 192, p-value = 0.1697

acf(residuals(var1), type="partial", lag.max=24)

The roots are all less than 1 and the serial test returns a p-value greater than 0.05, which fails to reject the null hypothesis that the errors are random. Both suggesting an adequate model and thus continuing the analysis. However, it is worth noting that the p-value is not very large at 0.1697, which suggests some room for improvement, but large enough to continue. Moreover, the PCAF plots show some problems with the unemployment rate with the other variables, and the same can be said with the industrial price index and other variables. This may explain why the p-value was only 0.1697, however, it was still in the region to fail to reject the null, and thus safe to continue.

causality(var1, cause= c("l.gdp", "df3.ur", "l.ippi" ))

## $Granger
## 
##  Granger causality H0: l.ippi l.gdp df3.ur do not Granger-cause
##  l.cpi
## 
## data:  VAR object var1
## F-Test = 3.6321, df1 = 12, df2 = 804, p-value = 2.531e-05
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: l.ippi l.gdp df3.ur and
##  l.cpi
## 
## data:  VAR object var1
## Chi-squared = 32.087, df = 3, p-value = 5.018e-07

causality(var1, cause= c("l.cpi", "l.gdp", 'df3.ur' ))

## $Granger
## 
##  Granger causality H0: l.gdp l.cpi df3.ur do not Granger-cause
##  l.ippi
## 
## data:  VAR object var1
## F-Test = 1.4944, df1 = 12, df2 = 804, p-value = 0.1205
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: l.gdp l.cpi df3.ur and
##  l.ippi
## 
## data:  VAR object var1
## Chi-squared = 34.601, df = 3, p-value = 1.479e-07

causality(var1, cause= c("l.ippi", "l.cpi", 'l.gdp' ))

## $Granger
## 
##  Granger causality H0: l.ippi l.gdp l.cpi do not Granger-cause
##  df3.ur
## 
## data:  VAR object var1
## F-Test = 3.1073, df1 = 12, df2 = 804, p-value = 0.0002523
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: l.ippi l.gdp l.cpi and
##  df3.ur
## 
## data:  VAR object var1
## Chi-squared = 9.8693, df = 3, p-value = 0.01971

causality(var1, cause= c("df3.ur", "l.ippi", 'l.cpi' ))

## $Granger
## 
##  Granger causality H0: l.ippi l.cpi df3.ur do not Granger-cause
##  l.gdp
## 
## data:  VAR object var1
## F-Test = 2.7038, df1 = 12, df2 = 804, p-value = 0.001382
## 
## 
## $Instant
## 
##  H0: No instantaneous causality between: l.ippi l.cpi df3.ur and
##  l.gdp
## 
## data:  VAR object var1
## Chi-squared = 11.932, df = 3, p-value = 0.007619

The above code is a comprehensive test of causality within the model. The first test, is perhaps an integral part of this VAR model, as the prediction is GDP, IPPI and the unemployment rate have a causal effect on inflation. A p-value of virtually 0 supports the model, as the null hypothesis is no causal relationship between log-ippi, log-GDP, and unemployment rate on log-CPI, and therefore, rejecting the null allows confidence in rejecting the hypothesis, on average.

The second test suggests there is no causal relationship between the producer’s price index and the rest of the variables in the model. Lastly, I test for causality in the model for the unemployment rate and GDP, respectively. Both tests return a p-value less than 0.05, which rejects the null hypothesis that no causality exists with the model on these two variables separately.

fevdvar1 = fevd(var1, n.ahead = 48)
par(mfrow=c(2,2))
plot(fevdvar1, plot.type = "single")

To show why the variables are in this order can be best explained by the forecast of the error variance decomposition. The variable that has the highest proportion correlated with the other variables from the model is ordered last, which in this case is the unemployment rate. Conversely, the variable with the errors least correlated across the model is ordered first, and so on. Moreover, it is likely the unemployment rate will produce nice impulse response functions, while a shock from GDP and IPPI is less likely due to its error variance decomposition.

var1.irf <- irf(var1, n.ahead = 48, boot = TRUE,  runs=500, seed=3, cumulative=FALSE)
par(mfrow=c(2,2))
plot(var1.irf, plot.type = "single")

The impulse response functions are plotted above, with each chart representing an exogenous shock in each variable of the model to the others. The first exogenous shock is from l.ippi on the others. You can see the effect on GDP is initially positive and statistically significant for the first several months, then negative and marginally statistically significant for some periods between 15 and 25 months. The positive portion is consistent with the theory, as producers are charging more for their goods due to an increase in demand, which translates to good economic times and economic expansion. The effect on CPI is also positive and significant. Therefore, also consistent with the theory, as rising producer prices translate to rising costs for the consumer, which translates to inflation. The effect converges to 0 after several months. Lastly, the effect on unemployment is lower in magnitude, however, a portion is negative and statistically significant for a short period, however, this effect moves into both, positive and statistically significant for a short amount of time.

The second shock comes from a positive exogenous shock in GDP. It has a positive effect on CPI and is significant roughly between 8 and 17 months. Moreover, the effect on unemployment is much stronger. A large negative and statistically significant effect for the first 15 months is consistent with the theory. Furthermore, the dynamics of the unemployment rate and GDP demonstrate the theory very well. This is due to the nice curvature and convergence towards 0 in the long run.

Unfortunately, the impulse response functions for a positive exogenous shock to CPI does not give very interesting results. All of them are statistically insignificant and have large error bands. The effect on GDP suggests a negative effect, which is not what the theory states. Moreover, the effect on the unemployment rate is negative at the start, however, spikes upwards and converges to 0 afterwards; although both effects are statistically insignificant. However, this could suggest an increase in CPI is positive (negative for the unemployment rate) in the short term, but negative (positive) shortly after due to rising costs for producers and consumers. Therefore, producers reduce inputs to production such as labour and capital, which suggests the decrease in IPPI and GDP, and finally leads to an increase in unemployment.

Lastly, the impulse response functions of an exogenous shock to unemployment return mixed results. The response of the producer’s pricing index return a large negative drop, which is to be expected, however, it is statistically insignificant. The confidence bands suggest it could have a positive effect on the producer’s price index from an increase in the unemployment rate. Moreover, the effects of GDP and CPI are also inconsistent with what the theory states. There is a slight drop in both CPI and GDP, followed by a sharp increase and a persistent positive effect, which is contrary to what the theory suggests, but consistent with our impulse response functions thus far.

Something to note is that unlike the Federal Reserve, the Bank of Canada does not have a dual mandate where it targets both the inflation rate and the unemployment rate. This could have an impact on the strength of the impulse response functions, however, the dynamics should stay the same. In the next section, I will test the model by forecasting the training data onto the testing length, similar to the previous sections.

testvar <- window(vardata1, start=2017)
trainvar <- window(vardata1,start=c(1999, 1), end = c(2016,12))
h1=dim(testvar)[1]
var2 = VAR(trainvar, p=4, type= "both", season = 12)
var.fc2 = forecast(var2, h = h1)
accuracy(var.fc2$forecast$l.cpi, testvar[,3])

##                         ME        RMSE         MAE           MPE
## Training set -1.675823e-17 0.002659144 0.002097989 -3.337039e-05
## Test set     -1.624709e-03 0.004295395 0.003611683 -3.336902e-02
##                    MAPE      MASE        ACF1 Theil's U
## Training set 0.04454430 0.1084849 -0.01059989        NA
## Test set     0.07410444 0.1867564  0.69301743  1.388346

Before forecasting out-of-sample, I will produce benchmark forecasts using test and training data. To keep this analysis consistent, I will produce the accuracy measure separately to the previous forecasts as this is a forecast for the log-transformed CPI and not inflation.

The MASE for the test set is 0.18676, which is much better than the other forecasting methods. Moreover, the testing set accuracy measures are lower than the ARIMA training set measures, suggesting a strong model. However, it is important to remember that data which has been differenced such as inflation have different forecasting properties compared to data with high persistence such as CPI or GDP. To demonstrate this, I will compare the forecast accuracy of the ARIMA method on the log-transformed CPI.

logcpi = log(cpi)
testcpi = window(logcpi, start=2017)
traincpi = window(logcpi, start= c(1999, 1), end = c(2016,12))
cpi.arima = forecast(auto.arima(traincpi, approximation=FALSE, stepwise= FALSE), h= h)
accuracy(cpi.arima, testcpi)

##                        ME        RMSE         MAE          MPE       MAPE
## Training set 2.858887e-05 0.003353440 0.002604100 0.0008291841 0.05536960
## Test set     1.568165e-03 0.003791617 0.003348356 0.0320908785 0.06862728
##                   MASE       ACF1 Theil's U
## Training set 0.1346553 0.00224416        NA
## Test set     0.1731400 0.69390309   1.17832

After performing the accuracy test on the same data set using an ARIMA forecasting method, the MASE for the training set is 0.1731. This is very close and slightly lower, to the VAR model which was 0.18676. The two measurements being so close together would suggest the VAR model providing a similar forecast to the ARIMA method previously discussed.

var.fc = predict(var1, n.ahead= 6)
var.fc$fcst$l.cpi

##          fcst    lower    upper          CI
## [1,] 4.896633 4.891084 4.902181 0.005548546
## [2,] 4.896745 4.888578 4.904912 0.008166991
## [3,] 4.898833 4.889436 4.908231 0.009397613
## [4,] 4.898259 4.887860 4.908657 0.010398262
## [5,] 4.897440 4.886193 4.908687 0.011246863
## [6,] 4.894905 4.883082 4.906729 0.011823633

l.cpi.fc = var.fc$fcst$l.cpi[,1]
lcpi = as.matrix(df3[,"l.cpi"])
lcpi = c(lcpi,l.cpi.fc)
lcpi.incl.fc = ts(lcpi,start = c(1999, 01), frequency = 12)
y.var.fc = diff(lcpi.incl.fc, lag = 12)
y.var.fc = y.var.fc*100
y.var.fc = window(y.var.fc, start = c(2018, 07))

For this forecast to be consistent with the rest of the paper, I will difference the forecast by a lag of 12 months. Therefore, I can plot it over time and compare it to the variable of interest which is year-over-year inflation. Moreover, the forecast period is 6 months as this will be compared to the ARIMA model, which is shown previously to have similar forecast accuracy.

autoplot(y) + autolayer(y.var.fc, series = "Forecast") +
  autolayer(y1n, series = "Recent Data") +
  xlab("Year") + ylab("Inflation (%)") +
  ggtitle("Canadian Inflation Rate VAR Forecast") + 
  guides(colour=guide_legend(title=""))

Like in the previous forecasting section, I have included the recent data of inflation. The forecast visually is similar to the ARIMA forecast, which is unsurprising due to the similar accuracy measures, proving the results found in the VAR accuracy section. Moreover, it seems to perform better than the ARIMA model for the recent data of inflation. The next section will provide a more in-depth discussion on the forecasting methods used thus far, it seems the ARIMA and VAR models both produce consistent forecasts with respect to the economic theory.

Conclusion

y.n = naive(y, h=6)
summary(y.n)

## 
## Forecast method: Naive method
## 
## Model Information:
## Call: naive(y = y, h = 6) 
## 
## Residual sd: 0.4437 
## 
## Error measures:
##                      ME      RMSE       MAE       MPE    MAPE      MASE
## Training set 0.00118604 0.4427258 0.3454054 0.1170405 29.0103 0.3388692
##                    ACF1
## Training set 0.08724437
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80     Lo 95    Hi 95
## Jul 2018       2.424361 1.856985 2.991737 1.5566345 3.292088
## Aug 2018       2.424361 1.621970 3.226752 1.1972104 3.651512
## Sep 2018       2.424361 1.441637 3.407085 0.9214146 3.927308
## Oct 2018       2.424361 1.289609 3.559113 0.6889079 4.159814
## Nov 2018       2.424361 1.155670 3.693052 0.4840655 4.364657
## Dec 2018       2.424361 1.034580 3.814143 0.2988737 4.549849

tail(y1n)

##           Jun      Jul      Aug      Sep      Oct
## 2018 2.424361 2.946945 2.795800 2.192905 2.415212

checkresiduals(y.n)

## 
##  Ljung-Box test
## 
## data:  Residuals from Naive method
## Q* = 85.829, df = 24, p-value = 6.986e-09
## 
## Model df: 0.   Total lags used: 24

Initially, this data set proposed some trouble, as I had to adjust the testing period. I found using 90% of my data as training data, and the last 10% as testing gave me the best model. To account for this, I made sure to forecast no further than the length of my test data (\(h=18\)). Less than 90% would not work in the ARIMA model’s favour as the naive method and exponential smoothing method ranked better in terms of accuracy but failed the Ljung-Box test, as the p-value is close to 0, thereby rejecting the null hypothesis that this model does not exhibit lack of fit. This can be shown in the cross validation section, where the forecast length of 6 months is most efficient. However, the forecasts from the naive model are very close for July, August, and October, but not so much for September. Moreover, given the recent 25 basis point increase to the policy rate on October 24, 2018, I would not expect the naive to be the best model.

y.f = forecast(auto.arima(y, approximation=FALSE, stepwise= FALSE ), h=6)
summary(y.f)

## 
## Forecast method: ARIMA(0,1,1)(2,0,1)[12]
## 
## Model Information:
## Series: y 
## ARIMA(0,1,1)(2,0,1)[12] 
## 
## Coefficients:
##          ma1     sar1     sar2     sma1
##       0.0966  -0.0487  -0.0720  -0.8873
## s.e.  0.0788   0.0830   0.0794   0.0776
## 
## sigma^2 estimated as 0.1026:  log likelihood=-70.44
## AIC=150.88   AICc=151.15   BIC=167.87
## 
## Error measures:
##                       ME      RMSE       MAE       MPE     MAPE      MASE
## Training set -0.01999494 0.3167181 0.2460129 -1.557083 17.76297 0.2413576
##                      ACF1
## Training set -0.009549693
## 
## Forecasts:
##          Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
## Jul 2018       2.505995 2.0949444 2.917045 1.8773475 3.134642
## Aug 2018       2.478117 1.8680865 3.088147 1.5451561 3.411077
## Sep 2018       2.371214 1.6127214 3.129707 1.2111996 3.531229
## Oct 2018       2.259006 1.3766872 3.141324 0.9096160 3.608395
## Nov 2018       1.940914 0.9501261 2.931702 0.4256347 3.456193
## Dec 2018       2.021972 0.9334710 3.110473 0.3572533 3.686691

y1n

##           Jun      Jul      Aug      Sep      Oct
## 2018 2.424361 2.946945 2.795800 2.192905 2.415212

checkresiduals(y.f)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1)(2,0,1)[12]
## Q* = 17.27, df = 20, p-value = 0.6353
## 
## Model df: 4.   Total lags used: 24

autoplot(y) +
  autolayer(y.f) + 
  xlab("Year") + ylab("Inflation (%)") +
  ggtitle("Canadian Inflation Rate ARIMA Forecast with Predicition Intervals")

The ARIMA model, which now has an order of \((0,1,1)(2,0,1)_{12}\), captures that increase in July 2018, following a slight decrease in August and lastly, with a larger decrease in September. It is worth noting that the Bank of Canada did increase the policy rate by 25 basis points in July 2017, before the one in October 2018, which is also consistent in the forecast. The decline after September is also consistent with what I would normally expect, even with this recent rate hike. Furthermore, I expect inflation to normalize at the target rate of 2% roughly in December or so. The forecasts afterwards seem to drop from the 2% rate suggesting perhaps a slight economic slowdown, maybe due to the relatively close rate hikes in recent times, where previously, rate changes seldom occurred. Finally, after performing the Ljung-Box test, the p-value is 0.6353, which is greater than 0.05. Thereby, failing to reject the null hypothesis, which is, the model does not exhibit a lack of fit. I can then conclude that this is an appropriate model to forecast inflation.

y.f

##          Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
## Jul 2018       2.505995 2.0949444 2.917045 1.8773475 3.134642
## Aug 2018       2.478117 1.8680865 3.088147 1.5451561 3.411077
## Sep 2018       2.371214 1.6127214 3.129707 1.2111996 3.531229
## Oct 2018       2.259006 1.3766872 3.141324 0.9096160 3.608395
## Nov 2018       1.940914 0.9501261 2.931702 0.4256347 3.456193
## Dec 2018       2.021972 0.9334710 3.110473 0.3572533 3.686691

y1n

##           Jun      Jul      Aug      Sep      Oct
## 2018 2.424361 2.946945 2.795800 2.192905 2.415212

y.var.fc

##           Jul      Aug      Sep      Oct      Nov      Dec
## 2018 2.602624 2.537151 2.516368 2.382486 1.995486 2.123593

autoplot(y) +
  autolayer(y1n, series = "Recent Data") +
  autolayer(y.f, series = "ARIMA", PI=FALSE) + 
  autolayer(y.var.fc, series = "VAR") +
  xlab("Year") + ylab("Inflation (%)") +
  ggtitle("Canadian Inflation Rate Forecasts") + 
  guides(colour=guide_legend(title=""))

A comparison of the ARIMA and VAR(4) model will uncover similar forecasts which are equally the most accurate. Logically, both should be considered. A discernible difference in the two forecast measures is that the VAR model estimates larger values for inflation in the following months, while the estimates from ARIMA are smaller in magnitude. However, this difference dissipates, and the two follow each other closely. The recent data for inflation show values greater than the model estimates. July and August were closer to the 3% mark, while a large drop occurred in September, followed by an increase in October. The prediction for October from the VAR model is very close to the actual measure of inflation for the month, suggesting that it could be the more accurate model. Both models arrive at an estimate of close to 2% in November, and a slight uptick in December, possibly due to the holiday season. To build a more robust model, an average can be taken of both forecasts to generate the most accurate forecast within the scope of this paper.

y.fdf = as.data.frame(y.f)
arima.fc = y.fdf[,1]
avg.fc = (y.var.fc + arima.fc)/2
avg.fc

##           Jul      Aug      Sep      Oct      Nov      Dec
## 2018 2.554309 2.507634 2.443791 2.320746 1.968200 2.072783

To conclude this paper, because the data ends on June 2018, I will make a policy rate decision based on as if I were conducting this analysis in July 2018. I would decide to raise the policy rate by 25 basis points, due to the inflation rate floating around the upper bound of the target range. The Bank of Canada fell to the same conclusion in July 2018 and increased the policy rate by 25 basis points, to 1.75%.

autoplot(y) +
  autolayer(y1n, series = "Recent Data") +
  autolayer(avg.fc, series = "Forecast") + 
  xlab("Year") + ylab("Inflation (%)") +
  ggtitle("Canadian Inflation Rate  Forecast") + 
  guides(colour=guide_legend(title=""))

References

Romer, David. Advanced Macroeconomics. 4th ed., McGraw-Hill, 2012.

Hyndman, Rob J, and George Athanasopoulos. Forecasting: Principles and Practice. 2nd ed., Monash University, Australia, 2018.

Mishkin, Frederic S., and Apostolos Serletis. The Economics of Money, Banking, and Financial Markets. Pearson, 2016.

Meyler, Aidan, Kenny, Geoff, and Terry Quinn. “Forecasting irish inflation using ARIMA models.” Central Bank and Financial Services Authority of Ireland Technical Paper Series, vol. 1998, No. 3/RT/98, Dec. 1998, pp. 1-48.

Iqbal, Muhammad, and Amjad Naveed. “Forecasting Inflation: Autoregressive Integrated Moving Average Model.” European Scientific Journal, vol. 12, No.1, Jan. 2016, pp. 83–92.

Canada. Bank of Canada. Governing Council of the Bank of Canada. Monetary Policy Report July 2017. [Ottawa]: Bank of Canada, 2017. Publications. Web. October 26. 2018

Canada. Bank of Canada. Governing Council of the Bank of Canada. Monetary Policy Report October 2018. [Ottawa]: Bank of Canada, 2018. Publications. Web. October 26. 2018

ECON6210 Term Paper - Canadian Inflation Rate

Sindri Shtepani

December 10, 2018