This paper aims to analyse two economics time series about Belgium, and these series are GDPC and INDIVC, where the main goal is analysing the series univariately and jointly and then coming up with a good model can be used for interpretation and forecasting. The notation GDPC stands for the Gross domestic product per capita at market prices with euro unit as published on EUROSTAT website quarterly from 1995Q1 till 2018Q2. The notation INDIVC stands for the actual individual consumption with euro unit as published on EUROSTAT website quarterly from 1995Q1 till 2018Q2, therefore the number of observations for these series are 94 obs.
#Read Data and define the variables.
library(readxl)
Belgium_data <- read_excel("C:/Users/Hatem/Desktop/Master Degree/Second Year/Time Series/Assignment/GDP and Cons per capita-belgium.xls", sheet = "Belgium")
Belgium <- data.frame(Belgium_data)
GDPC <- ts(Belgium$GDPC, frequency = 4, start = c(1995,1))# GDP per capita
INDIVC <- ts(Belgium$AIC, frequency = 4, start = c(1995,1))# Individual Consumption
Table.1 shows that the quarterly average of GDP per capita during period of study approximately is 7563 euros with standard deviation 1393 euros, whereas the quarterly average of individual consumption approximately is 4994 euros with 955 euros standard deviation. From the plots of GDPC and INDIVC in level which appear in Figure.1, it is shown that there are increasing deterministic trend and seasonal pattern. Figure.2 presents the seasonal plots for GDPC and INDIVC.
kable(describe(GDPC), caption = "Descriptive Statistics of GDP")
| vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| X1 | 1 | 94 | 7562.766 | 1393.378 | 7850 | 7567.105 | 1704.99 | 5300 | 10200 | 4900 | -0.1076051 | -1.25883 | 143.716 |
kable(describe(INDIVC), caption = "Descriptive Statistics of Indiviual consumpption")
| vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| X1 | 1 | 94 | 4993.617 | 954.7627 | 5000 | 4986.842 | 1334.34 | 3400 | 6700 | 3300 | 0.0020217 | -1.38225 | 98.47625 |
dGDPC <- diff(GDPC)
dINDIVC <- diff(INDIVC)
sdGDPC <- diff(diff(GDPC, lag = 4))
sdINDIVC <- diff(diff(INDIVC, lag = 4))
To analyse the time series and build a forecasting model for, using statistical methodology like (ARMA), the series must be stationary. Stationarity can be checked from line plot, or by unit root tests. Many statistical tests can be used to test that if a time series has a unit root or not. Augmented Dickey-Fuller test (ADF) was used here. Figure.1 shows the series plots in level and first difference, which reveals that the series are non-stationary in level, but they became stationary after first difference, the results of ADF-test below emphasize that, because the test doesn’t reject the null hypotheses (Ho: Series are non-stationary) at 0.05 when the time series in level, but the null hypotheses are rejected when the time series in the first difference, therefore the time series of GDPC and INDIVC are stationary after one difference, which means each one of them is integrated from the first order. Moreover, Figure.1 shows the de-seasonal time series which look stationary as well, and ADF-test in Table.2 emphasize that again.
par(mfrow = c(3,2))
plot(GDPC, ylab = "GDP per capita", type = "o", main = "Time series in level")
plot(INDIVC, ylab = "Individual consumption", type = "o", main = "Time series in level")
plot(dGDPC, ylab = "GDP per capita", type = "o", main = "Time series after first difference")
plot(dINDIVC, ylab = "Individual consumption", type = "o", main = "Time series after first difference")
plot(sdGDPC, ylab = "GDP per capita", type = "o", main = "Time series after seasonal and first differences")
plot(sdINDIVC, ylab = "Individual consumption", type = "o", main = "Time series after seasonal and first differences")
Figure 1: GDPC and INDIVC in level, first difference and difference of de-seasonality
par(mfrow = c(2,2))
seasonplot(GDPC)
seasonplot(INDIVC)
seasonplot(diff(GDPC))
seasonplot(diff(INDIVC))
Figure 2: Seasonal plots of GDPC and INDIVC in level and first differences.
#Stationary Test.
CADFtest(GDPC)
##
## ADF test
##
## data: GDPC
## ADF(1) = -2.6118, p-value = 0.2762
## alternative hypothesis: true delta is less than 0
## sample estimates:
## delta
## -0.2712786
CADFtest(INDIVC)
##
## ADF test
##
## data: INDIVC
## ADF(1) = -3.2601, p-value = 0.07956
## alternative hypothesis: true delta is less than 0
## sample estimates:
## delta
## -0.3890632
# Stationary test after first difference.
CADFtest(dGDPC)
##
## ADF test
##
## data: dGDPC
## ADF(1) = -8.0392, p-value = 3.243e-09
## alternative hypothesis: true delta is less than 0
## sample estimates:
## delta
## -1.704165
CADFtest(dINDIVC)
##
## ADF test
##
## data: dINDIVC
## ADF(1) = -8.9753, p-value = 9.156e-11
## alternative hypothesis: true delta is less than 0
## sample estimates:
## delta
## -1.865636
#stationary after de-seasonality
CADFtest(sdGDPC)
##
## ADF test
##
## data: sdGDPC
## ADF(1) = -5.9849, p-value = 1.317e-05
## alternative hypothesis: true delta is less than 0
## sample estimates:
## delta
## -0.9186413
CADFtest(sdINDIVC)
##
## ADF test
##
## data: sdINDIVC
## ADF(1) = -9.0231, p-value = 1.048e-10
## alternative hypothesis: true delta is less than 0
## sample estimates:
## delta
## -1.603567
To specify an ARMA or SARMA model for a univariate stationary time series, autocorrelation and partial autocorrelation functions plots were used. Figures.3 shows the Acf and Pacf for stationary de-seasonal GDPC and INDIVC series, initially for GDPC series one can suggest SARIMA(0,1,0)(0,1,1)[4] because the Acf has significant correlation at lag=4 while Pacf does not have a nice structure. For INDIV series one can suggest SARIMA (1,1,0)(2,1,0)[4] as an initial model, because Pacf has significant correlations at lag=1,4 and 8, or SARIMA(0,1,1)(0,1,1)[4] as an initial model, because acf has significant correlations at lag=1 and 4.
#Models Specification using correlogram
par(mfrow = c(2,2))
#Acf and Parial acf for GDP.Capita after first diff. and seasonal diff.
acf(diff(diff((GDPC), lag = 4)), lag.max = 30, main = "First difference of de-seasonal GDP per Capita")
pacf(diff(diff((GDPC), lag = 4)), lag.max = 30, main = "First difference of de-seasonal GDP per Capita")
#Acf and Parial acf for Indiv.C after first diff. and seasonal diff.
acf(diff(diff((INDIVC), lag = 4)), lag.max = 30, main = "First difference of de-seasonal Individual consumption")
pacf(diff(diff((INDIVC), lag = 4)), lag.max = 30, main = "First difference of de-seasonal Individual consumption")
Figure 3: Ac and Pac functions for stationary de-seasonal GDPC and INDIV.
Based on what is suggested before, the initial models were estimated beside group of different models were estimated by trial. The R outputs below show goodness of fit criteria (RMSE, MAE, AIC, BIC), to help with selecting the best model, statistically the model that has smallest values of those criteria is considered the best. Moreover Box-test significance and number of significant coefficients are considered for more help at model selection. With respect to GDP per capita, the criteria point out that SARIMA(1,1,1)(0,1,1)[4] is the best model, with all its coefficients are statistically significant at 0.05 level.
GDPC.model1 <- Arima(GDPC, order = c(0,1,0), seasonal = c(0,1,1), method = "ML")
summary(GDPC.model1)
## Series: GDPC
## ARIMA(0,1,0)(0,1,1)[4]
##
## Coefficients:
## sma1
## -0.4645
## s.e. 0.0871
##
## sigma^2 estimated as 7754: log likelihood=-524.8
## AIC=1053.6 AICc=1053.74 BIC=1058.58
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 2.937524 85.20072 62.59226 0.03726042 0.8289887 0.291881
## ACF1
## Training set -0.07458765
tsdiag(GDPC.model1)
Figure 4:Residuals Diagnostics
Box.test(GDPC.model1$residuals, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: GDPC.model1$residuals
## X-squared = 17.851, df = 10, p-value = 0.05752
GDPC.model2 <- Arima(GDPC, order = c(0,1,0), seasonal = c(1,1,1), method = "ML")
summary(GDPC.model2)
## Series: GDPC
## ARIMA(0,1,0)(1,1,1)[4]
##
## Coefficients:
## sar1 sma1
## -0.0942 -0.3977
## s.e. 0.1951 0.1730
##
## sigma^2 estimated as 7819: log likelihood=-524.69
## AIC=1055.37 AICc=1055.65 BIC=1062.84
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 2.941644 85.06881 62.5358 0.03729303 0.8277721 0.2916177
## ACF1
## Training set -0.06151828
tsdiag(GDPC.model2)
Figure 5:Residuals Diagnostics
Box.test(GDPC.model2$residuals, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: GDPC.model2$residuals
## X-squared = 18.063, df = 10, p-value = 0.0539
GDPC.model3 <- Arima(GDPC, order = c(0,1,1), seasonal = c(1,1,1), method = "ML")
summary(GDPC.model3)
## Series: GDPC
## ARIMA(0,1,1)(1,1,1)[4]
##
## Coefficients:
## ma1 sar1 sma1
## -0.0584 -0.0641 -0.4359
## s.e. 0.1016 0.1954 0.1754
##
## sigma^2 estimated as 7875: log likelihood=-524.52
## AIC=1057.04 AICc=1057.52 BIC=1067
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 3.290439 84.88136 62.90117 0.04310862 0.8329311 0.2933215
## ACF1
## Training set -0.005495464
tsdiag(GDPC.model3)
Figure 6:Residuals Diagnostics
Box.test(GDPC.model3$residuals, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: GDPC.model3$residuals
## X-squared = 17.875, df = 10, p-value = 0.0571
GDPC.model4 <- Arima(GDPC, order = c(1,1,1), seasonal = c(0,1,1), method = "ML")
summary(GDPC.model4)
## Series: GDPC
## ARIMA(1,1,1)(0,1,1)[4]
##
## Coefficients:
## ar1 ma1 sma1
## -0.9956 0.9352 -0.6831
## s.e. 0.0102 0.0747 0.0893
##
## sigma^2 estimated as 6851: log likelihood=-519.39
## AIC=1046.78 AICc=1047.26 BIC=1056.74
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 4.448005 79.16761 58.9668 0.06662214 0.7859778 0.2749747
## ACF1
## Training set 0.04169637
tsdiag(GDPC.model4)
Figure 7:Residuals Diagnostics
Box.test(GDPC.model4$residuals, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: GDPC.model4$residuals
## X-squared = 7.5577, df = 10, p-value = 0.672
With respect to individual consumption series, the criteria point out that SARIMA(0,1,1)(0,1,1)[4] is the best model, with its both coefficients are statistically significant at 0.05 level. Accordingly, the results of model estimation are shown below.
INDIVC.model1 <- Arima(INDIVC, order = c(1,1,0), seasonal = c(2,1,0), method = "ML")
summary(INDIVC.model1)
## Series: INDIVC
## ARIMA(1,1,0)(2,1,0)[4]
##
## Coefficients:
## ar1 sar1 sar2
## -0.4057 -0.4701 -0.3448
## s.e. 0.1038 0.1079 0.1056
##
## sigma^2 estimated as 3115: log likelihood=-483.55
## AIC=975.09 AICc=975.57 BIC=985.05
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.2087445 53.38094 38.03756 -0.01403091 0.7752754 0.2976853
## ACF1
## Training set 0.01196252
tsdiag(INDIVC.model1)
Figure 8:Residuals Diagnostics
Box.test(INDIVC.model1$residuals, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: INDIVC.model1$residuals
## X-squared = 7.3313, df = 10, p-value = 0.6938
INDIVC.model2 <- Arima(INDIVC, order = c(0,1,1), seasonal = c(2,1,0), method = "ML")
summary(INDIVC.model2)
## Series: INDIVC
## ARIMA(0,1,1)(2,1,0)[4]
##
## Coefficients:
## ma1 sar1 sar2
## -0.3947 -0.4566 -0.3308
## s.e. 0.0968 0.1107 0.1053
##
## sigma^2 estimated as 3138: log likelihood=-483.85
## AIC=975.69 AICc=976.17 BIC=985.65
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.1548785 53.58194 38.35637 -0.005964718 0.7853136 0.3001803
## ACF1
## Training set -0.003050254
tsdiag(INDIVC.model2)
Figure 9:Residuals Diagnostics
Box.test(INDIVC.model2$residuals, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: INDIVC.model2$residuals
## X-squared = 8.1558, df = 10, p-value = 0.6136
INDIVC.model3 <- Arima(INDIVC, order = c(0,1,1), seasonal = c(0,1,1), method = "ML")
summary(INDIVC.model3)
## Series: INDIVC
## ARIMA(0,1,1)(0,1,1)[4]
##
## Coefficients:
## ma1 sma1
## -0.3887 -0.6233
## s.e. 0.0946 0.0934
##
## sigma^2 estimated as 3030: log likelihood=-483.07
## AIC=972.14 AICc=972.43 BIC=979.61
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.6352748 52.95694 38.72873 0.007073504 0.7922808 0.3030944
## ACF1
## Training set -0.001996647
tsdiag(INDIVC.model3)
Figure 10:Residuals Diagnostics
Box.test(INDIVC.model3$residuals, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: INDIVC.model3$residuals
## X-squared = 7.5114, df = 10, p-value = 0.6764
INDIVC.model4 <- Arima(INDIVC, order = c(0,1,1), seasonal = c(2,1,1), method = "ML")
summary(INDIVC.model4)
## Series: INDIVC
## ARIMA(0,1,1)(2,1,1)[4]
##
## Coefficients:
## ma1 sar1 sar2 sma1
## -0.4189 0.0687 -0.1589 -0.6021
## s.e. 0.0972 0.1937 0.1398 0.1676
##
## sigma^2 estimated as 3022: log likelihood=-482.03
## AIC=974.05 AICc=974.77 BIC=986.49
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.6596787 52.27499 38.1716 0.009033021 0.7796661 0.2987342
## ACF1
## Training set -0.007305399
tsdiag(INDIVC.model4)
Figure 11:Residuals Diagnostics
Box.test(INDIVC.model4$residuals, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: INDIVC.model4$residuals
## X-squared = 6.0151, df = 10, p-value = 0.814
To check the validity of the models, residuals correlograms were plotted, and the Ljung-Box statistic (Q-statistic) was calculated to test the white noise of residuals. Figures 7 and 10 show the standardised residuals, autocorrelation functions of residuals and p-values of Ljung-Box statistic at different lags, for GDP per capita model and individual consumptions model respectively. For GDP per capita model, the Ljung-Box statistic was highly insignificant at different lags, and the Acf shows insignificant correlation except at lag =7, which is not that high, so it is ignored. For individual consumption model, the Ljung-Box statistic was highly insignificant at different lags, and the residuals Acf does not show any significant correlation.
Using the suggested models, the time series were fitted, and as mentioned before the models have the lowest values of criteria (RMSE, AIC, MAE). Figure 12 presents comparison between actual time series and fitted one, in addition to forecasted values for 10 future time points (10 quarter = 2.5 years). The figure reveals that there are highly matching between actual and fitted series, which means the model could be good enough for forecasting. Tables below present, the forecasted values of our times series using the chosen models.
| Time | Point | Lo95 | Hi95 |
|---|---|---|---|
| 2018Q3 | 9575.757 | 9413.534 | 9737.980 |
| 2018Q4 | 10389.451 | 10166.850 | 10612.052 |
| 2019Q1 | 9627.420 | 9352.005 | 9902.836 |
| 2019Q2 | 10220.162 | 9905.358 | 10534.967 |
| 2019Q3 | 9774.710 | 9394.309 | 10155.112 |
| 2019Q4 | 10609.520 | 10179.363 | 11039.678 |
| 2020Q1 | 9826.466 | 9346.243 | 10306.690 |
| 2020Q2 | 10440.139 | 9919.594 | 10960.685 |
| 2020Q3 | 9973.849 | 9389.800 | 10557.898 |
| 2020Q4 | 10829.405 | 10194.397 | 11464.414 |
| Time | Point | Lo95 | Hi95 |
|---|---|---|---|
| 2018Q3 | 6501.271 | 6393.383 | 6609.160 |
| 2018Q4 | 6820.712 | 6694.263 | 6947.161 |
| 2019Q1 | 6456.144 | 6313.530 | 6598.757 |
| 2019Q2 | 6726.139 | 6569.015 | 6883.263 |
| 2019Q3 | 6632.888 | 6443.019 | 6822.758 |
| 2019Q4 | 6952.329 | 6741.868 | 7162.791 |
| 2020Q1 | 6587.761 | 6358.550 | 6816.972 |
| 2020Q2 | 6857.756 | 6611.217 | 7104.295 |
| 2020Q3 | 6764.506 | 6485.117 | 7043.894 |
| 2020Q4 | 7083.947 | 6781.572 | 7386.322 |
#Actual, fitted and Forecasting plot
win.graph(width=15,height=12,pointsize=8)
par(mfrow = c(2,1))
plot(forecast(GDPC.model4, h = 10), ylab = "GDP per capita")
lines(GDPC.model4$fitted, col = "red")
legend(1995, 11000, legend = c("Atual", "Fitted", "Forecasting"),
col = c("black", "red", "blue"), lty=1, cex=0.8)
forecast(GDPC.model4, h = 10)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2018 Q3 9575.757 9469.685 9681.829 9413.534 9737.980
## 2018 Q4 10389.451 10243.900 10535.002 10166.850 10612.052
## 2019 Q1 9627.420 9447.336 9807.505 9352.005 9902.836
## 2019 Q2 10220.162 10014.323 10426.002 9905.358 10534.967
## 2019 Q3 9774.710 9525.979 10023.441 9394.309 10155.112
## 2019 Q4 10609.520 10328.255 10890.785 10179.363 11039.678
## 2020 Q1 9826.466 9512.465 10140.468 9346.243 10306.690
## 2020 Q2 10440.139 10099.773 10780.506 9919.594 10960.685
## 2020 Q3 9973.849 9591.960 10355.738 9389.800 10557.898
## 2020 Q4 10829.405 10414.196 11244.615 10194.397 11464.414
plot(forecast(INDIVC.model3, h = 10), ylab = "Individual Consumption")
lines(INDIVC.model3$fitted, col = "red")
legend(1995, 7000, legend = c("Atual", "Fitted", "Forecasting"),
col = c("black", "red", "blue"), lty=1, cex=0.8)
Figure 12: Actual, fitted, and forecasting times series of GDPC and INDIVC
forecast(INDIVC.model3, h = 10)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2018 Q3 6501.271 6430.727 6571.816 6393.383 6609.160
## 2018 Q4 6820.712 6738.032 6903.392 6694.263 6947.161
## 2019 Q1 6456.144 6362.894 6549.394 6313.530 6598.757
## 2019 Q2 6726.139 6623.401 6828.877 6569.015 6883.263
## 2019 Q3 6632.888 6508.740 6757.037 6443.019 6822.758
## 2019 Q4 6952.329 6814.716 7089.943 6741.868 7162.791
## 2020 Q1 6587.761 6437.888 6737.634 6358.550 6816.972
## 2020 Q2 6857.756 6696.553 7018.960 6611.217 7104.295
## 2020 Q3 6764.506 6581.823 6947.188 6485.117 7043.894
## 2020 Q4 7083.947 6886.234 7281.659 6781.572 7386.322
multivariate times series analysis is used to develop a model describes the relationship between GDP per capita and individual consumption, based on Economic aspect, GDP per capita should affect in individual consumption positively, which means that the increasing in GDP per capita should leads to increasing in individual consumption. Based on that Economic aspect, different multivariate models will be estimated, to see which one suits data better.
Engle-Granger approach is used to test the cointegration between GDPC and INDIVC, because the model is simple with two variables. The cointegration will be tested for time series in level, and de-seasonal time series. The results of stationary test for residuals of \(INDVI_t = \beta_0 + \beta_1GDPC_t+\epsilon_t\) shows that ADF-stat is significant for the residuals of regression of de-seasonal series, which means the (Ho) is rejected, therefore the residual is stationary, and the de-seasonal series are cointegrated.
#Co-integration test
#level Series
mod1 <- lm(INDIVC~GDPC)
CADFtest(mod1$residuals)
##
## ADF test
##
## data: mod1$residuals
## ADF(1) = -1.9853, p-value = 0.6013
## alternative hypothesis: true delta is less than 0
## sample estimates:
## delta
## -0.1504641
#De-seasonal series
xv <- diff(GDPC, lag = 4)
yv <- diff(INDIVC, lag = 4)
mod2 <- lm(yv ~ xv)
CADFtest(mod2$residuals)
##
## ADF test
##
## data: mod2$residuals
## ADF(1) = -3.8923, p-value = 0.01636
## alternative hypothesis: true delta is less than 0
## sample estimates:
## delta
## -0.448944
#Residuals Plots
par(mfrow = c(2,1))
plot.ts(mod1$residuals, main = "Residuals of regress Individual consumption on GDP per capita in level")
plot.ts(mod2$residuals, main = "Residuals of regress de-seasonal Individual consumption on de-seasonal GDP per capita")
Based on the above results, Error-correction model could be appropriate to describe the relationship between GDP per capita and individual consumption.
outputs below show the estimation results of ECM for de-seasonal series. The model is pretty good, but its residual is not white noise series (P-value of Box-test = 0.006) and the autocorrelation plot for residuals. Figure.7 shows significant correlation at lag 4 and 5. The short run equilibrium value is significant and equal 0.21, meaning that system corrects its previous period disequilibrium at a speed of 21% between variables Individual consumption and GDP per capita. ECTt-1 is one period lag error correction term or residual. The long run equilibrium Parameter is negative 0.61 and it is significant at 5% level meaning that system corrects its previous period disequilibrium at a speed of 61% quarterly. It implies that the model identified the sizable speed of adjustment by 61% of disequilibrium correction quarterly for reaching long run equilibrium steady state position.
#Error correction model.
ect.1 <- mod2$residuals[-90]
ECM <- lm(diff(yv)~ diff(xv)+ect.1)
summary(ECM)
##
## Call:
## lm(formula = diff(yv) ~ diff(xv) + ect.1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -120.245 -24.500 7.277 32.496 121.167
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.14469 5.63838 -0.203 0.839601
## diff(xv) 0.20706 0.05804 3.568 0.000592 ***
## ect.1 -0.61252 0.09621 -6.366 9.12e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 53.19 on 86 degrees of freedom
## Multiple R-squared: 0.4065, Adjusted R-squared: 0.3927
## F-statistic: 29.45 on 2 and 86 DF, p-value: 1.813e-10
Box.test(ECM$residuals, lag = 10)
##
## Box-Pierce test
##
## data: ECM$residuals
## X-squared = 24.472, df = 10, p-value = 0.006441
par(mfrow = c(1,1))
plot.ts(ECM$residuals)
acf(ECM$residuals)
AIC <- AIC(ECM)
RMSE <- sqrt(mean(ECM$residuals^2))
MAE <- mean(abs(ECM$residuals))
c("AIC" = AIC,"RMSE" = RMSE, "MAE" = MAE)
## AIC RMSE MAE
## 964.85988 52.28339 41.93912
output below shows the regression model of individual consumption on GDP per capita with SARMA terms results GDPC-SARIMA(0,1,1)(0,1,1)[4]. Which the same SARIMA model for INDIVC in previous added to it GDPC as independent variable in the equation. The criteria of goodness of fit of the model show very good model compare to ECM where Box-test is insignificant and autocorrelation plot of residual shows no significant correlation, which means that the residual is a white noise series. The regression coefficient of GDP per capita is 0.33, which means that the increasing by one unit in GDP per capita will increase individual consumption by 0.33 units simultaneously.
#X-SARIMA MODEL
multi.1 <- Arima(INDIVC, order = c(0,1,1), seasonal = c(0,1,1), method = "ML", xreg = GDPC)
summary(multi.1)
## Series: INDIVC
## Regression with ARIMA(0,1,1)(0,1,1)[4] errors
##
## Coefficients:
## ma1 sma1 GDPC
## -0.4645 -0.8398 0.3272
## s.e. 0.0914 0.1090 0.0552
##
## sigma^2 estimated as 2135: log likelihood=-468.52
## AIC=945.03 AICc=945.51 BIC=954.99
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.1570502 44.20107 35.4749 -0.01635739 0.7271296 0.2776296
## ACF1
## Training set -0.01657933
acf(multi.1$residuals, main = "Residuals OF X-SARIMA Model")
tsdiag(multi.1)
Box.test(multi.1$residuals, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: multi.1$residuals
## X-squared = 5.7445, df = 10, p-value = 0.8363