Introduction

This paper aims to analyse two economics time series about Belgium, and these series are GDPC and INDIVC, where the main goal is analysing the series univariately and jointly and then coming up with a good model can be used for interpretation and forecasting. The notation GDPC stands for the Gross domestic product per capita at market prices with euro unit as published on EUROSTAT website quarterly from 1995Q1 till 2018Q2. The notation INDIVC stands for the actual individual consumption with euro unit as published on EUROSTAT website quarterly from 1995Q1 till 2018Q2, therefore the number of observations for these series are 94 obs.

#Read Data and define the variables.
library(readxl)
Belgium_data <- read_excel("C:/Users/Hatem/Desktop/Master Degree/Second Year/Time Series/Assignment/GDP and Cons per capita-belgium.xls", sheet = "Belgium")
Belgium <- data.frame(Belgium_data)
GDPC <- ts(Belgium$GDPC, frequency = 4, start = c(1995,1))# GDP per capita
INDIVC <- ts(Belgium$AIC, frequency = 4, start = c(1995,1))# Individual Consumption

Data Description

Table.1 shows that the quarterly average of GDP per capita during period of study approximately is 7563 euros with standard deviation 1393 euros, whereas the quarterly average of individual consumption approximately is 4994 euros with 955 euros standard deviation. From the plots of GDPC and INDIVC in level which appear in Figure.1, it is shown that there are increasing deterministic trend and seasonal pattern. Figure.2 presents the seasonal plots for GDPC and INDIVC.

kable(describe(GDPC), caption = "Descriptive Statistics of GDP")

Descriptive Statistics of GDP
	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
X1	1	94	7562.766	1393.378	7850	7567.105	1704.99	5300	10200	4900	-0.1076051	-1.25883	143.716

kable(describe(INDIVC), caption = "Descriptive Statistics of Indiviual consumpption")

Descriptive Statistics of Indiviual consumpption
	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
X1	1	94	4993.617	954.7627	5000	4986.842	1334.34	3400	6700	3300	0.0020217	-1.38225	98.47625

dGDPC <- diff(GDPC)
dINDIVC <- diff(INDIVC)
sdGDPC <- diff(diff(GDPC, lag = 4))
sdINDIVC <- diff(diff(INDIVC, lag = 4))

Stationary

To analyse the time series and build a forecasting model for, using statistical methodology like (ARMA), the series must be stationary. Stationarity can be checked from line plot, or by unit root tests. Many statistical tests can be used to test that if a time series has a unit root or not. Augmented Dickey-Fuller test (ADF) was used here. Figure.1 shows the series plots in level and first difference, which reveals that the series are non-stationary in level, but they became stationary after first difference, the results of ADF-test below emphasize that, because the test doesn’t reject the null hypotheses (Ho: Series are non-stationary) at 0.05 when the time series in level, but the null hypotheses are rejected when the time series in the first difference, therefore the time series of GDPC and INDIVC are stationary after one difference, which means each one of them is integrated from the first order. Moreover, Figure.1 shows the de-seasonal time series which look stationary as well, and ADF-test in Table.2 emphasize that again.

par(mfrow = c(3,2))
plot(GDPC, ylab = "GDP per capita", type = "o", main = "Time series in level")
plot(INDIVC, ylab = "Individual consumption", type = "o", main = "Time series in level")
plot(dGDPC, ylab = "GDP per capita", type = "o", main = "Time series after first difference")
plot(dINDIVC, ylab = "Individual consumption", type = "o", main = "Time series after first difference")
plot(sdGDPC, ylab = "GDP per capita", type = "o", main = "Time series after seasonal and first differences")
plot(sdINDIVC, ylab = "Individual consumption", type = "o", main = "Time series after seasonal and first differences")

Figure 1: GDPC and INDIVC in level, first difference and difference of de-seasonality

par(mfrow = c(2,2))
seasonplot(GDPC)
seasonplot(INDIVC)
seasonplot(diff(GDPC))
seasonplot(diff(INDIVC))

Figure 2: Seasonal plots of GDPC and INDIVC in level and first differences.

#Stationary Test.
CADFtest(GDPC)

## 
##  ADF test
## 
## data:  GDPC
## ADF(1) = -2.6118, p-value = 0.2762
## alternative hypothesis: true delta is less than 0
## sample estimates:
##      delta 
## -0.2712786

CADFtest(INDIVC)

## 
##  ADF test
## 
## data:  INDIVC
## ADF(1) = -3.2601, p-value = 0.07956
## alternative hypothesis: true delta is less than 0
## sample estimates:
##      delta 
## -0.3890632

# Stationary test after first difference.
CADFtest(dGDPC)

## 
##  ADF test
## 
## data:  dGDPC
## ADF(1) = -8.0392, p-value = 3.243e-09
## alternative hypothesis: true delta is less than 0
## sample estimates:
##     delta 
## -1.704165

CADFtest(dINDIVC)

## 
##  ADF test
## 
## data:  dINDIVC
## ADF(1) = -8.9753, p-value = 9.156e-11
## alternative hypothesis: true delta is less than 0
## sample estimates:
##     delta 
## -1.865636

#stationary after de-seasonality
CADFtest(sdGDPC)

## 
##  ADF test
## 
## data:  sdGDPC
## ADF(1) = -5.9849, p-value = 1.317e-05
## alternative hypothesis: true delta is less than 0
## sample estimates:
##      delta 
## -0.9186413

CADFtest(sdINDIVC)

## 
##  ADF test
## 
## data:  sdINDIVC
## ADF(1) = -9.0231, p-value = 1.048e-10
## alternative hypothesis: true delta is less than 0
## sample estimates:
##     delta 
## -1.603567

Univariate Analysis

Specification

To specify an ARMA or SARMA model for a univariate stationary time series, autocorrelation and partial autocorrelation functions plots were used. Figures.3 shows the Acf and Pacf for stationary de-seasonal GDPC and INDIVC series, initially for GDPC series one can suggest SARIMA(0,1,0)(0,1,1)[4] because the Acf has significant correlation at lag=4 while Pacf does not have a nice structure. For INDIV series one can suggest SARIMA (1,1,0)(2,1,0)[4] as an initial model, because Pacf has significant correlations at lag=1,4 and 8, or SARIMA(0,1,1)(0,1,1)[4] as an initial model, because acf has significant correlations at lag=1 and 4.

#Models Specification using correlogram
par(mfrow = c(2,2))
#Acf and Parial acf for GDP.Capita after first diff. and seasonal diff.
acf(diff(diff((GDPC), lag = 4)), lag.max = 30, main = "First difference of de-seasonal GDP per Capita")
pacf(diff(diff((GDPC), lag = 4)), lag.max = 30, main = "First difference of de-seasonal GDP per Capita")
#Acf and Parial acf for Indiv.C after first diff. and seasonal diff.
acf(diff(diff((INDIVC), lag = 4)), lag.max = 30, main = "First difference of de-seasonal Individual consumption")
pacf(diff(diff((INDIVC), lag = 4)), lag.max = 30, main = "First difference of de-seasonal Individual consumption")

Figure 3: Ac and Pac functions for stationary de-seasonal GDPC and INDIV.

Estimation and Selection

Based on what is suggested before, the initial models were estimated beside group of different models were estimated by trial. The R outputs below show goodness of fit criteria (RMSE, MAE, AIC, BIC), to help with selecting the best model, statistically the model that has smallest values of those criteria is considered the best. Moreover Box-test significance and number of significant coefficients are considered for more help at model selection. With respect to GDP per capita, the criteria point out that SARIMA(1,1,1)(0,1,1)[4] is the best model, with all its coefficients are statistically significant at 0.05 level.

GDP Per Capita

GDP per capita ~ ARIMA(0,1,0)(0,1,1)[4]

GDPC.model1 <- Arima(GDPC, order = c(0,1,0), seasonal = c(0,1,1), method = "ML")
summary(GDPC.model1)

## Series: GDPC 
## ARIMA(0,1,0)(0,1,1)[4] 
## 
## Coefficients:
##          sma1
##       -0.4645
## s.e.   0.0871
## 
## sigma^2 estimated as 7754:  log likelihood=-524.8
## AIC=1053.6   AICc=1053.74   BIC=1058.58
## 
## Training set error measures:
##                    ME     RMSE      MAE        MPE      MAPE     MASE
## Training set 2.937524 85.20072 62.59226 0.03726042 0.8289887 0.291881
##                     ACF1
## Training set -0.07458765

tsdiag(GDPC.model1)

Figure 4:Residuals Diagnostics

Box.test(GDPC.model1$residuals, lag = 10, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  GDPC.model1$residuals
## X-squared = 17.851, df = 10, p-value = 0.05752

GDP per capita ~ ARIMA(0,1,0)(1,1,1)[4]

GDPC.model2 <- Arima(GDPC, order = c(0,1,0), seasonal = c(1,1,1), method = "ML")
summary(GDPC.model2)

## Series: GDPC 
## ARIMA(0,1,0)(1,1,1)[4] 
## 
## Coefficients:
##          sar1     sma1
##       -0.0942  -0.3977
## s.e.   0.1951   0.1730
## 
## sigma^2 estimated as 7819:  log likelihood=-524.69
## AIC=1055.37   AICc=1055.65   BIC=1062.84
## 
## Training set error measures:
##                    ME     RMSE     MAE        MPE      MAPE      MASE
## Training set 2.941644 85.06881 62.5358 0.03729303 0.8277721 0.2916177
##                     ACF1
## Training set -0.06151828

tsdiag(GDPC.model2)

Figure 5:Residuals Diagnostics

Box.test(GDPC.model2$residuals, lag = 10, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  GDPC.model2$residuals
## X-squared = 18.063, df = 10, p-value = 0.0539

GDP per capita ~ ARIMA(0,1,1)(1,1,1)[4]

GDPC.model3 <- Arima(GDPC, order = c(0,1,1), seasonal = c(1,1,1), method = "ML")
summary(GDPC.model3)

## Series: GDPC 
## ARIMA(0,1,1)(1,1,1)[4] 
## 
## Coefficients:
##           ma1     sar1     sma1
##       -0.0584  -0.0641  -0.4359
## s.e.   0.1016   0.1954   0.1754
## 
## sigma^2 estimated as 7875:  log likelihood=-524.52
## AIC=1057.04   AICc=1057.52   BIC=1067
## 
## Training set error measures:
##                    ME     RMSE      MAE        MPE      MAPE      MASE
## Training set 3.290439 84.88136 62.90117 0.04310862 0.8329311 0.2933215
##                      ACF1
## Training set -0.005495464

tsdiag(GDPC.model3)

Figure 6:Residuals Diagnostics

Box.test(GDPC.model3$residuals, lag = 10, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  GDPC.model3$residuals
## X-squared = 17.875, df = 10, p-value = 0.0571

GDP per capita ~ ARIMA(1,1,1)(0,1,1)[4]

GDPC.model4 <- Arima(GDPC, order = c(1,1,1), seasonal = c(0,1,1), method = "ML")
summary(GDPC.model4)

## Series: GDPC 
## ARIMA(1,1,1)(0,1,1)[4] 
## 
## Coefficients:
##           ar1     ma1     sma1
##       -0.9956  0.9352  -0.6831
## s.e.   0.0102  0.0747   0.0893
## 
## sigma^2 estimated as 6851:  log likelihood=-519.39
## AIC=1046.78   AICc=1047.26   BIC=1056.74
## 
## Training set error measures:
##                    ME     RMSE     MAE        MPE      MAPE      MASE
## Training set 4.448005 79.16761 58.9668 0.06662214 0.7859778 0.2749747
##                    ACF1
## Training set 0.04169637

tsdiag(GDPC.model4)

Figure 7:Residuals Diagnostics

Box.test(GDPC.model4$residuals, lag = 10, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  GDPC.model4$residuals
## X-squared = 7.5577, df = 10, p-value = 0.672

With respect to individual consumption series, the criteria point out that SARIMA(0,1,1)(0,1,1)[4] is the best model, with its both coefficients are statistically significant at 0.05 level. Accordingly, the results of model estimation are shown below.

Individual Consumption

Consumption ~ ARIMA(1,1,0)(2,1,0)[4]

INDIVC.model1 <- Arima(INDIVC, order = c(1,1,0), seasonal = c(2,1,0), method = "ML")
summary(INDIVC.model1)

## Series: INDIVC 
## ARIMA(1,1,0)(2,1,0)[4] 
## 
## Coefficients:
##           ar1     sar1     sar2
##       -0.4057  -0.4701  -0.3448
## s.e.   0.1038   0.1079   0.1056
## 
## sigma^2 estimated as 3115:  log likelihood=-483.55
## AIC=975.09   AICc=975.57   BIC=985.05
## 
## Training set error measures:
##                      ME     RMSE      MAE         MPE      MAPE      MASE
## Training set -0.2087445 53.38094 38.03756 -0.01403091 0.7752754 0.2976853
##                    ACF1
## Training set 0.01196252

tsdiag(INDIVC.model1)

Figure 8:Residuals Diagnostics

Box.test(INDIVC.model1$residuals, lag = 10, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  INDIVC.model1$residuals
## X-squared = 7.3313, df = 10, p-value = 0.6938

Consumption ~ ARIMA(0,1,1)(2,1,0)[4]

INDIVC.model2 <- Arima(INDIVC, order = c(0,1,1), seasonal = c(2,1,0), method = "ML")
summary(INDIVC.model2)

## Series: INDIVC 
## ARIMA(0,1,1)(2,1,0)[4] 
## 
## Coefficients:
##           ma1     sar1     sar2
##       -0.3947  -0.4566  -0.3308
## s.e.   0.0968   0.1107   0.1053
## 
## sigma^2 estimated as 3138:  log likelihood=-483.85
## AIC=975.69   AICc=976.17   BIC=985.65
## 
## Training set error measures:
##                     ME     RMSE      MAE          MPE      MAPE      MASE
## Training set 0.1548785 53.58194 38.35637 -0.005964718 0.7853136 0.3001803
##                      ACF1
## Training set -0.003050254

tsdiag(INDIVC.model2)

Figure 9:Residuals Diagnostics

Box.test(INDIVC.model2$residuals, lag = 10, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  INDIVC.model2$residuals
## X-squared = 8.1558, df = 10, p-value = 0.6136

Consumption ~ ARIMA(0,1,1)(0,1,1)[4]

INDIVC.model3 <- Arima(INDIVC, order = c(0,1,1), seasonal = c(0,1,1), method = "ML")
summary(INDIVC.model3)

## Series: INDIVC 
## ARIMA(0,1,1)(0,1,1)[4] 
## 
## Coefficients:
##           ma1     sma1
##       -0.3887  -0.6233
## s.e.   0.0946   0.0934
## 
## sigma^2 estimated as 3030:  log likelihood=-483.07
## AIC=972.14   AICc=972.43   BIC=979.61
## 
## Training set error measures:
##                     ME     RMSE      MAE         MPE      MAPE      MASE
## Training set 0.6352748 52.95694 38.72873 0.007073504 0.7922808 0.3030944
##                      ACF1
## Training set -0.001996647

tsdiag(INDIVC.model3)

Figure 10:Residuals Diagnostics

Box.test(INDIVC.model3$residuals, lag = 10, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  INDIVC.model3$residuals
## X-squared = 7.5114, df = 10, p-value = 0.6764

Consumption ~ ARIMA(0,1,1)(2,1,1)[4]

INDIVC.model4 <- Arima(INDIVC, order = c(0,1,1), seasonal = c(2,1,1), method = "ML")
summary(INDIVC.model4)

## Series: INDIVC 
## ARIMA(0,1,1)(2,1,1)[4] 
## 
## Coefficients:
##           ma1    sar1     sar2     sma1
##       -0.4189  0.0687  -0.1589  -0.6021
## s.e.   0.0972  0.1937   0.1398   0.1676
## 
## sigma^2 estimated as 3022:  log likelihood=-482.03
## AIC=974.05   AICc=974.77   BIC=986.49
## 
## Training set error measures:
##                     ME     RMSE     MAE         MPE      MAPE      MASE
## Training set 0.6596787 52.27499 38.1716 0.009033021 0.7796661 0.2987342
##                      ACF1
## Training set -0.007305399

tsdiag(INDIVC.model4)

Figure 11:Residuals Diagnostics

Box.test(INDIVC.model4$residuals, lag = 10, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  INDIVC.model4$residuals
## X-squared = 6.0151, df = 10, p-value = 0.814

Validation

To check the validity of the models, residuals correlograms were plotted, and the Ljung-Box statistic (Q-statistic) was calculated to test the white noise of residuals. Figures 7 and 10 show the standardised residuals, autocorrelation functions of residuals and p-values of Ljung-Box statistic at different lags, for GDP per capita model and individual consumptions model respectively. For GDP per capita model, the Ljung-Box statistic was highly insignificant at different lags, and the Acf shows insignificant correlation except at lag =7, which is not that high, so it is ignored. For individual consumption model, the Ljung-Box statistic was highly insignificant at different lags, and the residuals Acf does not show any significant correlation.

Forecasting

Using the suggested models, the time series were fitted, and as mentioned before the models have the lowest values of criteria (RMSE, AIC, MAE). Figure 12 presents comparison between actual time series and fitted one, in addition to forecasted values for 10 future time points (10 quarter = 2.5 years). The figure reveals that there are highly matching between actual and fitted series, which means the model could be good enough for forecasting. Tables below present, the forecasted values of our times series using the chosen models.

GDP

Time	Point	Lo95	Hi95
2018Q3	9575.757	9413.534	9737.980
2018Q4	10389.451	10166.850	10612.052
2019Q1	9627.420	9352.005	9902.836
2019Q2	10220.162	9905.358	10534.967
2019Q3	9774.710	9394.309	10155.112
2019Q4	10609.520	10179.363	11039.678
2020Q1	9826.466	9346.243	10306.690
2020Q2	10440.139	9919.594	10960.685
2020Q3	9973.849	9389.800	10557.898
2020Q4	10829.405	10194.397	11464.414

Consumption

Time	Point	Lo95	Hi95
2018Q3	6501.271	6393.383	6609.160
2018Q4	6820.712	6694.263	6947.161
2019Q1	6456.144	6313.530	6598.757
2019Q2	6726.139	6569.015	6883.263
2019Q3	6632.888	6443.019	6822.758
2019Q4	6952.329	6741.868	7162.791
2020Q1	6587.761	6358.550	6816.972
2020Q2	6857.756	6611.217	7104.295
2020Q3	6764.506	6485.117	7043.894
2020Q4	7083.947	6781.572	7386.322

#Actual, fitted and Forecasting plot
win.graph(width=15,height=12,pointsize=8)
par(mfrow = c(2,1))
plot(forecast(GDPC.model4, h = 10), ylab = "GDP per capita")
lines(GDPC.model4$fitted, col = "red")
legend(1995, 11000, legend = c("Atual", "Fitted", "Forecasting"),
       col = c("black", "red", "blue"), lty=1, cex=0.8)
forecast(GDPC.model4, h = 10)

##         Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
## 2018 Q3       9575.757  9469.685  9681.829  9413.534  9737.980
## 2018 Q4      10389.451 10243.900 10535.002 10166.850 10612.052
## 2019 Q1       9627.420  9447.336  9807.505  9352.005  9902.836
## 2019 Q2      10220.162 10014.323 10426.002  9905.358 10534.967
## 2019 Q3       9774.710  9525.979 10023.441  9394.309 10155.112
## 2019 Q4      10609.520 10328.255 10890.785 10179.363 11039.678
## 2020 Q1       9826.466  9512.465 10140.468  9346.243 10306.690
## 2020 Q2      10440.139 10099.773 10780.506  9919.594 10960.685
## 2020 Q3       9973.849  9591.960 10355.738  9389.800 10557.898
## 2020 Q4      10829.405 10414.196 11244.615 10194.397 11464.414

plot(forecast(INDIVC.model3, h = 10), ylab = "Individual Consumption")
lines(INDIVC.model3$fitted, col = "red")
legend(1995, 7000, legend = c("Atual", "Fitted", "Forecasting"),
       col = c("black", "red", "blue"), lty=1, cex=0.8)

Figure 12: Actual, fitted, and forecasting times series of GDPC and INDIVC

forecast(INDIVC.model3, h = 10)

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2018 Q3       6501.271 6430.727 6571.816 6393.383 6609.160
## 2018 Q4       6820.712 6738.032 6903.392 6694.263 6947.161
## 2019 Q1       6456.144 6362.894 6549.394 6313.530 6598.757
## 2019 Q2       6726.139 6623.401 6828.877 6569.015 6883.263
## 2019 Q3       6632.888 6508.740 6757.037 6443.019 6822.758
## 2019 Q4       6952.329 6814.716 7089.943 6741.868 7162.791
## 2020 Q1       6587.761 6437.888 6737.634 6358.550 6816.972
## 2020 Q2       6857.756 6696.553 7018.960 6611.217 7104.295
## 2020 Q3       6764.506 6581.823 6947.188 6485.117 7043.894
## 2020 Q4       7083.947 6886.234 7281.659 6781.572 7386.322

Multivariate Analysis

multivariate times series analysis is used to develop a model describes the relationship between GDP per capita and individual consumption, based on Economic aspect, GDP per capita should affect in individual consumption positively, which means that the increasing in GDP per capita should leads to increasing in individual consumption. Based on that Economic aspect, different multivariate models will be estimated, to see which one suits data better.

Co-integration test

Engle-Granger approach is used to test the cointegration between GDPC and INDIVC, because the model is simple with two variables. The cointegration will be tested for time series in level, and de-seasonal time series. The results of stationary test for residuals of \(INDVI_t = \beta_0 + \beta_1GDPC_t+\epsilon_t\) shows that ADF-stat is significant for the residuals of regression of de-seasonal series, which means the (Ho) is rejected, therefore the residual is stationary, and the de-seasonal series are cointegrated.

#Co-integration test
#level Series
mod1 <- lm(INDIVC~GDPC)
CADFtest(mod1$residuals)

## 
##  ADF test
## 
## data:  mod1$residuals
## ADF(1) = -1.9853, p-value = 0.6013
## alternative hypothesis: true delta is less than 0
## sample estimates:
##      delta 
## -0.1504641

#De-seasonal series
xv <- diff(GDPC, lag = 4)
yv <- diff(INDIVC, lag = 4)
mod2 <- lm(yv ~ xv)
CADFtest(mod2$residuals)

## 
##  ADF test
## 
## data:  mod2$residuals
## ADF(1) = -3.8923, p-value = 0.01636
## alternative hypothesis: true delta is less than 0
## sample estimates:
##     delta 
## -0.448944

#Residuals Plots
par(mfrow = c(2,1))
plot.ts(mod1$residuals, main = "Residuals of regress Individual consumption on GDP per capita in level")
plot.ts(mod2$residuals, main = "Residuals of regress de-seasonal Individual consumption on de-seasonal GDP per capita")

Based on the above results, Error-correction model could be appropriate to describe the relationship between GDP per capita and individual consumption.

ECM Model

outputs below show the estimation results of ECM for de-seasonal series. The model is pretty good, but its residual is not white noise series (P-value of Box-test = 0.006) and the autocorrelation plot for residuals. Figure.7 shows significant correlation at lag 4 and 5. The short run equilibrium value is significant and equal 0.21, meaning that system corrects its previous period disequilibrium at a speed of 21% between variables Individual consumption and GDP per capita. ECTt-1 is one period lag error correction term or residual. The long run equilibrium Parameter is negative 0.61 and it is significant at 5% level meaning that system corrects its previous period disequilibrium at a speed of 61% quarterly. It implies that the model identified the sizable speed of adjustment by 61% of disequilibrium correction quarterly for reaching long run equilibrium steady state position.

#Error correction model.
ect.1 <- mod2$residuals[-90]
ECM <- lm(diff(yv)~ diff(xv)+ect.1)
summary(ECM)

## 
## Call:
## lm(formula = diff(yv) ~ diff(xv) + ect.1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -120.245  -24.500    7.277   32.496  121.167 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -1.14469    5.63838  -0.203 0.839601    
## diff(xv)     0.20706    0.05804   3.568 0.000592 ***
## ect.1       -0.61252    0.09621  -6.366 9.12e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 53.19 on 86 degrees of freedom
## Multiple R-squared:  0.4065, Adjusted R-squared:  0.3927 
## F-statistic: 29.45 on 2 and 86 DF,  p-value: 1.813e-10

Box.test(ECM$residuals, lag = 10)

## 
##  Box-Pierce test
## 
## data:  ECM$residuals
## X-squared = 24.472, df = 10, p-value = 0.006441

par(mfrow = c(1,1))
plot.ts(ECM$residuals)

acf(ECM$residuals)

AIC <- AIC(ECM)
RMSE <- sqrt(mean(ECM$residuals^2))
MAE <- mean(abs(ECM$residuals))
c("AIC" = AIC,"RMSE" = RMSE, "MAE" = MAE)

##       AIC      RMSE       MAE 
## 964.85988  52.28339  41.93912

X-SARIMA Model

output below shows the regression model of individual consumption on GDP per capita with SARMA terms results GDPC-SARIMA(0,1,1)(0,1,1)[4]. Which the same SARIMA model for INDIVC in previous added to it GDPC as independent variable in the equation. The criteria of goodness of fit of the model show very good model compare to ECM where Box-test is insignificant and autocorrelation plot of residual shows no significant correlation, which means that the residual is a white noise series. The regression coefficient of GDP per capita is 0.33, which means that the increasing by one unit in GDP per capita will increase individual consumption by 0.33 units simultaneously.

#X-SARIMA MODEL
multi.1 <- Arima(INDIVC, order = c(0,1,1), seasonal = c(0,1,1), method = "ML", xreg = GDPC)
summary(multi.1)

## Series: INDIVC 
## Regression with ARIMA(0,1,1)(0,1,1)[4] errors 
## 
## Coefficients:
##           ma1     sma1    GDPC
##       -0.4645  -0.8398  0.3272
## s.e.   0.0914   0.1090  0.0552
## 
## sigma^2 estimated as 2135:  log likelihood=-468.52
## AIC=945.03   AICc=945.51   BIC=954.99
## 
## Training set error measures:
##                      ME     RMSE     MAE         MPE      MAPE      MASE
## Training set -0.1570502 44.20107 35.4749 -0.01635739 0.7271296 0.2776296
##                     ACF1
## Training set -0.01657933

acf(multi.1$residuals, main = "Residuals OF X-SARIMA Model")

tsdiag(multi.1)

Box.test(multi.1$residuals, lag = 10, type = "Ljung-Box")

## 
##  Box-Ljung test
## 
## data:  multi.1$residuals
## X-squared = 5.7445, df = 10, p-value = 0.8363

Time series Analysis

Hatem Jamil

2018 M10 30