Reading data

setwd("~/Desktop/MAR")
price <- read.csv("WMP.csv")

t <- price$Date
p <- price$EU.Price.
head(price)
##      Date EU.Price.
## 1 2009/05       193
## 2 2009/06       194
## 3 2009/07       194
## 4 2009/08       198
## 5 2009/09       214
## 6 2009/10       239

#Plots

The dataset start from May on 2009, let’s plot the data to see its’ general trend.

The ACF, PACF has been plot as well.

#par(mfrow=c(1,2))
Acf(time_series,lag.max=132)

Acf(time_series,type = "partial",lag.max = 132)

#lines(lowess(time_series),col="orange")
#summary(lowess(price))
#line(lowess(price))

From 05/2009 to 04/2020, there’s 132 months and 11 years exactly. I specified the number of lags equal to number of months, and we could observe the correlation of a variable with itself at all differing time lags.And same setting for PACF.

Decomposition

According to Wold’s Decomposition Theorem, a time series usually can be represented as deterministic part and random part, where we can use standard techniques to model the deterministic component, then analyse and model the following component.This project assumes there’s fundamentals in dairy commodity price, and the fundamental component is assumed to be a nonlinear deterministic trend, I try to capture it via following analysis.Following graphs show the different components of time series plot, I will find way to capture the trend and get it off in next part.

de <-decompose(time_series,"additive")
plot(de)

plot(de$x,ylab="observation")
lines(de$trend,col="green")

#stl_price <- stl(time_series,s.window = "period")
#plot(stl_price,main = "Decomposition")

The first thing I suppose to do in this part is disentangle the fundamental component from the bubble component through the dairy commodity prices. I will use polynomial regression to capture the trend and get rid of it. Linear regression, quadratic polynomial regression, cubic polynomial regression and quartic polynomial regression are fitted to capture the trend.

mon <- time(time_series)-mean(time(time_series))
mon2 <- mon^2
mon3 <- mon^3
mon4 <- mon^4
reg1 <- lm(time_series ~ mon +mon2+ mon3)
reg2 <- lm(time_series ~ mon)
reg3 <- lm(time_series~ mon + mon2)
reg4 <- lm(time_series ~ mon +mon2+ mon3+mon4)
plot(p,type="l")
lines(fitted(reg1),col="red",type = "l")
lines(fitted(reg2),col="cyan2",type = "l")
lines(fitted(reg3),col="green",type = "l")
lines(fitted(reg4),col="gold",type = "l")
legend("topright",
       c("Cubic polynomial regression","Linear trend",
         "Quadratic polynomial regression","Quartic polynomial regression"),
       col = c("red","cyan2","green","gold"),
       lwd = c("2.5","2.5","2.5","2.5"),lty = c(1,1,1,1),bty = "n")

From above graph we can see that cubic polynomial regression and quadratic polynomial regression both have a good performance among all regressions, after comparing the significant of the parameters via summary from R, the p-value tell us all terms from cubic regression are significant, therefore, I will use the cubic polynomial regression to capture the trend.

summary(reg1)
## 
## Call:
## lm(formula = time_series ~ mon + mon2 + mon3)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -68.009 -19.599  -1.834  19.408  74.205 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 287.5724     4.5920  62.625  < 2e-16 ***
## mon         -15.4335     2.4106  -6.402 2.65e-09 ***
## mon2         -0.8540     0.3395  -2.516   0.0131 *  
## mon3          0.8899     0.1217   7.309 2.57e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 35.17 on 128 degrees of freedom
## Multiple R-squared:  0.3202, Adjusted R-squared:  0.3043 
## F-statistic:  20.1 on 3 and 128 DF,  p-value: 9.746e-11
summary(reg4)
## 
## Call:
## lm(formula = time_series ~ mon + mon2 + mon3 + mon4)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -65.644 -18.021  -2.771  16.281  78.922 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 280.90067    5.67797  49.472  < 2e-16 ***
## mon         -15.43348    2.38436  -6.473 1.90e-09 ***
## mon2          1.35228    1.17571   1.150   0.2522    
## mon3          0.88986    0.12042   7.389 1.74e-11 ***
## mon4         -0.08511    0.04347  -1.958   0.0524 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 34.79 on 127 degrees of freedom
## Multiple R-squared:  0.3402, Adjusted R-squared:  0.3194 
## F-statistic: 16.37 on 4 and 127 DF,  p-value: 7.741e-11

Defining the dairy commodity prices \(P_t\) as mixed causal and noncausal models \[ P_t = fundamental_t +Y_t\] \[Y_t = P_t - fundamental_t\] \[\Psi(L)\Phi(L^{-1})y_t=\epsilon_t\] \[(1-\alpha_1 L)(1-\alpha_2 L^{-1 })Y_t=\epsilon \] where \(\Phi(L)=(1-\phi_1L- ... -\phi_rL^{r})\) and $(L^{-1})=(_1L^{-1}- …-_sL^{-s}) $, the fundamental represents the determinstic part which is supposed to be removed, and there is a 3rd degree polynomial model fitted to get rid of it. The fundamental component, bubble component and the estimated trend is given by : \[fundamental = 0.8899t^3 - 0.854t^2 - 15.4335t + 35.17 \ ;\] \[Y_t = P_t -35.17 + 15.4335t + 0.854t^2 - 0.8899t^3 \ .\]

Due to the price of whole milk powder has been fluctuated from 2009 year to 2020, the 3rd degree polynomial model only generally capture the bubble component with residual standard error at 35.17 on 128 degree of freedom, and this could be agreed.

After decomposition, the corresponding detrend plot, ACF and PACF looks like

plot(reg1$residuals,type="l",col="blue",lwd=3)

par(mfrow=c(1,2))
Acf(reg1$residuals,lag.max = 132)
Acf(reg1$residuals,type = "partial",lag.max = 132)

Let’s see the summary report of this model.

summary(reg1)
## 
## Call:
## lm(formula = time_series ~ mon + mon2 + mon3)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -68.009 -19.599  -1.834  19.408  74.205 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 287.5724     4.5920  62.625  < 2e-16 ***
## mon         -15.4335     2.4106  -6.402 2.65e-09 ***
## mon2         -0.8540     0.3395  -2.516   0.0131 *  
## mon3          0.8899     0.1217   7.309 2.57e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 35.17 on 128 degrees of freedom
## Multiple R-squared:  0.3202, Adjusted R-squared:  0.3043 
## F-statistic:  20.1 on 3 and 128 DF,  p-value: 9.746e-11

#Models Estimated

In this part we will estimate and diagnose models. First we will set the autoregressive polynomial values p then use the result to fit different AR, MAR models to see their performance on bubble burst predict. According to Hecq, we will assume the residuals follow student’s t distribution because of the sudden change, after that we will use BDS test (Brock, William, Davis Dechert, Scheinkman)to test the idd-ness for residuals, which null hypothesis is idd distributed residuals.

##Order p setting

Setting value of p base on information criteria is in this part. Bayesian information criteria (BIC) , Akaike information criteria (AIC) and Hannan-Quinn information criterion (HQ) is considered here.The result of three information criteria, asterisk means minimum value has been attained. Hecq has show that BIC is preference among the results from simulation, thus we refer to BIC, p = 2 has settled.

x1 <- reg1$fitted.values
y1 <- reg1$residuals


selection.lag(y1,x1,8)
## Order Selection Criteria Pseudo Causal Model: 
##   
## BAYESIAN INFORMATION CRITERION 
##         p = 0    p = 1    p = 2    p = 3    p = 4    p = 5    p = 6    p = 7
## [1,] 10.00149 7.779264 7.404106 7.415615 7.458618 7.500369 7.529357 7.564225
##         p = 8
## [1,] 7.525035
##   
## Minimum value attained at p = 2  
##   
## AKAIKE INFORMATION CRITERION 
##         p = 0    p = 1    p = 2   p = 3    p = 4    p = 5    p = 6    p = 7
## [1,] 9.957812 7.713419 7.315874 7.30477 7.324929 7.343603 7.349276 7.360586
##         p = 8
## [1,] 7.297593
##   
## Minimum value attained at p = 8  
##   
## HANNAN-QUINN INFORMATION CRITERION 
##         p = 0    p = 1    p = 2    p = 3    p = 4    p = 5    p = 6    p = 7
## [1,] 9.975561 7.740175 7.351726 7.349809 7.379248 7.407295 7.422437 7.443314
##         p = 8
## [1,] 7.389985
##   
## Minimum value attained at p = 3  
## 
## $bic
##         p = 0    p = 1    p = 2    p = 3    p = 4    p = 5    p = 6    p = 7
## [1,] 10.00149 7.779264 7.404106 7.415615 7.458618 7.500369 7.529357 7.564225
##         p = 8
## [1,] 7.525035
## 
## $aic
##         p = 0    p = 1    p = 2   p = 3    p = 4    p = 5    p = 6    p = 7
## [1,] 9.957812 7.713419 7.315874 7.30477 7.324929 7.343603 7.349276 7.360586
##         p = 8
## [1,] 7.297593
## 
## $hq
##         p = 0    p = 1    p = 2    p = 3    p = 4    p = 5    p = 6    p = 7
## [1,] 9.975561 7.740175 7.351726 7.349809 7.379248 7.407295 7.422437 7.443314
##         p = 8
## [1,] 7.389985

Since we have set the p equal to 2, in order to find the optimal fit for our sample data, we will estimate MAR(2,0), MAR(0,2) and MAR(1,1) model in following part.

MAR(2,0) Model

We firstly analyse MAR(2,0) model, which is pure noncausal AR(2) model as well : \[y_t = \psi_1 y_{t+1}+\psi_2 y_{t+2}+\epsilon_t\] where \(\epsilon_t\) followed iid student’s t distribution with location 0 and scale parameter \(\lambda, \epsilon_t \sim (0, \lambda)\).

Estimated parameters is displayed in following :

df <- data.frame(1.5002308,-0.5675223)
df <- rbind(c("$\\psi_1$","$\\psi_2$"),df)
df = unname(df)
knitr::kable(df)
\(\psi_1\) \(\psi_2\)
1.5002308 -0.5675223

The residuals plot of MAR(2,0) is:

#lag=2,lead=0
mar20 <- marx.t(y1,x1,2,0)
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
#m1 <- mixed(y1,x1,2,0)
#summary(m1)
#marx(y1,x1,2,0)
#inference(y1,x1,mar20$coef.c,mar20$coef.nc,mar20$coef.exo,mar20$coef.int,mar20$scale,mar20$df,0.05)
plot(mar20$residuals,ylab="residuals",type="l",main=" MAR(2,0) Residuals plot")

BDS Test:

# bdsTest(mar20$residuals,12)
m <- seq(2,12)
statistics <- c( 0.818, 1.1651,1.4883,0.9349,-4.2622,-4.0646,            -3.2212,-2.622,-2.1789,-1.8407,-1.5759)
p <- c(0.4133,0.244,0.1367,0.3498,2.024e-05,4.811e-05,0.001277,0.008741,0.02934,0.06566,0.115)

df <- data.frame(m,statistics,p)
knitr::kable(df)   
m statistics p
2 0.8180 0.4133000
3 1.1651 0.2440000
4 1.4883 0.1367000
5 0.9349 0.3498000
6 -4.2622 0.0000202
7 -4.0646 0.0000481
8 -3.2212 0.0012770
9 -2.6220 0.0087410
10 -2.1789 0.0293400
11 -1.8407 0.0656600
12 -1.5759 0.1150000

The result for MAR(2,0) model generally over 0.05 significant value except for several values significant, thus we still reject the null hypothesis.

MAR(0,2) Model

Then I analyse MAR(0,2) model : \[y_t = \phi_1 y_{t-1}+\phi_2 y_{t-2}+\epsilon_t\] where \(\epsilon_t\) followed iid student’s t distribution with location 0 and scale parameter \(\lambda, \epsilon_t \sim (0, \lambda)\). Estimated parameters are displayed in fowling table:

#lag =0,lead = 2
mar02 <- marx.t(y1,x1,0,2)
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced

## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
#m2 <- mixed(y1,x1,0,2)
#summary(m2)
#marx(y1,x1,2,0.05,0,2)
#inference(y1,x1,mar02$coef.c,mar02$coef.nc,mar02$coef.exo,mar02$coef.int,mar02$scale,mar02$df,0.05)
plot(mar02$residuals,type="l",ylab="residuals",main=" MAR(0,2) Residuals plot",col="yellow2")

bdsTest(mar02$residuals,12)
## 
## Title:
##  BDS Test
## 
## Test Results:
##   PARAMETER:
##     Max Embedding Dimension: 12
##     eps[1]: 4.488
##     eps[2]: 8.976
##     eps[3]: 13.464
##     eps[4]: 17.951
##   STATISTIC:
##     eps[1] m=2: -0.0326
##     eps[1] m=3: -0.5532
##     eps[1] m=4: -0.127
##     eps[1] m=5: -0.0922
##     eps[1] m=6: 1.6426
##     eps[1] m=7: 2.6807
##     eps[1] m=8: 2.1888
##     eps[1] m=9: 3.4398
##     eps[1] m=10: -1.13
##     eps[1] m=11: -0.9393
##     eps[1] m=12: -0.7907
##     eps[2] m=2: 0.4693
##     eps[2] m=3: -0.169
##     eps[2] m=4: 0.2399
##     eps[2] m=5: 0.1824
##     eps[2] m=6: 0.512
##     eps[2] m=7: 0.702
##     eps[2] m=8: 0.7899
##     eps[2] m=9: 1.339
##     eps[2] m=10: 1.798
##     eps[2] m=11: 2.4158
##     eps[2] m=12: 3.2123
##     eps[3] m=2: 0.8748
##     eps[3] m=3: 0.2023
##     eps[3] m=4: 0.393
##     eps[3] m=5: 0.248
##     eps[3] m=6: 0.3449
##     eps[3] m=7: 0.4848
##     eps[3] m=8: 0.6715
##     eps[3] m=9: 1.0157
##     eps[3] m=10: 1.3135
##     eps[3] m=11: 1.5618
##     eps[3] m=12: 1.7386
##     eps[4] m=2: 1.0529
##     eps[4] m=3: 0.357
##     eps[4] m=4: 0.185
##     eps[4] m=5: -0.1022
##     eps[4] m=6: -0.0846
##     eps[4] m=7: -0.0311
##     eps[4] m=8: 0.2932
##     eps[4] m=9: 0.5805
##     eps[4] m=10: 0.7954
##     eps[4] m=11: 0.8904
##     eps[4] m=12: 0.9042
##   P VALUE:
##     eps[1] m=2: 0.974 
##     eps[1] m=3: 0.5801 
##     eps[1] m=4: 0.8989 
##     eps[1] m=5: 0.9266 
##     eps[1] m=6: 0.1005 
##     eps[1] m=7: 0.007347 
##     eps[1] m=8: 0.02861 
##     eps[1] m=9: 0.0005822 
##     eps[1] m=10: 0.2585 
##     eps[1] m=11: 0.3476 
##     eps[1] m=12: 0.4291 
##     eps[2] m=2: 0.6389 
##     eps[2] m=3: 0.8658 
##     eps[2] m=4: 0.8104 
##     eps[2] m=5: 0.8553 
##     eps[2] m=6: 0.6086 
##     eps[2] m=7: 0.4827 
##     eps[2] m=8: 0.4296 
##     eps[2] m=9: 0.1806 
##     eps[2] m=10: 0.07217 
##     eps[2] m=11: 0.0157 
##     eps[2] m=12: 0.001317 
##     eps[3] m=2: 0.3817 
##     eps[3] m=3: 0.8397 
##     eps[3] m=4: 0.6943 
##     eps[3] m=5: 0.8041 
##     eps[3] m=6: 0.7301 
##     eps[3] m=7: 0.6278 
##     eps[3] m=8: 0.5019 
##     eps[3] m=9: 0.3098 
##     eps[3] m=10: 0.189 
##     eps[3] m=11: 0.1183 
##     eps[3] m=12: 0.08211 
##     eps[4] m=2: 0.2924 
##     eps[4] m=3: 0.7211 
##     eps[4] m=4: 0.8532 
##     eps[4] m=5: 0.9186 
##     eps[4] m=6: 0.9326 
##     eps[4] m=7: 0.9752 
##     eps[4] m=8: 0.7694 
##     eps[4] m=9: 0.5616 
##     eps[4] m=10: 0.4264 
##     eps[4] m=11: 0.3732 
##     eps[4] m=12: 0.3659 
## 
## Description:
##  Mon May 24 09:58:15 2021 by user:

The result for MAR(0,2) model generally over 0.05 significant value except for several values significant, thus we still reject the null hypothesis.

MAR(1,1) Model

Then let’s focus on the mixed causal - noncausal model, we have already set the p = 2, thus the estimated mixed causal-noncausal model is MAR(1,1),which is \[(1-\phi_1 L)(1-\psi_1L^{-1})y_t =\epsilon_t .\] The residuals plot of estimated MAR(1,1) is

#lag =1,lead = 1
mar11 <- marx.t(y1,x1,1,1)
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
#m11 <- mixed(y1,x1,1,1)
#summary(m11)
#marx(y1,x1,2,0.05,1,1)
#inference(y1,x1,mar11$coef.c,mar11$coef.nc,mar11$coef.exo,mar11$coef.int,mar11$scale,mar11$df,0.05)

plot(mar11$residuals,type="l",ylab="residuals",main="MAR(1,1) Residuals plot",col="green3")

bdsTest(mar11$residuals,12)
## 
## Title:
##  BDS Test
## 
## Test Results:
##   PARAMETER:
##     Max Embedding Dimension: 12
##     eps[1]: 4.607
##     eps[2]: 9.213
##     eps[3]: 13.82
##     eps[4]: 18.427
##   STATISTIC:
##     eps[1] m=2: 0.5621
##     eps[1] m=3: 0.1782
##     eps[1] m=4: -0.0541
##     eps[1] m=5: -0.689
##     eps[1] m=6: -1.4531
##     eps[1] m=7: -1.0002
##     eps[1] m=8: -1.7427
##     eps[1] m=9: -1.3982
##     eps[1] m=10: -1.1445
##     eps[1] m=11: -0.9517
##     eps[1] m=12: -0.8015
##     eps[2] m=2: 0.9539
##     eps[2] m=3: 0.6102
##     eps[2] m=4: 1.178
##     eps[2] m=5: 1.5311
##     eps[2] m=6: 2.2204
##     eps[2] m=7: 2.8897
##     eps[2] m=8: 3.4708
##     eps[2] m=9: 3.9696
##     eps[2] m=10: 4.6495
##     eps[2] m=11: 4.9381
##     eps[2] m=12: 5.6557
##     eps[3] m=2: 1.5691
##     eps[3] m=3: 0.9836
##     eps[3] m=4: 1.096
##     eps[3] m=5: 1.1456
##     eps[3] m=6: 1.5478
##     eps[3] m=7: 1.9405
##     eps[3] m=8: 2.2678
##     eps[3] m=9: 2.4402
##     eps[3] m=10: 2.6658
##     eps[3] m=11: 2.8404
##     eps[3] m=12: 3.0159
##     eps[4] m=2: 1.8518
##     eps[4] m=3: 1.143
##     eps[4] m=4: 0.9934
##     eps[4] m=5: 0.8474
##     eps[4] m=6: 1.0009
##     eps[4] m=7: 1.207
##     eps[4] m=8: 1.5402
##     eps[4] m=9: 1.7486
##     eps[4] m=10: 1.8612
##     eps[4] m=11: 1.865
##     eps[4] m=12: 1.8817
##   P VALUE:
##     eps[1] m=2: 0.5741 
##     eps[1] m=3: 0.8586 
##     eps[1] m=4: 0.9568 
##     eps[1] m=5: 0.4908 
##     eps[1] m=6: 0.1462 
##     eps[1] m=7: 0.3172 
##     eps[1] m=8: 0.08139 
##     eps[1] m=9: 0.1621 
##     eps[1] m=10: 0.2524 
##     eps[1] m=11: 0.3412 
##     eps[1] m=12: 0.4228 
##     eps[2] m=2: 0.3401 
##     eps[2] m=3: 0.5417 
##     eps[2] m=4: 0.2388 
##     eps[2] m=5: 0.1258 
##     eps[2] m=6: 0.02639 
##     eps[2] m=7: 0.003856 
##     eps[2] m=8: 0.0005189 
##     eps[2] m=9: 7.2e-05 
##     eps[2] m=10: 3.328e-06 
##     eps[2] m=11: 7.889e-07 
##     eps[2] m=12: 1.552e-08 
##     eps[3] m=2: 0.1166 
##     eps[3] m=3: 0.3253 
##     eps[3] m=4: 0.2731 
##     eps[3] m=5: 0.252 
##     eps[3] m=6: 0.1217 
##     eps[3] m=7: 0.05232 
##     eps[3] m=8: 0.02334 
##     eps[3] m=9: 0.01468 
##     eps[3] m=10: 0.00768 
##     eps[3] m=11: 0.004506 
##     eps[3] m=12: 0.002562 
##     eps[4] m=2: 0.06405 
##     eps[4] m=3: 0.253 
##     eps[4] m=4: 0.3205 
##     eps[4] m=5: 0.3968 
##     eps[4] m=6: 0.3169 
##     eps[4] m=7: 0.2274 
##     eps[4] m=8: 0.1235 
##     eps[4] m=9: 0.08036 
##     eps[4] m=10: 0.06271 
##     eps[4] m=11: 0.06218 
##     eps[4] m=12: 0.05988 
## 
## Description:
##  Mon May 24 09:58:16 2021 by user:

The above BDS test results suggest that fail to reject the null hypothesis.

plot(mar20$residuals,ylab="residuals",type="l",ylim=c(-50,50))
lines(mar02$residuals,type="l",ylab="residuals",col="yellow2")
lines(mar11$residuals,type="l",ylab="residuals",col="green3")

legend("topright",c("MAR(2,0)","MAR(0,2)","MAR(1,1)"),col = c("black","yellow2","green3"),
       lwd = c("2.5","2.5","2.5"),lty = c(1,1,1),bty = "n")

Lag plot of above three models:

lag.plot(mar20$residuals,12,do.lines = F,main = "lag plot for MAR(2,0)")

lag.plot(mar11$residuals,9,do.lines = F,main = "lag plot for MAR(1,1)")

lag.plot(mar02$residuals,9,do.lines = F,main = "lag plot for MAR(0,2)")

#lag.plot(mar4$residuals,12,do.lines = F)

#The full call is lag.plot1(x,m,corr=TRUE,smooth=TRUE) and it will generate a grid of #scatterplots of x(t-h) versus x(t) for h = 1,...,m, along with the autocorrelation #values in blue and a lowess fit in red.If you don't want any correlations or lines, #simply use R's lag.plot

The following values are MSE of MAR(2,0), MAR(0,2) MAR(1,1) models:

ts1 <- ts(mar20$residuals,start = 1,frequency = 130)
ts2 <- ts(mar02$residuals,start = 1,frequency = 130)
ts3 <- ts(mar11$residuals,start = 1,frequency = 130)
e1 <- tsCV(ts1,forecastfunction = naive,h=2)
e2 <- tsCV(ts2,forecastfunction = naive,h=2)
e3 <- tsCV(ts3,forecastfunction = naive,h=2)
mse1 <- colMeans(e1^2,na.rm = T)
mse2 <- colMeans(e2^2,na.rm = T)
mse3 <- colMeans(e3^2,na.rm = T)
mse1;mse2;mse3
##      h=1      h=2 
## 145.6210 186.9647
##      h=1      h=2 
## 138.5147 180.7669
##      h=1      h=2 
## 141.2022 191.6916

Forecasting

In this part I will see how MAR(1,1) model works on dairy commodity price prediction. The mixed causal-non causal model we mentioned beforehand could be written as \[y_t= \frac{1}{\Phi(L)}\frac{1}{\Psi(L^{-1})}\epsilon_t ,\] where the causal part could be written as \(v_t = \Psi(L^{-1})y_t\) and it satisfied \(\epsilon_t=\Phi(L)v_t\), depends on past and present values. where the noncausal component is \(u_t=\Phi(L)y_t\), it satisfies \(\epsilon_t=\Psi(L^{-1})u_t\), and it depends on its future and present errors. Thus we will use the noncausal component \(u_t\) to predict the future values \(y_{T+1}, y_{T+2},...,y_{T+H}\), H is denoted as horizons. According to Voisin, Elisa and Hecq, the prediction we used here is based on closed form estimator of the predictive density of unobserved component \(u\), we estimate the forecasting density of noncausal component \(u_t\) at different horizons.

\[\prod (u_T,...,u_{T+H}|\hat{u}_T)\] \[=\{ \hat{g}(\hat{u}_T - \hat{\psi}u_{T+1}) \hat{g}(\hat{u}_{T+1} - \hat{\psi}u_{T+2})... \hat{g}(\hat{u}_{T+H-1} - \hat{\psi}u_{T+H})\} \{\sum_{t-1}^{T-1} \hat{g}(\hat{u}_{T+H} - \hat{\psi}u_t)\} \{ \sum_{t-1}^{T-1} \hat{g}(\hat{u}_{T+T} - \hat{\psi}u_t \}^{-1}\]

where \(\hat{g}\) is the student’s t distribution: \[g(\epsilon_t; \sigma, v)= \frac{\Gamma(\frac{v+1}{2})}{\Gamma(\frac{v}{2})\sqrt{\pi v \sigma }}(1+\frac{1}{v}(\frac{\epsilon_t}{\sigma})^2)^{-\frac{v+1}{2}}\] \(\theta\) denotes scale parameter, \(v\) is degrees of freedom, which suppose to \(v > 0\) and \(\Gamma(.)\) is the gamma function.

Then I adopt SIR method to simulate future noncausal components \(u_{T+1}^{s}, u_{T+2}^s,..., u_{T+H}^s\), which is introduced from Voisin, Elisa and Hecq, and use them to compute the future values \(y_{T+1^s}, y_{T+2}^s,..., y_{T+H}^s\), 10,000 times simulations are used here to forecast in following part.

For next, we use MAR(1,1) model forecast the values of \(y_{T+1}, y_{T+2},..., y_{T+H}.\)

Let’s see how estimated MAR(1,1) model and forecast the density of detrended process y at horizon H=1, \(y_t = \psi_1 y_{T+1} +\sum_{j=0}^{10,000}\phi^j (\beta_1 x_{1,t-j}+\epsilon_{t-j})\), with \([\psi_1,\phi_1] = [0.9601,0.4941]\), \(\beta=4\), \(x_{1,t} \stackrel{iid}{\sim} t(1,0)\), where \(X_t = [x_{1,t},...,x_{q,t}]' \in R\), they are exogenous variables for \(y_t\), \(\beta\) is a vector of parameters for $X_t $, they are calculated by MARX package when proceed the forecasting, \(\epsilon_t \stackrel{iid}{ \sim} t(4,0)\), where \(t(v,\sigma)\) is the student’s t distribution. The prediction density plot of \(y_{T+1}\) shows below:

fn1 <- forecast.marx(y1,p_C=1,p_NC = 1,h=2)
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length
fn2 <- forecast.marx(y1,p_C=1,p_NC = 1,h=3)
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length
fn3 <- forecast.marx(y1,p_C=1,p_NC = 1,h=4)
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length
fn4 <- forecast.marx(y1,p_C=1,p_NC = 1,h=5)
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length
fn5 <- forecast.marx(y1,p_C=1,p_NC = 1,h=6)
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length
fn6 <- forecast.marx(y1,p_C=1,p_NC = 1,h=7)
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length
fn7 <- forecast.marx(y1,p_C=1,p_NC = 1,h=8)
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length
plot(density(fn1),main="",xlab="y(T+1)")

par(mfrow=c(2,3))
plot(density(fn1),xlim=c(-70,-35),main="",xlab="y(T+1)",lwd=3,col="lightseagreen")
plot(density(fn2),xlim=c(-70,-35),main="",xlab="y(T+2)",lwd=2,col="lightskyblue")
plot(density(fn3),xlim=c(-70,-30),main="",xlab="y(T+3)",lwd=2,col="lightskyblue1")
plot(density(fn4),xlim=c(-70,-30),main="",xlab="y(T+4)",lwd=3,col="lightskyblue2")
plot(density(fn5),xlim=c(-70,-30),main="",xlab="y(T+5)",lwd=3,col="lightskyblue3")
plot(density(fn6),xlim=c(-70,-30),main="",xlab="y(T+6)",lwd=3,col="lightskyblue4")

#plot(density(fn7),xlim=c(-70,-30),main="",xlab="y(T+7)",lwd=3,col="lightslateblue")

3D plotting

d1 <-  density(forecast.marx(mar11$residuals,p_C=1,p_NC = 1,h=2))
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length
d2 <- density(forecast.marx(mar11$residuals,p_C=1,p_NC = 1,h=3))
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in sqrt(df1 * pi * sig1^2): NaNs produced
## Warning in log(1 + (E/sig1)^2/df1): NaNs produced
## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length

## Warning in y.for[j] <- object$coefficients[1]/(1 - sum(coef.noncaus)) + : number
## of items to replace is not a multiple of replacement length
f <- kbvpdf(d1$y,d2$y,0.7,7)
plot(f,TRUE,xlab="y(T+1)",ylab="y(T+2)")