Consider the weekly price for Brent Crude Oil, FRED/WCOILBRENTEU. Follow the Box-Jenkins methodology to buld a time series model forlog-change in the price.
The data is collected from the FRED (Federal Reserve Economic Data) website. The Crude oil price used in the analysis is weekly data from May 1987 to February 2016
oilprice <- read.csv(url("http://research.stlouisfed.org/fred2/data/WCOILBRENTEU.csv"))
summary(oilprice)
## DATE VALUE
## 1987-05-15: 1 Min. : 9.44
## 1987-05-22: 1 1st Qu.: 18.37
## 1987-05-29: 1 Median : 27.16
## 1987-06-05: 1 Mean : 44.85
## 1987-06-12: 1 3rd Qu.: 67.78
## 1987-06-19: 1 Max. :141.07
## (Other) :1496
str(oilprice)
## 'data.frame': 1502 obs. of 2 variables:
## $ DATE : Factor w/ 1502 levels "1987-05-15","1987-05-22",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ VALUE: num 18.6 18.5 18.6 18.7 18.8 ...
head(oilprice)
## DATE VALUE
## 1 1987-05-15 18.58
## 2 1987-05-22 18.54
## 3 1987-05-29 18.60
## 4 1987-06-05 18.70
## 5 1987-06-12 18.75
## 6 1987-06-19 19.01
tail(oilprice)
## DATE VALUE
## 1497 2016-01-15 29.10
## 1498 2016-01-22 27.76
## 1499 2016-01-29 31.75
## 1500 2016-02-05 32.18
## 1501 2016-02-12 30.41
## 1502 2016-02-19 32.29
plot(oilprice,xlab = "Years 1987 to 2016", ylab = "Dollars per Barrel Not Seasonally Adjusted", main="Crude Oil Prices: Brent - Europe, Weekly")
dlnoilprice <- diff(oilprice[,2])
plot(dlnoilprice, type = "l", xlab="Years to 2016", ylab="", main="Log-Difference in Oil Price, Quarterly")
Observation The log-difference plot’s trend is around zero but during the early nineties, for a brief period, and starting in the new millenia onward the spread of the points is sizably away from zero.
acf(dlnoilprice, type="correlation", lag=200, xlab="Weekly Lag", ylab="Correlations", main="Sample ACF")
acf(dlnoilprice, type="partial", lag=200, xlab="Weekly Lag", ylab="Correlations", main="Sample PACF")
There is exponential decay of correlation towards zero for the auto-regressive model in the ACF plot and for the moving-average in the PACF plot. In the sample data, we don’t see a set cut-off however we can assume after 3 lags there is a change in the pattern of oscillation.
ar1 <- arima(dlnoilprice, order = c(1,0,0))
ar1
##
## Call:
## arima(x = dlnoilprice, order = c(1, 0, 0))
##
## Coefficients:
## ar1 intercept
## 0.2257 0.0092
## s.e. 0.0251 0.0652
##
## sigma^2 estimated as 3.827: log likelihood = -3137.06, aic = 6280.13
tsdiag(ar1, gof.lag=50)
ar2 <- arima(dlnoilprice, order = c(2,0,0))
ar2
##
## Call:
## arima(x = dlnoilprice, order = c(2, 0, 0))
##
## Coefficients:
## ar1 ar2 intercept
## 0.2206 0.0227 0.0092
## s.e. 0.0258 0.0258 0.0667
##
## sigma^2 estimated as 3.825: log likelihood = -3136.68, aic = 6281.36
tsdiag(ar2, gof.lag=50)
ar3 <- arima(dlnoilprice, order = c(3,0,0))
ar3
##
## Call:
## arima(x = dlnoilprice, order = c(3, 0, 0))
##
## Coefficients:
## ar1 ar2 ar3 intercept
## 0.2196 0.0130 0.0436 0.0092
## s.e. 0.0258 0.0264 0.0258 0.0697
##
## sigma^2 estimated as 3.818: log likelihood = -3135.25, aic = 6280.5
tsdiag(ar3, gof.lag=50)
Conclusion The lowest AIC among the three AR models is 6274.6 which belongs to the AR(1) model. For adequacy it is necessary to inspect the Standard Residual plot, the ACF, and the p-values for the Ljung-box statistic. In case of AR(3) the ACF residual is closer to zero and p-value for Ljung-Box statistic is the highest, this tells us that the AR(3) is the best among the three models.
ma1 <- arima(dlnoilprice, order=c(0,0,1))
ma1
##
## Call:
## arima(x = dlnoilprice, order = c(0, 0, 1))
##
## Coefficients:
## ma1 intercept
## 0.2131 0.0092
## s.e. 0.0244 0.0614
##
## sigma^2 estimated as 3.841: log likelihood = -3139.78, aic = 6285.56
tsdiag(ma1, gof.lag=50)
ma2 <- arima(dlnoilprice, order=c(0,0,2))
ma2
##
## Call:
## arima(x = dlnoilprice, order = c(0, 0, 2))
##
## Coefficients:
## ma1 ma2 intercept
## 0.2180 0.0508 0.0092
## s.e. 0.0256 0.0261 0.0641
##
## sigma^2 estimated as 3.831: log likelihood = -3137.88, aic = 6283.77
tsdiag(ma2, gof.lag=50)
ma3 <- arima(dlnoilprice, order=c(0,0,3))
ma3
##
## Call:
## arima(x = dlnoilprice, order = c(0, 0, 3))
##
## Coefficients:
## ma1 ma2 ma3 intercept
## 0.2213 0.0562 0.0519 0.0092
## s.e. 0.0259 0.0263 0.0248 0.0670
##
## sigma^2 estimated as 3.82: log likelihood = -3135.72, aic = 6281.44
tsdiag(ma3, gof.lag=50)
Conclusion The lowest AIC among the three AR models is 6277.12 which belongs to the MA(3) model. For adequacy it is necessary to inspect the Standard Residual plot, the ACF, and the p-values for the Ljung-box statistic. In case of MA(3) the ACF residual is closer to zero and p-value for Ljung-Box statistic is the highest, this tells us that the MA(3) is the best among the three models.
ar1ma1 <- arima(dlnoilprice, order = c(1,0,1))
ar1ma1
##
## Call:
## arima(x = dlnoilprice, order = c(1, 0, 1))
##
## Coefficients:
## ar1 ma1 intercept
## 0.3824 -0.1667 0.0089
## s.e. 0.1383 0.1495 0.0681
##
## sigma^2 estimated as 3.824: log likelihood = -3136.43, aic = 6280.86
tsdiag(ar1ma1, gof.lag=50)
ar1ma2 <- arima(dlnoilprice, order = c(1,0,2))
ar1ma2
##
## Call:
## arima(x = dlnoilprice, order = c(1, 0, 2))
##
## Coefficients:
## ar1 ma1 ma2 intercept
## 0.8564 -0.6406 -0.1222 0.0096
## s.e. 0.0527 0.0593 0.0304 0.0830
##
## sigma^2 estimated as 3.801: log likelihood = -3131.91, aic = 6273.82
tsdiag(ar1ma2, gof.lag=50)
ar2ma1 <- arima(dlnoilprice, order = c(2,0,1))
ar2ma1
##
## Call:
## arima(x = dlnoilprice, order = c(2, 0, 1))
##
## Coefficients:
## ar1 ar2 ma1 intercept
## 1.0182 -0.1352 -0.8077 0.0087
## s.e. 0.0729 0.0338 0.0671 0.0826
##
## sigma^2 estimated as 3.804: log likelihood = -3132.52, aic = 6275.05
tsdiag(ar2ma1, gof.lag=50)
ar2ma2 <- arima(dlnoilprice, order = c(2,0,2))
ar2ma2
##
## Call:
## arima(x = dlnoilprice, order = c(2, 0, 2))
##
## Coefficients:
## ar1 ar2 ma1 ma2 intercept
## 0.6771 0.1420 -0.4633 -0.2378 0.0093
## s.e. 0.3349 0.2574 0.3299 0.2068 0.0830
##
## sigma^2 estimated as 3.8: log likelihood = -3131.76, aic = 6275.52
tsdiag(ar2ma2, gof.lag=50)
aic.arma = c(ar1ma1$aic,ar1ma2$aic,ar2ma1$aic,ar2ma2$aic)
aic.arma
## [1] 6280.857 6273.820 6275.049 6275.520
ConclusionAmong the four ARMA(p,q) models the ARMA(2,2) which is 6269.203. For adequacy inspect the Standard Residual plots, residuals of the ACF and, the P-Values for Ljung-box statistic plots. The ARMA (1,2) model has the higher p-value and the lowest ACF values, thus concluding that the ARMA (1,2) is the best among the four models.