Problem 2

Consider the weekly price for Brent Crude Oil, FRED/WCOILBRENTEU. Follow the Box-Jenkins methodology to buld a time series model forlog-change in the price.

Data

The data is collected from the FRED (Federal Reserve Economic Data) website. The Crude oil price used in the analysis is weekly data from May 1987 to February 2016

oilprice <- read.csv(url("http://research.stlouisfed.org/fred2/data/WCOILBRENTEU.csv"))
summary(oilprice)
##          DATE          VALUE       
##  1987-05-15:   1   Min.   :  9.44  
##  1987-05-22:   1   1st Qu.: 18.37  
##  1987-05-29:   1   Median : 27.16  
##  1987-06-05:   1   Mean   : 44.85  
##  1987-06-12:   1   3rd Qu.: 67.78  
##  1987-06-19:   1   Max.   :141.07  
##  (Other)   :1496
str(oilprice)
## 'data.frame':    1502 obs. of  2 variables:
##  $ DATE : Factor w/ 1502 levels "1987-05-15","1987-05-22",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ VALUE: num  18.6 18.5 18.6 18.7 18.8 ...
head(oilprice)
##         DATE VALUE
## 1 1987-05-15 18.58
## 2 1987-05-22 18.54
## 3 1987-05-29 18.60
## 4 1987-06-05 18.70
## 5 1987-06-12 18.75
## 6 1987-06-19 19.01
tail(oilprice)
##            DATE VALUE
## 1497 2016-01-15 29.10
## 1498 2016-01-22 27.76
## 1499 2016-01-29 31.75
## 1500 2016-02-05 32.18
## 1501 2016-02-12 30.41
## 1502 2016-02-19 32.29
plot(oilprice,xlab = "Years 1987 to 2016", ylab = "Dollars per Barrel Not Seasonally Adjusted", main="Crude Oil Prices: Brent - Europe, Weekly")

(2a.)* Examine the plot of the original data and for the log-change in the price

dlnoilprice <- diff(oilprice[,2])
plot(dlnoilprice, type = "l", xlab="Years to 2016", ylab="", main="Log-Difference in Oil Price, Quarterly")

Observation The log-difference plot’s trend is around zero but during the early nineties, for a brief period, and starting in the new millenia onward the spread of the points is sizably away from zero.

(2b.) Plot the ACF and the PACF for the log-change in the price.

acf(dlnoilprice, type="correlation", lag=200, xlab="Weekly Lag", ylab="Correlations", main="Sample ACF")

acf(dlnoilprice, type="partial", lag=200, xlab="Weekly Lag", ylab="Correlations", main="Sample PACF")

There is exponential decay of correlation towards zero for the auto-regressive model in the ACF plot and for the moving-average in the PACF plot. In the sample data, we don’t see a set cut-off however we can assume after 3 lags there is a change in the pattern of oscillation.

(2c.) Diagnose the residuals for estimated model(s), if there are several competing specifications use AIC, BIC and, Q statistics to compare their properties.

Constructing a AR(q) model

AR(1)

ar1 <- arima(dlnoilprice, order = c(1,0,0))
ar1
## 
## Call:
## arima(x = dlnoilprice, order = c(1, 0, 0))
## 
## Coefficients:
##          ar1  intercept
##       0.2257     0.0092
## s.e.  0.0251     0.0652
## 
## sigma^2 estimated as 3.827:  log likelihood = -3137.06,  aic = 6280.13
tsdiag(ar1, gof.lag=50)

AR(2)

ar2 <- arima(dlnoilprice, order = c(2,0,0))
ar2
## 
## Call:
## arima(x = dlnoilprice, order = c(2, 0, 0))
## 
## Coefficients:
##          ar1     ar2  intercept
##       0.2206  0.0227     0.0092
## s.e.  0.0258  0.0258     0.0667
## 
## sigma^2 estimated as 3.825:  log likelihood = -3136.68,  aic = 6281.36
tsdiag(ar2, gof.lag=50)

AR(3)

ar3 <- arima(dlnoilprice, order = c(3,0,0))
ar3
## 
## Call:
## arima(x = dlnoilprice, order = c(3, 0, 0))
## 
## Coefficients:
##          ar1     ar2     ar3  intercept
##       0.2196  0.0130  0.0436     0.0092
## s.e.  0.0258  0.0264  0.0258     0.0697
## 
## sigma^2 estimated as 3.818:  log likelihood = -3135.25,  aic = 6280.5
tsdiag(ar3, gof.lag=50)

Conclusion The lowest AIC among the three AR models is 6274.6 which belongs to the AR(1) model. For adequacy it is necessary to inspect the Standard Residual plot, the ACF, and the p-values for the Ljung-box statistic. In case of AR(3) the ACF residual is closer to zero and p-value for Ljung-Box statistic is the highest, this tells us that the AR(3) is the best among the three models.

Constructing a MA(q) model

MA(1)

ma1 <- arima(dlnoilprice, order=c(0,0,1))
ma1
## 
## Call:
## arima(x = dlnoilprice, order = c(0, 0, 1))
## 
## Coefficients:
##          ma1  intercept
##       0.2131     0.0092
## s.e.  0.0244     0.0614
## 
## sigma^2 estimated as 3.841:  log likelihood = -3139.78,  aic = 6285.56
tsdiag(ma1, gof.lag=50)

MA(2)

ma2 <- arima(dlnoilprice, order=c(0,0,2))
ma2
## 
## Call:
## arima(x = dlnoilprice, order = c(0, 0, 2))
## 
## Coefficients:
##          ma1     ma2  intercept
##       0.2180  0.0508     0.0092
## s.e.  0.0256  0.0261     0.0641
## 
## sigma^2 estimated as 3.831:  log likelihood = -3137.88,  aic = 6283.77
tsdiag(ma2, gof.lag=50)

MA(3)

ma3 <- arima(dlnoilprice, order=c(0,0,3))
ma3
## 
## Call:
## arima(x = dlnoilprice, order = c(0, 0, 3))
## 
## Coefficients:
##          ma1     ma2     ma3  intercept
##       0.2213  0.0562  0.0519     0.0092
## s.e.  0.0259  0.0263  0.0248     0.0670
## 
## sigma^2 estimated as 3.82:  log likelihood = -3135.72,  aic = 6281.44
tsdiag(ma3, gof.lag=50)

Conclusion The lowest AIC among the three AR models is 6277.12 which belongs to the MA(3) model. For adequacy it is necessary to inspect the Standard Residual plot, the ACF, and the p-values for the Ljung-box statistic. In case of MA(3) the ACF residual is closer to zero and p-value for Ljung-Box statistic is the highest, this tells us that the MA(3) is the best among the three models.

Constructing ARMA (p,q) model

AR(1)MA(1)

ar1ma1 <- arima(dlnoilprice, order = c(1,0,1))
ar1ma1
## 
## Call:
## arima(x = dlnoilprice, order = c(1, 0, 1))
## 
## Coefficients:
##          ar1      ma1  intercept
##       0.3824  -0.1667     0.0089
## s.e.  0.1383   0.1495     0.0681
## 
## sigma^2 estimated as 3.824:  log likelihood = -3136.43,  aic = 6280.86
tsdiag(ar1ma1, gof.lag=50)

AR(1)MA(2)

ar1ma2 <- arima(dlnoilprice, order = c(1,0,2))
ar1ma2
## 
## Call:
## arima(x = dlnoilprice, order = c(1, 0, 2))
## 
## Coefficients:
##          ar1      ma1      ma2  intercept
##       0.8564  -0.6406  -0.1222     0.0096
## s.e.  0.0527   0.0593   0.0304     0.0830
## 
## sigma^2 estimated as 3.801:  log likelihood = -3131.91,  aic = 6273.82
tsdiag(ar1ma2, gof.lag=50)

AR(2)MA(1)

ar2ma1 <- arima(dlnoilprice, order = c(2,0,1))
ar2ma1
## 
## Call:
## arima(x = dlnoilprice, order = c(2, 0, 1))
## 
## Coefficients:
##          ar1      ar2      ma1  intercept
##       1.0182  -0.1352  -0.8077     0.0087
## s.e.  0.0729   0.0338   0.0671     0.0826
## 
## sigma^2 estimated as 3.804:  log likelihood = -3132.52,  aic = 6275.05
tsdiag(ar2ma1, gof.lag=50)

AR(2)MA(2)

ar2ma2 <- arima(dlnoilprice, order = c(2,0,2))
ar2ma2
## 
## Call:
## arima(x = dlnoilprice, order = c(2, 0, 2))
## 
## Coefficients:
##          ar1     ar2      ma1      ma2  intercept
##       0.6771  0.1420  -0.4633  -0.2378     0.0093
## s.e.  0.3349  0.2574   0.3299   0.2068     0.0830
## 
## sigma^2 estimated as 3.8:  log likelihood = -3131.76,  aic = 6275.52
tsdiag(ar2ma2, gof.lag=50)

aic.arma = c(ar1ma1$aic,ar1ma2$aic,ar2ma1$aic,ar2ma2$aic)
aic.arma
## [1] 6280.857 6273.820 6275.049 6275.520

ConclusionAmong the four ARMA(p,q) models the ARMA(2,2) which is 6269.203. For adequacy inspect the Standard Residual plots, residuals of the ACF and, the P-Values for Ljung-box statistic plots. The ARMA (1,2) model has the higher p-value and the lowest ACF values, thus concluding that the ARMA (1,2) is the best among the four models.