We begin by ploting the series and looking at the ACF and PACF plots. ACF seems to decay very slowly, a sign of non-stationarity:
load("/Users/arminda/Desktop/Arminda/TS/TS1-Session/TS-New/milk.RData")
library("astsa")
plot(globtemp.ts)
acf2(globtemp.ts)
We take first-order differences and confirm by a series of tests than one first-order difference is enough to make the series stationary:
library("tseries")
adf.test(diff(globtemp.ts))
##
## Augmented Dickey-Fuller Test
##
## data: diff(globtemp.ts)
## Dickey-Fuller = -8.3444, Lag order = 4, p-value = 0.01
## alternative hypothesis: stationary
kpss.test(diff(globtemp.ts))
##
## KPSS Test for Level Stationarity
##
## data: diff(globtemp.ts)
## KPSS Level = 0.10493, Truncation lag parameter = 4, p-value = 0.1
sd(globtemp.ts)
## [1] 0.2566048
sd(diff(globtemp.ts))
## [1] 0.1499603
sd(diff(diff(globtemp.ts)))
## [1] 0.2381071
The ACF and PACF of the first differences:
library("forecast")
acf2(diff(globtemp.ts))
The ACF and PACF of the first differences look much more plausible for an ARMA model. If the first differences were an AR or an MA process, we would expect either the ACF or the PACF to cut off after a finite lag. Since this doesn’t seem to happen, we will propose an ARMA model; the simplest candidate is ARMA(1,1), so let’s start with that. As we have taken first-order differences on the original series to make it stationary, we have to fit an ARIMA(1,1,1) model to the original series:
model.1=Arima(globtemp.ts,order=c(1,1,1),include.constant=1)
summary(model.1)
## Series: globtemp.ts
## ARIMA(1,1,1) with drift
##
## Coefficients:
## ar1 ma1 drift
## 0.2695 -0.8180 0.0061
## s.e. 0.1122 0.0624 0.0030
##
## sigma^2 = 0.01722: log likelihood = 77.05
## AIC=-146.11 AICc=-145.77 BIC=-134.83
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -0.002162733 0.1291073 0.09925874 Inf Inf 0.8277124 0.04097333
The AR coefficient is estimated to be \(0.27\) (with a standard error of \(0.11\)) and the MA coefficient to be \(-0.818\) (with a standard error of \(0.06\)). The mean of the differenced sequence, so it corresponds to the drift of the original sequence, is estimated at \(0.006\), with a standard error of \(0.003\)). Some diagnostic and diagnostic plots follow:
jarque.bera.test(model.1$residuals)
##
## Jarque Bera Test
##
## data: model.1$residuals
## X-squared = 1.496, df = 2, p-value = 0.4733
tsdiag(model.1)
The Jarque-Bera test returns a fairly reasonable p-value of \(0.4733\), but we could attempt to reduce the almost significant correlations of the residuals at lags \(2\) and \(4\) by introducing some more MA terms. In fact, doing so reduces the AIC of the fitted model. After trying MA degrees of 1 through 5, it is found that an ARIMA(1,1,4) model has the lowest AIC. It is not lower than AIC for model.1, but it passes the independence residuals test.
model.2=Arima(globtemp.ts,order=c(1,1,4),include.constant=1)
summary(model.2)
## Series: globtemp.ts
## ARIMA(1,1,4) with drift
##
## Coefficients:
## ar1 ma1 ma2 ma3 ma4 drift
## -0.7875 0.3034 -0.7214 -0.2547 0.1837 0.0060
## s.e. 0.1879 0.2056 0.1119 0.0985 0.1081 0.0033
##
## sigma^2 = 0.01633: log likelihood = 81.76
## AIC=-149.52 AICc=-148.56 BIC=-129.78
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set -0.001313962 0.1241457 0.09612207 Inf Inf 0.801556 0.008604604
tsdiag(model.2)
cov2cor(model.2$var.coef)
## ar1 ma1 ma2 ma3 ma4 drift
## ar1 1.00000000 -0.895080684 0.53796027 0.3983651 0.47167214 -0.013879452
## ma1 -0.89508068 1.000000000 -0.32403569 -0.6151942 -0.57346262 0.009982013
## ma2 0.53796027 -0.324035694 1.00000000 0.1080222 -0.23603126 -0.016105031
## ma3 0.39836506 -0.615194225 0.10802216 1.0000000 0.49905622 -0.012889498
## ma4 0.47167214 -0.573462623 -0.23603126 0.4990562 1.00000000 -0.019272251
## drift -0.01387945 0.009982013 -0.01610503 -0.0128895 -0.01927225 1.000000000
But, when we check correlations between coefficients, we find something alarming, there is one high correlation (greater than \(0.8\)). So, we discard this model and try increasing the order of the autoregressive part instead.
model.3=Arima(globtemp.ts,order=c(2,1,1),include.constant=1)
model.3$aicc
## [1] -148.1998
tsdiag(model.3)
jarque.bera.test(model.3$residuals)
##
## Jarque Bera Test
##
## data: model.3$residuals
## X-squared = 0.54661, df = 2, p-value = 0.7609
Predictions with model.3 follow. An upward drift is evident in the predictions corresponding to the mean \(0.006\) that we noticed in the differenced data, but the standard error is quite large (you can see this in the length of the confidence intervals for predictions).
plot(forecast(model.3, h=6))
Observation: In both ARIMA models fitted so far, an include.constat=1 term is included. When the order of total differences in a model is equal to \(1\), a constant term should be included if the series has non zero average trend. We observe, in the plot of the original series and upward trend or upward drift. If the model fitted has two or more orders of total differences, it normally doesn’t include a constant term. A model with no order of differencing normally includes a constant term (which allows for a non-zero mean value).