Time Series Data and Models

Investigate the nature of time series data and learn the basics of ARMA models that can explain the behavior of such data. we will learn the basic R commands needed to help set up raw time series data to a form that can be analyzed using ARMA models.

Data Play

Before we use a data set for the first time, we should use the help system to see the details of the data. For example, use help(AirPassengers) or ?AirPassengers to see the details of the series.

# View a detailed description of AirPassengers
#help(AirPassengers)

# Plot AirPassengers
plot(AirPassengers)

# Plot the DJIA daily closings
plot(djia$Close)

# Plot the Southern Oscillation Index
plot(soi)

Differencing

When a time series is trend stationary, it will have stationary behavior around a trend. A simple example is \(Y_t=α+β_t+X_t\) where \(X_t\) is stationary.

A different type of model for trend is random walk, which has the form \(X_t=X_{t−1}+W_t\), where \(W_t\) is white noise. It is called a random walk because at time t the process is where it was at time \(t−1\) plus a completely random movement. For a random walk with drift, a constant is added to the model and will cause the random walk to drift in the direction (positve or negative) of the drift.

In both cases, simple differencing can remove the trend and coerce the data to stationarity. Differencing looks at the difference between the value of a time series at a certain point in time and its preceding value. That is, \(X_t−X_{t−1}\) is computed.

To check that it works, we will difference each generated time series and plot the detrended series. If a time series is in x, then diff(x) will have the detrended series obtained by differencing the data. To plot the detrended series, simply use plot(diff(x)).

# Plot globtemp and detrended globtemp
par(mfrow = c(2,1))
plot(globtemp) 
plot(diff(globtemp)) 

# Plot cmort and detrended cmort
par(mfrow = c(2,1))
plot(cmort)
plot(diff(cmort))

Dealing with Trend and Heteroscedasticity

Here, we will coerce nonstationary data to stationarity by calculating the return or growth rate as follows.

Often time series are generated as \(X_t=(1+p_t)X_{t−1}\) meaning that the value of the time series observed at time t equals the value observed at time \(t−1\) and a small percent change pt at time \(t\).

A simple deterministic example is putting money into a bank with a fixed interest \(p\). In this case, \(X_t\) is the value of the account at time period t with an initial deposit of \(X_0\).

Typically, pt is referred to as the return or growth rate of a time series, and this process is often stable.

For reasons that are outside the scope of this course, it can be shown that the growth rate pt can be approximated by \(Y_t=logX_t−logX_t−1≈p_t\).

# astsa and xts are preloaded 

# Plot GNP series (gnp) and its growth rate
par(mfrow = c(2,1))
plot(gnp)
plot(diff(log(gnp)))

# Plot DJIA closings (djia$Close) and its returns
par(mfrow = c(2,1))
plot(djia$Close)
plot(diff(log(djia$Close)))

Simulating ARMA Models

Any stationary time series can be written as a linear combination of white noise. In addition, any ARMA model has this form, so it is a good choice for modeling stationary time series.

R provides a simple function called arima.sim() to generate data from an ARMA model. For example, the syntax for generating 100 observations from an MA(1) with parameter .9 is arima.sim(model = list(order = c(0, 0, 1), ma = .9 ), n = 100). We can also use order = c(0, 0, 0) to generate white noise.

Now, we will generate data from various ARMA models. For each command, generate 200 observations and plot the result.

# Generate and plot white noise
WN <- arima.sim(model = list(order = c(0, 0, 0)), n = 200)
plot(WN)

# Generate and plot an MA(1) with parameter .9 
MA <- arima.sim(model = list(order = c(0, 0, 1), ma = 0.9), n = 200)
plot(MA)

# Generate and plot an AR(2) with parameters 1.5 and -.75
AR <- arima.sim(model = list(order = c(2, 0, 0), ar = c(1.5, -0.75)), n = 200)
plot(AR)

Fitting ARMA models

Discover the wonderful world of ARMA models and how to fit these models to time series data. We will learn how to identify a model, how to choose the correct model, and how to verify a model once we fit it to data. We will learn how to use R time series commands from the stats and astsa packages.

Fitting an AR(1) Model

# Generate 100 observations from the AR(1) model
x <- arima.sim(model = list(order = c(1, 0, 0), ar = .9), n = 100) 

# Plot the generated data 
plot(x)

# Plot the sample P/ACF pair
acf2(x)

##         ACF  PACF
##  [1,]  0.85  0.85
##  [2,]  0.68 -0.16
##  [3,]  0.56  0.08
##  [4,]  0.44 -0.09
##  [5,]  0.35  0.06
##  [6,]  0.29  0.02
##  [7,]  0.26  0.07
##  [8,]  0.21 -0.09
##  [9,]  0.22  0.21
## [10,]  0.23 -0.04
## [11,]  0.23  0.03
## [12,]  0.24  0.06
## [13,]  0.26  0.07
## [14,]  0.24 -0.14
## [15,]  0.17 -0.10
## [16,]  0.12  0.04
## [17,]  0.09  0.02
## [18,]  0.04 -0.08
## [19,]  0.00 -0.03
## [20,] -0.04 -0.07
# Fit an AR(1) to the data and examine the t-table
sarima(x, p = 1, d = 0, q = 0)
## initial  value 0.605460 
## iter   2 value -0.043560
## iter   3 value -0.043570
## iter   4 value -0.043574
## iter   5 value -0.043581
## iter   6 value -0.043587
## iter   7 value -0.043589
## iter   8 value -0.043589
## iter   9 value -0.043590
## iter  10 value -0.043590
## iter  11 value -0.043590
## iter  12 value -0.043590
## iter  13 value -0.043590
## iter  13 value -0.043590
## iter  13 value -0.043590
## final  value -0.043590 
## converged
## initial  value -0.040515 
## iter   2 value -0.040576
## iter   3 value -0.040711
## iter   4 value -0.040725
## iter   5 value -0.040758
## iter   6 value -0.040759
## iter   6 value -0.040759
## final  value -0.040759 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##          ar1    xmean
##       0.8458  -0.5511
## s.e.  0.0509   0.5879
## 
## sigma^2 estimated as 0.9102:  log likelihood = -137.82,  aic = 281.64
## 
## $degrees_of_freedom
## [1] 98
## 
## $ttable
##       Estimate     SE t.value p.value
## ar1     0.8458 0.0509 16.6338  0.0000
## xmean  -0.5511 0.5879 -0.9374  0.3509
## 
## $AIC
## [1] 0.9459142
## 
## $AICc
## [1] 0.9684142
## 
## $BIC
## [1] -0.001982403

Fitting an AR(2) Model

# Plot x
x <- arima.sim(model = list(order = c(2, 0, 0), ar = c(1.5, -.75)), n = 200)
plot(x)

# Plot the sample P/ACF of x
acf2(x)

##         ACF  PACF
##  [1,]  0.87  0.87
##  [2,]  0.57 -0.70
##  [3,]  0.24 -0.02
##  [4,] -0.05  0.01
##  [5,] -0.25 -0.06
##  [6,] -0.34  0.03
##  [7,] -0.34 -0.06
##  [8,] -0.28 -0.10
##  [9,] -0.21 -0.03
## [10,] -0.17 -0.18
## [11,] -0.13  0.10
## [12,] -0.10  0.00
## [13,] -0.06  0.01
## [14,]  0.02  0.16
## [15,]  0.12  0.08
## [16,]  0.23 -0.08
## [17,]  0.29  0.01
## [18,]  0.29 -0.02
## [19,]  0.22  0.03
## [20,]  0.11 -0.05
## [21,] -0.01  0.02
## [22,] -0.11  0.02
## [23,] -0.17 -0.03
## [24,] -0.18  0.01
## [25,] -0.17 -0.14
# Fit an AR(2) to the data and examine the t-table
sarima(x, p = 2, d = 0, q = 0)
## initial  value 1.140552 
## iter   2 value 1.018658
## iter   3 value 0.573575
## iter   4 value 0.336931
## iter   5 value 0.176712
## iter   6 value 0.069879
## iter   7 value 0.066441
## iter   8 value 0.066184
## iter   9 value 0.066177
## iter  10 value 0.066160
## iter  11 value 0.066158
## iter  12 value 0.066158
## iter  13 value 0.066158
## iter  13 value 0.066158
## iter  13 value 0.066158
## final  value 0.066158 
## converged
## initial  value 0.074755 
## iter   2 value 0.074744
## iter   3 value 0.074737
## iter   4 value 0.074734
## iter   5 value 0.074734
## iter   6 value 0.074733
## iter   7 value 0.074733
## iter   7 value 0.074733
## iter   7 value 0.074733
## final  value 0.074733 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##          ar1      ar2   xmean
##       1.4973  -0.7237  0.3246
## s.e.  0.0486   0.0486  0.3338
## 
## sigma^2 estimated as 1.145:  log likelihood = -298.73,  aic = 605.47
## 
## $degrees_of_freedom
## [1] 197
## 
## $ttable
##       Estimate     SE  t.value p.value
## ar1     1.4973 0.0486  30.7940   0.000
## ar2    -0.7237 0.0486 -14.9007   0.000
## xmean   0.3246 0.3338   0.9725   0.332
## 
## $AIC
## [1] 1.165025
## 
## $AICc
## [1] 1.176051
## 
## $BIC
## [1] 0.2144999

Fitting an MA(1) Model

# Plot x
x <- arima.sim(model = list(order = c(0, 0, 1), ma = -.8), n = 100)
plot(x)

# Plot the sample P/ACF of x
acf2(x)

##         ACF  PACF
##  [1,] -0.46 -0.46
##  [2,] -0.03 -0.32
##  [3,] -0.16 -0.47
##  [4,]  0.37  0.01
##  [5,] -0.32 -0.27
##  [6,]  0.18 -0.06
##  [7,] -0.08 -0.02
##  [8,]  0.09 -0.04
##  [9,] -0.13  0.03
## [10,]  0.07 -0.08
## [11,] -0.09 -0.16
## [12,]  0.21  0.10
## [13,] -0.22 -0.16
## [14,]  0.16  0.08
## [15,] -0.16 -0.09
## [16,]  0.18 -0.02
## [17,] -0.18  0.01
## [18,]  0.25  0.15
## [19,] -0.27  0.03
## [20,]  0.17  0.10
# Fit an MA(1) to the data and examine the t-table
sarima(x, p = 0, d = 0, q = 1)
## initial  value 0.184996 
## iter   2 value 0.003431
## iter   3 value -0.072282
## iter   4 value -0.078450
## iter   5 value -0.084453
## iter   6 value -0.087446
## iter   7 value -0.087909
## iter   8 value -0.089048
## iter   9 value -0.089552
## iter  10 value -0.089603
## iter  11 value -0.089604
## iter  12 value -0.089604
## iter  13 value -0.089604
## iter  14 value -0.089604
## iter  14 value -0.089604
## iter  14 value -0.089604
## final  value -0.089604 
## converged
## initial  value -0.083653 
## iter   2 value -0.083695
## iter   3 value -0.083788
## iter   4 value -0.083814
## iter   5 value -0.083814
## iter   5 value -0.083814
## iter   5 value -0.083814
## final  value -0.083814 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##           ma1   xmean
##       -0.8285  0.0094
## s.e.   0.0538  0.0167
## 
## sigma^2 estimated as 0.8359:  log likelihood = -133.51,  aic = 273.02
## 
## $degrees_of_freedom
## [1] 98
## 
## $ttable
##       Estimate     SE  t.value p.value
## ma1    -0.8285 0.0538 -15.3923   0.000
## xmean   0.0094 0.0167   0.5656   0.573
## 
## $AIC
## [1] 0.8607767
## 
## $AICc
## [1] 0.8832767
## 
## $BIC
## [1] -0.08711994

Fitting an ARMA model

x <- arima.sim(model = list(order = c(2, 0, 1), ar = c(1, -.9), ma = .8), n = 250)

# Plot x
plot(x)

# Plot the sample P/ACF of x
acf2(x)

##         ACF  PACF
##  [1,]  0.56  0.56
##  [2,] -0.32 -0.92
##  [3,] -0.82  0.38
##  [4,] -0.55 -0.29
##  [5,]  0.16  0.21
##  [6,]  0.65 -0.08
##  [7,]  0.53  0.08
##  [8,] -0.03 -0.01
##  [9,] -0.49  0.02
## [10,] -0.49 -0.08
## [11,] -0.08  0.03
## [12,]  0.35  0.04
## [13,]  0.44 -0.01
## [14,]  0.17  0.10
## [15,] -0.19  0.08
## [16,] -0.34  0.00
## [17,] -0.19 -0.09
## [18,]  0.06 -0.09
## [19,]  0.19 -0.06
## [20,]  0.14  0.06
## [21,]  0.03  0.05
## [22,] -0.05 -0.01
## [23,] -0.08 -0.03
## [24,] -0.08 -0.05
## [25,] -0.06  0.06
## [26,]  0.01  0.04
# Fit an ARMA(2,1) to the data and examine the t-table
sarima(x, p = 2, d = 0, q = 1)
## initial  value 1.334226 
## iter   2 value 0.549828
## iter   3 value 0.337621
## iter   4 value 0.021881
## iter   5 value 0.007836
## iter   6 value 0.003739
## iter   7 value 0.000661
## iter   8 value 0.000611
## iter   9 value 0.000611
## iter  10 value 0.000611
## iter  11 value 0.000611
## iter  11 value 0.000611
## final  value 0.000611 
## converged
## initial  value 0.002748 
## iter   2 value 0.001963
## iter   3 value 0.001782
## iter   4 value 0.001537
## iter   5 value 0.001493
## iter   6 value 0.001490
## iter   7 value 0.001488
## iter   8 value 0.001488
## iter   9 value 0.001488
## iter  10 value 0.001488
## iter  10 value 0.001488
## final  value 0.001488 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##          ar1      ar2     ma1    xmean
##       0.9823  -0.8654  0.8360  -0.1779
## s.e.  0.0315   0.0309  0.0393   0.1303
## 
## sigma^2 estimated as 0.979:  log likelihood = -355.11,  aic = 720.21
## 
## $degrees_of_freedom
## [1] 246
## 
## $ttable
##       Estimate     SE  t.value p.value
## ar1     0.9823 0.0315  31.1592  0.0000
## ar2    -0.8654 0.0309 -27.9836  0.0000
## ma1     0.8360 0.0393  21.2563  0.0000
## xmean  -0.1779 0.1303  -1.3652  0.1734
## 
## $AIC
## [1] 1.01073
## 
## $AICc
## [1] 1.019713
## 
## $BIC
## [1] 0.06707288

Model Choice - I

dl_varve <- diff(log(varve))

# Fit an MA(1) to dl_varve.   
sarima(dl_varve, p = 0, d = 0, q = 1)
## initial  value -0.551780 
## iter   2 value -0.671633
## iter   3 value -0.706234
## iter   4 value -0.707586
## iter   5 value -0.718543
## iter   6 value -0.719692
## iter   7 value -0.721967
## iter   8 value -0.722970
## iter   9 value -0.723231
## iter  10 value -0.723247
## iter  11 value -0.723248
## iter  12 value -0.723248
## iter  12 value -0.723248
## iter  12 value -0.723248
## final  value -0.723248 
## converged
## initial  value -0.722762 
## iter   2 value -0.722764
## iter   3 value -0.722764
## iter   4 value -0.722765
## iter   4 value -0.722765
## iter   4 value -0.722765
## final  value -0.722765 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##           ma1    xmean
##       -0.7710  -0.0013
## s.e.   0.0341   0.0044
## 
## sigma^2 estimated as 0.2353:  log likelihood = -440.68,  aic = 887.36
## 
## $degrees_of_freedom
## [1] 631
## 
## $ttable
##       Estimate     SE  t.value p.value
## ma1    -0.7710 0.0341 -22.6002  0.0000
## xmean  -0.0013 0.0044  -0.2818  0.7782
## 
## $AIC
## [1] -0.4406366
## 
## $AICc
## [1] -0.4374168
## 
## $BIC
## [1] -1.426575
# Fit an MA(2) to dl_varve. Improvement?
sarima(dl_varve, p = 0, d = 0, q =2)
## initial  value -0.551780 
## iter   2 value -0.679736
## iter   3 value -0.728605
## iter   4 value -0.734640
## iter   5 value -0.735449
## iter   6 value -0.735979
## iter   7 value -0.736015
## iter   8 value -0.736059
## iter   9 value -0.736060
## iter  10 value -0.736060
## iter  11 value -0.736061
## iter  12 value -0.736061
## iter  12 value -0.736061
## iter  12 value -0.736061
## final  value -0.736061 
## converged
## initial  value -0.735372 
## iter   2 value -0.735378
## iter   3 value -0.735379
## iter   4 value -0.735379
## iter   4 value -0.735379
## iter   4 value -0.735379
## final  value -0.735379 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##           ma1      ma2    xmean
##       -0.6710  -0.1595  -0.0013
## s.e.   0.0375   0.0392   0.0033
## 
## sigma^2 estimated as 0.2294:  log likelihood = -432.69,  aic = 873.39
## 
## $degrees_of_freedom
## [1] 630
## 
## $ttable
##       Estimate     SE  t.value p.value
## ma1    -0.6710 0.0375 -17.9057  0.0000
## ma2    -0.1595 0.0392  -4.0667  0.0001
## xmean  -0.0013 0.0033  -0.4007  0.6888
## 
## $AIC
## [1] -0.4629629
## 
## $AICc
## [1] -0.4597027
## 
## $BIC
## [1] -1.441871
# Fit an ARMA(1,1) to dl_varve. Improvement?
sarima(dl_varve, p = 1, d = 0, q = 1)
## initial  value -0.550994 
## iter   2 value -0.648962
## iter   3 value -0.676965
## iter   4 value -0.699167
## iter   5 value -0.724554
## iter   6 value -0.726719
## iter   7 value -0.729066
## iter   8 value -0.731976
## iter   9 value -0.734235
## iter  10 value -0.735969
## iter  11 value -0.736410
## iter  12 value -0.737045
## iter  13 value -0.737600
## iter  14 value -0.737641
## iter  15 value -0.737643
## iter  16 value -0.737643
## iter  17 value -0.737643
## iter  18 value -0.737643
## iter  18 value -0.737643
## iter  18 value -0.737643
## final  value -0.737643 
## converged
## initial  value -0.737522 
## iter   2 value -0.737527
## iter   3 value -0.737528
## iter   4 value -0.737529
## iter   5 value -0.737530
## iter   5 value -0.737530
## iter   5 value -0.737530
## final  value -0.737530 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##          ar1      ma1    xmean
##       0.2341  -0.8871  -0.0013
## s.e.  0.0518   0.0292   0.0028
## 
## sigma^2 estimated as 0.2284:  log likelihood = -431.33,  aic = 870.66
## 
## $degrees_of_freedom
## [1] 630
## 
## $ttable
##       Estimate     SE  t.value p.value
## ar1     0.2341 0.0518   4.5184  0.0000
## ma1    -0.8871 0.0292 -30.4107  0.0000
## xmean  -0.0013 0.0028  -0.4618  0.6444
## 
## $AIC
## [1] -0.467376
## 
## $AICc
## [1] -0.4641159
## 
## $BIC
## [1] -1.446284

ARMA get in

The data in oil are crude oil, WTI spot price FOB (in dollars per barrel), weekly data from 2000 to 2008. Let us use our skills to fit an ARMA model to the returns. The weekly crude oil prices (oil) are plotted for us. Throughout the exercise, work with the returns, which we will calculate.

As before, the astsa package is preloaded for we. The data are preloaded as oil and plotted on the right.

# Calculate approximate oil returns
oil_returns <- diff(log(oil))

# Plot oil_returns. Notice the outliers.
plot(oil_returns)

# Plot the P/ACF pair for oil_returns
acf2(oil_returns)

##          ACF  PACF
##   [1,]  0.13  0.13
##   [2,] -0.07 -0.09
##   [3,]  0.13  0.16
##   [4,] -0.01 -0.06
##   [5,]  0.02  0.05
##   [6,] -0.03 -0.08
##   [7,] -0.03  0.00
##   [8,]  0.13  0.12
##   [9,]  0.08  0.05
##  [10,]  0.02  0.03
##  [11,]  0.01 -0.02
##  [12,]  0.00  0.00
##  [13,] -0.02 -0.03
##  [14,]  0.06  0.09
##  [15,] -0.05 -0.07
##  [16,] -0.09 -0.06
##  [17,]  0.03  0.01
##  [18,]  0.05  0.04
##  [19,] -0.05 -0.05
##  [20,] -0.07 -0.05
##  [21,]  0.04  0.05
##  [22,]  0.09  0.06
##  [23,] -0.05 -0.06
##  [24,] -0.08 -0.05
##  [25,] -0.07 -0.08
##  [26,]  0.00  0.02
##  [27,] -0.11 -0.11
##  [28,] -0.07  0.01
##  [29,]  0.02  0.00
##  [30,] -0.02 -0.01
##  [31,] -0.03 -0.05
##  [32,] -0.05 -0.04
##  [33,] -0.03  0.02
##  [34,]  0.00  0.02
##  [35,] -0.09 -0.08
##  [36,] -0.01  0.02
##  [37,] -0.04 -0.04
##  [38,] -0.01  0.04
##  [39,]  0.02 -0.01
##  [40,] -0.01 -0.01
##  [41,] -0.06 -0.05
##  [42,]  0.01  0.03
##  [43,]  0.00 -0.03
##  [44,] -0.01  0.00
##  [45,]  0.04  0.08
##  [46,]  0.01  0.00
##  [47,]  0.05  0.05
##  [48,]  0.07  0.01
##  [49,] -0.01  0.04
##  [50,] -0.03 -0.08
##  [51,]  0.01  0.01
##  [52,] -0.04 -0.07
##  [53,] -0.04  0.00
##  [54,] -0.03 -0.06
##  [55,]  0.00  0.00
##  [56,] -0.01 -0.06
##  [57,] -0.10 -0.11
##  [58,] -0.01  0.01
##  [59,] -0.05 -0.09
##  [60,] -0.04 -0.01
##  [61,] -0.03 -0.04
##  [62,]  0.01  0.04
##  [63,]  0.01 -0.01
##  [64,] -0.01  0.00
##  [65,] -0.04 -0.04
##  [66,]  0.02  0.03
##  [67,]  0.00  0.00
##  [68,] -0.01  0.00
##  [69,] -0.03 -0.04
##  [70,] -0.02 -0.02
##  [71,] -0.05 -0.04
##  [72,] -0.01  0.00
##  [73,] -0.01 -0.01
##  [74,] -0.02  0.00
##  [75,]  0.01  0.02
##  [76,]  0.02 -0.01
##  [77,]  0.04  0.04
##  [78,] -0.01 -0.02
##  [79,]  0.03  0.08
##  [80,]  0.02 -0.03
##  [81,] -0.04 -0.03
##  [82,] -0.01 -0.03
##  [83,]  0.02  0.03
##  [84,]  0.03 -0.03
##  [85,]  0.01 -0.02
##  [86,]  0.03  0.03
##  [87,]  0.08  0.04
##  [88,] -0.04 -0.09
##  [89,] -0.02 -0.01
##  [90,]  0.01 -0.02
##  [91,] -0.04 -0.03
##  [92,]  0.05  0.03
##  [93,]  0.07  0.05
##  [94,] -0.04 -0.11
##  [95,]  0.02  0.02
##  [96,]  0.05 -0.01
##  [97,]  0.01  0.02
##  [98,]  0.00 -0.03
##  [99,]  0.01  0.06
## [100,]  0.04  0.01
## [101,]  0.01 -0.05
## [102,] -0.03  0.02
## [103,] -0.04 -0.03
## [104,] -0.01  0.01
## [105,]  0.02  0.00
## [106,]  0.01  0.04
## [107,]  0.01 -0.01
## [108,]  0.06  0.07
## [109,]  0.08  0.04
## [110,]  0.04  0.04
## [111,]  0.02  0.00
## [112,]  0.01  0.05
## [113,]  0.03 -0.01
## [114,]  0.02  0.00
## [115,] -0.02 -0.04
## [116,] -0.04 -0.03
## [117,] -0.01 -0.03
## [118,]  0.04  0.02
## [119,]  0.05  0.04
## [120,] -0.02 -0.01
## [121,] -0.02 -0.04
## [122,]  0.03 -0.01
## [123,]  0.01  0.03
## [124,] -0.04 -0.03
## [125,] -0.08 -0.07
## [126,]  0.02  0.00
## [127,]  0.00 -0.02
## [128,] -0.04 -0.04
## [129,]  0.01  0.01
## [130,]  0.02  0.01
## [131,]  0.01 -0.01
## [132,]  0.02  0.02
## [133,]  0.00  0.05
## [134,] -0.01  0.02
## [135,]  0.00  0.01
## [136,] -0.03  0.02
## [137,] -0.06 -0.02
## [138,]  0.01  0.04
## [139,] -0.02  0.01
## [140,] -0.02 -0.03
## [141,]  0.02 -0.02
## [142,] -0.01  0.02
## [143,] -0.03 -0.01
## [144,]  0.00  0.02
## [145,]  0.00 -0.01
## [146,] -0.04  0.02
## [147,] -0.01 -0.02
## [148,] -0.02 -0.03
## [149,] -0.04 -0.02
## [150,] -0.04 -0.01
## [151,]  0.01  0.02
## [152,]  0.01 -0.01
## [153,]  0.04  0.04
## [154,]  0.03  0.03
## [155,]  0.01 -0.04
## [156,]  0.05  0.03
## [157,]  0.01  0.00
## [158,] -0.06 -0.05
## [159,]  0.02  0.02
## [160,]  0.05  0.03
## [161,] -0.02  0.00
## [162,]  0.05  0.03
## [163,]  0.00  0.01
## [164,] -0.01  0.00
## [165,]  0.00  0.00
## [166,] -0.01  0.02
## [167,] -0.02  0.01
## [168,] -0.01 -0.01
## [169,]  0.00  0.03
## [170,] -0.03 -0.03
## [171,] -0.01  0.00
## [172,] -0.02 -0.04
## [173,] -0.02  0.00
## [174,]  0.04  0.02
## [175,] -0.01 -0.03
## [176,] -0.03 -0.01
## [177,]  0.02  0.01
## [178,]  0.01  0.02
## [179,] -0.01 -0.01
## [180,] -0.01  0.00
## [181,] -0.04 -0.04
## [182,]  0.07  0.08
## [183,] -0.01 -0.05
## [184,] -0.04  0.02
## [185,]  0.05 -0.01
## [186,] -0.02 -0.02
## [187,] -0.01  0.03
## [188,]  0.01  0.00
## [189,] -0.05 -0.03
## [190,] -0.04 -0.04
## [191,] -0.01  0.01
## [192,]  0.01 -0.01
## [193,]  0.04  0.07
## [194,] -0.01 -0.01
## [195,]  0.00  0.02
## [196,]  0.06 -0.01
## [197,] -0.06 -0.03
## [198,] -0.02  0.01
## [199,]  0.02  0.00
## [200,]  0.00  0.00
## [201,]  0.00 -0.02
## [202,]  0.00 -0.01
## [203,] -0.04 -0.05
## [204,]  0.00 -0.01
## [205,]  0.04  0.02
## [206,]  0.04  0.04
## [207,]  0.04  0.05
## [208,]  0.01 -0.01
# Assuming both P/ACF are tailing, fit a model to oil_returns
sarima(oil_returns, p = 1, d = 0, q = 1)
## initial  value -3.057594 
## iter   2 value -3.061420
## iter   3 value -3.067360
## iter   4 value -3.067479
## iter   5 value -3.071834
## iter   6 value -3.074359
## iter   7 value -3.074843
## iter   8 value -3.076656
## iter   9 value -3.080467
## iter  10 value -3.081546
## iter  11 value -3.081603
## iter  12 value -3.081615
## iter  13 value -3.081642
## iter  14 value -3.081643
## iter  14 value -3.081643
## iter  14 value -3.081643
## final  value -3.081643 
## converged
## initial  value -3.082345 
## iter   2 value -3.082345
## iter   3 value -3.082346
## iter   4 value -3.082346
## iter   5 value -3.082346
## iter   5 value -3.082346
## iter   5 value -3.082346
## final  value -3.082346 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = xmean, include.mean = FALSE, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##           ar1     ma1   xmean
##       -0.5264  0.7146  0.0018
## s.e.   0.0871  0.0683  0.0022
## 
## sigma^2 estimated as 0.002102:  log likelihood = 904.89,  aic = -1801.79
## 
## $degrees_of_freedom
## [1] 541
## 
## $ttable
##       Estimate     SE t.value p.value
## ar1    -0.5264 0.0871 -6.0422  0.0000
## ma1     0.7146 0.0683 10.4699  0.0000
## xmean   0.0018 0.0022  0.7981  0.4252
## 
## $AIC
## [1] -5.153838
## 
## $AICc
## [1] -5.150025
## 
## $BIC
## [1] -6.130131

ARIMA Models

Now that we know how to fit ARMA models to stationary time series, we will learn about integrated ARMA (ARIMA) models for nonstationary time series. we will fit the models to real data using R time series commands from the stats and astsa packages.

ARIMA - Plug and Play

x <- arima.sim(model = list(order = c(1, 1, 0), ar = .9), n = 200)
# Plot x
plot(x)

# Plot the P/ACF pair of x
acf2(x)

##        ACF  PACF
##  [1,] 0.99  0.99
##  [2,] 0.98 -0.14
##  [3,] 0.97 -0.11
##  [4,] 0.95 -0.10
##  [5,] 0.94 -0.06
##  [6,] 0.92 -0.06
##  [7,] 0.90 -0.07
##  [8,] 0.88 -0.04
##  [9,] 0.86 -0.01
## [10,] 0.84 -0.03
## [11,] 0.81 -0.02
## [12,] 0.79  0.00
## [13,] 0.77 -0.02
## [14,] 0.74 -0.01
## [15,] 0.72  0.00
## [16,] 0.70  0.00
## [17,] 0.67 -0.01
## [18,] 0.65  0.01
## [19,] 0.63  0.01
## [20,] 0.61 -0.01
## [21,] 0.58 -0.03
## [22,] 0.56 -0.03
## [23,] 0.54 -0.03
## [24,] 0.51 -0.05
## [25,] 0.49 -0.04
# Plot the differenced data
plot(diff(x))

# Plot the P/ACF pair of the differenced data
acf2(diff(x))

##         ACF  PACF
##  [1,]  0.91  0.91
##  [2,]  0.85  0.10
##  [3,]  0.79  0.03
##  [4,]  0.72 -0.08
##  [5,]  0.67  0.01
##  [6,]  0.61 -0.01
##  [7,]  0.56 -0.03
##  [8,]  0.48 -0.20
##  [9,]  0.40 -0.04
## [10,]  0.35  0.05
## [11,]  0.29 -0.02
## [12,]  0.25  0.04
## [13,]  0.21  0.01
## [14,]  0.15 -0.13
## [15,]  0.10  0.01
## [16,]  0.07  0.04
## [17,]  0.04  0.02
## [18,]  0.01 -0.01
## [19,] -0.01 -0.03
## [20,] -0.01  0.06
## [21,] -0.02  0.04
## [22,] -0.03  0.01
## [23,] -0.03 -0.04
## [24,] -0.02  0.04
## [25,] -0.01  0.03

Global Warming

# Plot the sample P/ACF pair of the differenced data 
acf2(diff(globtemp))

##         ACF  PACF
##  [1,] -0.24 -0.24
##  [2,] -0.19 -0.26
##  [3,] -0.08 -0.23
##  [4,]  0.20  0.06
##  [5,] -0.15 -0.16
##  [6,] -0.03 -0.09
##  [7,]  0.03 -0.05
##  [8,]  0.14  0.07
##  [9,] -0.16 -0.09
## [10,]  0.11  0.11
## [11,] -0.05 -0.03
## [12,]  0.00 -0.02
## [13,] -0.13 -0.10
## [14,]  0.14  0.02
## [15,] -0.01  0.00
## [16,] -0.08 -0.09
## [17,]  0.00  0.00
## [18,]  0.19  0.11
## [19,] -0.07  0.04
## [20,]  0.02  0.13
## [21,] -0.02  0.09
## [22,]  0.08  0.08
# Fit an ARIMA(1,1,1) model to globtemp
sarima(globtemp, p = 1, d = 1, q = 1)
## initial  value -2.218917 
## iter   2 value -2.253118
## iter   3 value -2.263750
## iter   4 value -2.272144
## iter   5 value -2.282786
## iter   6 value -2.296777
## iter   7 value -2.297062
## iter   8 value -2.297253
## iter   9 value -2.297389
## iter  10 value -2.297405
## iter  11 value -2.297413
## iter  12 value -2.297413
## iter  13 value -2.297414
## iter  13 value -2.297414
## iter  13 value -2.297414
## final  value -2.297414 
## converged
## initial  value -2.305504 
## iter   2 value -2.305800
## iter   3 value -2.305821
## iter   4 value -2.306655
## iter   5 value -2.306875
## iter   6 value -2.306950
## iter   7 value -2.306955
## iter   8 value -2.306955
## iter   8 value -2.306955
## final  value -2.306955 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = constant, optim.control = list(trace = trc, REPORT = 1, 
##     reltol = tol))
## 
## Coefficients:
##          ar1      ma1  constant
##       0.3549  -0.7663    0.0072
## s.e.  0.1314   0.0874    0.0032
## 
## sigma^2 estimated as 0.009885:  log likelihood = 119.88,  aic = -231.76
## 
## $degrees_of_freedom
## [1] 132
## 
## $ttable
##          Estimate     SE t.value p.value
## ar1        0.3549 0.1314  2.7008  0.0078
## ma1       -0.7663 0.0874 -8.7701  0.0000
## constant   0.0072 0.0032  2.2738  0.0246
## 
## $AIC
## [1] -3.572642
## 
## $AICc
## [1] -3.555691
## 
## $BIC
## [1] -4.508392
# Fit an ARIMA(0,1,2) model to globtemp. Which model is better?

sarima(globtemp, p = 0, d = 1, q = 2)
## initial  value -2.220513 
## iter   2 value -2.294887
## iter   3 value -2.307682
## iter   4 value -2.309170
## iter   5 value -2.310360
## iter   6 value -2.311251
## iter   7 value -2.311636
## iter   8 value -2.311648
## iter   9 value -2.311649
## iter   9 value -2.311649
## iter   9 value -2.311649
## final  value -2.311649 
## converged
## initial  value -2.310187 
## iter   2 value -2.310197
## iter   3 value -2.310199
## iter   4 value -2.310201
## iter   5 value -2.310202
## iter   5 value -2.310202
## iter   5 value -2.310202
## final  value -2.310202 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = constant, optim.control = list(trace = trc, REPORT = 1, 
##     reltol = tol))
## 
## Coefficients:
##           ma1      ma2  constant
##       -0.3984  -0.2173    0.0072
## s.e.   0.0808   0.0768    0.0033
## 
## sigma^2 estimated as 0.00982:  log likelihood = 120.32,  aic = -232.64
## 
## $degrees_of_freedom
## [1] 132
## 
## $ttable
##          Estimate     SE t.value p.value
## ma1       -0.3984 0.0808 -4.9313  0.0000
## ma2       -0.2173 0.0768 -2.8303  0.0054
## constant   0.0072 0.0033  2.1463  0.0337
## 
## $AIC
## [1] -3.579224
## 
## $AICc
## [1] -3.562273
## 
## $BIC
## [1] -4.514974

Forecasting Simulated ARIMA

x <- arima.sim(model = list(order = c(1, 1, 0), ar = .9), n = 200) 
# Plot P/ACF pair of differenced data 
acf2(diff(x))

##         ACF  PACF
##  [1,]  0.79  0.79
##  [2,]  0.59 -0.07
##  [3,]  0.47  0.07
##  [4,]  0.38 -0.01
##  [5,]  0.28 -0.03
##  [6,]  0.21  0.00
##  [7,]  0.19  0.06
##  [8,]  0.16  0.00
##  [9,]  0.16  0.08
## [10,]  0.16 -0.01
## [11,]  0.16  0.04
## [12,]  0.18  0.06
## [13,]  0.20  0.05
## [14,]  0.19 -0.01
## [15,]  0.17 -0.03
## [16,]  0.15  0.01
## [17,]  0.15  0.03
## [18,]  0.07 -0.17
## [19,]  0.01  0.01
## [20,]  0.01  0.04
## [21,] -0.01 -0.04
## [22,]  0.00  0.07
## [23,]  0.02  0.03
## [24,]  0.02 -0.05
## [25,]  0.02  0.00
# Fit model - check t-table and diagnostics
sarima(x, p = 1, d = 1, q = 0)
## initial  value 0.477041 
## iter   2 value -0.006433
## iter   3 value -0.006433
## iter   4 value -0.006433
## iter   5 value -0.006433
## iter   6 value -0.006433
## iter   7 value -0.006433
## iter   8 value -0.006433
## iter   9 value -0.006433
## iter  10 value -0.006433
## iter  11 value -0.006433
## iter  12 value -0.006433
## iter  13 value -0.006433
## iter  13 value -0.006433
## iter  13 value -0.006433
## final  value -0.006433 
## converged
## initial  value -0.006155 
## iter   2 value -0.006166
## iter   3 value -0.006176
## iter   4 value -0.006177
## iter   5 value -0.006184
## iter   6 value -0.006185
## iter   7 value -0.006185
## iter   8 value -0.006185
## iter   8 value -0.006185
## iter   8 value -0.006185
## final  value -0.006185 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = constant, optim.control = list(trace = trc, REPORT = 1, 
##     reltol = tol))
## 
## Coefficients:
##          ar1  constant
##       0.7839   -0.4317
## s.e.  0.0432    0.3187
## 
## sigma^2 estimated as 0.983:  log likelihood = -282.55,  aic = 571.1
## 
## $degrees_of_freedom
## [1] 198
## 
## $ttable
##          Estimate     SE t.value p.value
## ar1        0.7839 0.0432 18.1646  0.0000
## constant  -0.4317 0.3187 -1.3545  0.1771
## 
## $AIC
## [1] 1.002766
## 
## $AICc
## [1] 1.013322
## 
## $BIC
## [1] 0.03563439
# Forecast the data 20 time periods ahead
sarima.for(x, n.ahead = 20, p = 1, d = 1, q = 0) 

## $pred
## Time Series:
## Start = 202 
## End = 221 
## Frequency = 1 
##  [1] -90.36814 -90.48686 -90.67322 -90.91260 -91.19354 -91.50707 -91.84613
##  [8] -92.20522 -92.57999 -92.96707 -93.36380 -93.76808 -94.17828 -94.59313
## [15] -95.01163 -95.43297 -95.85656 -96.28189 -96.70861 -97.13640
## 
## $se
## Time Series:
## Start = 202 
## End = 221 
## Frequency = 1 
##  [1]  0.9914692  2.0276075  3.1249955  4.2331183  5.3245717  6.3847129
##  [7]  7.4062079  8.3860230  9.3236722 10.2201592 11.0773241 11.8974370
## [13] 12.6829448 13.4363160 14.1599475 14.8561119 15.5269300 16.1743603
## [19] 16.8001981 17.4060813
#lines(y)  

Forecasting Global Temperatures

Now you can try forecasting real data.

Here, you will forecast the annual global temperature deviations globtemp to 2050. Recall that in previous exercises, you fit an ARIMA(0,1,2) model to the data. You will refit the model to confirm it, and then forecast the series 35 years into the future.

# Fit an ARIMA(0,1,2) to globtemp and check the fit
sarima(globtemp, p = 0, d = 1, q = 2)
## initial  value -2.220513 
## iter   2 value -2.294887
## iter   3 value -2.307682
## iter   4 value -2.309170
## iter   5 value -2.310360
## iter   6 value -2.311251
## iter   7 value -2.311636
## iter   8 value -2.311648
## iter   9 value -2.311649
## iter   9 value -2.311649
## iter   9 value -2.311649
## final  value -2.311649 
## converged
## initial  value -2.310187 
## iter   2 value -2.310197
## iter   3 value -2.310199
## iter   4 value -2.310201
## iter   5 value -2.310202
## iter   5 value -2.310202
## iter   5 value -2.310202
## final  value -2.310202 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = constant, optim.control = list(trace = trc, REPORT = 1, 
##     reltol = tol))
## 
## Coefficients:
##           ma1      ma2  constant
##       -0.3984  -0.2173    0.0072
## s.e.   0.0808   0.0768    0.0033
## 
## sigma^2 estimated as 0.00982:  log likelihood = 120.32,  aic = -232.64
## 
## $degrees_of_freedom
## [1] 132
## 
## $ttable
##          Estimate     SE t.value p.value
## ma1       -0.3984 0.0808 -4.9313  0.0000
## ma2       -0.2173 0.0768 -2.8303  0.0054
## constant   0.0072 0.0033  2.1463  0.0337
## 
## $AIC
## [1] -3.579224
## 
## $AICc
## [1] -3.562273
## 
## $BIC
## [1] -4.514974
# Forecast data 35 years into the future
sarima.for(globtemp, n.ahead = 35, p = 0, d = 1, q = 2) 

## $pred
## Time Series:
## Start = 2016 
## End = 2050 
## Frequency = 1 
##  [1] 0.7995567 0.7745381 0.7816919 0.7888457 0.7959996 0.8031534 0.8103072
##  [8] 0.8174611 0.8246149 0.8317688 0.8389226 0.8460764 0.8532303 0.8603841
## [15] 0.8675379 0.8746918 0.8818456 0.8889995 0.8961533 0.9033071 0.9104610
## [22] 0.9176148 0.9247687 0.9319225 0.9390763 0.9462302 0.9533840 0.9605378
## [29] 0.9676917 0.9748455 0.9819994 0.9891532 0.9963070 1.0034609 1.0106147
## 
## $se
## Time Series:
## Start = 2016 
## End = 2050 
## Frequency = 1 
##  [1] 0.09909556 0.11564576 0.12175580 0.12757353 0.13313729 0.13847769
##  [7] 0.14361964 0.14858376 0.15338730 0.15804492 0.16256915 0.16697084
## [13] 0.17125943 0.17544322 0.17952954 0.18352490 0.18743511 0.19126540
## [19] 0.19502047 0.19870459 0.20232164 0.20587515 0.20936836 0.21280424
## [25] 0.21618551 0.21951471 0.22279416 0.22602604 0.22921235 0.23235497
## [31] 0.23545565 0.23851603 0.24153763 0.24452190 0.24747019

Seasonal ARIMA

We will learn how to fit and forecast seasonal time series data using seasonal ARIMA models. This is accomplished using what you learned in the previous chapters and by learning how to extend the R time series commands available in the stats and astsa packages.

Data Analysis - Unemployment

# Plot unemp 
plot(unemp)

# Difference your data and plot it
d_unemp <- diff(unemp)
plot(d_unemp)

# Seasonally difference d_unemp and plot it
dd_unemp <- diff(d_unemp, lag = 12)
plot(dd_unemp)

# Plot P/ACF pair of fully differenced data to lag 60
acf2(dd_unemp)

##         ACF  PACF
##  [1,]  0.21  0.21
##  [2,]  0.33  0.29
##  [3,]  0.15  0.05
##  [4,]  0.17  0.05
##  [5,]  0.10  0.01
##  [6,]  0.06 -0.02
##  [7,] -0.06 -0.12
##  [8,] -0.02 -0.03
##  [9,] -0.09 -0.05
## [10,] -0.17 -0.15
## [11,] -0.08  0.02
## [12,] -0.48 -0.43
## [13,] -0.18 -0.02
## [14,] -0.16  0.15
## [15,] -0.11  0.03
## [16,] -0.15 -0.04
## [17,] -0.09 -0.01
## [18,] -0.09  0.00
## [19,]  0.03  0.01
## [20,] -0.01  0.01
## [21,]  0.02 -0.01
## [22,] -0.02 -0.16
## [23,]  0.01  0.01
## [24,] -0.02 -0.27
## [25,]  0.09  0.05
## [26,] -0.05 -0.01
## [27,] -0.01 -0.05
## [28,]  0.03  0.05
## [29,]  0.08  0.09
## [30,]  0.01 -0.04
## [31,]  0.03  0.02
## [32,] -0.05 -0.07
## [33,]  0.01 -0.01
## [34,]  0.02 -0.08
## [35,] -0.06 -0.08
## [36,] -0.02 -0.23
## [37,] -0.12 -0.08
## [38,]  0.01  0.06
## [39,] -0.03 -0.07
## [40,] -0.03 -0.01
## [41,] -0.10  0.03
## [42,] -0.02 -0.03
## [43,] -0.13 -0.11
## [44,]  0.00 -0.04
## [45,] -0.06  0.01
## [46,]  0.01  0.00
## [47,]  0.02 -0.03
## [48,]  0.11 -0.04
# Plot P/ACF pair of fully differenced data to lag 60
dd_unemp <- diff(diff(unemp), lag = 12)


# Fit an appropriate model

acf2(dd_unemp, max.lag = 60)

##         ACF  PACF
##  [1,]  0.21  0.21
##  [2,]  0.33  0.29
##  [3,]  0.15  0.05
##  [4,]  0.17  0.05
##  [5,]  0.10  0.01
##  [6,]  0.06 -0.02
##  [7,] -0.06 -0.12
##  [8,] -0.02 -0.03
##  [9,] -0.09 -0.05
## [10,] -0.17 -0.15
## [11,] -0.08  0.02
## [12,] -0.48 -0.43
## [13,] -0.18 -0.02
## [14,] -0.16  0.15
## [15,] -0.11  0.03
## [16,] -0.15 -0.04
## [17,] -0.09 -0.01
## [18,] -0.09  0.00
## [19,]  0.03  0.01
## [20,] -0.01  0.01
## [21,]  0.02 -0.01
## [22,] -0.02 -0.16
## [23,]  0.01  0.01
## [24,] -0.02 -0.27
## [25,]  0.09  0.05
## [26,] -0.05 -0.01
## [27,] -0.01 -0.05
## [28,]  0.03  0.05
## [29,]  0.08  0.09
## [30,]  0.01 -0.04
## [31,]  0.03  0.02
## [32,] -0.05 -0.07
## [33,]  0.01 -0.01
## [34,]  0.02 -0.08
## [35,] -0.06 -0.08
## [36,] -0.02 -0.23
## [37,] -0.12 -0.08
## [38,]  0.01  0.06
## [39,] -0.03 -0.07
## [40,] -0.03 -0.01
## [41,] -0.10  0.03
## [42,] -0.02 -0.03
## [43,] -0.13 -0.11
## [44,]  0.00 -0.04
## [45,] -0.06  0.01
## [46,]  0.01  0.00
## [47,]  0.02 -0.03
## [48,]  0.11 -0.04
## [49,]  0.13  0.02
## [50,]  0.10  0.03
## [51,]  0.07 -0.05
## [52,]  0.10  0.02
## [53,]  0.12  0.02
## [54,]  0.06 -0.08
## [55,]  0.14  0.00
## [56,]  0.05 -0.03
## [57,]  0.04 -0.07
## [58,]  0.04  0.05
## [59,]  0.07  0.04
## [60,] -0.03 -0.04
# Fit an appropriate model
sarima(unemp, p = 2, d = 1, q = 0, P = 0, D = 1, Q = 1, S = 12)
## initial  value 3.340809 
## iter   2 value 3.105512
## iter   3 value 3.086631
## iter   4 value 3.079778
## iter   5 value 3.069447
## iter   6 value 3.067659
## iter   7 value 3.067426
## iter   8 value 3.067418
## iter   8 value 3.067418
## final  value 3.067418 
## converged
## initial  value 3.065481 
## iter   2 value 3.065478
## iter   3 value 3.065477
## iter   3 value 3.065477
## iter   3 value 3.065477
## final  value 3.065477 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), include.mean = !no.constant, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##          ar1     ar2     sma1
##       0.1351  0.2464  -0.6953
## s.e.  0.0513  0.0515   0.0381
## 
## sigma^2 estimated as 449.6:  log likelihood = -1609.91,  aic = 3227.81
## 
## $degrees_of_freedom
## [1] 356
## 
## $ttable
##      Estimate     SE  t.value p.value
## ar1    0.1351 0.0513   2.6326  0.0088
## ar2    0.2464 0.0515   4.7795  0.0000
## sma1  -0.6953 0.0381 -18.2362  0.0000
## 
## $AIC
## [1] 7.12457
## 
## $AICc
## [1] 7.130239
## 
## $BIC
## [1] 6.156174

Data Analysis - Commodity Prices

# Plot differenced chicken
plot(diff(chicken))

# Plot P/ACF pair of differenced data to lag 60
acf2(diff(chicken), max.lag=60)

##         ACF  PACF
##  [1,]  0.72  0.72
##  [2,]  0.39 -0.29
##  [3,]  0.09 -0.14
##  [4,] -0.07  0.03
##  [5,] -0.16 -0.10
##  [6,] -0.20 -0.06
##  [7,] -0.27 -0.19
##  [8,] -0.23  0.12
##  [9,] -0.11  0.10
## [10,]  0.09  0.16
## [11,]  0.26  0.09
## [12,]  0.33  0.00
## [13,]  0.20 -0.22
## [14,]  0.07  0.03
## [15,] -0.03  0.03
## [16,] -0.10 -0.11
## [17,] -0.19 -0.09
## [18,] -0.25  0.01
## [19,] -0.29 -0.03
## [20,] -0.20  0.07
## [21,] -0.08 -0.04
## [22,]  0.08  0.06
## [23,]  0.16 -0.05
## [24,]  0.18  0.02
## [25,]  0.08 -0.14
## [26,] -0.06 -0.19
## [27,] -0.21 -0.13
## [28,] -0.31 -0.06
## [29,] -0.40 -0.08
## [30,] -0.40 -0.05
## [31,] -0.33  0.01
## [32,] -0.18  0.03
## [33,]  0.02  0.10
## [34,]  0.20  0.02
## [35,]  0.30 -0.01
## [36,]  0.35  0.09
## [37,]  0.26 -0.12
## [38,]  0.13  0.01
## [39,] -0.02 -0.01
## [40,] -0.14 -0.05
## [41,] -0.23  0.02
## [42,] -0.21  0.12
## [43,] -0.18 -0.05
## [44,] -0.11 -0.13
## [45,] -0.03 -0.07
## [46,]  0.08  0.01
## [47,]  0.21  0.14
## [48,]  0.33  0.05
## [49,]  0.26 -0.20
## [50,]  0.12 -0.01
## [51,] -0.01  0.07
## [52,] -0.11 -0.04
## [53,] -0.13  0.02
## [54,] -0.09  0.00
## [55,] -0.09 -0.08
## [56,] -0.06  0.03
## [57,]  0.03  0.04
## [58,]  0.17  0.00
## [59,]  0.29  0.01
## [60,]  0.32  0.03
# Fit ARIMA(2,1,0) to chicken - not so good
sarima(chicken, p = 2, d = 1, q = 0)
## initial  value 0.001863 
## iter   2 value -0.156034
## iter   3 value -0.359181
## iter   4 value -0.424164
## iter   5 value -0.430212
## iter   6 value -0.432744
## iter   7 value -0.432747
## iter   8 value -0.432749
## iter   9 value -0.432749
## iter  10 value -0.432751
## iter  11 value -0.432752
## iter  12 value -0.432752
## iter  13 value -0.432752
## iter  13 value -0.432752
## iter  13 value -0.432752
## final  value -0.432752 
## converged
## initial  value -0.420883 
## iter   2 value -0.420934
## iter   3 value -0.420936
## iter   4 value -0.420937
## iter   5 value -0.420937
## iter   6 value -0.420937
## iter   6 value -0.420937
## iter   6 value -0.420937
## final  value -0.420937 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = constant, optim.control = list(trace = trc, REPORT = 1, 
##     reltol = tol))
## 
## Coefficients:
##          ar1      ar2  constant
##       0.9494  -0.3069    0.2632
## s.e.  0.0717   0.0718    0.1362
## 
## sigma^2 estimated as 0.4286:  log likelihood = -178.64,  aic = 365.28
## 
## $degrees_of_freedom
## [1] 176
## 
## $ttable
##          Estimate     SE t.value p.value
## ar1        0.9494 0.0717 13.2339  0.0000
## ar2       -0.3069 0.0718 -4.2723  0.0000
## constant   0.2632 0.1362  1.9328  0.0549
## 
## $AIC
## [1] 0.1861622
## 
## $AICc
## [1] 0.1985432
## 
## $BIC
## [1] -0.7606218
# Fit SARIMA(2,1,0,1,0,0,12) to chicken - that works
sarima(chicken, p = 2, d = 1, q = 0, P = 1, D = 0, Q = 0, S = 12)
## initial  value 0.015039 
## iter   2 value -0.226398
## iter   3 value -0.412955
## iter   4 value -0.460882
## iter   5 value -0.470787
## iter   6 value -0.471082
## iter   7 value -0.471088
## iter   8 value -0.471090
## iter   9 value -0.471092
## iter  10 value -0.471095
## iter  11 value -0.471095
## iter  12 value -0.471096
## iter  13 value -0.471096
## iter  14 value -0.471096
## iter  15 value -0.471097
## iter  16 value -0.471097
## iter  16 value -0.471097
## iter  16 value -0.471097
## final  value -0.471097 
## converged
## initial  value -0.473585 
## iter   2 value -0.473664
## iter   3 value -0.473721
## iter   4 value -0.473823
## iter   5 value -0.473871
## iter   6 value -0.473885
## iter   7 value -0.473886
## iter   8 value -0.473886
## iter   8 value -0.473886
## iter   8 value -0.473886
## final  value -0.473886 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = constant, optim.control = list(trace = trc, REPORT = 1, 
##     reltol = tol))
## 
## Coefficients:
##          ar1      ar2    sar1  constant
##       0.9154  -0.2494  0.3237    0.2353
## s.e.  0.0733   0.0739  0.0715    0.1973
## 
## sigma^2 estimated as 0.3828:  log likelihood = -169.16,  aic = 348.33
## 
## $degrees_of_freedom
## [1] 175
## 
## $ttable
##          Estimate     SE t.value p.value
## ar1        0.9154 0.0733 12.4955  0.0000
## ar2       -0.2494 0.0739 -3.3728  0.0009
## sar1       0.3237 0.0715  4.5238  0.0000
## constant   0.2353 0.1973  1.1923  0.2347
## 
## $AIC
## [1] 0.0842377
## 
## $AICc
## [1] 0.09726452
## 
## $BIC
## [1] -0.8448077

Data Analysis - Birth Rate

# Plot P/ACF to lag 60 of differenced data
d_birth <- diff(birth)
acf2(d_birth, max.lag=60)

##         ACF  PACF
##  [1,] -0.32 -0.32
##  [2,]  0.16  0.06
##  [3,] -0.08 -0.01
##  [4,] -0.19 -0.25
##  [5,]  0.09 -0.03
##  [6,] -0.28 -0.26
##  [7,]  0.06 -0.17
##  [8,] -0.19 -0.29
##  [9,] -0.05 -0.35
## [10,]  0.17 -0.16
## [11,] -0.26 -0.59
## [12,]  0.82  0.57
## [13,] -0.28  0.13
## [14,]  0.17  0.11
## [15,] -0.07  0.13
## [16,] -0.18  0.09
## [17,]  0.08  0.00
## [18,] -0.28  0.00
## [19,]  0.07  0.05
## [20,] -0.18  0.04
## [21,] -0.05 -0.07
## [22,]  0.16 -0.10
## [23,] -0.24 -0.20
## [24,]  0.78  0.19
## [25,] -0.27  0.01
## [26,]  0.19  0.05
## [27,] -0.08  0.07
## [28,] -0.17  0.07
## [29,]  0.07 -0.02
## [30,] -0.29 -0.06
## [31,]  0.07 -0.02
## [32,] -0.15  0.09
## [33,] -0.04  0.03
## [34,]  0.14 -0.06
## [35,] -0.24 -0.16
## [36,]  0.75  0.03
## [37,] -0.23  0.08
## [38,]  0.16 -0.10
## [39,] -0.08 -0.03
## [40,] -0.15  0.07
## [41,]  0.05 -0.04
## [42,] -0.25  0.06
## [43,]  0.06  0.04
## [44,] -0.18 -0.07
## [45,] -0.03 -0.06
## [46,]  0.15  0.02
## [47,] -0.22 -0.04
## [48,]  0.72  0.10
## [49,] -0.24  0.01
## [50,]  0.16  0.00
## [51,] -0.08 -0.03
## [52,] -0.13  0.04
## [53,]  0.05  0.03
## [54,] -0.26  0.00
## [55,]  0.05 -0.01
## [56,] -0.17  0.01
## [57,] -0.02  0.03
## [58,]  0.15  0.04
## [59,] -0.23 -0.09
## [60,]  0.70  0.04
# Plot P/ACF to lag 60 of seasonal differenced data
dd_birth <- diff(d_birth, lag = 12)
acf2(dd_birth, max.lag=60)

##         ACF  PACF
##  [1,] -0.30 -0.30
##  [2,] -0.09 -0.20
##  [3,] -0.09 -0.21
##  [4,]  0.00 -0.14
##  [5,]  0.07 -0.03
##  [6,]  0.03  0.02
##  [7,] -0.07 -0.06
##  [8,] -0.04 -0.08
##  [9,]  0.11  0.06
## [10,]  0.04  0.08
## [11,]  0.13  0.23
## [12,] -0.43 -0.32
## [13,]  0.14 -0.06
## [14,] -0.01 -0.13
## [15,]  0.03 -0.13
## [16,]  0.01 -0.11
## [17,]  0.02  0.02
## [18,]  0.00  0.06
## [19,]  0.03  0.04
## [20,] -0.07 -0.10
## [21,] -0.01  0.02
## [22,]  0.00  0.00
## [23,]  0.06  0.17
## [24,] -0.01 -0.13
## [25,] -0.12 -0.14
## [26,]  0.17  0.07
## [27,] -0.04 -0.04
## [28,]  0.03 -0.02
## [29,] -0.05  0.02
## [30,] -0.09 -0.06
## [31,] -0.01 -0.07
## [32,]  0.19  0.05
## [33,] -0.03  0.07
## [34,] -0.09 -0.06
## [35,] -0.02  0.05
## [36,] -0.04 -0.16
## [37,]  0.17 -0.01
## [38,] -0.14 -0.04
## [39,]  0.03 -0.01
## [40,] -0.05 -0.03
## [41,]  0.03 -0.01
## [42,]  0.10  0.01
## [43,]  0.00  0.00
## [44,] -0.10  0.03
## [45,] -0.03 -0.02
## [46,]  0.06 -0.07
## [47,]  0.02  0.05
## [48,]  0.01 -0.11
## [49,] -0.01  0.05
## [50,]  0.06  0.06
## [51,] -0.08 -0.03
## [52,]  0.03 -0.03
## [53,]  0.01  0.04
## [54,] -0.02  0.02
## [55,] -0.01 -0.04
## [56,]  0.00 -0.01
## [57,] -0.07 -0.13
## [58,]  0.17  0.07
## [59,] -0.04  0.07
## [60,] -0.01 -0.05
# Fit SARIMA(0,1,1)x(0,1,1)_12. What happens?
sarima(birth, p = 0, d = 1, q = 1, P = 0, D = 1, Q = 1, S = 12)
## initial  value 2.219164 
## iter   2 value 2.013310
## iter   3 value 1.988107
## iter   4 value 1.980026
## iter   5 value 1.967594
## iter   6 value 1.965384
## iter   7 value 1.965049
## iter   8 value 1.964993
## iter   9 value 1.964992
## iter   9 value 1.964992
## iter   9 value 1.964992
## final  value 1.964992 
## converged
## initial  value 1.951264 
## iter   2 value 1.945867
## iter   3 value 1.945729
## iter   4 value 1.945723
## iter   5 value 1.945723
## iter   5 value 1.945723
## iter   5 value 1.945723
## final  value 1.945723 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), include.mean = !no.constant, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##           ma1     sma1
##       -0.4734  -0.7861
## s.e.   0.0598   0.0451
## 
## sigma^2 estimated as 47.4:  log likelihood = -1211.28,  aic = 2428.56
## 
## $degrees_of_freedom
## [1] 358
## 
## $ttable
##      Estimate     SE  t.value p.value
## ma1   -0.4734 0.0598  -7.9097       0
## sma1  -0.7861 0.0451 -17.4227       0
## 
## $AIC
## [1] 4.869388
## 
## $AICc
## [1] 4.874924
## 
## $BIC
## [1] 3.890415
# Add AR term and conclude

sarima(birth, p = 1, d = 1, q = 1, P = 0, D = 1, Q = 1, S = 12)
## initial  value 2.218186 
## iter   2 value 2.032584
## iter   3 value 1.982464
## iter   4 value 1.975643
## iter   5 value 1.971721
## iter   6 value 1.967284
## iter   7 value 1.963840
## iter   8 value 1.961106
## iter   9 value 1.960849
## iter  10 value 1.960692
## iter  11 value 1.960683
## iter  12 value 1.960675
## iter  13 value 1.960672
## iter  13 value 1.960672
## iter  13 value 1.960672
## final  value 1.960672 
## converged
## initial  value 1.940459 
## iter   2 value 1.934425
## iter   3 value 1.932752
## iter   4 value 1.931750
## iter   5 value 1.931074
## iter   6 value 1.930882
## iter   7 value 1.930860
## iter   8 value 1.930859
## iter   8 value 1.930859
## final  value 1.930859 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), include.mean = !no.constant, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##          ar1      ma1     sma1
##       0.3038  -0.7006  -0.8000
## s.e.  0.0865   0.0604   0.0441
## 
## sigma^2 estimated as 45.91:  log likelihood = -1205.93,  aic = 2419.85
## 
## $degrees_of_freedom
## [1] 357
## 
## $ttable
##      Estimate     SE  t.value p.value
## ar1    0.3038 0.0865   3.5104   5e-04
## ma1   -0.7006 0.0604 -11.5984   0e+00
## sma1  -0.8000 0.0441 -18.1302   0e+00
## 
## $AIC
## [1] 4.842869
## 
## $AICc
## [1] 4.848523
## 
## $BIC
## [1] 3.87441

Forecasting Monthly Unemployment

# Fit your previous model to unemp and check the diagnostics
sarima(unemp, p = 2, d = 1, q = 0, P = 0, D = 1, Q = 1, S = 12)
## initial  value 3.340809 
## iter   2 value 3.105512
## iter   3 value 3.086631
## iter   4 value 3.079778
## iter   5 value 3.069447
## iter   6 value 3.067659
## iter   7 value 3.067426
## iter   8 value 3.067418
## iter   8 value 3.067418
## final  value 3.067418 
## converged
## initial  value 3.065481 
## iter   2 value 3.065478
## iter   3 value 3.065477
## iter   3 value 3.065477
## iter   3 value 3.065477
## final  value 3.065477 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), include.mean = !no.constant, optim.control = list(trace = trc, 
##     REPORT = 1, reltol = tol))
## 
## Coefficients:
##          ar1     ar2     sma1
##       0.1351  0.2464  -0.6953
## s.e.  0.0513  0.0515   0.0381
## 
## sigma^2 estimated as 449.6:  log likelihood = -1609.91,  aic = 3227.81
## 
## $degrees_of_freedom
## [1] 356
## 
## $ttable
##      Estimate     SE  t.value p.value
## ar1    0.1351 0.0513   2.6326  0.0088
## ar2    0.2464 0.0515   4.7795  0.0000
## sma1  -0.6953 0.0381 -18.2362  0.0000
## 
## $AIC
## [1] 7.12457
## 
## $AICc
## [1] 7.130239
## 
## $BIC
## [1] 6.156174
# Forecast the data 3 years into the future

sarima.for(unemp,n.ahead = 36, p = 2, d = 1, q = 0, P = 0, D = 1, Q = 1, S = 12)

## $pred
##           Jan      Feb      Mar      Apr      May      Jun      Jul
## 1979 676.4664 685.1172 653.2388 585.6939 553.8813 664.4072 647.0657
## 1980 683.3045 687.7649 654.8658 586.1507 553.9285 664.1108 646.6220
## 1981 682.6406 687.0977 654.1968 585.4806 553.2579 663.4398 645.9508
##           Aug      Sep      Oct      Nov      Dec
## 1979 611.0828 594.6414 569.3997 587.5801 581.1833
## 1980 610.5345 594.0427 568.7684 586.9320 580.5249
## 1981 609.8632 593.3713 568.0970 586.2606 579.8535
## 
## $se
##            Jan       Feb       Mar       Apr       May       Jun       Jul
## 1979  21.20465  32.07710  43.70167  53.66329  62.85364  71.12881  78.73590
## 1980 116.99599 124.17344 131.51281 138.60466 145.49706 152.12863 158.52302
## 1981 194.25167 201.10648 208.17066 215.11503 221.96039 228.64285 235.16874
##            Aug       Sep       Oct       Nov       Dec
## 1979  85.75096  92.28663  98.41329 104.19488 109.67935
## 1980 164.68623 170.63839 176.39520 181.97333 187.38718
## 1981 241.53258 247.74268 253.80549 259.72970 265.52323

How Hard is it to Forecast Commodity Prices?

# Fit the chicken model again and check diagnostics
sarima(chicken, p = 2, d = 1, q = 0, P = 1, D = 0, Q = 0, S = 12)
## initial  value 0.015039 
## iter   2 value -0.226398
## iter   3 value -0.412955
## iter   4 value -0.460882
## iter   5 value -0.470787
## iter   6 value -0.471082
## iter   7 value -0.471088
## iter   8 value -0.471090
## iter   9 value -0.471092
## iter  10 value -0.471095
## iter  11 value -0.471095
## iter  12 value -0.471096
## iter  13 value -0.471096
## iter  14 value -0.471096
## iter  15 value -0.471097
## iter  16 value -0.471097
## iter  16 value -0.471097
## iter  16 value -0.471097
## final  value -0.471097 
## converged
## initial  value -0.473585 
## iter   2 value -0.473664
## iter   3 value -0.473721
## iter   4 value -0.473823
## iter   5 value -0.473871
## iter   6 value -0.473885
## iter   7 value -0.473886
## iter   8 value -0.473886
## iter   8 value -0.473886
## iter   8 value -0.473886
## final  value -0.473886 
## converged

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = constant, optim.control = list(trace = trc, REPORT = 1, 
##     reltol = tol))
## 
## Coefficients:
##          ar1      ar2    sar1  constant
##       0.9154  -0.2494  0.3237    0.2353
## s.e.  0.0733   0.0739  0.0715    0.1973
## 
## sigma^2 estimated as 0.3828:  log likelihood = -169.16,  aic = 348.33
## 
## $degrees_of_freedom
## [1] 175
## 
## $ttable
##          Estimate     SE t.value p.value
## ar1        0.9154 0.0733 12.4955  0.0000
## ar2       -0.2494 0.0739 -3.3728  0.0009
## sar1       0.3237 0.0715  4.5238  0.0000
## constant   0.2353 0.1973  1.1923  0.2347
## 
## $AIC
## [1] 0.0842377
## 
## $AICc
## [1] 0.09726452
## 
## $BIC
## [1] -0.8448077
# Forecast the chicken data 5 years into the future
sarima.for(chicken,n.ahead = 60,  p = 2, d = 1, q = 0, P = 1, D = 0, Q = 0, S = 12)

## $pred
##           Jan      Feb      Mar      Apr      May      Jun      Jul
## 2016                                                               
## 2017 110.5358 110.5612 110.5480 110.7055 111.0047 111.1189 111.1552
## 2018 111.8108 111.9782 112.1330 112.3431 112.5991 112.7952 112.9661
## 2019 114.1331 114.3464 114.5556 114.7827 115.0247 115.2473 115.4617
## 2020 116.7942 117.0224 117.2492 117.4819 117.7193 117.9505 118.1790
## 2021 119.5651 119.7980 120.0306 120.2650 120.5010 120.7350 120.9681
##           Aug      Sep      Oct      Nov      Dec
## 2016 111.0907 110.8740 110.6853 110.5045 110.5527
## 2017 111.1948 111.2838 111.3819 111.4825 111.6572
## 2018 113.1380 113.3260 113.5168 113.7085 113.9242
## 2019 115.6765 115.8965 116.1174 116.3386 116.5675
## 2020 118.4077 118.6380 118.8686 119.0993 119.3326
## 2021                                             
## 
## $se
##             Jan        Feb        Mar        Apr        May        Jun
## 2016                                                                  
## 2017  3.7414959  4.1793190  4.5747009  4.9373266  5.2742129  5.5903499
## 2018  8.2010253  8.5605811  8.9054714  9.2372195  9.5572539  9.8667955
## 2019 12.0038164 12.2921541 12.5738417 12.8492868 13.1188976 13.3830477
## 2020 15.1557253 15.3959082 15.6323906 15.8653300 16.0948844 16.3212022
## 2021 17.8397890 18.0473081 18.2524651 18.4553364 18.6559977 18.8545213
##             Jul        Aug        Sep        Oct        Nov        Dec
## 2016             0.6187194  1.3368594  2.0462419  2.6867986  3.2486625
## 2017  5.8893133  6.2367345  6.6253573  7.0309771  7.4344077  7.8255932
## 2018 10.1668604 10.4736807 10.7857727 11.0980056 11.4063211 11.7085266
## 2019 13.6420693 13.9002670 14.1573839 14.4122197 14.6638269 14.9117124
## 2020 16.5444204 16.7657634 16.9852163 17.2025022 17.4174076 17.6298379
## 2021 19.0509752