Visit my website for more like this! I would love to hear your feedback (seriously).

library(astsa, quietly=TRUE, warn.conflicts=FALSE)
require(knitr)
## Loading required package: knitr
library(ggplot2)

Data Sources:

Heavily borrowed from:

Reading Time Series

#Read in the data and skip the first three lines.
kings<-scan('http://robjhyndman.com/tsdldata/misc/kings.dat', skip=3)
kings
##  [1] 60 43 67 50 56 42 50 65 68 43 65 34 47 34 49 41 13 35 53 56 16 43 69
## [24] 59 48 59 86 55 68 51 33 49 67 77 81 67 71 81 68 70 77 56

This dataset records the age of death for 42 successive kings of England.

Convert the data into a time series

It is best to convert the data into an R time series object after you have successfully loaded the data.

kings <- ts(kings)
kings
## Time Series:
## Start = 1 
## End = 42 
## Frequency = 1 
##  [1] 60 43 67 50 56 42 50 65 68 43 65 34 47 34 49 41 13 35 53 56 16 43 69
## [24] 59 48 59 86 55 68 51 33 49 67 77 81 67 71 81 68 70 77 56

However, it is common to come across time series that have been collected at regular intervals that are less than the one year of the kings dataset, for example, monthly, weekly or quarterly. In these cases we can specify the number of times that data was collected per year by using the frequency parameter in the ts( ) function. For monthly data, we set frequency = 12. We can also specify the first year that the data were collected and the first interval in that year by using the ‘stat’ parameter. For example, the third quarter of 1909 would be `start = c(1909, 3).

Next we load in a dataset of number of births per month in New York city, from January 1946 to December 1958.

births <- scan("http://robjhyndman.com/tsdldata/data/nybirths.dat")

births <- ts(births, frequency = 12, start = c(1946, 1))
births
##        Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov
## 1946 26.66 23.60 26.93 24.74 25.81 24.36 24.48 23.90 23.18 23.23 21.67
## 1947 21.44 21.09 23.71 21.67 21.75 20.76 23.48 23.82 23.11 23.11 21.76
## 1948 21.94 20.04 23.59 21.67 22.22 22.12 23.95 23.50 22.24 23.14 21.06
## 1949 21.55 20.00 22.42 20.61 21.76 22.87 24.10 23.75 23.26 22.91 21.52
## 1950 22.60 20.89 24.68 23.67 25.32 23.58 24.67 24.45 24.12 24.25 22.08
## 1951 23.29 23.05 25.08 24.04 24.43 24.67 26.45 25.62 25.01 25.11 22.96
## 1952 23.80 22.27 24.77 22.65 23.99 24.74 26.28 25.82 25.21 25.20 23.16
## 1953 24.36 22.64 25.57 24.06 25.43 24.64 27.01 26.61 26.27 26.46 25.25
## 1954 24.66 23.30 26.98 26.20 27.21 26.12 26.71 26.88 26.15 26.38 24.71
## 1955 24.99 24.24 26.72 23.48 24.77 26.22 28.36 28.60 27.91 27.78 25.69
## 1956 26.22 24.22 27.91 26.98 28.53 27.14 28.98 28.17 28.06 29.14 26.29
## 1957 26.59 24.85 27.54 26.90 28.88 27.39 28.07 28.14 29.05 28.48 26.63
## 1958 27.13 24.92 28.96 26.59 27.93 28.01 29.23 28.76 28.41 27.95 25.91
## 1959 26.08 25.29 27.66 25.95 26.40 25.57 28.86 30.00 29.26 29.01 26.99
##        Dec
## 1946 21.87
## 1947 22.07
## 1948 21.57
## 1949 22.02
## 1950 22.99
## 1951 23.98
## 1952 24.71
## 1953 25.18
## 1954 25.69
## 1955 26.88
## 1956 26.99
## 1957 27.73
## 1958 26.62
## 1959 27.90

Plotting Time Series

The next step in any time series analysis is to plot the data

plot.ts(kings)

plot of chunk unnamed-chunk-5

At this point we could guess that this time series could be described using an additive model, since the random fluctuations in the data are roughly constant in size over time.

Let’s look at the births data.

plot.ts(births)

plot of chunk unnamed-chunk-6

We can see from this time series that there is certainly some seasonal variation in the number of births per month; there is a peak every summer, and a trough every winter. Again the it seems like this could be described using an additive model, as the seasonal fluctuations are roughly constant in size over time and do not seem to depend on the level of the time series, and the random fluctuations seem constant over time.

How about a time series of a beach town souvenir shop

gift <- scan("http://robjhyndman.com/tsdldata/data/fancy.dat")
gift<- ts(gift, frequency=12, start=c(1987,1))

plot.ts(gift)

plot of chunk unnamed-chunk-7

In this case, an additive model is not appropriate since the size of the seasonal and random fluctuations change over time and the level of the time series. It is then appropriate to transform the time series so that we can model the data with a classic additive model

logGift <- log(gift)
plot.ts(logGift)

plot of chunk unnamed-chunk-8

Decomposing Time Series

Decomposing a time series means separating it into it’s constituent components, which are often a trend component and a random component, and if the data is seasonal, a seasonal component.

Decomposing non-Seasonal Data

Recall that non-seasonal time series consist of a trend component and a random component. Decomposing the time series involves tying to separate the time series into these individual components.

One way to do this is using some smoothing method, such as a simple moving average.

The SMA() function in the TTR R package can be used to smooth time series data using a moving average. The SMA function takes a span argument as n order. To calculate the moving average of order 5, we set n = 5.

Let’s try to see a clearer picture of the Kings dataset trend component by applying an order 3 moving average.

library(TTR)
## Loading required package: xts
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
kingsSMA3 <- SMA(kings, n=3)
plot.ts(kingsSMA3)

plot of chunk unnamed-chunk-9

It seems like there is still some random fluctuations in the data, we might want to try a big larger of a smoother.

library(TTR)

kingsSMA3 <- SMA(kings, n=8)
plot.ts(kingsSMA3)

plot of chunk unnamed-chunk-10

This is better, we can see that the death of English kings has declined from ~55 years to ~40 years for a brief period, followed by a rapid increase in the next 20 years to ages in the 70’s.

Decomposing Seasonal Data

A seasonal time series, in addition to the trend and random components, also has a seasonal component. Decomposing a seasonal time series means separating the time series into these three components.

In R we can use the decompose() function to estimate the three components of the time series.

To estimate the trend, seasonal, and random components of the New York births dataset we can …

birthsComp <- decompose(births)
plot(birthsComp)

plot of chunk unnamed-chunk-11

Seasonally Adjusting

If you have a seasonal time series, you can seasonally adjust the series by estimating the seasonal component, and subtracting it from the original time series.

birthsSeasonAdj <- births - birthsComp$seasonal
plot(birthsSeasonAdj)

plot of chunk unnamed-chunk-12

We can see now that time time series simply consists of the trend and random components.

Forecasts using Exponential Smoothing

Exponential smoothing is a common method for making short-term forecasts in time series data.

Simple Exponential Smoothing

If you have a time series with constant level and no seasonality, you can use simple exponential smoothing for short term forecasts.

This method is a way of estimating the level at the current time point. Smoothing is controlled by the parameter alpha for the estimate of the level at the current time point. The value of alpha lies between 0 and 1. Values of alpha close to 0 mean that little weight is places on the most recent observations when making forecasts of future values.

Example 1

Total annual rainfall in inches for London, from 1813 - 1912.

rain <- ts(scan("http://robjhyndman.com/tsdldata/hurst/precip1.dat",skip=1), start=c(1813))
plot.ts(rain)

plot of chunk unnamed-chunk-13

You can see from the plat that there is roughly constant variance over time, thus we can describe this as an additive model, thus we can make forecasts using simple exponential smoothing.

In R we can use the HoltWinters() function. For simple exponential smoothing, we need to set the parameters beta = FALSE and gamma = FALSE.

rainF <- HoltWinters(rain, beta=FALSE, gamma = FALSE)
rainF
## Holt-Winters exponential smoothing without trend and without seasonal component.
## 
## Call:
## HoltWinters(x = rain, beta = FALSE, gamma = FALSE)
## 
## Smoothing parameters:
##  alpha: 0.02412
##  beta : FALSE
##  gamma: FALSE
## 
## Coefficients:
##    [,1]
## a 24.68

The output tells us that the estimated value of the alpha parameter is about 0.024, which is very close to zero. This means that the forecasts are based on both recent and less recent observations, though there is more weight placed on recent observations. By default, HoltWinters makes forecasts for the total time period covered in the original series. We can get the fitted values of the forecasts and plot them simply by calling plot() on our variable storing the model we fit above.

plot(rainF)

plot of chunk unnamed-chunk-15

To test the accuracy, we can calculate the SSE for the in-sample forecast errors.

rainF$SSE
## [1] 1829

If we want to make predictions into the future (past 1912) we can utilize the forecast.HoltWinters() function from the R package forecast. We specify how many predictions into the future we will make with the h parameter. For example: to make predictions into the next 8 years we would type:

library(forecast)
## Loading required package: timeDate
## This is forecast 5.4 
## 
## 
## Attaching package: 'forecast'
## 
## The following object is masked from 'package:astsa':
## 
##     gas
rainF8 <- forecast.HoltWinters(rainF, h=8)
plot.forecast(rainF8)

plot of chunk unnamed-chunk-17

where the dark gray is the 80% confidence interval, the light gray is the 95% confidence interval, and the blue line are the actual predictions.

To test the validity of this model we can again compute the SSE, however, we should also examine the correlations between the forecast errors. If correlation exists in the error terms, it is likely that the simple exponential smoothing forecasts could be improved upon by another technique.

acf(rainF8$residuals)

plot of chunk unnamed-chunk-18

We can see that lag 3 is just about touching the significance interval. To test whether there is significant evidence for non-zero correlations we can carry out a Ljung-Box test. In R we can use the Box-test() function.

Box.test(rainF8$residuals, lag=20, type='Ljung-Box')
## 
##  Box-Ljung test
## 
## data:  rainF8$residuals
## X-squared = 17.4, df = 20, p-value = 0.6268

Here we see that the p-value is ~0.6, so there is little evidence of non-zero autocorrelation in the forecast errors.

To be sure the predictive model cannot be improved upon we can check whether the forecast errors are normally distributed with mean = 0 and constant variance.

plot.ts(rainF8$residuals)

plot of chunk unnamed-chunk-20

While the mean is roughly constant, the variance does not seem to be obviously constant, but it is close enough.

Exponential Smoothing

If you have a time series that can be described using an additive model with a trend and no seasonality, you could use Holt's exponential smoothing to describe the series. The smoothing is controlled by parameters alpha: the estimate of the level at the current time point, and beta: the estimate of the slope b of the trend component at the current time point. These two parameters range form 0 - 1, and similar to simple smoothing, values close to 0 mean that little weight is placed on the most recent observations when making forecasts.

Here we utilize this model on a time series of annual diameter of woman’s skirts at the hem, from 1866 to 1911.

skirts <- ts(scan("http://robjhyndman.com/tsdldata/roberts/skirts.dat",skip=5), start=c(1866))
plot.ts(skirts)

plot of chunk unnamed-chunk-21

To make forecasts we will again use the HotWinters() function. For exponential smoothing, we only set the gamma = FALSE.

skirtsF <- HoltWinters(skirts, gamma=F)
skirtsF; skirtsF$SSE
## Holt-Winters exponential smoothing with trend and without seasonal component.
## 
## Call:
## HoltWinters(x = skirts, gamma = F)
## 
## Smoothing parameters:
##  alpha: 0.8383
##  beta : 1
##  gamma: FALSE
## 
## Coefficients:
##     [,1]
## a 529.31
## b   5.69
## [1] 16954

The high estimates of alpha and beta tell us that the estimate of the current level are basted mostly on recent observations. This makes sense since the data change rapidly. To examine the fit visually, we can plot the forecasts vs the original data.

plot(skirtsF)

plot of chunk unnamed-chunk-23

# Forecast into the future
skirtsF19 <- forecast.HoltWinters(skirtsF, h=19)
plot.forecast(skirtsF19)

plot of chunk unnamed-chunk-24

To validate the predictive model, we can check to verify the residuals are not correlated.

par(mfrow=c(3, 1))
acf(skirtsF19$residuals, lag.max=20)
Box.test(skirtsF19$residuals, lag=20, type='Ljung-Box')
## 
##  Box-Ljung test
## 
## data:  skirtsF19$residuals
## X-squared = 19.73, df = 20, p-value = 0.4749
plot.ts(skirtsF19$residuals)

plot of chunk unnamed-chunk-25

Here, we see that the ACF shows significant correlation at lag 5. However, the Ljung-Box test indicates little evidence of non-zero autocorrelations for 20 lags, which means the autocorrelation at lag 5 could be expected by chance. Further, the residuals are normally distributed, thus we can probably conclude this model as an adequate representation of the data.

Example 2

The beach-side gift shop dataset is an example of a time series with a trend component and seasonality.

giftLog <- log(gift) # take natural log
plot.ts(giftLog)

plot of chunk unnamed-chunk-26

giftLogF <- HoltWinters(giftLog)
giftLogF
## Holt-Winters exponential smoothing with trend and additive seasonal component.
## 
## Call:
## HoltWinters(x = giftLog)
## 
## Smoothing parameters:
##  alpha: 0.4134
##  beta : 0
##  gamma: 0.9561
## 
## Coefficients:
##         [,1]
## a   10.37662
## b    0.02996
## s1  -0.80952
## s2  -0.60576
## s3   0.01103
## s4  -0.24161
## s5  -0.35934
## s6  -0.18077
## s7   0.07789
## s8   0.10147
## s9   0.09649
## s10  0.05198
## s11  0.41794
## s12  1.18088
giftLogF$SSE
## [1] 2.011

From this output we see that the alpha = 0.41 which is closer to 0 than 1 and indicates that the estimate of the level at the current time point is based upon both recent and past observations. The beta = 0.0 which means that the estimate of the slope of the trend component is not updated over the time series, and is instead equal to the initial value. This makes sense since we can see the trend component maintains near constant slope over the time series. The gamma = 0.95 is very high, which indicates that the estimate of the seasonal component at the current time point is based on very recent observations.

Plot our results…

plot(giftLogF)

plot of chunk unnamed-chunk-28

It is impressive how well we can predict the peaks in November each year.

Make forecasts into the future..

giftLogF48 <- forecast.HoltWinters(giftLogF, h=48) # predict 48 months ahead
plot.forecast(giftLogF48)

plot of chunk unnamed-chunk-29

Test the model…

Again, we utilize various types of residual analysis to make sure our model cannot be approved upon.

acf(giftLogF48$residuals, lag.max=20)

plot of chunk unnamed-chunk-30

Box.test(giftLogF48$residuals, lag=20, type='Ljung-Box')
## 
##  Box-Ljung test
## 
## data:  giftLogF48$residuals
## X-squared = 17.53, df = 20, p-value = 0.6183
plot.ts(giftLogF48$residuals)

plot of chunk unnamed-chunk-30

Everything checks out!

ARIMA Models

While exponential smoothing methods are useful for forecasting, they make no assumptions about the correlations between successive values of the time series. We can sometimes make better models by utilizing these correlations in the data, using Autoregressive Integrated Moving Average (ARIMA) models.

Differencing a Time Series

ARIMA models are defined only for stationary time series. Therefore, if the raw data are non-stationary, you will need to difference the series until you obtain stationarity. Once you know the order of differencing you can use it as the d parameter in an ARIMA(p, d, q) model.

In R you can difference a time series using the diff() function. For example: the time series of annual skirt diameter is not stationary as the mean level changes a lot over time.

plot.ts(skirts)

plot of chunk unnamed-chunk-31

We can take the first difference and plot the differenced series…

skirtsDiff <- diff(skirts, differences = 1)
plot.ts(skirtsDiff)

plot of chunk unnamed-chunk-32

Again, the mean is not quite stationary. Therefore we can take the first difference of the first difference we computed above and see how that looks.

skirtsDiff2 <- diff(skirts, differences = 2)
plot.ts(skirtsDiff2)

plot of chunk unnamed-chunk-33

There we go, it appears that the mean and variance remain constant and stable over time. So far, our model definition is ARIMA(p, 2, q). The next step is to figure out the p and q values of this model.

Another Example of Differencing

Kings death dataset…

plot.ts(kings)

plot of chunk unnamed-chunk-34

# First differencing
kingsDiff <- diff(kings, differences = 1)
plot.ts(kingsDiff)

plot of chunk unnamed-chunk-35

In this case, we can see that an ARIMA(p, 1, q) model would be appropriate.

Next we can examine the correlation between successive terms of this irregular differenced component.

Selecting a Cadidate ARIMA Model

Once we have a stationary time series, the next still is to select the best ARIMA model. To do this we will inspect the autocorrelation function (ACF) and partial autocorrelation function (PACF). You can find a good collection of some of the ‘rules’ for selecting ARIMA models from Duke University here.

Example of Ages of Kings Death

I find it easiest to use some of the functions provided from this [ textbook

# Install the tools
source(url("http://lib.stat.cmu.edu/general/tsa2/Rcode/itall.R"))
# Plot ACF and PACF
acf2(kingsDiff, max.lag = 20)

plot of chunk unnamed-chunk-36

We can see that the ACF is significant at lag 1 and has an alternating pattern. The PACF depicts significant autocorrelation at the first three lags.

The two most basic rules are:

  • If the ACF is a sharp cutoff the q component is equal to the last significant lag. This type of model also often has a PACF with a tapering pattern.
  • If the PACF has a sharp cutoff, the p component is equal to the last number of significant lags. Again, the ACF may exhibit a tapering pattern.

Understanding these observations, we would probably test these 2 ARIMA models first.

  • ARIMA (3, 1, 0), since the PACF cuts off sharply after lag 3, and the ACF exhibits a tapering pattern. AR(3), MA(0)

  • ARIMA(0, 1, 1), since the ACF is zero after lag 1 and the PACF does also taper off to some degree. AR(0), MA(1)

Under the case that we have a few potential models, it is convention to pick the simplest model. For now, we will test the diagnostics of the both models, and examine the residual plots, and AIC to determine the best model.

sarima(kingsDiff, 0, 1, 1)
## initial  value 3.400079 
## iter   2 value 3.130716
## iter   3 value 3.082518
## iter   4 value 3.031836
## iter   5 value 3.014638
## iter   6 value 2.996626
## iter   7 value 2.996427
## iter   8 value 2.993689
## iter   9 value 2.992911
## iter  10 value 2.992715
## iter  11 value 2.992675
## iter  12 value 2.992675
## iter  13 value 2.992672
## iter  14 value 2.992672
## iter  14 value 2.992672
## iter  14 value 2.992672
## final  value 2.992672 
## converged
## initial  value 2.981319 
## iter   2 value 2.967670
## iter   3 value 2.959423
## iter   4 value 2.956145
## iter   5 value 2.956067
## Warning: closing unused connection 5
## (http://lib.stat.cmu.edu/general/tsa2/Rcode/itall.R)
## iter   6 value 2.956066
## iter   6 value 2.956066
## final  value 2.956066 
## converged

plot of chunk unnamed-chunk-37

## $fit
## Series: xdata 
## ARIMA(0,1,1) with non-zero mean 
## 
## Coefficients:
##          ma1  constant
##       -1.000     0.020
## s.e.   0.065     0.245
## 
## sigma^2 estimated as 345:  log likelihood=-170.6
## AIC=347.2   AICc=347.9   BIC=352.3
## 
## $AIC
## [1] 6.942
## 
## $AICc
## [1] 7.007
## 
## $BIC
## [1] 6.026
Box.test(kingsDiff, lag=20, type='Ljung-Box')
## 
##  Box-Ljung test
## 
## data:  kingsDiff
## X-squared = 25.1, df = 20, p-value = 0.1975

While there is an auto correlated residual at lag 1, this model is valid.

sarima(kingsDiff, 3, 1, 0)
## initial  value 3.381661 
## iter   2 value 3.206938
## iter   3 value 3.160710
## iter   4 value 3.067232
## iter   5 value 3.049600
## iter   6 value 3.007382
## iter   7 value 2.951981
## iter   8 value 2.933148
## iter   9 value 2.931406
## iter  10 value 2.931149
## iter  11 value 2.931132
## iter  12 value 2.931075
## iter  13 value 2.931003
## iter  14 value 2.930991
## iter  15 value 2.930974
## iter  15 value 2.930974
## iter  15 value 2.930974
## final  value 2.930974 
## converged
## initial  value 2.950994 
## iter   2 value 2.950920
## iter   3 value 2.950057
## iter   4 value 2.948720
## iter   5 value 2.948683
## iter   6 value 2.948682
## iter   6 value 2.948682
## iter   6 value 2.948682
## final  value 2.948682 
## converged

plot of chunk unnamed-chunk-38

## $fit
## Series: xdata 
## ARIMA(3,1,0) with non-zero mean 
## 
## Coefficients:
##          ar1     ar2     ar3  constant
##       -1.045  -0.854  -0.525    -0.121
## s.e.   0.135   0.161   0.133     0.902
## 
## sigma^2 estimated as 357:  log likelihood=-170.3
## AIC=350.7   AICc=352.4   BIC=359.1
## 
## $AIC
## [1] 7.074
## 
## $AICc
## [1] 7.164
## 
## $BIC
## [1] 6.241
Box.test(kingsDiff, lag=20, type='Ljung-Box')
## 
##  Box-Ljung test
## 
## data:  kingsDiff
## X-squared = 25.1, df = 20, p-value = 0.1975

The Diagnostics are quite similar here. To choose the best model we can utilize the AIC and see that the first model (also the simplest) performs marginally better.

We can also use the auto.arima() function to test the best model with an automated approach and see how it compares to our manual model selection.

auto.arima(kings)
## Series: kings 
## ARIMA(0,1,1) with drift         
## 
## Coefficients:
##          ma1  drift
##       -0.746  0.388
## s.e.   0.129  0.660
## 
## sigma^2 estimated as 234:  log likelihood=-165.8
## AIC=337.5   AICc=338.2   BIC=342.7

Example of Volcanic Dust in N Hemisphere

This dataset contains the _volcanic dust veil index in the northern hemisphere from 1500 - 1969. It is a measure of the impact of volcanic eruptions release of dist an aerosols into the environment.

volc <- ts(scan("http://robjhyndman.com/tsdldata/annual/dvi.dat", skip=1), start=c(1500))
plot.ts(volc, ylab = 'VDI')

plot of chunk unnamed-chunk-40

We can see that the this time series may be stationary, since the mean is constant and the variance appears relatively constant over time. Therefore, we do not need to difference this series. Let’s investigate the ACF and PACF.

acf2(volc)

plot of chunk unnamed-chunk-41

##         ACF  PACF
##  [1,]  0.67  0.67
##  [2,]  0.37 -0.13
##  [3,]  0.16 -0.06
##  [4,]  0.05  0.00
##  [5,]  0.02  0.04
##  [6,] -0.01 -0.04
##  [7,]  0.02  0.06
##  [8,]  0.02 -0.02
##  [9,]  0.01 -0.03
## [10,]  0.01  0.03
## [11,]  0.00 -0.01
## [12,]  0.02  0.04
## [13,]  0.08  0.08
## [14,]  0.08 -0.02
## [15,]  0.06 -0.01
## [16,]  0.04  0.01
## [17,]  0.00 -0.02
## [18,]  0.03  0.07
## [19,]  0.11  0.13
## [20,]  0.18  0.06
## [21,]  0.16 -0.07
## [22,]  0.14  0.07
## [23,]  0.10 -0.01
## [24,]  0.06 -0.01
## [25,] -0.01 -0.05
## [26,] -0.02  0.03
## [27,]  0.00  0.02
## [28,]  0.00 -0.03
## [29,]  0.03  0.06
## [30,]  0.07  0.05
## [31,]  0.10  0.06
## [32,]  0.13  0.02

The ACF depicts a tapering pattern where the first 3 lags are auto correlated. The lags of 19 - 23 are also significant, but we expect that these are by chance, since the autocorrelations for lags 4 - 18 are not significant.

The PACF depicts a sharp cutoff where the first two lags are significant.

Given this information, we have two possible models…

* ARIMA(2,0,0), since the PACF is zero after lag 2 and the ACF tapers off to zero at lag 2.
* ARIMA(0,0,3), since the ACF is zero after lag 3 and the PACF sort-of tails off to zero (though probably to abruptly for this to be appropriate).
* ARIMA(1,0,2) as estimated by `auto.arima`. 
sarima(volc, 2,0,0)
## initial  value 4.547990 
## iter   2 value 4.404559
## iter   3 value 4.272216
## iter   4 value 4.247505
## iter   5 value 4.245129
## iter   6 value 4.245081
## iter   7 value 4.245079
## iter   8 value 4.245078
## iter   9 value 4.245078
## iter  10 value 4.245076
## iter  11 value 4.245074
## iter  12 value 4.245074
## iter  12 value 4.245074
## iter  12 value 4.245074
## final  value 4.245074 
## converged
## initial  value 4.246070 
## iter   2 value 4.246066
## iter   3 value 4.246062
## iter   4 value 4.246058
## iter   5 value 4.246051
## iter   6 value 4.246049
## iter   7 value 4.246048
## iter   7 value 4.246048
## iter   7 value 4.246048
## final  value 4.246048 
## converged

plot of chunk unnamed-chunk-42

## $fit
## Series: xdata 
## ARIMA(2,0,0) with zero mean     
## 
## Coefficients:
##         ar1     ar2   xmean
##       0.753  -0.127  57.527
## s.e.  0.046   0.046   8.596
## 
## sigma^2 estimated as 4870:  log likelihood=-2663
## AIC=5333   AICc=5333   BIC=5350
## 
## $AIC
## [1] 9.504
## 
## $AICc
## [1] 9.508
## 
## $BIC
## [1] 8.53
sarima(volc, 1,0,2)
## initial  value 4.547952 
## iter   2 value 4.410381
## iter   3 value 4.251298
## iter   4 value 4.246934
## iter   5 value 4.244047
## iter   6 value 4.243760
## iter   7 value 4.243019
## iter   8 value 4.242747
## iter   9 value 4.242674
## iter  10 value 4.242670
## iter  11 value 4.242670
## iter  12 value 4.242670
## iter  13 value 4.242670
## iter  14 value 4.242669
## iter  14 value 4.242669
## iter  14 value 4.242669
## final  value 4.242669 
## converged
## initial  value 4.244570 
## iter   2 value 4.244568
## iter   3 value 4.244565
## iter   4 value 4.244554
## iter   5 value 4.244553
## iter   6 value 4.244553
## iter   7 value 4.244553
## iter   8 value 4.244553
## iter   8 value 4.244553
## iter   8 value 4.244553
## final  value 4.244553 
## converged

plot of chunk unnamed-chunk-42

## $fit
## Series: xdata 
## ARIMA(1,0,2) with zero mean     
## 
## Coefficients:
##         ar1    ma1    ma2   xmean
##       0.472  0.269  0.128  57.518
## s.e.  0.094  0.097  0.075   8.488
## 
## sigma^2 estimated as 4855:  log likelihood=-2662
## AIC=5334   AICc=5334   BIC=5354
## 
## $AIC
## [1] 9.505
## 
## $AICc
## [1] 9.509
## 
## $BIC
## [1] 8.54

Even though the optimal model is ARIMA(1,0,2), we prefer a simpler model and choose ARIMA(2,0,0). If we change our selection criterion to BIC, which penalizes for extra parameters, instead of AIC, we find that the simpler model indeed is the best description of the time series.

Forecasting Using an ARIMA Model

The Kings Death Dataset

Recall that this dataset can be described using a ARIMA(0,1,1) model.

kingsARIMA <- arima(kings, order=c(0,1,1))
kingsARIMA
## Series: kings 
## ARIMA(0,1,1)                    
## 
## Coefficients:
##          ma1
##       -0.722
## s.e.   0.121
## 
## sigma^2 estimated as 230:  log likelihood=-170.1
## AIC=344.1   AICc=344.4   BIC=347.6
library(forecast) 

kingsF <- forecast.Arima(kingsARIMA)
kingsF
##    Point Forecast Lo 80 Hi 80 Lo 95  Hi 95
## 43          67.75 48.30 87.20 38.00  97.50
## 44          67.75 47.56 87.94 36.87  98.63
## 45          67.75 46.84 88.66 35.78  99.72
## 46          67.75 46.16 89.35 34.72 100.78
## 47          67.75 45.49 90.01 33.70 101.80
## 48          67.75 44.84 90.66 32.71 102.79
## 49          67.75 44.21 91.29 31.75 103.76
## 50          67.75 43.59 91.91 30.81 104.70
## 51          67.75 42.99 92.51 29.89 105.61
## 52          67.75 42.41 93.09 29.00 106.51
plot.forecast(kingsF)

plot of chunk unnamed-chunk-43

# Examine the residuals
acf(kingsF$residuals)

plot of chunk unnamed-chunk-43

Box.test(kingsF$residuals, lag = 20, type='Ljung-Box')
## 
##  Box-Ljung test
## 
## data:  kingsF$residuals
## X-squared = 13.58, df = 20, p-value = 0.8509