In this workshop we learn how to design/calibrate a seasonal ARIMA model (SARIMA) for a time series.
An ARIMA/SARIMA model is applied to monthly or quarterly time series. It shows:
1 Historical monthly or quarterly sales data 2 Historical monthly or quarterly price data 3 Historical monthly or quarterly data related to any variable of any financial statement. For example: cost of good sold, sales and general administrative expenses, income tax, EBIT, etc. 4 Historical monthly or quarterly data related to any macro economic variable. For example: inflation indexes, GDP indexes, exchange rates, etc.
This model help in the fact that nobody can predict the stock prices, with this you get intuitive guesses
Each ARIMA-SARIMA model needs to be CALIBRATED. The general steps to calibrate a model are the following:
1 Check if the series has seasonality. If the series shows a seasonal pattern, start with the seasonal difference of the log of the series, which is the % annual growth of the series. Usually monthly and quarterly data of many economic and financial series have seasonal pattern.
I recommend that if you have monthly or quarterly data of any financial or economic variable, start testing whether the seasonal difference of the log of the series is stationary. If you have daily data, it is hard to model for seasonality. In the case of daily data, start testing whether the first difference of the log is stationary.
In this step you start with a version of the series. Assume that you decide to start with the seasonal difference of the log.
2 Test whether the seasonal difference is stationary. If so, continue to next step. If not, then change the variable to first difference of the log of the series, which is the % change for each period.
3 Run the AC and PAC plots to identify the ARIMA-SARIMA parameters p, d, q, P, D, Q:
An ARIMA/SARIMA model has the following “parameters” that need to be defined:
arima(p,d,q) sarima(P,D,Q,#Periods)
*p: refers to the number of autoregressive (AR) terms
*d: refers to how many differences where needed to the series in order to make the series stationary. Usually this parameter is either 0 or 1.
*q: refers to the number of moving average (MA) terms in the model
*P: refers to the number of SEASONAL autoregressive terms
*D: refers to how many SEASONAL differences were needed to the series to make the series stationary. Usually this parameter is either 0 or 1.
*Q: refers to the number of SEASONAL moving average terms
*#periods: referes to the number of periods in the year. For example, if the data is monthly, then #periods=12.
These parameters are usually expressed as: ARIMA(p,d,q) SARIMA(P,D,Q, #periods).
When you identify the first values of these paremeter, then you will have your first CALIBRATION of the model.
4 Estimate/Run the ARIMA-SARIMA model
5 Run the ACF and PACF of the residuals/errors of the model to check whether the errors is a white noise series. In other words, if there is no significant autocorrelations of the errors.
If you have significant autocorrelations, you can include other term(s) in the ARIMA-SARIMA model, and go back to step 4.
If there the errors seem like a white noise (no signficant autocorrelations), then you can continue and finish the calibration process.
5 Interpret the model
6 Run a forecast using the model
Download the excel file from a web site:
download.file("http://www.apradie.com/ec2004/air2.xlsx", "air2.xlsx", mode="wb")
library(readxl)
dataset <- read_excel("air2.xlsx")
# I familiarize with the data
head(dataset)In order to perform time-series analysis, it is better to use xts objects. We can construct this by using the xts() method of the xts package. Objects of class xts are composed by 2 elements: the core data and the index (class Date, POSIX,etc).
## Loading required package: zoo
## Warning: package 'zoo' was built under R version 4.0.3
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
library(forecast)
# I get the date of the dataset:
date <- dataset$month
# I join both objects using xts() and assign it to a new object called air.xts
air.xts<- xts(x=dataset$air, frequency = "monthly", order.by=date)
names(air.xts)<-c("Passengers")
class(air.xts)## [1] "xts" "zoo"
This dataset contains air passengers by month from an airline. Do the following:
Does the series look stationary? Obviously not, it looks with a clear growing trend, and a seasonality pattern. The mean of the series GROWS OVER TIME.
Get the natural logarithm of the series:
Since we see a clear seasonality pattern, it is strongly recommended to use the seasonal difference of the log of the series. So, we will check whether we can treat the seasonal difference of the log as STATIONARY.
We start by plotting the seasonal difference of the log, which is the annual % growth (in cc) of the air passengers:
It is hard to see whether this is stationary or not, so we have to run the Dicky-Fuller test. Run the Dicky-Fuller test using the adf.test function of the tseries packages.
## Warning in adf.test(na.omit(diff(lnair.xts, lag = 12)), alternative =
## "stationary", : p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: na.omit(diff(lnair.xts, lag = 12))
## Dickey-Fuller = -4.988, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary
We got a p-value<0.05, so we can treat the seasonal difference (the annual % growth) as stationary and continue with this series to calibrate an ARIMA-SARIMA model.
Once we confirm that our series is stationary, then the first step is to do the AC and PAC plots to identify how many AR and MA terms we can include in our ARIMA/SARIMA model.
Remember that our series is the seasonal difference of the log of air passengers, which is the annual % growth of air passengers month by month.
## Warning: package 'astsa' was built under R version 4.0.3
##
## Attaching package: 'astsa'
## The following object is masked from 'package:forecast':
##
## gas
# I create a ts object from the xts; the acf2 works better with ts objects
lnair =ts(lnair.xts$Passengers)
acf2(diff(lnair,lag=12))## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## ACF 0.71 0.62 0.48 0.44 0.39 0.32 0.24 0.19 0.15 -0.01 -0.11 -0.24 -0.14
## PACF 0.71 0.23 -0.06 0.10 0.05 -0.06 -0.06 0.01 -0.01 -0.29 -0.16 -0.14 0.29
## [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22]
## ACF -0.14 -0.10 -0.15 -0.10 -0.11 -0.14 -0.16 -0.11 -0.08
## PACF 0.06 0.05 -0.04 0.13 -0.06 -0.15 -0.02 0.12 -0.18
We used the ts function to transform the dataset into a ts R dataset since this function works better for time-series (ts) class datasets.
The ACF shows the autocorrelations (AC) between the series and its lags, and the PACF shows the partial autocorrelations (PAC) between the series and its lags.
The VERTICAL lines show the MAGNITUD of the AUTOCORRELATION between a LAG and the current value of the series.
The blue horizontal dotted lines in the plots cover the 95% confidence interval for the autocorrelations. So, if the vertical line croses the blue dotted line, it means that that specific autocorrelation is SIGNIFICANTLY DIFFERENT THAN ZERO (it could be positive or negative).
Now what is the general interpretation of the these plots?
First, always remember what are we modelling. In this case is the seasonal difference of the log, which is the ANNUAL % GROWTH of the series.
In the ACF plot I see that the first 8 ACs are positive and significant, and the gradually decay. For example, the AC of LAG 1 is about 70% (almost 0.70). This means that ANNUAL % growth of air passengers is strongly correlated (with a correlation of 70%) with ITS OWN ANNUAL % GROWTH of the PREVIOUS month (its LAG 1).
The AC of LAG 2 is about 62%, so the ANNUAL % GROWTH of passengers is strongly correlated with ITS OWN ANNUAL % GROWTH of 2 MONTHS AGO (its LAG 2)… and so on.
It is important to mention that the ACF plot shows autocorrelations independently of each other.
In the case of PACF, we see PARTIAL autocorrelations, which measure HOW MUCH ELSE the series is correlated with each its own LAGS, AFTER CONSIDERING the effect of LOWER-ORDER LAG AUTOCORRELATIONS.
I see that the PACF shows only the first 2 autocorrelations to be positive AND SINGIFICANT, and the magnitude of the following autocorrelations goes down to zero or negative very quiclky.
When the ACF plot shows a SLOW DECLINE of autocorrelations, and the PACF shows a FAST DECLINE of autocorrelatinos, this is a clear PATTERN OF AN AR(p) MODEL. In this case, the # of AR terms is determined by the SIGNIFICANT LAGS in the PACF plot.
We see that also the AC of LAG 10 is statistically significant and negative, but it is recommended first to start with the AR(2) model, and then check the residuals of the model to see whether there is another significant lag.
Then, in this case we start with an ARIMA with p=2 and q=0.
We can use the log series, and leave the model to automatically calculate the SEASONAL difference. Then, we will use the log of the series (log of air passengers) and define my model as:
arima(2,0,0) sarima(0,1,0,12)
What this model tells us?
You first have to look at the # of periods. In this case, you have 12 periods in a year, so your series should be monthly.
Then you look at d and D (first differences and the seasonal differences). In this case, d=0 and D=1. What does this mean? this means that we are modeling the SEASONAL difference (D=1) of the log of the variable, which is the annual % growth (month by month). Since d=0 this means that we are NOT using the first difference of the series.
Then you look at the p, q, P, Q. p tells you how many AR terms are included in your model. In this case, p=2, so your model includes the first 2 AR terms. Since q, P and Q are zero, then your model does not include any of these type of terms.
How this model can be expressed mathematically?
Since you are modeling the seasonal difference, you can express this model as the following equation:
△12lnair=ϕ0+ϕ1(L1△12lnairt)+ϕ2(L2△12lnairt)+εt
Using a simpler notation, this equation can be expressed as:
s12.lnairt=ϕ0+ϕ1L1.s12.lnairt+ϕ2L2.s12.lnairt+εt
Remember that s12.lnair is the seasonal difference of the log of air passengers, which is the annual %growth of air passengers month by month. Then we can read this mathematical expression as: “The annual %growth (month by month) of air passengers at time t can be determined by its own annual % growth of air passen
In this case, instead of creating a variable for the seasonal difference of the log (s12.lnair), we will configure the parameters of the Arima function so that the log transformation and the seasonal difference will be calculated before estimating the arima coefficients.
In our first calibration we discussed above, we defined the following values of the ARIMA-SARIMA model for the log of the number of air passengers:
arima(2,0,0) sarima(0,1,0,12)
This means that D=1, so R will calculate the seasonal difference, and then estimate the arima model and include 2 AR terms and no MA terms.
We can run this model in R as follows:
m1 <- Arima(air.xts$Passengers, order = c(2,0,0),
seasonal = list(order=c(0,1,0),period=12),
include.constant = TRUE,
lambda = 0)
m1## Series: air.xts$Passengers
## ARIMA(2,0,0)(0,1,0)[12] with drift
## Box Cox transformation: lambda= 0
##
## Coefficients:
## ar1 ar2 drift
## 0.5540 0.2378 0.0096
## s.e. 0.0845 0.0848 0.0014
##
## sigma^2 estimated as 0.001742: log likelihood=233.13
## AIC=-458.26 AICc=-457.95 BIC=-446.73
We can alternatively use the function sarima from the astsa model to do the same. This function provides more detail in the output, but is very quick to do predictions as the Arima function. We run sarima for the same model as:
## initial value -2.794390
## iter 2 value -2.940651
## iter 3 value -3.175869
## iter 4 value -3.186870
## iter 5 value -3.189461
## iter 6 value -3.189605
## iter 7 value -3.189606
## iter 8 value -3.189607
## iter 9 value -3.189607
## iter 10 value -3.189608
## iter 11 value -3.189609
## iter 12 value -3.189609
## iter 12 value -3.189609
## iter 12 value -3.189609
## final value -3.189609
## converged
## initial value -3.184486
## iter 2 value -3.184596
## iter 3 value -3.184959
## iter 4 value -3.185009
## iter 5 value -3.185059
## iter 6 value -3.185073
## iter 7 value -3.185076
## iter 8 value -3.185076
## iter 8 value -3.185076
## final value -3.185076
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 ar2 constant
## 0.5540 0.2378 0.0096
## s.e. 0.0845 0.0848 0.0014
##
## sigma^2 estimated as 0.001701: log likelihood = 233.13, aic = -458.26
##
## $degrees_of_freedom
## [1] 129
##
## $ttable
## Estimate SE t.value p.value
## ar1 0.5540 0.0845 6.5597 0.0000
## ar2 0.2378 0.0848 2.8037 0.0058
## constant 0.0096 0.0014 6.8655 0.0000
##
## $AIC
## [1] -3.204618
##
## $AICc
## [1] -3.203411
##
## $BIC
## [1] -3.12398
As we see, we need to take the log first and then specify the model. The result is exactly the same as the Arima result, but we can see the coefficients, their p-values and other important plots of the model.
## Estimate SE t.value p.value
## ar1 0.5540 0.0845 6.5597 0.0000
## ar2 0.2378 0.0848 2.8037 0.0058
## constant 0.0096 0.0014 6.8655 0.0000
I see that the constant (phi0) and both lags are positive and SIGNIFICANT. What does this mean? First, I have to check which is the variable I am modelling. I am modelling the s12.lnair, which is the annual % growth of air passengers month by month. Then, I can interpret this arima model as follow:
The annual % growth of air passengers is positively and significantly related with the annual % growth of air passengers of last month (lag1). In a similar way, the annual % growth of air passengers is positively and significantly related with the annual % growth of two months ago. More specifically, 55.4% of annual growth of last month is passed to this month’s annual growth, andonly 23.78% of the annual growth of 2 months ago is passed to this month. In other words, for each percentual point of annual growth of previous month, it is expected that the current annual growth will grow about 0.55 percentual points
It is interesting to see that the sum of phi1 and phi2 (0.55+0.23) MUST be < 1 in order for the series to be stationary. If you include more arima terms, the sum of the phi’s must be less than one, and the sum of the thetas also must be less than one. In addition, the constant or phi0 is positive and significant, indicating that the annual growth of air passengers has a positive tendency to grow over time.
The mathematical model is: E[s12.lnair]=0.1150+0.5540∗L1.s12.lnair+0.23∗L2.s12.lnair or s12.lnair=0.1150+0.5540∗L1.s12.lnair+0.23∗L2.s12.lnair+e
We need to generate a series for the residuals of the ARIMA-SARIMA model. To do this, get the residuals as follows:
## Time Series:
## Start = 1
## End = 432001
## Frequency = 1.15740740740741e-05
## [1] 0.004708908 0.004751501 0.004854026 0.004821445 0.004747833 0.004847728
Test whether the residual series (the air_res variable) is a white noise running the ac and pac plots. A white noise is a series that has no signicant autocorrelations with any of its own lags.
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## ACF 0.01 0.01 -0.13 0.02 0.09 0.05 -0.03 0.05 0.20 -0.01 -0.09 -0.40 0.04
## PACF 0.01 0.01 -0.13 0.02 0.10 0.03 -0.03 0.08 0.21 -0.04 -0.10 -0.37 0.01
## [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22]
## ACF 0.01 0.11 -0.11 0.07 0.04 -0.05 -0.13 -0.03 -0.03
## PACF -0.04 0.01 -0.09 0.16 0.11 -0.08 -0.08 0.16 -0.09
BY LOOKING AT THE PLOT I CAN SAY THAT THERE´S NO EXISTENT AUTOCORRELATION IN THE RESIDUALS.
Can you add another AR or MA term to improve the previous model and then to make the residual a white noise ?
THE RESIDUALS ALREADY ARE WHITE NOISE
Go to the course site (Material) and download Dr. Nau slides about ARIMA/SARIMA models. Read the slides and focus on his recommendations to select the number of p, d, q, P, D, Q for any ARIMA/SARIMA model. Download the following Dataset:
rm(list=ls())
download.file("http://www.apradie.com/ec2004/salesfab2.xlsx", "salesfabs2.xlsx", mode="wb")
library(readxl)
library(xts)
library(zoo)
library(tseries)
library(forecast)
library(astsa)
library(lmtest)Now, we read the data from the Excel file:
sales_brands.xts <- read_xlsx("salesfabs2.xlsx", sheet = "Sheet1")
# Date variable using the month column:
dates <- seq(as.Date(sales_brands.xts$month[1]),
by="month",
length.out = nrow(sales_brands.xts) )
sales_brands.xts = sales_brands.xts[,-1]
sales_brands.xts<- xts(x=sales_brands.xts, frequency = "monthly", order.by=dates)Create the log variables for price (pfab1) and sales in volume (qfab1) for product 1.
For Price
For Sales
Sales in tons of product 1:
BY LOOKING AT THE GRAPH I CAN SAY THAT IT HAS A GROWING TENDENCY, I WOULD SAY ITS NOT STATIONARY CAUSE ITS MEAN IS CHANGING AND THE SEASONALITY IS MARKABLE
If the series is not stationary, try whether the seasonal difference of the log looks as stationary (s12.lnqfab1). Do a time series plot, run and INTERPRET the Dicky-Fuller test
## Warning in adf.test(na.omit(diff(ln.qfab1.xts, lag = 12)), alternative =
## "stationary", : p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: na.omit(diff(ln.qfab1.xts, lag = 12))
## Dickey-Fuller = -4.8046, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary
DUE TO MY P-VALUE OF 0.01, I CAN SAY THAT THE SEASONAL DIFFERENCE IS STATIONARY.
# I create a ts object from the xts; the acf2 works better with ts object
ln.qfab1 <- ts(ln.qfab1.xts$qfab1)
acf2(diff(ln.qfab1,lag = 12))## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## ACF 0.41 0.47 0.31 0.17 0.01 0.04 -0.13 -0.28 -0.29 -0.48 -0.24 -0.49 -0.15
## PACF 0.41 0.37 0.05 -0.13 -0.19 0.05 -0.10 -0.29 -0.14 -0.26 0.26 -0.30 0.21
## [,14] [,15] [,16] [,17] [,18] [,19]
## ACF -0.23 -0.15 -0.14 -0.03 -0.06 0.17
## PACF -0.04 -0.08 -0.09 -0.18 0.00 0.21
I´m going to use the log of the series(log of sales for product 1), my model would be arima(2,0,0)sarima(1,1,0,12)
What this model is telling us?
I have 12 number of periods due to my monthly series
Then the p and P (number of autorregresive terms and seasonal terms) In this case p=2 and P=1 which means how many AR terms are included un your model. Which are 2 AR terms ans 1 AR seasonal term.
Then the d and D (first differenes and seasonal differences). In this case, d=0 and D=1 which means that we are modeling the SEASONAL differnece (D=1) of the lof of the variable.
Then the q and Q which are in zero, then your model does not include any of these type of terms.
Calibration of the log of sales for product 1
m_1 <- Arima(sales_brands.xts$qfab1, order = c(2,0,0), seasonal=list(order=c(1,1,0),period=12),include.constant = TRUE, lambda = 0)
m_1## Series: sales_brands.xts$qfab1
## ARIMA(2,0,0)(1,1,0)[12] with drift
## Box Cox transformation: lambda= 0
##
## Coefficients:
## ar1 ar2 sar1 drift
## 0.3356 0.1918 -0.4171 0.0085
## s.e. 0.1310 0.1391 0.1200 0.0013
##
## sigma^2 estimated as 0.005833: log likelihood=67.71
## AIC=-125.42 AICc=-124.26 BIC=-115.12
## initial value -2.339814
## iter 2 value -2.550462
## iter 3 value -2.580508
## iter 4 value -2.585312
## iter 5 value -2.586270
## iter 6 value -2.586331
## iter 7 value -2.586332
## iter 7 value -2.586332
## iter 7 value -2.586332
## final value -2.586332
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 ar2 sar1 constant
## 0.3356 0.1918 -0.4171 0.0085
## s.e. 0.1310 0.1391 0.1200 0.0013
##
## sigma^2 estimated as 0.005425: log likelihood = 67.71, aic = -125.42
##
## $degrees_of_freedom
## [1] 54
##
## $ttable
## Estimate SE t.value p.value
## ar1 0.3356 0.1310 2.5620 0.0132
## ar2 0.1918 0.1391 1.3785 0.1737
## sar1 -0.4171 0.1200 -3.4768 0.0010
## constant 0.0085 0.0013 6.7942 0.0000
##
## $AIC
## [1] -1.548366
##
## $AICc
## [1] -1.541869
##
## $BIC
## [1] -1.421179
## Estimate SE t.value p.value
## ar1 0.3356 0.1310 2.5620 0.0132
## ar2 0.1918 0.1391 1.3785 0.1737
## sar1 -0.4171 0.1200 -3.4768 0.0010
## constant 0.0085 0.0013 6.7942 0.0000
IN WHITE NOISE THE MEAN IS EQUAL TO 0, THE TIME-SERIES IS STATIONARY AND THE AUTOCORRELATIONS BETWEEN THE VARIABLES AND LAGS IS EQUAL TO 0.
## Time Series:
## Start = 1
## End = 6
## Frequency = 1
## [1] 0.005814540 0.005862934 0.005811224 0.005181171 0.005093136 0.005478768
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
## ACF -0.01 0.02 0.13 -0.1 -0.15 0.17 -0.01 -0.16 -0.06 -0.29 0.06 -0.07 0.17
## PACF -0.01 0.02 0.13 -0.1 -0.16 0.17 0.03 -0.16 -0.15 -0.29 0.19 -0.09 0.18
## [,14] [,15] [,16] [,17] [,18] [,19] [,20]
## ACF 0.02 -0.04 -0.08 -0.04 -0.14 0.20 0.08
## PACF -0.07 -0.07 -0.04 -0.11 -0.17 0.19 -0.05
BECAUSE IF THE MEAN AND THE AUTOCCORRELATIONS BETWEEN VARIABLE AND LAGS IS EQUAL TO 0, THIS WOULD REPRESENT THAT MY MODEL IS WELL CALIBRATED.
Create 24 new observations in the future Management data
# Convert the xts to a ts dataset:
qfab_1.ts <- ts(coredata(sales_brands.xts$qfab1), start=c(2012,1), frequency = 12)
qfab_1.ts## Jan Feb Mar Apr May Jun Jul Aug
## 2012 338.0100 357.8112 342.6911 184.0687 170.0012 252.1363 179.6723 179.9935
## 2013 391.1488 402.2955 403.4620 204.9068 203.2642 261.1563 173.9211 194.2030
## 2014 352.1699 497.8295 450.2763 267.4923 255.3361 324.7467 230.2683 229.0850
## 2015 477.4290 437.6544 427.0321 262.7887 254.4105 320.5292 245.1620 233.3661
## 2016 480.6635 493.3284 429.3796 300.9901 293.1736 349.5101 284.4155 290.8479
## 2017 531.4313 543.0544 451.3241 313.1662 312.9280 372.2951 307.8225 299.9826
## 2018 NA NA NA NA NA NA NA NA
## Sep Oct Nov Dec
## 2012 170.4417 151.3453 156.3783 206.7963
## 2013 175.0611 165.7026 173.4322 216.8979
## 2014 236.9812 218.0010 220.9612 253.1830
## 2015 243.5807 226.4656 225.1428 293.7678
## 2016 273.6415 264.9729 271.2281 306.9521
## 2017 299.2357 276.1388 NA NA
## 2018 NA NA
# Run the Arima model:
m_1 <- Arima(qfab_1.ts, order = c(2,0,0), seasonal = list(order=c(1,1,0),period=12),include.constant = TRUE, lambda = 0)
m_1## Series: qfab_1.ts
## ARIMA(2,0,0)(1,1,0)[12] with drift
## Box Cox transformation: lambda= 0
##
## Coefficients:
## ar1 ar2 sar1 drift
## 0.3356 0.1918 -0.4171 0.0085
## s.e. 0.1310 0.1391 0.1200 0.0013
##
## sigma^2 estimated as 0.005833: log likelihood=67.71
## AIC=-125.42 AICc=-124.26 BIC=-115.12
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.3356360 0.1310070 2.5620 0.0104080 *
## ar2 0.1917799 0.1391192 1.3785 0.1680398
## sar1 -0.4170940 0.1199652 -3.4768 0.0005075 ***
## drift 0.0085315 0.0012557 6.7942 1.089e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Create a forecast for the future observations using your last model
## Warning: Removed 12 row(s) containing missing values (geom_path).
## $names
## [1] "method" "model" "level" "mean" "lower" "upper"
## [7] "x" "series" "fitted" "residuals"
##
## $class
## [1] "forecast"
## Jan Feb Mar Apr May Jun Jul Aug
## 2017
## 2018 649.8982 665.6756 560.0577 389.8095 387.2543 461.0663 379.8071 374.5312
## 2019 719.2199 737.1913 622.4971 433.6641 430.0735 512.1603 421.4283 417.0415
## Sep Oct Nov Dec
## 2017 323.4230 379.3843
## 2018 368.3474 343.9315 355.2466 421.1893
## 2019 408.4264 382.6928
The AIC (Akaike Information Criterios) is a mathematical method that helps to evaluate how well a model fits the data it was generated from. Its used to compare different possible models and determine which one is the best fit for the data. Calculated from:
~The number of independent variables used to build the model. ~The maximum likelihood estimate of the model (how well the model reproduces the data).