by Kunaal Naik YouTube - www.youtube.com/fxexcel GitHub Link - Download Code and Dataset
Links open in another window
library(forecast)
library(tseries)
Provide the path in which you data is present
setwd("C:\\Users\\DELL\\Documents\\__Fun_X_Excel_Channel_Videos\\Arima\\R")
sales <- read.csv("sales.csv")
sales_ts <- ts(sales$Sales_k,start=c(1972),frequency=12)
autoplot(sales_ts)
Stationarity : A stationary process has a mean and variance that do not change overtime and the process does not have trend.
Null Hypothesis - Non Stationary (Do NOT Reject if P value > sig lvl {1%, 5%, 10%} )
adf.test(sales_ts)
## Warning in adf.test(sales_ts): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: sales_ts
## Dickey-Fuller = -8.7644, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
Since P is not greater sig lvl - The Series is NON Stationary
We will take the first difference to make it Stationary
sales_ts_d1 <- diff(sales_ts, differences = 1)
adf.test(sales_ts_d1)
## Warning in adf.test(sales_ts_d1): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: sales_ts_d1
## Dickey-Fuller = -10.501, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
autoplot(sales_ts_d1)
Much better and Stationary.
q term will remain 1 - since we took the first difference
ACF - Correlation between lags
Acf(sales_ts)
##### We will run the same test with differenced series
Acf(sales_ts_d1)
PACF - Correlation between moving averages
Pacf(sales_ts)
##### We will run the same test with differenced series
Pacf(sales_ts_d1)
We will use p and q as 6; d will be 1
tsMod <- Arima(y = sales_ts,order = c(6,1,6))
tsMod
## Series: sales_ts
## ARIMA(6,1,6)
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ar6 ma1 ma2
## 0.0083 -0.0185 0.0170 -0.0218 0.0152 -0.9819 0.0057 -0.1451
## s.e. 0.0161 0.0172 0.0222 0.0190 0.0155 0.0120 0.0560 0.0478
## ma3 ma4 ma5 ma6
## -0.3816 -0.1804 -0.0486 0.9697
## s.e. 0.0509 0.0439 0.0560 0.0574
##
## sigma^2 estimated as 223.7: log likelihood=-643.19
## AIC=1312.38 AICc=1314.96 BIC=1351.94
forecast(tsMod,h=12)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Jan 1985 847.1119 827.7200 866.5039 817.4546 876.7693
## Feb 1985 887.8929 860.4025 915.3834 845.8499 929.9360
## Mar 1985 904.2875 872.3911 936.1838 855.5062 953.0688
## Apr 1985 928.0817 894.8973 961.2662 877.3306 978.8329
## May 1985 943.8799 910.2542 977.5055 892.4539 995.3058
## Jun 1985 935.9272 901.9288 969.9257 883.9312 987.9233
## Jul 1985 889.9770 855.6982 924.2557 837.5521 942.4018
## Aug 1985 849.6974 815.1469 884.2480 796.8569 902.5379
## Sep 1985 833.9970 798.6300 869.3639 779.9079 888.0861
## Oct 1985 810.8785 772.6859 849.0711 752.4680 869.2891
## Nov 1985 795.6589 753.4106 837.9072 731.0457 860.2722
## Dec 1985 803.6800 757.5914 849.7686 733.1936 874.1665
autoplot(forecast(tsMod,h=12))
Box.test(tsMod$residuals, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: tsMod$residuals
## X-squared = 0.037092, df = 1, p-value = 0.8473