Al modelo ARIMA se le pueden hacer varias modificaciones para tener en cuenta el comportamiento estacional y no estacionario. A menudo, la dependencia del pasado tiende a ocurrir con más fuerza en múltiplos de algunos rezagos estacionales \(s\)

\[ \Phi_{P}\left(B^{s}\right) x_{t}=\Theta_{Q}\left(B^{s}\right) w_{t} \]

Donde \[ \Phi_{P}\left(B^{s}\right)=1-\Phi_{1} B^{s}-\Phi_{2} B^{2 s}-\cdots-\Phi_{P} B^{P s} \] y \[ \Theta_{Q}\left(B^{s}\right)=1+\Theta_{1} B^{s}+\Theta_{2} B^{2 s}+\cdots+\Theta_{Q} B^{Q s} \]

son los operador autorregresivos estacional y de media móvil estacional de órdenes \(P\) y \(Q\), respectivamente, con período estacional \(s\).

Ejemplo:

Una Serie de medias móviles autorregresivas estacionales de primer orden podrían escribirse como:

\[\left(1-\Phi B^{12}\right) x_{t}=\left(1+\Theta B^{12}\right) w_{t}\] o \[x_{t}=\Phi x_{t-12}+w_{t}+\Theta w_{t-12}\]

Definición: El modelo autorregresivo estacional multiplicativo integrado de media móvil, o modelo SARIMA, de Box y Jenkins (1970) es dado por

\[\Phi_{P}\left(B^{s}\right) \phi(B) \nabla_{s}^{D} \nabla^{d} x_{t}=\alpha+\Theta_{Q}\left(B^{s}\right) \theta(B) w_{t}\]

donde \(w_t\) es el proceso habitual de ruido blanco. El modelo general se denota como \(ARIMA (p, d, q) × (P, D, Q)_s\).

Ejemplo:

El modelo \(SARIMA(0, 1, 1) × (0, 1, 1)_{12}\) se representa como:

\[\left(1-B^{12}\right)(1-B) x_{t}=\left(1+\Theta B^{12}\right)(1+\theta B) w_{t}\]

Tempdup

Temperatura promedio mensual (en grados Fahrenheit) registrada en Dubuque 1/1964 - 12/1975

Cargar las librerias

library(car)
library(urca)
library(forecast)
library(tseries)
#library(FitAR)
library(highcharter)

Cargar los datos

library(rio)
tempdub <-rio::import("https://github.com/Wilsonsr/Series-de-Tiempo/raw/main/bases/tempdub.dat")
head(tempdub)
##   tempdub
## 1    24.7
## 2    25.7
## 3    30.6
## 4    47.5
## 5    62.9
## 6    68.5

Formato serie de tiempo

z1<- ts(tempdub, start = c(1964,1), end=c(1975,12), frequency = 12 )
z1
##       Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
## 1964 24.7 25.7 30.6 47.5 62.9 68.5 73.7 67.9 61.1 48.5 39.6 20.0
## 1965 16.1 19.1 24.2 45.4 61.3 66.5 72.1 68.4 60.2 50.9 37.4 31.1
## 1966 10.4 21.6 37.4 44.7 53.2 68.0 73.7 68.2 60.7 50.2 37.2 24.6
## 1967 21.5 14.7 35.0 48.3 54.0 68.2 69.6 65.7 60.8 49.1 33.2 26.0
## 1968 19.1 20.6 40.2 50.0 55.3 67.7 70.7 70.3 60.6 50.7 35.8 20.7
## 1969 14.0 24.1 29.4 46.6 58.6 62.2 72.1 71.7 61.9 47.6 34.2 20.4
## 1970  8.4 19.0 31.4 48.7 61.6 68.1 72.2 70.6 62.5 52.7 36.7 23.8
## 1971 11.2 20.0 29.6 47.7 55.8 73.2 68.0 67.1 64.9 57.1 37.6 27.7
## 1972 13.4 17.2 30.8 43.7 62.3 66.4 70.2 71.6 62.1 46.0 32.7 17.3
## 1973 22.5 25.7 42.3 45.2 55.5 68.9 72.3 72.3 62.5 55.6 38.0 20.4
## 1974 17.6 20.5 34.2 49.2 54.8 63.8 74.0 67.1 57.7 50.8 36.8 25.5
## 1975 20.4 19.6 24.6 41.3 61.8 68.5 72.0 71.1 57.3 52.5 40.6 26.2
summary(tempdub)
##     tempdub     
##  Min.   : 8.40  
##  1st Qu.:27.32  
##  Median :48.60  
##  Mean   :46.27  
##  3rd Qu.:63.12  
##  Max.   :74.00
diff(z1,12)
##        Jan   Feb   Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov   Dec
## 1965  -8.6  -6.6  -6.4  -2.1  -1.6  -2.0  -1.6   0.5  -0.9   2.4  -2.2  11.1
## 1966  -5.7   2.5  13.2  -0.7  -8.1   1.5   1.6  -0.2   0.5  -0.7  -0.2  -6.5
## 1967  11.1  -6.9  -2.4   3.6   0.8   0.2  -4.1  -2.5   0.1  -1.1  -4.0   1.4
## 1968  -2.4   5.9   5.2   1.7   1.3  -0.5   1.1   4.6  -0.2   1.6   2.6  -5.3
## 1969  -5.1   3.5 -10.8  -3.4   3.3  -5.5   1.4   1.4   1.3  -3.1  -1.6  -0.3
## 1970  -5.6  -5.1   2.0   2.1   3.0   5.9   0.1  -1.1   0.6   5.1   2.5   3.4
## 1971   2.8   1.0  -1.8  -1.0  -5.8   5.1  -4.2  -3.5   2.4   4.4   0.9   3.9
## 1972   2.2  -2.8   1.2  -4.0   6.5  -6.8   2.2   4.5  -2.8 -11.1  -4.9 -10.4
## 1973   9.1   8.5  11.5   1.5  -6.8   2.5   2.1   0.7   0.4   9.6   5.3   3.1
## 1974  -4.9  -5.2  -8.1   4.0  -0.7  -5.1   1.7  -5.2  -4.8  -4.8  -1.2   5.1
## 1975   2.8  -0.9  -9.6  -7.9   7.0   4.7  -2.0   4.0  -0.4   1.7   3.8   0.7
autoplot(z1)#

plot(decompose(z1))

hchart(z1)

Identificación

par(mfrow=c(3,3))
plot(z1,type="o")
acf(z1, lag.max=40)
pacf(z1, lag.max=40)
plot(diff(z1),type="o")
abline(h=2*sqrt(var(diff(z1))),col="red",lty=2)
abline(h=-2*sqrt(var(diff(z1))),col="red",lty=2)
acf(diff(z1), lag.max=40)
pacf(diff(z1), lag.max=40)
plot(diff(z1,12),type="o")
acf(diff(z1,12), lag.max=40)
pacf(diff(z1,12), lag.max=40)

adf.test(z1)
## Warning in adf.test(z1): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  z1
## Dickey-Fuller = -11.077, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary
adf.test(diff(z1))
## Warning in adf.test(diff(z1)): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  diff(z1)
## Dickey-Fuller = -10.856, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary

Test DF sobre la serie desestacionalizada

plot(ur.df(diff(z1,12),type="none",lag=12))

summary(ur.df(diff(z1,12),type="none",lag=12))
## 
## ############################################### 
## # Augmented Dickey-Fuller Test Unit Root Test # 
## ############################################### 
## 
## Test regression none 
## 
## 
## Call:
## lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.8377  -2.5662   0.5987   2.5522  10.8391 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## z.lag.1      -1.64348    0.36600  -4.490 1.82e-05 ***
## z.diff.lag1   0.72942    0.33382   2.185  0.03108 *  
## z.diff.lag2   0.63916    0.31374   2.037  0.04412 *  
## z.diff.lag3   0.61574    0.29625   2.078  0.04008 *  
## z.diff.lag4   0.50539    0.27730   1.823  0.07119 .  
## z.diff.lag5   0.46527    0.25846   1.800  0.07468 .  
## z.diff.lag6   0.56632    0.23824   2.377  0.01924 *  
## z.diff.lag7   0.52325    0.21942   2.385  0.01887 *  
## z.diff.lag8   0.49019    0.19540   2.509  0.01363 *  
## z.diff.lag9   0.44101    0.17154   2.571  0.01153 *  
## z.diff.lag10  0.45063    0.14953   3.014  0.00323 ** 
## z.diff.lag11  0.32356    0.12630   2.562  0.01182 *  
## z.diff.lag12 -0.11132    0.09405  -1.184  0.23925    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.263 on 106 degrees of freedom
## Multiple R-squared:  0.5952, Adjusted R-squared:  0.5456 
## F-statistic: 11.99 on 13 and 106 DF,  p-value: 1.379e-15
## 
## 
## Value of test-statistic is: -4.4904 
## 
## Critical values for test statistics: 
##       1pct  5pct 10pct
## tau1 -2.58 -1.95 -1.62

Dickey Fuller del paquete tseries

adf.test(diff(z1,12))
## Warning in adf.test(diff(z1, 12)): p-value smaller than printed p-value
## 
##  Augmented Dickey-Fuller Test
## 
## data:  diff(z1, 12)
## Dickey-Fuller = -4.9185, Lag order = 5, p-value = 0.01
## alternative hypothesis: stationary

test phillips- perron

plot(ur.pp(diff(z1,12),type="Z-tau",
     model="constant", lags="long"))

summary(ur.pp(diff(z1,12),type="Z-tau",
     model="constant", lags="long"))
## 
## ################################## 
## # Phillips-Perron Unit Root Test # 
## ################################## 
## 
## Test regression with intercept 
## 
## 
## Call:
## lm(formula = y ~ y.l1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.0779  -3.1268   0.3738   2.7058  13.0120 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.03669    0.41033  -0.089    0.929
## y.l1         0.08989    0.08660   1.038    0.301
## 
## Residual standard error: 4.695 on 129 degrees of freedom
## Multiple R-squared:  0.008283,   Adjusted R-squared:  0.000595 
## F-statistic: 1.077 on 1 and 129 DF,  p-value: 0.3012
## 
## 
## Value of test-statistic, type: Z-tau  is: -10.579 
## 
##          aux. Z statistics
## Z-tau-mu           -0.0858
## 
## Critical values for Z statistics: 
##                      1pct      5pct     10pct
## critical values -3.480998 -2.883488 -2.578338

Como la serie desestacionalizada es estacionaria vamos a plantear algunos modelos

par(mfrow=c(1,3))
plot(diff(z1,12),type="o")
acf(diff(z1,12), lag.max=40)
pacf(diff(z1,12), lag.max=40)

library(TSstudio)
ts_cor(diff(z1,12), lag.max=40)

Ajuste del Modelo

modelo1<-stats::arima(z1,order=c(0,0,11), 
        seasonal=list(order=c(0,1,1), period=12), 
        fixed = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))
modelo1
## 
## Call:
## stats::arima(x = z1, order = c(0, 0, 11), seasonal = list(order = c(0, 1, 1), 
##     period = 12), fixed = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA))
## 
## Coefficients:
##          ma1      ma2      ma3      ma4      ma5     ma6      ma7      ma8
##       0.0773  -0.0780  -0.0443  -0.1188  -0.0646  0.1583  -0.0557  -0.1154
## s.e.  0.0874   0.0924   0.0897   0.0916   0.0892  0.0913   0.0910   0.0897
##           ma9    ma10     ma11     sma1
##       -0.1283  0.0038  -0.1521  -1.0000
## s.e.   0.0891  0.1036   0.1009   0.1079
## 
## sigma^2 estimated as 10.45:  log likelihood = -357.83,  aic = 741.66
tt <- modelo1$coef[which(modelo1$coef!=0)]/sqrt(diag(modelo1$var.coef))
1 - pt(abs(tt),(modelo1$nobs - length(modelo1$coef[which(modelo1$coef!=0)])))
##          ma1          ma2          ma3          ma4          ma5          ma6 
## 1.888921e-01 2.002158e-01 3.112488e-01 9.855008e-02 2.352154e-01 4.267266e-02 
##          ma7          ma8          ma9         ma10         ma11         sma1 
## 2.708313e-01 1.002882e-01 7.616236e-02 4.854649e-01 6.713506e-02 4.440892e-16
BIC(modelo1)
## [1] 779.1329

Diagnóstico

et<-residuals(modelo1)
x1.fit <- fitted(modelo1)

par(mfrow=c(3,2))
plot(z1,type="l",lty=2)
lines(x1.fit,type="l",col="red")
plot(scale(et),type="l",main="Residuales")
abline(h=2*sqrt(var(scale(et))),col="red",lty=2)
abline(h=-2*sqrt(var(scale(et))),col="red",lty=2)
acf(et)
pacf(et)
qqPlot(scale(et))
## [1]  25 111
acf(abs(et)) #Mide Estructura Heteroced?stica

Test de Autocorrelacion de Ljung-Box

\(H_0: r1=r2=r3=...=rlag=0\)

\(H_a\): Al menos una es dif de cero

Box.test(et,lag=15,type="Ljung-Box")
## 
##  Box-Ljung test
## 
## data:  et
## X-squared = 2.9849, df = 15, p-value = 0.9996
tsdiag(modelo1 , gof.lag = 20)

Test de Normalidad basado en Sesgo y Curtosis

\(H_0:\) Datos provienen de una Dist. Normal

\(H_a:\) Los datos no provienen de una Dist. Normal

jarque.bera.test(et)
## 
##  Jarque Bera Test
## 
## data:  et
## X-squared = 2.3646, df = 2, p-value = 0.3066

Test de Aleatoriedad

\(H_0:\) Residuales exhiben un comport. de Aleatoriedad \(H_a:\) Residuales no exhiben estructura (Tendencia, o cualquier otro comportamiento predecible)

runs.test(as.factor(sign(et)), alternative="two.sided")
## 
##  Runs Test
## 
## data:  as.factor(sign(et))
## Standard Normal = -1.338, p-value = 0.1809
## alternative hypothesis: two.sided

Pronóstico Fuera Muestra

plot(forecast(modelo1,h=10, fan=T))
lines(fitted(modelo1), col="red")