En macroeconomía , el producto interno bruto (PIB),1 conocido también como producto interior o producto bruto interno (PBI),23 es una magnitud macroeconómica que expresa el valor monetario de la producción de bienes y servicios de demanda final de un país o región durante un período determinado, normalmente de un año o trimestrales.
Librerias
rm(list=ls()) # limpia memoria
library(fpp2)
library(TSA)
library(FitAR)
library(urca)
library(fBasics)
Cargar BD
lectura de datos desde un archivo de texto. Los datos corresponden al Producto Interno Bruto de Colombia observado trimestralmente y desestacionalizado desde el primer trimestre del año 2000 al cuarto trimestre del año 2015.
#library(readr)
#PIB <- read_csv("PIB_real_2000_1_2015_4.txt",col_names = FALSE)
# Otra forma
z= ts(scan("PIB_real_2000_1_2015_4.txt"))
Read 64 items
print(z)
Time Series:
Start = 1
End = 64
Frequency = 1
[1] 70991 71017 71421 71332 71846 72014 72614 73065 72407 74907 74643 74832
[13] 75449 76724 77633 78612 80107 80093 81037 83629 83449 84900 85395 86412
[25] 87939 89875 91920 93204 94935 95484 97577 99987 99664 100571 101186 100323
[37] 100806 101778 102528 103267 104438 105364 106055 108742 110319 112157 114462 115640
[49] 116747 117778 117403 118952 120113 123305 124435 125978 127706 128211 129404 130168
[61] 131171 132204 133601 134400
print(length(z))
[1] 64
Gráfica de la serie
plot.ts(z, type="o", cex=0.6,col="darkblue")
BoxCox.lambda(z, method = c("loglik"), lower = -2, upper = 2)
[1] 0.05
Transformacion LN
par(mfrow=c(2,1))
plot.ts(z, type="o", cex=0.6)# original
plot.ts(log(z), type="o", cex=0.6)# Transformada
Aplicar librería FitAR
(p=round(length(z)^(1/3)))
[1] 4
mod1=arima(z, c(p, 1, 0))
BoxCox(mod1)
Probar:
H0: Hay Raiz unitaria, es tendencia Aleartoria Ha: No Hay Raiz Unitaria, Tendencia deterministica
vamos a diferencia una vez
plot.ts(diff(z), type="o")
(maxlag=floor(12*(length(z)/100)^(1/4))) # Academica sugiere esta funcion
[1] 10
ru_dif_z=ur.df(diff(z), type = c("drift"), lags=maxlag, selectlags = c("AIC"))
summary(ru_dif_z)
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################
Test regression drift
Call:
lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
Residuals:
Min 1Q Median 3Q Max
-1975.01 -342.97 -74.13 442.79 1989.07
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 986.1994 237.6083 4.151 0.000132 ***
z.lag.1 -0.8559 0.1887 -4.537 3.71e-05 ***
z.diff.lag -0.1273 0.1362 -0.935 0.354596
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 787.8 on 49 degrees of freedom
Multiple R-squared: 0.5023, Adjusted R-squared: 0.482
F-statistic: 24.73 on 2 and 49 DF, p-value: 3.766e-08
Value of test-statistic is: -4.5367 10.3262
Critical values for test statistics:
1pct 5pct 10pct
tau2 -3.51 -2.89 -2.58
phi1 6.70 4.71 3.86
Normalidad
resid1=ru_dif_z@testreg$residuals
plot(ru_dif_z)
Los residuales estandarizado
resid1_sd=resid1/ru_dif_z@testreg$sigma
qqnorm(resid1, xlab = "Cuantiles Teóricos", ylab = "Cuantiles Muestrales")
qqline(resid1)
shapiro.test(resid1) #prueba de Shapiro-Wilks
Shapiro-Wilk normality test
data: resid1
W = 0.96353, p-value = 0.1113
jarqueberaTest(resid1) # prueba Jarque-Bera
Title:
Jarque - Bera Normalality Test
Test Results:
STATISTIC:
X-squared: 1.3737
P VALUE:
Asymptotic p Value: 0.5032
Description:
Wed Jul 14 19:28:45 2021 by user: oscagaal
**CUal es p y q?
de forma automatica
Selección “automática” del modelo usando librería foreast
auto.arima(z, d=1, max.p=5, max.q=5, ic=c("aic"))
Series: z
ARIMA(0,1,3) with drift
Coefficients:
ma1 ma2 ma3 drift
0.0514 0.2605 0.2197 995.7527
s.e. 0.1361 0.1507 0.1259 149.1463
sigma^2 estimated as 653706: log likelihood=-509.27
AIC=1028.53 AICc=1029.58 BIC=1039.25
auto.arima(z, d=1, max.p=5, max.q=5, ic=c("bic"))
Series: z
ARIMA(0,1,0) with drift
Coefficients:
drift
1006.492
s.e. 104.338
sigma^2 estimated as 696984: log likelihood=-512.7
AIC=1029.41 AICc=1029.61 BIC=1033.69
# selección del modelo usando la eacf: se elige el modelo que señala el vértice
# de un triángulo de ceros en la tabla
eacf(diff(z)) # se encuentra en el paquete TSA
AR/MA
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 o o o o o o o o o o x o o o
1 o o o o o o o o o o o o o o
2 x o o o o o o o o o o o o o
3 x x o o o o o o o o o o o o
4 x o o o o o o o o o o o o o
5 x o o o o o o o o o o o o o
6 o o o x o o o o o o o o o o
7 x x o o o o o o o o o o o o
ARIMA(1,1,0): estimación ML exacta del con valores iniciales dados por la estimación condicional
mod3_CSS_ML=Arima(z, c(1, 1, 0), include.drift=TRUE, lambda=1, method = c("CSS-ML"))
summary(mod3_CSS_ML)
Series: z
ARIMA(1,1,0) with drift
Box Cox transformation: lambda= 1
Coefficients:
ar1 drift
0.0487 1005.5300
s.e. 0.1263 109.4894
sigma^2 estimated as 706717: log likelihood=-512.63
AIC=1031.26 AICc=1031.66 BIC=1037.69
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 1.855449 820.7252 617.8031 -0.05127845 0.6739946 0.5645638 -0.01163188
Sacar los Residuales
res3_CSS_ML=residuals(mod3_CSS_ML)
res3_est=res3_CSS_ML/(mod3_CSS_ML$sigma2^.5) # estandarización de los residuales
autoplot(mod3_CSS_ML)
Normalidad
# chequeo de normalidad
# gráfico cuantil-cuantil
qqnorm(res3_est, xlab = "Cuantiles Teóricos", ylab = "Cuantiles Muestrales")
abline(a=0, b=1)
Test normalidad
# pruebas de normalidad
shapiro.test(res3_est)
Shapiro-Wilk normality test
data: res3_est
W = 0.98498, p-value = 0.6282
Detección de observaciones atípicas distantes
plot.ts(res3_est, type="o", ylim=c(-4,4))
abline(a=-3, b=0, col="red", lty=2)
abline(a=3, b=0, col="red", lty=2)
Valores Ajustados del Modelo
ajust=mod3_CSS_ML$fitted
# gráfico para los valores ajustados y los valores observados
ts.plot(z,ajust) # gráfico de las series contra el tiempo
lines(z, col="black", type="o", cex=.5)
lines(ajust, col="red", type="o", cex=.5)
summary(mod3_CSS_ML)
Series: z
ARIMA(1,1,0) with drift
Box Cox transformation: lambda= 1
Coefficients:
ar1 drift
0.0487 1005.5300
s.e. 0.1263 109.4894
sigma^2 estimated as 706717: log likelihood=-512.63
AIC=1031.26 AICc=1031.66 BIC=1037.69
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 1.855449 820.7252 617.8031 -0.05127845 0.6739946 0.5645638 -0.01163188
(t=coef(mod3_CSS_ML)/(diag(vcov(mod3_CSS_ML)))^.5)
ar1 drift
0.3854143 9.1838115
# valores P
(val_p=2*pnorm(t, lower.tail=FALSE))
ar1 drift
6.999305e-01 4.160965e-20
summary(mod4_CSS_ML)
Series: z
ARIMA(0,1,3) with drift
Box Cox transformation: lambda= 1
Coefficients:
ma1 ma2 ma3 drift
0.0514 0.2605 0.2197 995.7527
s.e. 0.1361 0.1507 0.1259 149.1463
sigma^2 estimated as 653706: log likelihood=-509.27
AIC=1028.53 AICc=1029.58 BIC=1039.25
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 10.13972 776.2961 579.4217 -0.02546079 0.6295016 0.5294899 -0.03977131
Normalidad
res4_CSS_ML=residuals(mod4_CSS_ML)
res4_est=res4_CSS_ML/(mod4_CSS_ML$sigma2^.5) # estandarización de los residuales
qqnorm(res4_est, xlab = "Cuantiles Teóricos", ylab = "Cuantiles Muestrales")
abline(a=0, b=1)
# pruebas de normalidad
shapiro.test(res4_est)
Shapiro-Wilk normality test
data: res4_est
W = 0.96735, p-value = 0.08803
Detección de observaciones atípicas distantes
plot.ts(res4_est, type="o", ylim=c(-4,4))
abline(a=-3, b=0, col="red", lty=2)
abline(a=3, b=0, col="red", lty=2)
Grafica del Modelo
ajust=mod4_CSS_ML$fitted
# gráfico para los valores ajustados y los valores observados
ts.plot(z,ajust) # gráfico de las series contra el tiempo
lines(z, col="black", type="o", cex=.5)
lines(ajust, col="red", type="o", cex=.5)
Evaluación de la significancia estadística de los coeficientes estimados
summary(mod4_CSS_ML)
Series: z
ARIMA(0,1,3) with drift
Box Cox transformation: lambda= 1
Coefficients:
ma1 ma2 ma3 drift
0.0514 0.2605 0.2197 995.7527
s.e. 0.1361 0.1507 0.1259 149.1463
sigma^2 estimated as 653706: log likelihood=-509.27
AIC=1028.53 AICc=1029.58 BIC=1039.25
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 10.13972 776.2961 579.4217 -0.02546079 0.6295016 0.5294899 -0.03977131
(t=coef(mod4_CSS_ML)/(diag(vcov(mod4_CSS_ML)))^.5)
ma1 ma2 ma3 drift
0.3778552 1.7280337 1.7450149 6.6763469
# valores P
(val_p=2*pnorm(t, lower.tail=FALSE))
ma1 ma2 ma3 drift
7.055381e-01 8.398218e-02 8.098228e-02 2.449718e-11
mod4_CSS_ML=Arima(z, c(0, 1, 3), include.drift=TRUE, lambda=1, method = c("CSS-ML"),
fixed=c(0,NA,NA,NA))
summary(mod4_CSS_ML)
Series: z
ARIMA(0,1,3) with drift
Box Cox transformation: lambda= 1
Coefficients:
ma1 ma2 ma3 drift
0 0.2645 0.1939 996.9528
s.e. 0 0.1517 0.1073 142.2553
sigma^2 estimated as 644431: log likelihood=-509.34
AIC=1026.67 AICc=1027.36 BIC=1035.25
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 9.77627 777.2735 580.7561 -0.02777907 0.6314878 0.5307093 0.008102302
(res4_CSS_ML=residuals(mod4_CSS_ML))
Time Series:
Start = 1
End = 64
Frequency = 1
[1] 69.99301 -922.61328 -521.29177 -817.34887 -188.16676 -522.66316 -191.28978
[8] -372.50309 -1502.86854 1638.27039 -791.75994 -949.91678 -488.13359 682.76763
[15] 225.30739 -103.90598 306.08908 -1027.14989 -113.76706 1807.38078 -947.73090
[22] -1.93761 -601.67709 204.29423 689.56256 1001.65853 826.05624 -111.56906
[29] 321.37124 -578.58925 1032.67622 1503.77679 -1480.91918 -687.89482 -281.79316
[36] -1390.90642 -306.05963 397.56365 103.65004 -303.77071 69.55762 -10.70181
[43] -265.45897 1679.39277 652.33413 448.32255 809.92874 -63.99785 -191.08908
[50] -106.04492 -1309.00384 617.14143 530.82901 2285.59101 -126.99755 -161.38763
[57] 321.53462 -424.64607 142.29123 -182.97187 50.73730 56.85639 422.09987
[64] -222.82728
res4_est=res4_CSS_ML/(mod4_CSS_ML$sigma2^.5) # estandarización de los residuales
# gráfico cuantil-cuantil
qqnorm(res4_est, xlab = "Cuantiles Teóricos", ylab = "Cuantiles Muestrales")
abline(a=0, b=1)
shapiro.test(res4_est) # prueba de Shapiro-Wilks
Shapiro-Wilk normality test
data: res4_est
W = 0.96739, p-value = 0.08836
normalTest(res4_est, method=("jb")) # prueba de Jarque-Bera
Title:
Jarque - Bera Normalality Test
Test Results:
STATISTIC:
X-squared: 4.0627
P VALUE:
Asymptotic p Value: 0.1312
Description:
Wed Jul 14 20:11:41 2021 by user: oscagaal
Grafica Final
(ajust=mod4_CSS_ML$fitted)
Time Series:
Start = 1
End = 64
Frequency = 1
[1] 70921.01 71939.61 71942.29 72149.35 72034.17 72536.66 72805.29 73437.50 73909.87
[10] 73268.73 75434.76 75781.92 75937.13 76041.23 77407.69 78715.91 79800.91 81120.15
[19] 81150.77 81821.62 84396.73 84901.94 85996.68 86207.71 87249.44 88873.34 91093.94
[28] 93315.57 94613.63 96062.59 96544.32 98483.22 101144.92 101258.89 101467.79 101713.91
[37] 101112.06 101380.44 102424.35 103570.77 104368.44 105374.70 106320.46 107062.61 109666.67
[46] 111708.68 113652.07 115704.00 116938.09 117884.04 118712.00 118334.86 119582.17 121019.41
[55] 124562.00 126139.39 127384.47 128635.65 129261.71 130350.97 131120.26 132147.14 133178.90
[64] 134622.83
ts.plot(z,ajust) # gráfico de las series contra el tiempo
lines(ajust, col="red", type="o", cex=.5)
mod_Evalpron=Arima(z[1:56], c(0, 1, 3), include.drift=TRUE, lambda=1, method = c("CSS-ML"),
fixed=c(0,NA,NA,NA))
(z_pred=forecast(mod_Evalpron, h=8, level=c(80, 95), fan=FALSE))
Gráfico de los pronósticos
plot(z_pred)
# o, alternativamente,
autoplot(z_pred)
Evaluación de los prónosticos
# lista de los valores reales y los pronósticos
cbind(z[57:64], z_pred$mean)
Time Series:
Start = 57
End = 64
Frequency = 1
z[57:64] z_pred$mean
57 127706 127350.1
58 128211 128289.1
59 129404 129258.3
60 130168 130251.5
61 131171 131244.6
62 132204 132237.7
63 133601 133230.9
64 134400 134224.0
ts.plot(z[57:64], z_pred$mean, z_pred$lower[,2], z_pred$upper[,2], type="o")
lines(z_pred$mean, col="red")
lines(z_pred$lower[,2], col="blue")
lines(z_pred$upper[,2], col="blue")
precisión de los pronósticos
(eamp=100*mean(abs((z[57:64]-ts(z_pred$mean))/z[57:64]))) # error absoluto medio en porcentaje
[1] 0.1257365