Each of these series are white noise, investigation into the lag spikes of x1 and x2 would require more research to confirm. The confidence intervals fluctuate significantly between the graphs and this is why x2 may have more cause for concern than any of the other variables. The critical values are various separations because of the fundamental variety in the information.
autoplot(ibmclose)
acf(ibmclose)
pacf(ibmclose)
There is strong autocorrelation, but the PACF plot tells us that the most recent price is driving the autocorrelations.
3.a
Box.test(usnetelec, lag = 10, type = "Ljung-Box")
##
## Box-Ljung test
##
## data: usnetelec
## X-squared = 329.22, df = 10, p-value < 2.2e-16
usnetelec %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 1.464
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
usnetelec %>% diff() %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 0.1585
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
3.b
Box.test(usgdp, lag =10, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: usgdp
## X-squared = 2078.3, df = 10, p-value < 2.2e-16
usgdp %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 4.6556
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
usgdp %>% diff() %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 1.7909
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
usgdp %>% diff() %>% diff() %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 0.021
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
3.c
Box.test(mcopper, lag =10, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: mcopper
## X-squared = 3819, df = 10, p-value < 2.2e-16
mcopper %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 6 lags.
##
## Value of test-statistic is: 5.01
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
mcopper %>% diff() %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 6 lags.
##
## Value of test-statistic is: 0.1843
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
3.d
Box.test(enplanements, lag =10, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: enplanements
## X-squared = 2122.7, df = 10, p-value < 2.2e-16
enplanements %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 4.4423
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
enplanements %>% diff() %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.0086
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
3.e
Box.test(visitors, lag =10, type = 'Ljung-Box')
##
## Box-Ljung test
##
## data: visitors
## X-squared = 1522.6, df = 10, p-value < 2.2e-16
visitors %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 4.6025
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
visitors %>% diff() %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 0.0191
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
(1-B)^1*y_t
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349873A"], frequency=12, start=c(1982,4))
ggtsdisplay(myts)
bc.lambda <- BoxCox.lambda(myts)
paste('Best Box-Cox lambda for this time series is:', bc.lambda)
## [1] "Best Box-Cox lambda for this time series is: 0.127636859661548"
myts.s <- myts %>% BoxCox(bc.lambda) %>% diff(1) %>% diff(12)
ggtsdisplay(myts.s)
myts.s %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.0138
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The data is stationary. The unit root test indicates that the test statistic is small.
6.a
ar.1 <- function(phi, sd=1, n=100){y <- ts(numeric(n))
e <- rnorm(n, sd=sd)
for(i in 2:n)
y[i] <- phi*y[i-1] + e[i]
return(y)}
6.b
data.list <- list()
i <- 0
phi.vec <- c(0.0000001, 0.0001, 0.1) * 6
for (phi in phi.vec){
i <- i + 1
data.list[[i]] <- ar.1(phi)
}
gen.data <- do.call(cbind, data.list)
colnames(gen.data) <- paste('phi=', phi.vec, sep = '')
autoplot(gen.data) + ylab('Data')
par(mfrow=c(1,3))
acf(gen.data[,1], main='Phi=0.0000006')
acf(gen.data[,2], main='Phi=0.0006')
acf(gen.data[,3], main='Phi=0.6')
6.c
ma.1 <- function(theta, sd=1, n=100){
y <- ts(numeric(n))
e <- rnorm(n, sd=sd)
e[1] <- 0
for(i in 2:n)
y[i] <- theta*e[i-1] + e[i]
return(y)}
6.d
data.list <- list()
i <- 0
theta.vec <- c(0.0000001, 0.0001, 0.1) * 6
for (theta in theta.vec){
i <- i + 1
data.list[[i]] <- ma.1(theta)}
gen.data <- do.call(cbind, data.list)
colnames(gen.data) <- paste('theta=', theta.vec, sep = '')
autoplot(gen.data) + ylab('Data')
par(mfrow=c(1,3))
acf(gen.data[,1], main='theta=0.0000006')
acf(gen.data[,2], main='theta=0.0006')
acf(gen.data[,3], main='theta=0.6')
6.e
y1 <- ts(numeric(100))
e <- rnorm(100, sd=1)
for(i in 2:100)
y1[i] <- 0.6*y1[i-1] + 0.6*e[i-1] + e[i]
autoplot(y1)+ggtitle('ARMA(1,1) Generated Data')
6.f
y2 <- ts(numeric(100))
e <- rnorm(100, sd=1)
for(i in 3:100)
y2[i] <- -0.8*y2[i-1] + 0.3*y2[i-2] + e[i]
autoplot(y2)+ggtitle('AR(2) Generated Data')
7.a
ggtsdisplay(wmurders, main='Annual Female Murder Rate in USA')
wmurders.s <- wmurders %>% diff(1) %>% diff(1)
ggtsdisplay(wmurders.s)
wmurders.s %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 0.0458
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
7.b no, because a non-seasonal ARIMA model has equivalent constants
7.c (1−ϕ1B)(1−B)2yt=c+(1+θ1B)εt, where c will be 0 if no constant included.
7.d
(fit <- Arima(wmurders, order=c(1,2,1)))
## Series: wmurders
## ARIMA(1,2,1)
##
## Coefficients:
## ar1 ma1
## -0.2434 -0.8261
## s.e. 0.1553 0.1143
##
## sigma^2 estimated as 0.04632: log likelihood=6.44
## AIC=-6.88 AICc=-6.39 BIC=-0.97
checkresiduals(fit)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,2,1)
## Q* = 12.419, df = 8, p-value = 0.1335
##
## Model df: 2. Total lags used: 10
7.e
forecast(fit, h=3)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2005 2.470660 2.194836 2.746484 2.048824 2.892496
## 2006 2.363106 1.986351 2.739862 1.786908 2.939304
## 2007 2.252833 1.765391 2.740276 1.507354 2.998313
tail(wmurders, 3)
## Time Series:
## Start = 2002
## End = 2004
## Frequency = 1
## [1] 2.797697 2.662227 2.589383
tail(residuals(fit), 1)
## Time Series:
## Start = 2004
## End = 2004
## Frequency = 1
## [1] 0.03708172
For 2005 forecast: yt=1.7566×2.589383−0.5132×2.662227−0.2434×2.797697−0.8261×0.03708172=2.470663 For 2006 forecast: yt=1.7566×2.470663−0.5132×2.589383−0.2434×2.662227−0.8261×0=2.363109 For 2007 forecast: yt=1.7566×2.363109−0.5132×2.470663−0.2434×2.589383−0.8261×0=2.252837
7.f
autoplot(forecast(fit, h=3))
(fit2 <- auto.arima(wmurders, seasonal=F, stepwise=F, approximation=F))
## Series: wmurders
## ARIMA(0,2,3)
##
## Coefficients:
## ma1 ma2 ma3
## -1.0154 0.4324 -0.3217
## s.e. 0.1282 0.2278 0.1737
##
## sigma^2 estimated as 0.04475: log likelihood=7.77
## AIC=-7.54 AICc=-6.7 BIC=0.35
forecast(fit2, h=3)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2005 2.408817 2.137718 2.679916 1.994206 2.823428
## 2006 2.365555 1.985092 2.746018 1.783687 2.947423
## 2007 2.290976 1.753245 2.828706 1.468588 3.113363
autoplot(forecast(fit2, h=3))
8.a
autoplot(austa)
fc_austa_autoarima <- forecast(
auto.arima(austa), h = 10
)
fc_austa_autoarima$model
## Series: austa
## ARIMA(0,1,1) with drift
##
## Coefficients:
## ma1 drift
## 0.3006 0.1735
## s.e. 0.1647 0.0390
##
## sigma^2 estimated as 0.03376: log likelihood=10.62
## AIC=-15.24 AICc=-14.46 BIC=-10.57
checkresiduals(fc_austa_autoarima)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,1) with drift
## Q* = 2.297, df = 5, p-value = 0.8067
##
## Model df: 2. Total lags used: 7
autoplot(fc_austa_autoarima)
8.b
fc_austa_arima.0.1.1 <- forecast(Arima(austa, order = c(0, 1, 1)), h = 10)
autoplot(fc_austa_arima.0.1.1)
fc_austa_arima.0.1.0 <- forecast(Arima(austa, order = c(0, 1, 0)), h = 10)
autoplot(fc_austa_arima.0.1.0)
fc_austa_arima.0.1.1$upper - fc_austa_arima.0.1.0$upper
## Time Series:
## Start = 2016
## End = 2025
## Frequency = 1
## fc_austa_arima.0.1.1$upper.80% fc_austa_arima.0.1.1$upper.95%
## 2016 0.1138117 0.09101057
## 2017 0.2039418 0.22885273
## 2018 0.2527212 0.30345426
## 2019 0.2886706 0.35843413
## 2020 0.3180942 0.40343373
## 2021 0.3434784 0.44225553
## 2022 0.3660754 0.47681466
## 2023 0.3866116 0.50822207
## 2024 0.4055495 0.53718500
## 2025 0.4232034 0.56418442
fc_austa_arima.0.1.0$lower - fc_austa_arima.0.1.1$lower
## Time Series:
## Start = 2016
## End = 2025
## Frequency = 1
## fc_austa_arima.0.1.0$lower.80% fc_austa_arima.0.1.0$lower.95%
## 2016 -0.199956384 -0.22275751
## 2017 -0.109826244 -0.08491535
## 2018 -0.061046922 -0.01031382
## 2019 -0.025097519 0.04466605
## 2020 0.004326137 0.08966565
## 2021 0.029710347 0.12848745
## 2022 0.052307350 0.16304658
## 2023 0.072843552 0.19445399
## 2024 0.091781392 0.22341692
## 2025 0.109435361 0.25041634
8.c
fc_austa_arima.2.1.3.drift <- forecast(Arima(austa, order = c(2, 1, 3), include.drift = TRUE),h = 10)
autoplot(fc_austa_arima.2.1.3.drift)
drift_austa <- fc_austa_arima.2.1.3.drift$model$coef[6]
fc_austa_arima.2.1.3.nodrift <- fc_austa_arima.2.1.3.drift$mean - drift_austa*seq_len(10)
autoplot(fc_austa_arima.2.1.3.drift)+autolayer(fc_austa_arima.2.1.3.nodrift)
8.d
fc_austa_arima.0.0.1.const <- forecast(Arima(austa, order = c(0, 0, 1), include.constant = TRUE),h = 10)
autoplot(fc_austa_arima.0.0.1.const)
fc_austa_arima.0.0.0.const <- forecast(Arima(austa, order = c(0, 0, 0), include.constant = TRUE),h = 10)
autoplot(fc_austa_arima.0.0.0.const)
8.e
fc_austa_arima.0.2.1 <- forecast(Arima(austa, order = c(0, 2, 1)),h = 10)
autoplot(fc_austa_arima.0.2.1)
9.a
autoplot(usgdp)
autoplot(BoxCox(usgdp, BoxCox.lambda(usgdp)))
lambda_usgdp <- BoxCox.lambda(usgdp)
9.b
usgdp_autoarima <- auto.arima(usgdp,lambda = lambda_usgdp)
autoplot(usgdp, series = "Data")+autolayer(usgdp_autoarima$fitted, series = "Fitted")
usgdp_autoarima
## Series: usgdp
## ARIMA(2,1,0) with drift
## Box Cox transformation: lambda= 0.366352
##
## Coefficients:
## ar1 ar2 drift
## 0.2795 0.1208 0.1829
## s.e. 0.0647 0.0648 0.0202
##
## sigma^2 estimated as 0.03518: log likelihood=61.56
## AIC=-115.11 AICc=-114.94 BIC=-101.26
9.c
ndiffs(BoxCox(usgdp, lambda_usgdp))
## [1] 1
ggtsdisplay(diff(BoxCox(usgdp, lambda_usgdp)))
usgdp_arima.1.1.0 <- Arima(usgdp, lambda = lambda_usgdp, order = c(1, 1, 0))
usgdp_arima.1.1.0
## Series: usgdp
## ARIMA(1,1,0)
## Box Cox transformation: lambda= 0.366352
##
## Coefficients:
## ar1
## 0.6326
## s.e. 0.0504
##
## sigma^2 estimated as 0.04384: log likelihood=34.39
## AIC=-64.78 AICc=-64.73 BIC=-57.85
autoplot(usgdp, series = "Data")+autolayer(usgdp_arima.1.1.0$fitted, series = "Fitted")
usgdp_arima.1.1.0.drift <- Arima(usgdp, lambda = lambda_usgdp, order = c(1, 1, 0),include.drift = TRUE)
usgdp_arima.1.1.0.drift
## Series: usgdp
## ARIMA(1,1,0) with drift
## Box Cox transformation: lambda= 0.366352
##
## Coefficients:
## ar1 drift
## 0.3180 0.1831
## s.e. 0.0619 0.0179
##
## sigma^2 estimated as 0.03555: log likelihood=59.83
## AIC=-113.66 AICc=-113.56 BIC=-103.27
autoplot(usgdp, series = "Data")+autolayer(usgdp_arima.1.1.0.drift$fitted, series = "Fitted")
9.d
accuracy(usgdp_autoarima)
## ME RMSE MAE MPE MAPE MASE
## Training set 1.195275 39.2224 29.29521 -0.01363259 0.6863491 0.1655687
## ACF1
## Training set -0.03824844
accuracy(usgdp_arima.1.1.0)
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 15.45449 45.49569 35.08393 0.3101283 0.7815664 0.198285 -0.3381619
accuracy(usgdp_arima.1.1.0.drift)
## ME RMSE MAE MPE MAPE MASE
## Training set 1.315796 39.90012 29.5802 -0.01678591 0.6834509 0.1671794
## ACF1
## Training set -0.08544569
checkresiduals(usgdp_autoarima)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(2,1,0) with drift
## Q* = 6.5772, df = 5, p-value = 0.254
##
## Model df: 3. Total lags used: 8
checkresiduals(usgdp_arima.1.1.0.drift)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,1,0) with drift
## Q* = 10.274, df = 6, p-value = 0.1136
##
## Model df: 2. Total lags used: 8
9.e
fc_usgdp_autoarima <- forecast(usgdp_autoarima)
autoplot(fc_usgdp_autoarima)
9.f
fc_usgdp_ets <- forecast(ets(usgdp))
autoplot(fc_usgdp_ets)
10.a
autoplot(austourists)
The plots have a strong seasonality and increasing trend.
10.b
ggAcf(austourists)
The values at the lags of multiple of 4 were big compared to the others.
10.c
ggPacf(austourists)
There are 5 significant spikes and no spikes like these afterwards.
10.d
ggtsdisplay(diff(austourists, lag = 4))
ggtsdisplay(diff(diff(austourists, lag = 4)))
10.e
fc_austourists_autoarima <- forecast(auto.arima(austourists))
fc_austourists_autoarima$model
## Series: austourists
## ARIMA(1,0,0)(1,1,0)[4] with drift
##
## Coefficients:
## ar1 sar1 drift
## 0.4705 -0.5305 0.5489
## s.e. 0.1154 0.1122 0.0864
##
## sigma^2 estimated as 5.15: log likelihood=-142.48
## AIC=292.97 AICc=293.65 BIC=301.6
fc_austourists_arima.1.1.0.1.1.0.4 <- forecast(
Arima(austourists, order = c(1, 1, 0), seasonal = c(1, 1, 0)))
autoplot(fc_austourists_autoarima)
autoplot(fc_austourists_arima.1.1.0.1.1.0.4)
accuracy(fc_austourists_autoarima)
## ME RMSE MAE MPE MAPE MASE
## Training set 0.02200144 2.149384 1.620917 -0.7072593 4.388288 0.5378929
## ACF1
## Training set -0.06393238
accuracy(fc_austourists_arima.1.1.0.1.1.0.4)
## ME RMSE MAE MPE MAPE MASE
## Training set 0.07547467 2.288429 1.661709 -0.3718168 4.49153 0.5514295
## ACF1
## Training set -0.03831574
checkresiduals(fc_austourists_autoarima)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,0,0)(1,1,0)[4] with drift
## Q* = 4.0937, df = 5, p-value = 0.536
##
## Model df: 3. Total lags used: 8
10.f
fc_austourists_autoarima$model
## Series: austourists
## ARIMA(1,0,0)(1,1,0)[4] with drift
##
## Coefficients:
## ar1 sar1 drift
## 0.4705 -0.5305 0.5489
## s.e. 0.1154 0.1122 0.0864
##
## sigma^2 estimated as 5.15: log likelihood=-142.48
## AIC=292.97 AICc=293.65 BIC=301.6
11.a
usmelec_ma2x12 <- ma(usmelec, order = 12, centre = TRUE)
autoplot(usmelec, series = "Data")+
autolayer(usmelec_ma2x12, series = "2X12-MA")+
ylab(expression(paste("Electricity(x", 10^{9}, "KWh)")))+
ggtitle("Monthly total net generation of electricity")+
scale_color_discrete(breaks = c("Data", "2X12-MA"))
## Warning: Removed 12 row(s) containing missing values (geom_path).
Total net generation amount increased at first but stopped increasing since 2008.
11.b
lambda_usmelec <- BoxCox.lambda(usmelec)
11.c
ndiffs(usmelec)
## [1] 1
nsdiffs(usmelec)
## [1] 1
11.d
lambda_usmelec <- BoxCox.lambda(usmelec)
ggtsdisplay(diff(BoxCox(usmelec, lambda_usmelec),lag = 12))
ggtsdisplay(diff(diff(BoxCox(usmelec, lambda_usmelec),lag = 12)))
usmelec_arima.0.1.2.0.1.1.12 <- Arima(usmelec,lambda = lambda_usmelec,order = c(0, 1, 2),seasonal = c(0, 1, 1))
usmelec_arima.0.1.3.0.1.1.12 <- Arima(usmelec,lambda =lambda_usmelec,order = c(0, 1, 3),seasonal = c(0, 1, 1))
usmelec_arima.0.1.2.0.1.1.12$aic
## [1] -5081.506
usmelec_arima.0.1.3.0.1.1.12$aic
## [1] -5080.441
11.e
usmelec_arima.0.1.2.0.1.1.12
## Series: usmelec
## ARIMA(0,1,2)(0,1,1)[12]
## Box Cox transformation: lambda= -0.5738331
##
## Coefficients:
## ma1 ma2 sma1
## -0.4317 -0.2552 -0.8536
## s.e. 0.0439 0.0440 0.0261
##
## sigma^2 estimated as 1.284e-06: log likelihood=2544.75
## AIC=-5081.51 AICc=-5081.42 BIC=-5064.87
checkresiduals(usmelec_arima.0.1.2.0.1.1.12)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,2)(0,1,1)[12]
## Q* = 32.525, df = 21, p-value = 0.05176
##
## Model df: 3. Total lags used: 24
usmelec_autoarima <- auto.arima(usmelec,lambda = lambda_usmelec)
usmelec_autoarima
## Series: usmelec
## ARIMA(1,1,3)(2,1,1)[12]
## Box Cox transformation: lambda= -0.5738331
##
## Coefficients:
## ar1 ma1 ma2 ma3 sar1 sar2 sma1
## 0.3952 -0.8194 -0.0468 0.0414 0.0403 -0.0934 -0.8462
## s.e. 0.3717 0.3734 0.1707 0.1079 0.0560 0.0533 0.0343
##
## sigma^2 estimated as 1.278e-06: log likelihood=2547.75
## AIC=-5079.5 AICc=-5079.19 BIC=-5046.23
checkresiduals(usmelec_autoarima)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,1,3)(2,1,1)[12]
## Q* = 25.995, df = 17, p-value = 0.07454
##
## Model df: 7. Total lags used: 24
11.f
lambda <- BoxCox.lambda(usmelec)
fit <- Arima(usmelec, order=c(2,1,3), seasonal=c(2,1,3), lambda=lambda)
fcast <- forecast(fit, h=12*15)
plot(fcast, ylab="kwh in Billions", main="15-year Prediction of Electricity Generation")
actual <- read.csv(paste0("https://raw.githubusercontent.com/jzuniga123/","SPS/master/DATA%20624/electricity-overview.csv"), stringsAsFactors=FALSE)
actual <- head(actual, -1) # remove last row with descriptive text
actual_0 <- as.integer(unlist(strsplit(as.character(head(actual["Month"], 1)), "-")))
actual_N <- as.integer(unlist(strsplit(as.character(tail(actual["Month"], 1)), "-")))
usmelec_test <- ts(actual[2], start=actual_0, end=actual_N, frequency=12)
buffer <- 5
usmelec_N <- tail(time(usmelec), 1)
fcast_0 <- c(floor(usmelec_N), as.numeric(substr(usmelec_N, 5, 7)))
12.a
lam <- BoxCox.lambda(mcopper)
mcopper.bcx <- BoxCox(mcopper, lambda = lam)
tsdisplay(mcopper.bcx)
12.b
fit <- auto.arima(mcopper, trace = TRUE, ic ="aic", lambda = lam)
##
## Fitting models using approximations to speed things up...
##
## ARIMA(2,1,2)(1,0,1)[12] with drift : -66.32021
## ARIMA(0,1,0) with drift : -13.82248
## ARIMA(1,1,0)(1,0,0)[12] with drift : -62.06849
## ARIMA(0,1,1)(0,0,1)[12] with drift : -82.25982
## ARIMA(0,1,0) : -12.82906
## ARIMA(0,1,1) with drift : -83.13952
## ARIMA(0,1,1)(1,0,0)[12] with drift : -73.91207
## ARIMA(0,1,1)(1,0,1)[12] with drift : -71.92506
## ARIMA(1,1,1) with drift : -80.30584
## ARIMA(0,1,2) with drift : -81.16609
## ARIMA(1,1,0) with drift : -71.41152
## ARIMA(1,1,2) with drift : -78.27629
## ARIMA(0,1,1) : -83.33279
## ARIMA(0,1,1)(1,0,0)[12] : -74.05872
## ARIMA(0,1,1)(0,0,1)[12] : -82.603
## ARIMA(0,1,1)(1,0,1)[12] : -72.07316
## ARIMA(1,1,1) : -80.55931
## ARIMA(0,1,2) : -81.33929
## ARIMA(1,1,0) : -71.94456
## ARIMA(1,1,2) : -80.62448
##
## Now re-fitting the best model(s) without approximations...
##
## ARIMA(0,1,1) : -86.0969
##
## Best model: ARIMA(0,1,1)
fit
## Series: mcopper
## ARIMA(0,1,1)
## Box Cox transformation: lambda= 0.1919047
##
## Coefficients:
## ma1
## 0.3720
## s.e. 0.0388
##
## sigma^2 estimated as 0.04997: log likelihood=45.05
## AIC=-86.1 AICc=-86.08 BIC=-77.43
13.a
autoplot(qauselec)
lambda_qauselec <- BoxCox.lambda(qauselec)
13.b
nsdiffs(qauselec)
## [1] 1
ndiffs(qauselec)
## [1] 1
kpss.test(diff(qauselec, lag = 4))
##
## KPSS Test for Level Stationarity
##
## data: diff(qauselec, lag = 4)
## KPSS Level = 0.39382, Truncation lag parameter = 4, p-value = 0.07982
13.c
ggtsdisplay(diff(BoxCox(qauselec, lambda_qauselec), lag = 4))
ggtsdisplay(diff(diff(BoxCox(qauselec, lambda_qauselec), lag = 4)))
qauselec_arima0.0.1.0.1.1.4 <- Arima(qauselec, lambda = lambda_qauselec,order = c(0, 0, 1), seasonal = c(0, 1, 1))
qauselec_arima0.1.1.0.1.1.4 <- Arima(qauselec, lambda = lambda_qauselec,order = c(0, 1, 1), seasonal = c(0, 1, 1))
qauselec_arima0.1.1.0.1.2.4 <- Arima(qauselec, lambda = lambda_qauselec,order = c(0, 1, 1), seasonal = c(0, 1, 2))
qauselec_autoarima <- auto.arima(qauselec, lambda = lambda_qauselec)
13.d
checkresiduals(qauselec_autoarima)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,1,1)(1,1,2)[4]
## Q* = 10.605, df = 3, p-value = 0.01407
##
## Model df: 5. Total lags used: 8
checkresiduals(qauselec_arima0.0.1.0.1.1.4)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,0,1)(0,1,1)[4]
## Q* = 84.595, df = 6, p-value = 4.441e-16
##
## Model df: 2. Total lags used: 8
checkresiduals(qauselec_arima0.1.1.0.1.1.4)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,1)(0,1,1)[4]
## Q* = 15.102, df = 6, p-value = 0.01948
##
## Model df: 2. Total lags used: 8
checkresiduals(qauselec_arima0.1.1.0.1.2.4)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,1)(0,1,2)[4]
## Q* = 13.684, df = 5, p-value = 0.01775
##
## Model df: 3. Total lags used: 8
13.e
fc_qauselec_autoarima <- forecast(qauselec_autoarima, h = 8)
autoplot(fc_qauselec_autoarima)
13.f
fc_qauselec_ets <- forecast(ets(qauselec), h = 8)
autoplot(fc_qauselec_ets)
fc_qauselec_stlf <- stlf(qauselec, lambda = BoxCox.lambda(qauselec),
s.window = 5, robust = TRUE, method = "arima",
h = 8)
autoplot(fc_qauselec_stlf)+scale_x_continuous(limits = c(2005, 2012))+scale_y_continuous(limits = c(50, 70))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Warning: Removed 196 row(s) containing missing values (geom_path).
autoplot(fc_qauselec_ets)+scale_x_continuous(limits = c(2005, 2012))+scale_y_continuous(limits = c(50, 70))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Warning: Removed 196 row(s) containing missing values (geom_path).
autoplot(fc_qauselec_autoarima)+scale_x_continuous(limits = c(2005, 2012))+scale_y_continuous(limits = c(50, 70))
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
## Warning: Removed 196 row(s) containing missing values (geom_path).
15.a
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts <- ts(retaildata[,"A3349873A"], frequency=12, start=c(1982,4))
fc_retail_autoarima <- forecast(auto.arima(myts), h = 36)
autoplot(fc_retail_autoarima)
15.b
fc_retail_snaive <- snaive(myts, h = 36)
autoplot(fc_retail_snaive)
fc_retail_ets <- forecast(ets(myts, lambda=BoxCox.lambda(myts)), h = 36)
autoplot(fc_retail_ets)
16.a
autoplot(sheep)
There is a decreasing trend, but no definite seasonality.
16.b (yt - yt-1) - phi1(yt-1 - yt-2) - phi2(yt-2 - yt-3) - phi3(yt-3 - yt-4) = et (1 - B)yt - phi1B(1- B)yt - phi2B^2(1- B)yt - phi3B^3(1- B)yt = et (1 - phi1B - phi2B^2 - phi3B^3)(1 - B)yt = et
16.c
ggtsdisplay(diff(sheep))
16.d sheep.1940 = 1797 + 0.42(1797 - 1791) -0.20(1791 - 1627) - 0.30(1627 - 1665) sheep.1941 = sheep.1940 + 0.42(sheep.1940 - 1797) -0.20(1797 - 1791) - 0.30(1791 - 1627) sheep.1942 = sheep.1941 + 0.42(sheep.1941 - sheep.1940) -0.20(sheep.1940 - 1797) - 0.30*(1797 - 1791) c(sheep.1940, sheep.1941, sheep.1942)
16.e
fc_sheep_arima.3.1.0 <- forecast(Arima(sheep, order = c(3, 1, 0)),h = 3)
fc_sheep_arima.3.1.0$mean
## Time Series:
## Start = 1940
## End = 1942
## Frequency = 1
## [1] 1777.996 1718.869 1695.985
ar1 <- fc_sheep_arima.3.1.0$model$coef[1]
ar2 <- fc_sheep_arima.3.1.0$model$coef[2]
ar3 <- fc_sheep_arima.3.1.0$model$coef[3]
sheep.1940.new = 1797 + ar1*(1797 - 1791) + ar2*(1791 - 1627) + ar3*(1627 - 1665)
sheep.1941.new = sheep.1940.new + ar1*(sheep.1940.new - 1797) + ar2*(1797 - 1791) + ar3*(1791 - 1627)
sheep.1942.new = sheep.1941.new + ar1*(sheep.1941.new - sheep.1940.new) + ar2*(sheep.1940.new - 1797) + ar3*(1797 - 1791)
c(sheep.1940.new, sheep.1941.new, sheep.1942.new)
## ar1 ar1 ar1
## 1777.996 1718.869 1695.985
17.a
autoplot(bicoal)
17.b (1 - phi1B - phi2B^2 - phi3B^3 - phi4B^4)yt = c + et if mu is the mean of yt, c = mu(1 - phi1B - phi2B^2 - phi3B^3 - phi4B^4) This model is ARIMA(4, 0, 0) or AR(4).
17.c
ggAcf(bicoal, lag.max = 36)
ggPacf(bicoal, lag.max = 36)
17.d The estimated parameters are c = 162.00, phi1 = 0.83, phi2 = -0.34, phi3 = 0.55 and phi4 = -0.38. Without using the forecast function, calculate forecasts for the next three years (1969-1971). c = 162.00 phi1 = 0.83 phi2 = -0.34 phi3 = 0.55 phi4 = -0.38 bicoal.1969 <- c + phi1545 + phi2552 + phi3534 + phi4512 bicoal.1970 <- c + phi1bicoal.1969 + phi2545 + phi3552 + phi4534 bicoal.1971 <- c + phi1bicoal.1970 + phi2bicoal.1969 + phi3545 + phi4552 c(bicoal.1969, bicoal.1970, bicoal.1971)
17.e
fc_bicoal_ar4 <- forecast(ar(bicoal, 4), h = 3)
fc_bicoal_ar4$mean
## Time Series:
## Start = 1969
## End = 1971
## Frequency = 1
## [1] 526.2057 514.0658 500.0111
phi1 <- fc_bicoal_ar4$model$ar[1]
phi2 <- fc_bicoal_ar4$model$ar[2]
phi3 <- fc_bicoal_ar4$model$ar[3]
phi4 <- fc_bicoal_ar4$model$ar[4]
c <- fc_bicoal_ar4$model$x.mean*(1 - phi1 - phi2 - phi3 - phi4)
bicoal.1969.new <- c + phi1*545 + phi2*552 + phi3*534 + phi4*512
bicoal.1970.new <- c + phi1*bicoal.1969.new + phi2*545 + phi3*552 + phi4*534
bicoal.1971.new <- c + phi1*bicoal.1970.new + phi2*bicoal.1969.new + phi3*545 + phi4*552
c(bicoal.1969.new, bicoal.1970.new, bicoal.1971.new)
## [1] 526.2057 514.0658 500.0111