Homework #6

library(fpp3)

## ── Attaching packages ──────────────────────────────────────────── fpp3 0.4.0 ──

## ✔ tibble      3.1.8     ✔ tsibble     1.1.2
## ✔ dplyr       1.0.9     ✔ tsibbledata 0.4.0
## ✔ tidyr       1.2.0     ✔ feasts      0.2.2
## ✔ lubridate   1.8.0     ✔ fable       0.3.1
## ✔ ggplot2     3.3.6

## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date()    masks base::date()
## ✖ dplyr::filter()      masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval()  masks lubridate::interval()
## ✖ dplyr::lag()         masks stats::lag()
## ✖ tsibble::setdiff()   masks base::setdiff()
## ✖ tsibble::union()     masks base::union()

library(readxl)

Question 1

The difference between the plots is the critical values decrease as the sample gets larger. They each refer to white noise, but the range for significant autocorrelation closes in.
The critical lines decrease because +-1.96/T decreases as T (the length of the series) gets larger and larger. They are all white noise but the significance level is smaller and smaller.

Question 2

amzn <- gafa_stock %>% 
  filter(Symbol=="AMZN") %>% 
  mutate(n=row_number()) %>% 
  as_tsibble(index=n) %>% 
  select(n,Close)
amzn %>% 
  autoplot(Close)

Stationary means the data does not depend on the time of the observation. This has a clear trend, and possibly seasonality, so it is not stationary. Were it cyclical, it might be stationary, but this isn’t.

amzn %>% ACF(Close) %>%
  autoplot() + labs(subtitle = "Amazon closing stock price")

## Warning: Provided data has an irregular interval, results should be treated with
## caution. Computing ACF by observation.

The ACF of stationary, if ever significant, should drop to zero quickly. This barely drops at all. Thus, this data exhibits serious autocorrelation.

amzn %>%
  PACF(Close) %>%
  autoplot()+ labs(subtitle = "Amazon closing stock price")

## Warning: Provided data has an irregular interval, results should be treated with
## caution. Computing ACF by observation.

Even without the intermediate values, there is still significant autocorrelation at multiple levels.

amzn %>% 
  gg_tsdisplay(Close, plot_type='partial')

## Warning: Provided data has an irregular interval, results should be treated with caution. Computing ACF by observation.
## Provided data has an irregular interval, results should be treated with caution. Computing ACF by observation.

Thus, it can be concluded that this data is not stationary.

Question 3

trky <- global_economy %>% 
  filter(Country=="Turkey")
trky %>% 
  autoplot(GDP)

This is obviously not stationary. It has an obvious trend.

trky %>% 
  gg_tsdisplay(GDP,plot_type="partial")

This does not resemble stationary data.

trky %>% 
  features(GDP, features=guerrero)

## # A tibble: 1 × 2
##   Country lambda_guerrero
##   <fct>             <dbl>
## 1 Turkey            0.157

trky %>%
  features(box_cox(GDP,.1572187), unitroot_kpss)

## # A tibble: 1 × 3
##   Country kpss_stat kpss_pvalue
##   <fct>       <dbl>       <dbl>
## 1 Turkey       1.50        0.01

This p-value is small, so differencing is required.

trky %>% 
  mutate(one_diff=difference(box_cox(GDP,.1572187))) %>% 
  features(one_diff,unitroot_kpss)

## # A tibble: 1 × 3
##   Country kpss_stat kpss_pvalue
##   <fct>       <dbl>       <dbl>
## 1 Turkey     0.0889         0.1

The p-value is higher than .05, so it should be stationary.

trky %>%
  features(box_cox(GDP,.1572187), unitroot_ndiffs)

## # A tibble: 1 × 2
##   Country ndiffs
##   <fct>    <int>
## 1 Turkey       1

The automatic function supports one difference.

(1-B)yt = yt - Byt = yt - yt-1

trky %>% 
  gg_tsdisplay(difference(box_cox(GDP,.1572187)),plot_type="partial")

## Warning: Removed 1 row(s) containing missing values (geom_path).

## Warning: Removed 1 rows containing missing values (geom_point).

This is certainly more stationary now. No significant PACF or ACF.

tk <- aus_accommodation %>% 
  filter(State=="Tasmania") %>% 
  select(Takings)
tk %>% 
  autoplot(Takings)

There is heteroskedasticity, trend, and seasonality.

tk %>% 
  features(Takings, features=guerrero)

## # A tibble: 1 × 1
##   lambda_guerrero
##             <dbl>
## 1         -0.0488

tk %>%
  features(box_cox(Takings,-.04884781), unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1      1.83        0.01

This p-value is small, so differencing is required.

tk %>% 
  mutate(one_diff=difference(box_cox(Takings,-.04884781),4)) %>% 
  features(one_diff,unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1     0.210         0.1

The p-value is higher than .05, so it should be stationary.

tk %>%
  features(box_cox(Takings,-.04884781), unitroot_ndiffs)

## # A tibble: 1 × 1
##   ndiffs
##    <int>
## 1      1

The automatic function supports one difference.

(1-B)yt = yt - Byt = yt - yt-1

tk %>% 
  gg_tsdisplay(difference(box_cox(Takings,-.04884781),4),plot_type="partial")

## Warning: Removed 4 row(s) containing missing values (geom_path).

## Warning: Removed 4 rows containing missing values (geom_point).

souv <- souvenirs
souv %>% 
  autoplot(Sales)

There is heteroskedasticity, trend, and seasonality.

souv %>% 
  features(Sales, features=guerrero)

## # A tibble: 1 × 1
##   lambda_guerrero
##             <dbl>
## 1         0.00212

souv %>%
  features(box_cox(Sales,0.002118221), unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1      1.79        0.01

This p-value is small, so seasonal differencing is required.

souv %>% 
  mutate(one_diff=difference(box_cox(Sales,0.002118221),12)) %>% 
  features(one_diff,unitroot_kpss)

## # A tibble: 1 × 2
##   kpss_stat kpss_pvalue
##       <dbl>       <dbl>
## 1     0.240         0.1

The p-value is higher than .05, so it should be stationary.

souv %>%
  features(box_cox(Sales,0.002118221), unitroot_ndiffs)

## # A tibble: 1 × 1
##   ndiffs
##    <int>
## 1      1

The automatic function supports one difference.

(1-B)yt = yt - Byt = yt - yt-1

souv %>% 
  gg_tsdisplay(difference(box_cox(Sales,0.002118221),12),plot_type="partial")

## Warning: Removed 12 row(s) containing missing values (geom_path).

## Warning: Removed 12 rows containing missing values (geom_point).

Question 4

ar1 <- function(phi, n = 100) {
  y <- numeric(n)
  e <- rnorm(n)
  for (i in 2:n) {
    y[i] <- phi * y[i - 1] + e[i]
  }
  tsibble(idx = seq_len(n), y = y, index = idx)
}
ar1(0.6)

## # A tsibble: 100 x 2 [1]
##      idx      y
##    <int>  <dbl>
##  1     1  0    
##  2     2  1.29 
##  3     3  1.67 
##  4     4  0.479
##  5     5 -2.45 
##  6     6 -2.29 
##  7     7 -2.21 
##  8     8 -0.519
##  9     9 -0.543
## 10    10  0.672
## # … with 90 more rows
## # ℹ Use `print(n = ...)` to see more rows

ar1(.6) %>% 
  autoplot(y)+labs(title="Phi=.6")

ar1(.7) %>% 
  autoplot(y)+labs(title="Phi=.7")

ar1(-.6) %>% 
  autoplot(y)+labs(title="Phi=-.6")

ar1(-.7) %>% 
  autoplot(y)+labs(title="Phi=-.7")

ar1(0) %>% 
  autoplot(y)+labs(title="Phi=0")

The negative phi makes it much more closely packed. As it gets higher, it seems less and less like a white noise. c.

ma1 <- function(phi, n = 100) {
  y <- numeric(n)
  e <- rnorm(n)
  for (i in 2:n) {
    y[i] <- phi * e[i - 1] + e[i]
  }
  tsibble(idx = seq_len(n), y = y, index = idx)
}
ma1(0.6)

## # A tsibble: 100 x 2 [1]
##      idx      y
##    <int>  <dbl>
##  1     1  0    
##  2     2 -1.91 
##  3     3 -0.646
##  4     4 -0.465
##  5     5 -1.13 
##  6     6  0.312
##  7     7  1.64 
##  8     8 -0.524
##  9     9 -1.08 
## 10    10  0.358
## # … with 90 more rows
## # ℹ Use `print(n = ...)` to see more rows

ma1(.6) %>% 
  autoplot(y)+labs(title="Theta=.6")

ma1(.7) %>% 
  autoplot(y)+labs(title="Theta=.7")

ma1(-.6) %>% 
  autoplot(y)+labs(title="Theta=-.6")

ma1(-.7) %>% 
  autoplot(y)+labs(title="Theta=-.7")

ma1(0) %>% 
  autoplot(y)+labs(title="Theta=0")

The negative phi makes it much more closely packed. As it gets higher, it seems less and less like a white noise.

arma11 <- function(phi, theta, n = 100) {
  y <- numeric(n)
  e <- rnorm(n)
  for (i in 2:n) {
    y[i] <- (theta * e[i - 1]) + (phi*y[i-1]) + e[i]
  }
  tsibble(idx = seq_len(n), y = y, index = idx)
}
arma11(0.6,0.6)

## # A tsibble: 100 x 2 [1]
##      idx       y
##    <int>   <dbl>
##  1     1  0     
##  2     2  1.04  
##  3     3  0.919 
##  4     4 -0.0820
##  5     5 -0.844 
##  6     6 -0.926 
##  7     7 -1.26  
##  8     8 -0.247 
##  9     9  1.81  
## 10    10  1.62  
## # … with 90 more rows
## # ℹ Use `print(n = ...)` to see more rows

ar2 <- function(phi1,phi2, n = 100) {
  y <- numeric(n)
  e <- rnorm(n)
  for (i in 3:n) {
    y[i] <- (phi1 * y[i - 1]) + (phi2*y[i-2])+ e[i]
  }
  tsibble(idx = seq_len(n), y = y, index = idx)
}
ar2(-.8,.3)

## # A tsibble: 100 x 2 [1]
##      idx       y
##    <int>   <dbl>
##  1     1  0     
##  2     2  0     
##  3     3 -1.03  
##  4     4  1.34  
##  5     5 -0.820 
##  6     6 -0.343 
##  7     7  0.0553
##  8     8  0.276 
##  9     9 -0.655 
## 10    10  0.120 
## # … with 90 more rows
## # ℹ Use `print(n = ...)` to see more rows

arma11(0.6,0.6) %>% 
  autoplot(y)

ar2(-.8,.3) %>% 
  autoplot(y)

One could even be called stattionary, but the ar2 is definitely not stationary. It is the most heteroskedastic data I have ever seen.

Question 5

ap <- aus_airpassengers
ap %>% 
  autoplot(Passengers)

apfit <- ap %>% 
  model(arima=ARIMA(Passengers))
report(apfit)

## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8963
## s.e.   0.0594
## 
## sigma^2 estimated as 4.308:  log likelihood=-97.02
## AIC=198.04   AICc=198.32   BIC=201.65

The model selected is a second difference, 0-order autoregressive, 1 order moving average.

gg_tsresiduals(apfit)

The residuals appear to be a white noise series. There is no significant autocorrelation, relatively normal distribution, and there is a zero mean.

augment(apfit) %>%
  features(.innov, ljung_box, lag = 10, dof = 1)

## # A tibble: 1 × 3
##   .model lb_stat lb_pvalue
##   <chr>    <dbl>     <dbl>
## 1 arima     6.70     0.669

No autocorrelation.

apfc <- apfit %>% 
  forecast(h=10)
apfc %>% 
  autoplot(ap)

(1-B)^2(yt) = (1-phi1B)et = (1-.869B)et

apfit2 <- ap %>% 
  model(arima010=ARIMA(Passengers~pdq(0,1,0)))
apfc2 <- apfit2 %>% 
  forecast(h=10)
apfc2 %>% 
  autoplot(ap)

apfc2

## # A fable: 10 x 4 [1Y]
## # Key:     .model [1]
##    .model    Year Passengers .mean
##    <chr>    <dbl>     <dist> <dbl>
##  1 arima010  2017 N(74, 4.3)  74.0
##  2 arima010  2018 N(75, 8.5)  75.4
##  3 arima010  2019  N(77, 13)  76.9
##  4 arima010  2020  N(78, 17)  78.3
##  5 arima010  2021  N(80, 21)  79.7
##  6 arima010  2022  N(81, 26)  81.1
##  7 arima010  2023  N(83, 30)  82.5
##  8 arima010  2024  N(84, 34)  84.0
##  9 arima010  2025  N(85, 38)  85.4
## 10 arima010  2026  N(87, 43)  86.8

apfc

## # A fable: 10 x 4 [1Y]
## # Key:     .model [1]
##    .model  Year Passengers .mean
##    <chr>  <dbl>     <dist> <dbl>
##  1 arima   2017 N(75, 4.3)  74.8
##  2 arima   2018 N(77, 9.6)  77.0
##  3 arima   2019  N(79, 16)  79.2
##  4 arima   2020  N(81, 23)  81.3
##  5 arima   2021  N(84, 32)  83.5
##  6 arima   2022  N(86, 42)  85.7
##  7 arima   2023  N(88, 53)  87.9
##  8 arima   2024  N(90, 66)  90.1
##  9 arima   2025  N(92, 80)  92.3
## 10 arima   2026  N(94, 97)  94.5

The 0,1,0 increases at a much slower rate.

apfit3 <- ap %>% 
  model(arima212=ARIMA(Passengers~1+pdq(2,1,2)))
apfc3 <- apfit3 %>% 
  forecast(h=10)
apfc3 %>% 
  autoplot(ap)

apfc3

## # A fable: 10 x 4 [1Y]
## # Key:     .model [1]
##    .model    Year Passengers .mean
##    <chr>    <dbl>     <dist> <dbl>
##  1 arima212  2017 N(73, 4.2)  73.2
##  2 arima212  2018 N(76, 8.6)  75.6
##  3 arima212  2019  N(77, 15)  77.1
##  4 arima212  2020  N(78, 20)  77.7
##  5 arima212  2021  N(80, 25)  79.5
##  6 arima212  2022  N(81, 30)  81.3
##  7 arima212  2023  N(82, 36)  82.2
##  8 arima212  2024  N(84, 40)  83.6
##  9 arima212  2025  N(85, 46)  85.4
## 10 arima212  2026  N(87, 51)  86.6

These are only slightly lower than the second 0,1,0 method.

report(apfit3)

## Series: Passengers 
## Model: ARIMA(2,1,2) w/ drift 
## 
## Coefficients:
##           ar1      ar2     ma1     ma2  constant
##       -0.5518  -0.7327  0.5895  1.0000    3.2216
## s.e.   0.1684   0.1224  0.0916  0.1008    0.7224
## 
## sigma^2 estimated as 4.031:  log likelihood=-96.23
## AIC=204.46   AICc=206.61   BIC=215.43

apfit4 <-  ap %>% model(ar=ARIMA(Passengers~pdq(2,1,2)))

## Warning: It looks like you're trying to fully specify your ARIMA model but have not said if a constant should be included.
## You can include a constant using `ARIMA(y~1)` to the formula or exclude it by adding `ARIMA(y~0)`.

## Warning: 1 error encountered for ar
## [1] Could not find an appropriate ARIMA model.
## This is likely because automatic selection does not select models with characteristic roots that may be numerically unstable.
## For more details, refer to https://otexts.com/fpp3/arima-r.html#plotting-the-characteristic-roots

report(apfit4)

## Series: Passengers 
## Model: NULL model 
## NULL model

When I try to omit the constant, it says the roots are unstable.

apfit5 <- ap %>% 
  model(
    arima021=ARIMA(Passengers~1+pdq(0,2,1))
  )

## Warning: Model specification induces a quadratic or higher order polynomial trend. 
## This is generally discouraged, consider removing the constant or reducing the number of differences.

apfc5 <- apfit5 %>% 
  forecast(h=10)

apfc5 %>% 
  autoplot(ap)

The forecasts are quadratic/polynomial-like.

Question 6

usa <- global_economy %>% 
  filter(Country=="United States") %>% 
  select(GDP)
usa %>% 
  autoplot(GDP)

usa %>% 
  features(GDP,features=guerrero)

## # A tibble: 1 × 1
##   lambda_guerrero
##             <dbl>
## 1           0.282

usas <- usa %>% 
  mutate(TGDP=box_cox(GDP,.2819443)) %>% 
  select(TGDP)
usas %>% 
  autoplot(TGDP)

tfit <- usas %>% 
  model(
    ar1=ARIMA(TGDP)
  )
report(tfit)

## Series: TGDP 
## Model: ARIMA(1,1,0) w/ drift 
## 
## Coefficients:
##          ar1  constant
##       0.4586  118.1823
## s.e.  0.1198    9.5047
## 
## sigma^2 estimated as 5479:  log likelihood=-325.32
## AIC=656.65   AICc=657.1   BIC=662.78

usas %>% 
  gg_tsdisplay(difference(TGDP),plot_type="partial")

## Warning: Removed 1 row(s) containing missing values (geom_path).

## Warning: Removed 1 rows containing missing values (geom_point).

The first difference of the transformed data appears to be enough.

tfit3 <- usas %>% 
  model(
    arima011=ARIMA(TGDP~pdq(0,1,1)),
    arima111=ARIMA(TGDP~pdq(1,1,1)),
    arima=ARIMA(TGDP)
  )
report(tfit3)

## Warning in report.mdl_df(tfit3): Model reporting is only supported for
## individual models, so a glance will be shown. To see the report for a specific
## model, use `select()` and `filter()` to identify a single model.

## # A tibble: 3 × 8
##   .model   sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
##   <chr>     <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
## 1 arima011  5689.   -326.  659.  659.  665. <cpl [0]> <cpl [1]>
## 2 arima111  5580.   -325.  659.  659.  667. <cpl [1]> <cpl [1]>
## 3 arima     5479.   -325.  657.  657.  663. <cpl [1]> <cpl [0]>

The automatic arima has a slightly lower AICc.

gg_tsresiduals(tfit)

Zero mean, no autocorrelation, somewhat normally distributed. The automatic ARIMA is the best.

tfitfc <- tfit %>% 
  forecast(h=10)
tfitfc %>% 
  autoplot(usas)

They look reasonable. They perfectly follow the trend of the transformed data.

usafit <- usa %>%
  model(ets=ETS(GDP~error("M")+trend("A")))
usafitfc <- usafit %>% 
  forecast(h=10)
usafitfc %>% 
  autoplot(usa)

This forecast seems less reasonable. I believe GDP will continue increasing at a higher rate.

Question 7

arr <- aus_arrivals %>% 
  filter(Origin=="US") %>% 
  select(Arrivals, Quarter)
arr %>% 
  autoplot(Arrivals)

There is clear heteroskedasticity and seasonality.

arr %>% 
  features(Arrivals, features=guerrero)

## # A tibble: 1 × 1
##   lambda_guerrero
##             <dbl>
## 1           0.598

tj <- arr %>% 
  mutate(ta=box_cox(Arrivals,.3916736)) %>% 
  select(ta)
tj %>% 
  autoplot(ta)

tj %>% 
  autoplot(difference(ta))

## Warning: Removed 1 row(s) containing missing values (geom_path).

tj %>% 
  gg_tsdisplay(difference(ta),plot_type="partial")

## Warning: Removed 1 row(s) containing missing values (geom_path).

## Warning: Removed 1 rows containing missing values (geom_point).

The ACF plot is purely sinusoidal, and the PACF has the last spike at lag 4, so a p=4 should work.

tj %>%
  features(ta, unitroot_ndiffs)

## # A tibble: 1 × 1
##   ndiffs
##    <int>
## 1      1

tj %>%
  features(ta, unitroot_nsdiffs)

## # A tibble: 1 × 1
##   nsdiffs
##     <int>
## 1       1

jfit <- tj %>% 
  model(
    arima=ARIMA(ta),
    arimanew=ARIMA(ta~1+pdq(0,1,2)+PDQ(0,1,1))
  )

## Warning: Model specification induces a quadratic or higher order polynomial trend. 
## This is generally discouraged, consider removing the constant or reducing the number of differences.

glance(jfit)

## # A tibble: 2 × 8
##   .model   sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
##   <chr>     <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
## 1 arima      57.2   -423.  861.  862.  880. <cpl [3]> <cpl [8]>
## 2 arimanew   62.9   -428.  867.  867.  881. <cpl [0]> <cpl [6]>

The automatic one is much better. It has a lower AICc.

arrfit <- jfit %>% 
  select(arima)
report(arrfit)

## Series: ta 
## Model: ARIMA(3,0,0)(0,1,2)[4] w/ drift 
## 
## Coefficients:
##          ar1     ar2     ar3     sma1     sma2  constant
##       0.4999  0.0700  0.3353  -0.6843  -0.2330    0.3119
## s.e.  0.0858  0.1047  0.0879   0.1018   0.1021    0.0792
## 
## sigma^2 estimated as 57.22:  log likelihood=-423.3
## AIC=860.59   AICc=861.57   BIC=880.28

(1+θB)(1+ΘB^4)εt

Question 8

usgdp <- readxl::read_excel("C:\\Users\\PythonAcct\\Downloads\\gdp.xlsx")

dt <- usgdp %>% 
  mutate(year=year(Date)) %>% 
  as_tsibble(index=year) %>% 
  select(Value)
dt %>% 
  autoplot(Value)

dt %>% 
  autoplot(difference(Value))

## Warning: Removed 1 row(s) containing missing values (geom_path).

dt %>% 
  gg_tsdisplay(difference(Value),plot_type="partial")

## Warning: Removed 1 row(s) containing missing values (geom_path).

## Warning: Removed 1 rows containing missing values (geom_point).

One difference should be fine, and the last lag spike in the PACF is at one. Thus, a (1,1,0) is what I will expect.

dt %>% 
  features(Value,unitroot_ndiffs)

## # A tibble: 1 × 1
##   ndiffs
##    <int>
## 1      1

dfit <- dt %>% 
  model(
    arima=ARIMA(Value),
    mine=ARIMA(Value~pdq(1,1,0))
  )
glance(dfit)

## # A tibble: 2 × 8
##   .model sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
##   <chr>   <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
## 1 arima  13855.   -136.  277.  278.  280. <cpl [1]> <cpl [0]>
## 2 mine   13855.   -136.  277.  278.  280. <cpl [1]> <cpl [0]>

I actually picked the right one. Lets go!

dfit %>% 
  forecast(h=4) %>% 
  autoplot(dt)

I do not believe multiplicative errors or trend is called for here. Similarly, there is no seasonality. I’ll check multiplicative, though.

bingbong1 <- dt %>% 
  model(
    mod=ETS(Value~error("A")+trend("A")+season("N")))
    
bingbong2 <- dt %>% 
  model(
    mod1=ETS(Value~error("M")+trend("A")+season("N"))
  )
gg_tsresiduals(bingbong1)

gg_tsresiduals(bingbong2)

report(bingbong1)

## Series: Value 
## Model: ETS(A,A,N) 
##   Smoothing parameters:
##     alpha = 0.9999 
##     beta  = 0.9904094 
## 
##   Initial states:
##      l[0]     b[0]
##  9185.682 666.2916
## 
##   sigma^2:  15397.37
## 
##      AIC     AICc      BIC 
## 299.4870 303.0164 305.1645

report(bingbong2)

## Series: Value 
## Model: ETS(M,A,N) 
##   Smoothing parameters:
##     alpha = 0.9998999 
##     beta  = 0.9998999 
## 
##   Initial states:
##      l[0]     b[0]
##  9224.503 599.5141
## 
##   sigma^2:  1e-04
## 
##      AIC     AICc      BIC 
## 297.9423 301.4717 303.6198

The one with multiplicative errors is better.

augment(bingbong2) %>%
  features(.resid, ljung_box, lag = 10, dof = 0)

## # A tibble: 1 × 3
##   .model lb_stat lb_pvalue
##   <chr>    <dbl>     <dbl>
## 1 mod1      13.5     0.195

The p-value indicates it is not significantly different from a white noise.

bingbong2 %>% 
  forecast(h=4) %>% 
  autoplot(dt)

dt %>%
  stretch_tsibble(.init = 10) %>%
  model(
    mine=ARIMA(Value~pdq(1,1,0)),
    mod1=ETS(Value~error("M")+trend("A")+season("N"))
    ) %>%
  forecast(h = 1) %>%
  accuracy(dt)

## Warning: The future dataset is incomplete, incomplete out-of-sample data will be treated as missing. 
## 1 observation is missing at 2023

## # A tibble: 2 × 10
##   .model .type    ME  RMSE   MAE     MPE  MAPE  MASE RMSSE   ACF1
##   <chr>  <chr> <dbl> <dbl> <dbl>   <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 mine   Test   69.0  154.  128.  0.318  0.678 0.204 0.238 -0.116
## 2 mod1   Test   15.6  180.  144. -0.0173 0.774 0.229 0.278  0.457

Apparently ARIMA is better through cross-validation, so I will trust that method more.

Homework #6

Brandon Pascali

2022-11-20

Question 1

Question 2

Question 3

Question 4

Question 5

Question 6

Question 7

Question 8