library(fpp3)
library(tsibble)
library(cowplot)

9.1

a.

Our three plots all have values that remain within the blue dotted lines. While the spread of the data varies for all three, the fact remains that none of them have any values that exceed this boundary. With this in mind, can conclude that each our plots indicate that the data are white noise.

b.

We need to keep in mind that as the sample size increases, our absolute critical values move closer to 0. With this in mind, it can become easier for smaller data sets to exhibit statistical significance through pure coincidence. To compensate for this, we set the autocorrelation region higher for smaller data sets, leading to the larger region in the first plot.

9.2

We can see below that our initial Closing ACF plot has no values that fit within the bounds. Our PACF shows one major line going outside, further showing we need to at least apply one different to make our data stationary. Doing so, as shown in the plot further below, makes our values fit within the bounds of the plot.

gg_tsdisplay( filter(gafa_stock,Symbol=="AMZN"),Close,plot_type = "partial")

 filter(gafa_stock,Symbol=="AMZN") %>%  gg_tsdisplay( difference(Close),plot_type = "partial")

9.3

a.

Turkey <- global_economy %>% filter(Country=="Turkey")

We find our data does vary quite a bit.

autoplot(Turkey,GDP)

The Boxcox transfation appears to successfuly smooth our data out.

lam <- Turkey %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)

Turkey <- Turkey %>%
  mutate(lam_gdp = box_cox(GDP, lam))


 Turkey %>% autoplot(lam_gdp)

gg_tsdisplay( Turkey,lam_gdp,plot_type = "partial")

We find the difference required to be one, and plot the corresponding visuals affer calculating the difference. We see our acf and pacf plot now show our data is stationary.

Turkey %>%
features(lam_gdp, unitroot_ndiffs)

Turkey %>%
mutate(dif = difference(lam_gdp,1)) %>% 
  gg_tsdisplay( dif,plot_type = "partial")

b.

Tas <- aus_accommodation %>% filter(State == "Tasmania")

we can see below that the the accommodation takings in the state of Tasmania grow at a constant, unvarying rate. This indicates that the Box-Cox transformation would not be appropriate here. Indeed, the Box-Cox transformed plot below is hardly any different. We will ignore it for this data.

autoplot(Tas,Takings)

Tas_lam <- Tas %>%
  features(Takings, features = guerrero) %>%
  pull(lambda_guerrero)

autoplot(Tas, box_cox(Takings,Tas_lam))

We confirm below that our data is indeed non-stationary.

gg_tsdisplay( Tas,Takings,plot_type = "partial")

Tas %>%

features(Takings, unitroot_nsdiffs)

We find the seasonal difference required for our data to be one. Given that we have quarterly data, we use a value of 4. We find our ACF and PACF plots to be greatly improved. It seems that the initial values of data go outside the bounded area by a bit, but double checking with the KPSS test shows that our data is properly stationary.

Tas  %>%
mutate(dif = difference(Takings,4)) %>% 
gg_tsdisplay( dif,plot_type = "partial")

Tas  %>%
mutate(dif = difference(Takings,4)) %>% 
  features(dif, unitroot_kpss)

c.

autoplot(souvenirs)

lam_sales <- souvenirs  %>%
  features(Sales, features = guerrero) %>%
  pull(lambda_guerrero)

souvenirs <-  souvenirs %>% 
  mutate(box_souvenirs = box_cox(Sales,lam_sales))

We find that the Box-Cox transformation makes this data far more evenly leveled out.

autoplot(souvenirs, box_souvenirs)

We find our data to again be non-stationary.

 gg_tsdisplay( souvenirs,box_souvenirs,plot_type = "partial")

Applying one seasonal difference appears to fix our data as shown below. We double check with the KPSS test. and confirm our findings that our data is now stationary.

souvenirs%>%
  features(box_souvenirs, unitroot_nsdiffs)

souvenirs %>%
  mutate(dif = difference(box_souvenirs,12)) %>% 
    gg_tsdisplay( dif,plot_type = "partial")

souvenirs  %>%
mutate(dif = difference(box_souvenirs,12)) %>% 
  features(dif, unitroot_kpss)

9.5

set.seed(1221)
myseries <- aus_retail %>%
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))

autoplot(myseries)

lam_my <- myseries  %>%
  features(Turnover, features = guerrero) %>%
  pull(lambda_guerrero)

myseries <-  myseries %>% 
  mutate(box_Turnover = box_cox(Turnover,lam_my))

autoplot(myseries,box_Turnover)

We find our data is very clearly non-stationary.

 gg_tsdisplay(myseries,box_Turnover,plot_type = "partial")

 myseries%>%
  features(box_Turnover, unitroot_nsdiffs)

We attempt to try one seasonal difference of 12 on our data. It does not appear to be enough as our plots and test result show below.

myseries %>%
  mutate(dif = difference(box_Turnover,12)) %>% 
    gg_tsdisplay( dif,plot_type = "partial")

myseries %>%
mutate(dif = difference(box_Turnover,12)) %>% 
  features(dif, unitroot_kpss)

We see that after applying the seasonal difference, the non-seasonal method suggests an additonal singal difference.

myseries %>%
mutate(dif = difference(box_Turnover,12))  %>%
features(dif, unitroot_ndiffs)

We see that after applying the second difference on our data, while some values go out of the boundaries, the majority of data now stays in bounds. Our test also confirms as much, indicating that our data is now stationary.

dd_series <- myseries %>%
mutate(dif = difference(box_Turnover,12),
       second_dif = difference(dif,1)) 

    gg_tsdisplay( dd_series,second_dif,plot_type = "partial")

dd_series  %>% 
  features(second_dif, unitroot_kpss)

9.6

a

set.seed(123)
y <- numeric(100)

e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)

sim

b.

 make_AR<- function(phi){
   y <- numeric(100)
  e <- rnorm(100)
  for(i in 2:100)
  y[i] <- phi*y[i-1] + e[i]
tsibble(idx = seq_len(100), y = y, index = idx)
 }

We can see below that as our Phi value increases, the amount of noise decreases, making is much easier to see the trend occurring.

set.seed(123)

plot_grid(autoplot(make_AR(.05) )+labs(title = "Phi .05") ,autoplot( make_AR(.1)) +labs(title = "Phi .1"),autoplot( make_AR(.6))+labs(title = "Phi .6"),autoplot( make_AR(1))+labs(title = "Phi 1") )

c.

make_MA <- function(theta){
  y <- numeric(100)
  e <- rnorm(100)
  for(i in 2:100)
    y[i] <- theta*e[i-1] + e[i]
  tsibble(idx = seq_len(100), y = y, index = idx)
}

d.

We see the same trend as with Phi, as the Theta increases, the noise decreases.

set.seed(123)

plot_grid(autoplot(make_MA(.05) )+labs(title = "Theta .05") ,autoplot( make_MA(.1)) +labs(title = "Theta .1"),autoplot( make_MA(.6))+labs(title = "Theta .6"),autoplot( make_MA(1))+labs(title = "Theta 1") )

e.

set.seed(123)
y <- numeric(100)
e <- rnorm(100)

for(i in 2:100)
  y[i] <- 0.6*y[i-1] + 0.6*e[i-1]+ e[i]
ARMA <- tsibble(idx = seq_len(100), y = y, index = idx)

f.

set.seed(123)
y <- numeric(100)
e <- rnorm(100)

for(i in 3:100)
  y[i] <- -0.8*y[i-1] + 0.3*y[i-2]+ e[i]
AR2 <- tsibble(idx = seq_len(100), y = y, index = idx)

g.

We see below that the generated data for each is very different. Where the ARMA(1,1) plot is a an unclear up and down plot, the AR(2) plot appears to be growing at a constant, almost exponential rate.

plot_grid(autoplot(ARMA) + labs(title="ARMA(1,1) Plot"),autoplot(AR2)+ labs(title ="AR(2) Plot"))

gg_tsdisplay( AR2,plot_type = "partial")

gg_tsdisplay( ARMA,plot_type = "partial")

9.7

a.

aus_fit <- aus_airpassengers %>%
  model(
    ARIMA(Passengers,stepwise=F))

We can see that by default, an ARIMA(0,2,1) model was was selected.

aus_fit %>% report()

## Series: Passengers 
## Model: ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.8963
## s.e.   0.0594
## 
## sigma^2 estimated as 4.308:  log likelihood=-97.02
## AIC=198.04   AICc=198.32   BIC=201.65

Our residuals look like white noise, stay within the bounds and don’t have any odd patterns.

aus_fit %>% gg_tsresiduals()

aus_fit %>% forecast(h=10) %>%
  autoplot(aus_airpassengers)

b.

Our model is as follows: \((1-B)^2y_t=c +(1+\theta_1B)e_t\)

c

aus_fit2 <-aus_airpassengers %>%
   model(ARIMA(Passengers ~ pdq(0,1,0)   ))

report(aus_fit2)

## Series: Passengers 
## Model: ARIMA(0,1,0) w/ drift 
## 
## Coefficients:
##       constant
##         1.4191
## s.e.    0.3014
## 
## sigma^2 estimated as 4.271:  log likelihood=-98.16
## AIC=200.31   AICc=200.59   BIC=203.97

aus_fit2 %>% forecast(h=10) %>%
  autoplot(aus_airpassengers)

d.

Removing the constant below reaises an error and creates an NA model.

aus_fit3 <-aus_airpassengers %>%
   model(ARIMA(Passengers ~ pdq(2,1,2)+1 ))

aus_fit3 %>% forecast(h=10) %>%
  autoplot(aus_airpassengers)

e.

We get the error shown below, and an aggressive upwards forecast in our plot.

aus_fit4 <-aus_airpassengers %>%
   model(ARIMA(Passengers ~ pdq(0,2,1)+1 ))

ggdraw() + draw_image("https://i.gyazo.com/ed572e798d7bbf48a21c73f44244320e.png")

aus_fit4 %>% forecast(h=10) %>%
  autoplot(aus_airpassengers)

9.8

a.

US <- global_economy %>% filter(Code=="USA")

autoplot(US)

We do find our Box-Cox transformed data to be more level.

US_lam <- US  %>%
  features(GDP, features = guerrero) %>%
  pull(lambda_guerrero)


US <- US %>%
  mutate(Box_GDP =box_cox(GDP, US_lam) )

autoplot(US, Box_GDP)

b.

US_fit <- US %>%
  model(
    ARIMA(Box_GDP,stepwise = FALSE, approx = FALSE)
  )

We find that a ARIMA(1,1,0) w/ drift was chosen automatically.

report(US_fit)

## Series: Box_GDP 
## Model: ARIMA(1,1,0) w/ drift 
## 
## Coefficients:
##          ar1  constant
##       0.4586  118.1822
## s.e.  0.1198    9.5047
## 
## sigma^2 estimated as 5479:  log likelihood=-325.32
## AIC=656.65   AICc=657.1   BIC=662.78

c.

US_fit2 <- US %>%
  model(
    auto = ARIMA(Box_GDP,stepwise = FALSE, approx = FALSE),
    `ARIMA 0,0,0` = ARIMA(Box_GDP~pdq(0,0,0)),
    `ARMIA 0,1,0` =  ARIMA(Box_GDP~pdq(0,1,0)),
    `ARMIA 2,1,2` =  ARIMA(Box_GDP~pdq(2,1,2)),
    `ARMIA 0,1,2` =  ARIMA(Box_GDP~pdq(0,1,2)),
    `ARMIA 0,2,2` =  ARIMA(Box_GDP~pdq(0,2,2)),
    `ARMIA 0,2,0` =  ARIMA(Box_GDP~pdq(0,2,0))

    
  )

d.

We chose the ARMIA 0,2,2 model from below, as the it has the lowest AIC score of 648.29.

US_fit2 %>% report()

We see the rsiduals of our model below. While a bit skewed, properly appears to be white noise.

US_fit2 %>%
  select( `ARMIA 0,2,2`) %>%
    gg_tsresiduals()

e.

The forecast looks very reasonable, not too extreme in either direction.

US_fit2 %>%
  select(Country, `ARMIA 0,2,2`) %>%
  forecast(h=10) %>%
  autoplot(US)

f.

We see below that the ETS model also appears reasonable.It is very difficult to say for certain how the two models compare. As stated in our textbook even, there are times when AIC doesn’t help in picking between ARMIA models, “It is important to note that these information criteria tend not to be good guides to selecting the appropriate order of differencing (d) of a model, but only for selecting the values of p and q”. Whether we can use the AIC to to compare ETS to ARMIA is not clear, so we will be very mindful of that when picking the proper model for the data.

US_ETS_FIT <- US %>%
  model(
  ETS(GDP)
  )

US_ETS_FIT %>% 
  forecast(h=10) %>% 
  autoplot(US)

 US %>%
  model(
  ETS(Box_GDP)
  ) %>%
  forecast(h=10) %>%
  autoplot(US)

US %>%
  model(
  ETS(Box_GDP)
  ) %>% report()

## Series: Box_GDP 
## Model: ETS(M,A,N) 
##   Smoothing parameters:
##     alpha = 0.9998985 
##     beta  = 0.3731217 
## 
##   Initial states:
##      l[0]     b[0]
##  7045.398 198.9275
## 
##   sigma^2:  0
## 
##      AIC     AICc      BIC 
## 744.1762 745.3300 754.4784

Data 624: Homework Six

Zachary Safir

3/13/2022

9.1

a.

b.

9.2

9.3

a.

b.

c.

9.5

9.6

a

b.

c.

d.

e.

f.

g.

9.7

a.

b.

c

d.

e.

9.8

a.

b.

c.

d.

e.

f.