Data624

Exercise 1.

Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. a. Explain the differences among these figures. Do they all indicate that the data are white noise? Series: x1, There is one significant spike at lag 12. Other Series:x2, Series: x3 don’t have significant spike. The bandwidth keeps getting narrow from x1 to x3.Yes, they all indicate that the data are white noise. These figures meet the condition of white noise: A time series is white noise if the variables are independent and identically distributed with a mean of zero. This means that all variables have the same variance (sigma^2) and each value has a zero correlation with all other values in the series. There is no clear pattern on these figures.

Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise? critical values of ±1.96/√T, T is the length of time series. On these figures, as T get bigger, the bandwidth gets narrow. This explains the critical values at different distances and autocorrelation different in each figure.

Exercise 2.

A classic example of a non-stationary series are stock prices. Plot the daily closing prices for Amazon stock (contained in gafa_stock), along with the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

## Warning: Provided data has an irregular interval, results should be treated with
## caution. Computing ACF by observation.

## Warning: Provided data has an irregular interval, results should be treated with
## caution. Computing ACF by observation.

Plot of the closing price of Amazon stock (2014- 2019) shows there is a trend and changing levels. Thus, this is a non-stationary. On the pacf plot, ACF does not drop quickly to zero, but instead data decreases slowly. If we take a look at the daily closing price of Amazon stock with year of 2014, we see that there is no pattern on price movement, meaning the price fluctuation is bouncing up and down with no prediction. Since it is non-stationary, it should be differenced.

## Warning: Provided data has an irregular interval, results should be treated with
## caution. Computing ACF by observation.

## Warning: Provided data has an irregular interval, results should be treated with
## caution. Computing ACF by observation.

Exercise 3

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data. a. Turkish GDP from global_economy.

ACF drops quickly to zero, this means it is a potential to get it to stationary data.

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 3 lags. 
## 
## Value of test-statistic is: 0.0889 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

Lambda = 0.1571804 order of differencing in order = 2 Performs the KPSS unit root test, where the Null hypothesis is stationarity. Based on this test, Value of test-statistic is: 0.0889 …Turkey_GDP is stationary.

Accommodation takings in the state of Tasmania from aus_accommodation.

ACF has many spikes, time series plot has upwward trend line.

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 3 lags. 
## 
## Value of test-statistic is: 0.2573 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

Lambda = -0.005076712 order of differencing in order = 1 Based on the KPSS test, Value of test-statistic is: 0.2573 Accommodation takings in the state of Tasmania data is stationary.

Monthly sales from souvenirs.

The time series of souvenirs looks like moving horizontal with a peak at the end of cycle (this looks like a stationany data). However, these peaks seem to gaining magnitude from year to year. Meaning, we might have a non-stationary data(not 100% sure).

## Warning: Unknown or uninitialised column: `Sale`.

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 3 lags. 
## 
## Value of test-statistic is: 0.0615 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

Lambda = -0.2444328 order of differencing in order = 0 Based on the KPSS test, Value of test-statistic is: NaN souvenirs data is stationary already. From the boxcox transformation, souvenirs data looks like more like a cyclic progression.

Exercise 5

For your retail data (from Exercise 8 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.0141 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

Exercise 6

Simulate and plot some data from simple ARIMA models.

Use the following R code to generate data from an AR(1) model with ϕ1=0.6 and σ2=1. The process starts with y1=0.

The plot looks stationary

Produce a time plot for the series. How does the plot change as you change

When, we change ϕ1 = 0.001, the plot gets more noise within [-2,2] bandwidth

Write your own code to generate data from an MA(1) model with θ1=0.6 and σ2=1.
Produce a time plot for the series. How does the plot change as you change θ1?

Not much changes on plots than more noise.

Generate data from an ARMA(1,1) model with ϕ1=0.6, θ1=0.6 and σ2=1.
Generate data from an AR(2) model with ϕ1=−0.8, ϕ2=0.3 and σ2=1. (Note that these parameters will give a non-stationary series.)

Exercise 7

Consider aus_airpassengers, the total number of passengers (in millions) from Australian air carriers for the period 1970-2011. a. Use ARIMA() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

## # A tibble: 3 x 8
##   .model   sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
##   <chr>     <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
## 1 arima      4.31   -97.0  198.  198.  202. <cpl [0]> <cpl [1]>
## 2 stepwise   4.31   -97.0  198.  198.  202. <cpl [0]> <cpl [1]>
## 3 search     4.31   -97.0  198.  198.  202. <cpl [0]> <cpl [1]>

Of the models fitted, full search has found that stepwise() gives the lowest AICc value.

## # A tibble: 1 x 3
##   .model lb_stat lb_pvalue
##   <chr>    <dbl>     <dbl>
## 1 arima     6.70     0.461

b. Write the model in terms of the backshift operator. c. Plot forecasts from an ARIMA(0,1,0) model with drift and compare these to part a. d. Plot forecasts from an ARIMA(2,1,2) model with drift and compare these to parts a and c. Remove the constant and see what happens. e. Plot forecasts from an ARIMA(0,2,1) model with a constant. What happens?

## Warning: It looks like you're trying to fully specify your ARIMA model but have not said if a constant should be included.
## You can include a constant using `ARIMA(y~1)` to the formula or exclude it by adding `ARIMA(y~0)`.

## Warning: 1 error encountered for arima212
## [1] Could not find an appropriate ARIMA model.
## This is likely because automatic selection does not select models with characteristic roots that may be numerically unstable.
## For more details, refer to https://otexts.com/fpp3/arima-r.html#plotting-the-characteristic-roots

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning -
## Inf

## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning -
## Inf

## Warning: Removed 10 row(s) containing missing values (geom_path).

## # A tibble: 5 x 8
##   .model   sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
##   <chr>     <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
## 1 arima      4.31   -97.0  198.  198.  202. <cpl [0]> <cpl [1]>
## 2 arima021   4.31   -97.0  198.  198.  202. <cpl [0]> <cpl [1]>
## 3 stepwise   4.31   -97.0  198.  198.  202. <cpl [0]> <cpl [1]>
## 4 search     4.31   -97.0  198.  198.  202. <cpl [0]> <cpl [1]>
## 5 arima010   4.27   -98.2  200.  201.  204. <cpl [0]> <cpl [0]>

## # A tibble: 1 x 3
##   .model   lb_stat lb_pvalue
##   <chr>      <dbl>     <dbl>
## 1 arima010    6.77     0.453

## # A tibble: 1 x 3
##   .model   lb_stat lb_pvalue
##   <chr>      <dbl>     <dbl>
## 1 arima212      NA        NA

Amount all fitted model, the ARIMA(2,1,2) returned a null model.

Exercise 8

For the United States GDP series (from global_economy): a. if necessary, find a suitable Box-Cox transformation for the data;

## [1] 0.3434343

The boxcox transformation adjusted the trend line to be more linear (As it was already linear).

fit a suitable ARIMA model to the transformed data using ARIMA();
try some other plausible models by experimenting with the orders chosen;

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 0.0141 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

choose what you think is the best model and check the residual diagnostics;
produce forecasts of your fitted model. Do the forecasts look reasonable?
compare the results with what you would obtain using ETS() (with no transformation).

## # A tibble: 6 x 9
##   Country       .model    sigma2 log_lik   AIC  AICc   BIC ar_roots  ma_roots 
##   <fct>         <chr>      <dbl>   <dbl> <dbl> <dbl> <dbl> <list>    <list>   
## 1 United States arima222 2.28e22  -1521. 3052. 3053. 3062. <cpl [2]> <cpl [2]>
## 2 United States arima022 2.61e22  -1524. 3054. 3055. 3060. <cpl [0]> <cpl [2]>
## 3 United States stepwise 2.61e22  -1524. 3054. 3055. 3060. <cpl [0]> <cpl [2]>
## 4 United States search   2.61e22  -1524. 3054. 3055. 3060. <cpl [0]> <cpl [2]>
## 5 United States arima021 2.92e22  -1528. 3059. 3059. 3063. <cpl [0]> <cpl [1]>
## 6 United States arima212 3.10e22  -1530. 3061. 3061. 3063. <cpl [0]> <cpl [0]>

## [1] "we choose  ARIMA(2,2,2) model because of the lowest AICc"

## [1] "Looks like a normal residual plot"

## [1] "Forescast of fitted ARIMA(0,2,2) model ...plot below"

## [1] "Forescast of fitted ARIMA(2,2,2) model looks reasonable...plot below"

## [1] "Forescast of fitted ARIMA(0,2,1) model....plot below"

## # A tibble: 1 x 4
##   Country       .model   lb_stat lb_pvalue
##   <fct>         <chr>      <dbl>     <dbl>
## 1 United States arima222    10.7     0.155

## [1] "Let's see the results with ETS"

## ETS(M,A,N) 
## 
## Call:
##  ets(y = .) 
## 
##   Smoothing parameters:
##     alpha = 0.9991 
##     beta  = 0.5012 
## 
##   Initial states:
##     l = 448093333333.703 
##     b = 64917355686.8708 
## 
##   sigma:  0.026
## 
##      AIC     AICc      BIC 
## 3190.787 3191.941 3201.089

## [1] "Comparing ETS() and ARIMA() we found that ARIMA offers a better result with lower AICc (3052.273) than ETS() with AICc(3191.941 )"

Data624_HW6

Alexis Mekueko

10/23/2021