Load libraries to be used

library(astsa)
library(forecast)

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

## 
## Attaching package: 'forecast'

## The following object is masked from 'package:astsa':
## 
##     gas

library(fGarch)

## Loading required package: timeDate

## Loading required package: timeSeries

## Loading required package: fBasics

## 
## Attaching package: 'fBasics'

## The following object is masked from 'package:astsa':
## 
##     nyse

Question 1

(a) By visually checking time series, ACF, and PACF plots of the train set, decide what ARIMA (or SARIMA) model seems most appropriate

load("cregist.RData")

train.cregist = ts(cregist[1:223], frequency = 4, start = c(1960,1))
test.cregist = ts(cregist[224:235], frequency = 4)

plot.ts(train.cregist)

Time series plot seems to be non stationary and shows some trend may be additive or multiplicative. We will plot for log transformation and log difference as well.

we can also check the acf

acf(train.cregist, lag=60)

ACF signals show significant high bars and nothing below zero, mostly above the blue line.

Lets try with difference data. We will apply first order difference

diff.cregist = diff(train.cregist, lag=1)
plot.ts(diff.cregist)

Time series looks much better and stationary now.

we will check acf/pacf for differenced data

par(mfrow=c(1,2))
acf(diff.cregist, lag=60)
pacf(diff.cregist, lag=60)

From ACF we see cut off at lag 1 and PACF tails off. So our model selected will be ARIMA(0,1,1)

2. Fit the model which you chose in part (a) on the train set using sarima function in R. Make comments on diagnostics plots.

cregist.sarima = sarima(train.cregist, p=0, d=1, q=1)

## initial  value 10.967596 
## iter   2 value 10.903519
## iter   3 value 10.903111
## iter   4 value 10.903081
## iter   4 value 10.903081
## iter   4 value 10.903081
## final  value 10.903081 
## converged
## initial  value 10.903344 
## iter   2 value 10.903343
## iter   2 value 10.903343
## iter   2 value 10.903343
## final  value 10.903343 
## converged

cregist.sarima

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), xreg = constant, transform.pars = trans, fixed = fixed, 
##     optim.control = list(trace = trc, REPORT = 1, reltol = tol))
## 
## Coefficients:
##           ma1   constant
##       -0.3366   352.0726
## s.e.   0.0577  2425.2920
## 
## sigma^2 estimated as 2.953e+09:  log likelihood = -2735.55,  aic = 5477.09
## 
## $degrees_of_freedom
## [1] 220
## 
## $ttable
##          Estimate        SE t.value p.value
## ma1       -0.3366    0.0577 -5.8283  0.0000
## constant 352.0726 2425.2920  0.1452  0.8847
## 
## $AIC
## [1] 24.67159
## 
## $AICc
## [1] 24.67184
## 
## $BIC
## [1] 24.71757

We note the AIC and BIC values as 24.67 and 24.71 respectively.

From diagnostics plot we see ACF of residuals are mostly within the threshold. As per p values for Ljung-Box statistic they are are all above 0 which is good.

(c) By visually checking the time series plot of the train set, decide which ESM model seems most appropriate. Explain your choice.

plot.ts(train.cregist)

From the time series plot of train set we see a downward trend and it is additive or A. There appears no seasonality so third parameter is N. For the error term we will try with both Additive and Multiplicative A and M. So ESM model we can try is MAN and AAN.

(d) Fit the model which you chose in part (c) on the train set. Use ets function with the specification of model input with three letters. Report autoplot() and interpret how your chosen model decomposes the data.

train.cregist.ets = ets(train.cregist, model="AAN")
train.cregist.ets

## ETS(A,Ad,N) 
## 
## Call:
##  ets(y = train.cregist, model = "AAN") 
## 
##   Smoothing parameters:
##     alpha = 0.6619 
##     beta  = 1e-04 
##     phi   = 0.8004 
## 
##   Initial states:
##     l = 529603.7364 
##     b = 9332.689 
## 
##   sigma:  54876.48
## 
##      AIC     AICc      BIC 
## 6079.869 6080.258 6100.312

train.cregist.ets1 = ets(train.cregist, model="MAN")
train.cregist.ets1

## ETS(M,Ad,N) 
## 
## Call:
##  ets(y = train.cregist, model = "MAN") 
## 
##   Smoothing parameters:
##     alpha = 0.6098 
##     beta  = 1e-04 
##     phi   = 0.98 
## 
##   Initial states:
##     l = 529603.7159 
##     b = 7965.6703 
## 
##   sigma:  0.0759
## 
##      AIC     AICc      BIC 
## 6071.791 6072.180 6092.234

We notice the AIC and BIC values as 6071.79 and 6092.23 respectively.

train.cregist.ets2 = ets(train.cregist)
train.cregist.ets2

## ETS(M,N,N) 
## 
## Call:
##  ets(y = train.cregist) 
## 
##   Smoothing parameters:
##     alpha = 0.6207 
## 
##   Initial states:
##     l = 544687.6876 
## 
##   sigma:  0.0761
## 
##      AIC     AICc      BIC 
## 6068.329 6068.438 6078.550

We notice AIC and BIC values as 6068.32 and 6078.55 respectively.

When we have run the two models AAN, MAN and see that auto ETS selects MNN model. So making a comparison three AAN, MAN and MNN models in terms of BIC and AIC, we observe that MAN model has lowest AIC and BIC. Hence, our final model selected for forecasting is MAN.

(e) Make a prediction for next 12 values based on your

(i) ARIMA (or SARIMA) model you chose in part (a) and

(ii) ESM you chose in part (c). Calculate their prediction errors using the test set.

pred.cregist = sarima.for(train.cregist,p=0,q=1,d=1,n.ahead=12)

pred.cregist

## $pred
##          Qtr1     Qtr2     Qtr3     Qtr4
## 2015                            635571.6
## 2016 635923.6 636275.7 636627.8 636979.9
## 2017 637331.9 637684.0 638036.1 638388.1
## 2018 638740.2 639092.3 639444.4         
## 
## $se
##           Qtr1      Qtr2      Qtr3      Qtr4
## 2015                                54343.06
## 2016  65215.24  74517.70  82781.33  90291.79
## 2017  97223.80 103693.43 109782.46 115551.06
## 2018 121045.06 126300.31 131345.45

pred.cregist$pred

##          Qtr1     Qtr2     Qtr3     Qtr4
## 2015                            635571.6
## 2016 635923.6 636275.7 636627.8 636979.9
## 2017 637331.9 637684.0 638036.1 638388.1
## 2018 638740.2 639092.3 639444.4

print(mean((pred.cregist$pred - as.vector(test.cregist))))

## [1] 116690.1

Now prediction using ETS model MAN selected

pred.cregist1 = forecast(train.cregist.ets1, h=12)
plot(pred.cregist1)

For sarima forecast we see the predicted next 12 values follow a little upward trend. And when make prediction with ESM selected model we see the next 12 predicted values as a flat line.

print(pred.cregist1$mean)

##          Qtr1     Qtr2     Qtr3     Qtr4
## 2015                            635186.6
## 2016 635267.3 635346.3 635423.8 635499.7
## 2017 635574.0 635646.9 635718.4 635788.4
## 2018 635857.0 635924.2 635990.1

print(mean((pred.cregist1$mean - as.vector(test.cregist))))

## [1] 114784

(f) Compare prediction performance between ESM and ARIMA.

As per the prediction MSE error values we see that it is higher for SARIMA model compared to ESM model selected.

Question 2

data("UnempRate")
train.UnempRate = ts(UnempRate[1:791], frequency = 12, start = c(1948,1))
test.UnempRate = ts(UnempRate[792:827], frequency = 12)

(a) By visually checking time series, ACF, and PACF plots of the train set, decide what ARIMA (or SARIMA) model seems most appropriate.

plot.ts(train.UnempRate)

The first eye ball look suggests that this series is non stationary and trend is upwards and seasonal as well.

we can see the acf plot as well

acf(train.UnempRate)

This has high values for ACF and indicates non stationary.

We will do log transformation of the data and check the time series again

log.UnempRate = log(train.UnempRate)
plot.ts(log.UnempRate)

It shows little change in time series and is still non stationary and trend

We will take first order difference

diff.UnempRate = diff(train.UnempRate, lag=1)
plot.ts(diff.UnempRate)

This is now stationary time series data. So first order difference data is stationary.

We will now look at ACF and PACF Plots

par(mfrow=c(1,2))
acf(diff.UnempRate, lag=60)
pacf(diff.UnempRate, lag=60)

Here we see ACF tailoff and PACF cutoff. Seasonal will have AR pattern every 12 lags So we take ARIMA(1,1,1) and seasonal ARIMA(1,1,1) with S=12

(b)Fit the model which you chose in part (a) on the train set using sarima function in R. Make comments on diagnostics plots.

sarima.UnempRate = sarima(train.UnempRate,p=1,d=1,q=1,P=1,D=1,Q=1,S=12)

## initial  value -1.159264 
## iter   2 value -1.315096
## iter   3 value -1.356353
## iter   4 value -1.369544
## iter   5 value -1.377989
## iter   6 value -1.380405
## iter   7 value -1.382290
## iter   8 value -1.390321
## iter   9 value -1.403696
## iter  10 value -1.405629
## iter  11 value -1.408597
## iter  12 value -1.409855
## iter  13 value -1.411677
## iter  14 value -1.412737
## iter  15 value -1.413065
## iter  16 value -1.413100
## iter  17 value -1.413102
## iter  18 value -1.413102
## iter  19 value -1.413102
## iter  19 value -1.413102
## final  value -1.413102 
## converged
## initial  value -1.415083 
## iter   2 value -1.418933
## iter   3 value -1.422674
## iter   4 value -1.423501
## iter   5 value -1.423713
## iter   6 value -1.423736
## iter   7 value -1.423746
## iter   8 value -1.423794
## iter   9 value -1.423801
## iter  10 value -1.423801
## iter  10 value -1.423801
## final  value -1.423801 
## converged

sarima.UnempRate

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), include.mean = !no.constant, transform.pars = trans, fixed = fixed, 
##     optim.control = list(trace = trc, REPORT = 1, reltol = tol))
## 
## Coefficients:
##          ar1      ma1     sar1     sma1
##       0.8235  -0.6547  -0.0208  -0.7505
## s.e.  0.0449   0.0570   0.0507   0.0370
## 
## sigma^2 estimated as 0.05722:  log likelihood = 3.78,  aic = 2.43
## 
## $degrees_of_freedom
## [1] 774
## 
## $ttable
##      Estimate     SE  t.value p.value
## ar1    0.8235 0.0449  18.3570  0.0000
## ma1   -0.6547 0.0570 -11.4961  0.0000
## sar1  -0.0208 0.0507  -0.4094  0.6824
## sma1  -0.7505 0.0370 -20.2579  0.0000
## 
## $AIC
## [1] 0.003084067
## 
## $AICc
## [1] 0.003148732
## 
## $BIC
## [1] 0.03259437

ACF of residuals are within the threshold. But p values for Ljun Box statistics are seen the be hovering around 0. This is may be due to some issues.

sarima.UnempRate

## $fit
## 
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, 
##     Q), period = S), include.mean = !no.constant, transform.pars = trans, fixed = fixed, 
##     optim.control = list(trace = trc, REPORT = 1, reltol = tol))
## 
## Coefficients:
##          ar1      ma1     sar1     sma1
##       0.8235  -0.6547  -0.0208  -0.7505
## s.e.  0.0449   0.0570   0.0507   0.0370
## 
## sigma^2 estimated as 0.05722:  log likelihood = 3.78,  aic = 2.43
## 
## $degrees_of_freedom
## [1] 774
## 
## $ttable
##      Estimate     SE  t.value p.value
## ar1    0.8235 0.0449  18.3570  0.0000
## ma1   -0.6547 0.0570 -11.4961  0.0000
## sar1  -0.0208 0.0507  -0.4094  0.6824
## sma1  -0.7505 0.0370 -20.2579  0.0000
## 
## $AIC
## [1] 0.003084067
## 
## $AICc
## [1] 0.003148732
## 
## $BIC
## [1] 0.03259437

Notice AIC and BIC values as 0.003 and 0.325 respectively.

So we will try auto arima model.

sarima.UnempRate_auto= auto.arima(train.UnempRate)
sarima.UnempRate_auto

## Series: train.UnempRate 
## ARIMA(3,0,1)(2,1,2)[12] 
## 
## Coefficients:
##          ar1      ar2      ar3      ma1     sar1    sar2    sma1     sma2
##       1.6794  -0.5682  -0.1238  -0.6012  -0.7889  0.0490  0.0529  -0.6234
## s.e.  0.0919   0.1416   0.0545   0.0831   0.0772  0.0583  0.0689   0.0503
## 
## sigma^2 estimated as 0.05549:  log likelihood=18.47
## AIC=-18.93   AICc=-18.7   BIC=22.99

We notice AIC and BIC values as -18.93 and 22.99 respectively.

We see that auto arima function chooses ARIMA(3,0,1)(2,1,2)[12] and our manual selection is ARIMA(1,1,1)(1,1,1) S=12.

We will now compare AIC and BIC values and find that AIC and BIC are lower for our manual model selected, and hence we will go with this model.

(c) By visually checking the time series plot of the train set, decide which ESM model seems the most appropriate. Explain your choice.

plot.ts(train.UnempRate)

When we check the time series plot has a trend and it is additive or A. There appears no seasonality so third parameter is M. For the error term we will try both Additive A and Multiplicative M. So ESM model should be MAM and AAM

(d)Fit the model which you chose in part (c) on the train set. Use ets function with the specification of model input with three letters. Report autoplot() and interpret how your chosen model decomposes the data.

UnempRate.ets = ets(train.UnempRate, model = "MAM" )
UnempRate.ets

## ETS(M,Ad,M) 
## 
## Call:
##  ets(y = train.UnempRate, model = "MAM") 
## 
##   Smoothing parameters:
##     alpha = 0.7164 
##     beta  = 0.0532 
##     gamma = 0.2835 
##     phi   = 0.8026 
## 
##   Initial states:
##     l = 3.2054 
##     b = 0.1779 
##     s = 0.9184 0.8635 0.7918 0.8505 0.8699 0.9988
##            0.9781 0.9238 1.0662 1.2055 1.3237 1.2098
## 
##   sigma:  0.0539
## 
##      AIC     AICc      BIC 
## 3394.344 3395.230 3478.463

# UnempRate.ets1 = ets(train.UnempRate, model = "AAM" ) # error forbidden model combination

autoplot(UnempRate.ets)

Checking the diagnostics plot, the observed trend is towards up and seasonality can also be seen. So we will keep our model as MAM.

(e) Make a prediction for next 36 values based on your (i) ARIMA (or SARIMA) model you chose in part (a) and (ii) ESM you chose in part (c). Calculate prediction errors using the test set.

pred.UnempRate1 = sarima.for(train.UnempRate, n.ahead=36, p=0, d=1, q=2, P=0, D=1, Q=1, S=12)

pred.UnempRate1

## $pred
##           Jan      Feb      Mar      Apr      May      Jun      Jul      Aug
## 2013                                                                        
## 2014 7.401420 7.231322 6.957408 6.439000 6.554403 7.041390 7.092784 6.822220
## 2015 7.233736 7.063638 6.789724 6.271316 6.386719 6.873706 6.925100 6.654536
## 2016 7.066052 6.895954 6.622040 6.103632 6.219036 6.706022 6.757417 6.486852
##           Sep      Oct      Nov      Dec
## 2013                            6.699899
## 2014 6.524507 6.420845 6.347168 6.463708
## 2015 6.356824 6.253161 6.179485 6.296024
## 2016 6.189140 6.085478 6.011801         
## 
## $se
##            Jan       Feb       Mar       Apr       May       Jun       Jul
## 2013                                                                      
## 2014 0.3614176 0.4803960 0.5752721 0.6565787 0.7288712 0.7946137 0.8553178
## 2015 1.1887986 1.2507842 1.3098398 1.3663453 1.4206050 1.4728671 1.5233373
## 2016 1.8235957 1.8810271 1.9367563 1.9909260 2.0436605 2.0950680 2.1452440
##            Aug       Sep       Oct       Nov       Dec
## 2013                                         0.2413038
## 2014 0.9119902 0.9653413 1.0158944 1.0640485 1.1270938
## 2015 1.5721882 1.6195662 1.6655971 1.7103897 1.7670957
## 2016 2.1942729 2.2422299 2.2891826 2.3351913

pred.UnempRate1

## $pred
##           Jan      Feb      Mar      Apr      May      Jun      Jul      Aug
## 2013                                                                        
## 2014 7.401420 7.231322 6.957408 6.439000 6.554403 7.041390 7.092784 6.822220
## 2015 7.233736 7.063638 6.789724 6.271316 6.386719 6.873706 6.925100 6.654536
## 2016 7.066052 6.895954 6.622040 6.103632 6.219036 6.706022 6.757417 6.486852
##           Sep      Oct      Nov      Dec
## 2013                            6.699899
## 2014 6.524507 6.420845 6.347168 6.463708
## 2015 6.356824 6.253161 6.179485 6.296024
## 2016 6.189140 6.085478 6.011801         
## 
## $se
##            Jan       Feb       Mar       Apr       May       Jun       Jul
## 2013                                                                      
## 2014 0.3614176 0.4803960 0.5752721 0.6565787 0.7288712 0.7946137 0.8553178
## 2015 1.1887986 1.2507842 1.3098398 1.3663453 1.4206050 1.4728671 1.5233373
## 2016 1.8235957 1.8810271 1.9367563 1.9909260 2.0436605 2.0950680 2.1452440
##            Aug       Sep       Oct       Nov       Dec
## 2013                                         0.2413038
## 2014 0.9119902 0.9653413 1.0158944 1.0640485 1.1270938
## 2015 1.5721882 1.6195662 1.6655971 1.7103897 1.7670957
## 2016 2.1942729 2.2422299 2.2891826 2.3351913

We see that the next 36 value prediction follows the trend and is little downwards.

print(pred.UnempRate1$mean)

## NULL

print(mean((pred.UnempRate1$mean - as.vector(test.UnempRate)^2)))

## [1] NaN

Prediction for the ESM model

pred.UnempRate = forecast(UnempRate.ets, h=36)
plot(pred.UnempRate)

The next 36 values follow the trend on the plot as we can see above.

print(pred.UnempRate$mean)

##           Jan      Feb      Mar      Apr      May      Jun      Jul      Aug
## 2013                                                                        
## 2014 7.382915 7.232364 7.069825 6.714668 6.910437 7.372853 7.437188 7.196614
## 2015 7.334351 7.195252 7.041777 6.694317 6.894695 7.360520 7.428364 7.190886
## 2016 7.336407 7.198021 7.045079 6.697907 6.898765 7.365185 7.433331 7.195895
##           Sep      Oct      Nov      Dec
## 2013                            6.747176
## 2014 6.895069 6.756338 6.526279 6.690677
## 2015 6.891745 6.754784 6.526097 6.691680
## 2016 6.896700 6.759762 6.531002

We will now calculate prediction error using test set

Prediction MSE error value

print(mean((pred.UnempRate$mean - as.vector(test.UnempRate)^2)))

## [1] -23.69653

(f) Compare prediction performance between ESM and ARIMA.

When we compare MSE error rate we see that ESM model has very low MSE error.

Question 3

The “nyse.RData” contains returns of the New York Stock Exchange (NYSE) from February 2, 1984 to December 31, 1991.

load("nyse.RData")
nyse=ts(nyse, frequency = 365)

(a) Generate the time series plot on returns of NYSE. Generate ACF and PACF plots from both raw data and squared data, respectively. Make comments based on your observation.

plot.ts(nyse)

This looks like a stationary time series.

Now acf on returned and squared return

par(mfrow=c(1,2))
acf(nyse);pacf(nyse)

In ACF plot for raw data we see few significant signals and also in PACF signficant signals. So raw data is correlated in the mean part.

par(mfrow=c(1,2))
acf(nyse^2);pacf(nyse^2)

In ACF plot for squared data we see some more significant signals and also in PACF there is about the same signficant signals as raw data acf plot. So sqaured data is also correlated in the mean part.

(b) Fit any four times series models of Heteroscedasticity, i.e., choose any four models from ARCH(q) or GARCH(q,p) with your specification of q and p. Then choose the best one based on diagnostics tests on residuals and AIC criteria. There is no one correct model and choose the one that you think the best.

ARCH(1) Model

nyse.m1 = garchFit(~garch(1,0), data = nyse, trace=F)

## Warning: Using formula(x) is deprecated when x is a character vector of length > 1.
##   Consider formula(paste(x, collapse = " ")) instead.

summary(nyse.m1)

## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(1, 0), data = nyse, trace = F) 
## 
## Mean and Variance Equation:
##  data ~ garch(1, 0)
## <environment: 0x7fc31894b468>
##  [data = nyse]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1  
## 6.3533e-04  6.5228e-05  2.6983e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     6.353e-04   1.968e-04    3.228  0.00125 ** 
## omega  6.523e-05   2.484e-06   26.254  < 2e-16 ***
## alpha1 2.698e-01   3.301e-02    8.175 2.22e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  6594.697    normalized:  3.297349 
## 
## Description:
##  Thu May 13 00:46:14 2021 by user:  
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value     
##  Jarque-Bera Test   R    Chi^2  9019.18   0           
##  Shapiro-Wilk Test  R    W      0.9212042 0           
##  Ljung-Box Test     R    Q(10)  38.38662  3.249771e-05
##  Ljung-Box Test     R    Q(15)  42.17843  0.0002109731
##  Ljung-Box Test     R    Q(20)  45.27366  0.001012957 
##  Ljung-Box Test     R^2  Q(10)  400.3309  0           
##  Ljung-Box Test     R^2  Q(15)  416.9455  0           
##  Ljung-Box Test     R^2  Q(20)  422.2616  0           
##  LM Arch Test       R    TR^2   308.6684  0           
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -6.591697 -6.583296 -6.591702 -6.588612

ARCH(2) Model

nyse.m2 = garchFit(~garch(2,0), data = nyse, trace=F)

## Warning: Using formula(x) is deprecated when x is a character vector of length > 1.
##   Consider formula(paste(x, collapse = " ")) instead.

summary(nyse.m2)

## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(2, 0), data = nyse, trace = F) 
## 
## Mean and Variance Equation:
##  data ~ garch(2, 0)
## <environment: 0x7fc31a2a3b20>
##  [data = nyse]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1      alpha2  
## 6.7759e-04  5.5752e-05  1.8776e-01  1.4201e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     6.776e-04   1.889e-04    3.587 0.000334 ***
## omega  5.575e-05   2.352e-06   23.707  < 2e-16 ***
## alpha1 1.878e-01   2.840e-02    6.612 3.79e-11 ***
## alpha2 1.420e-01   2.584e-02    5.497 3.87e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  6660.251    normalized:  3.330126 
## 
## Description:
##  Thu May 13 00:46:14 2021 by user:  
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value     
##  Jarque-Bera Test   R    Chi^2  3242.832  0           
##  Shapiro-Wilk Test  R    W      0.9441303 0           
##  Ljung-Box Test     R    Q(10)  25.50612  0.004464403 
##  Ljung-Box Test     R    Q(15)  27.22277  0.02697281  
##  Ljung-Box Test     R    Q(20)  29.02819  0.08720581  
##  Ljung-Box Test     R^2  Q(10)  84.49595  6.561418e-14
##  Ljung-Box Test     R^2  Q(15)  94.69486  1.308953e-13
##  Ljung-Box Test     R^2  Q(20)  99.9624   1.27931e-12 
##  LM Arch Test       R    TR^2   76.41465  1.983314e-11
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -6.656251 -6.645049 -6.656259 -6.652138

GARCH(1,1)

nyse.m3 = garchFit(~garch(1,1), data = nyse, trace=F)

## Warning: Using formula(x) is deprecated when x is a character vector of length > 1.
##   Consider formula(paste(x, collapse = " ")) instead.

summary(nyse.m3)

## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(1, 1), data = nyse, trace = F) 
## 
## Mean and Variance Equation:
##  data ~ garch(1, 1)
## <environment: 0x7fc31804bf58>
##  [data = nyse]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1       beta1  
## 7.3695e-04  6.5416e-06  1.1408e-01  8.0608e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     7.369e-04   1.786e-04    4.126 3.69e-05 ***
## omega  6.542e-06   1.455e-06    4.495 6.94e-06 ***
## alpha1 1.141e-01   1.604e-02    7.114 1.13e-12 ***
## beta1  8.061e-01   2.973e-02   27.112  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  6723.005    normalized:  3.361502 
## 
## Description:
##  Thu May 13 00:46:14 2021 by user:  
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value     
##  Jarque-Bera Test   R    Chi^2  3628.415  0           
##  Shapiro-Wilk Test  R    W      0.9515562 0           
##  Ljung-Box Test     R    Q(10)  29.69242  0.0009616813
##  Ljung-Box Test     R    Q(15)  30.50938  0.01021164  
##  Ljung-Box Test     R    Q(20)  32.81143  0.03538324  
##  Ljung-Box Test     R^2  Q(10)  3.510505  0.9667405   
##  Ljung-Box Test     R^2  Q(15)  4.408852  0.9960585   
##  Ljung-Box Test     R^2  Q(20)  6.68935   0.9975864   
##  LM Arch Test       R    TR^2   3.967784  0.9840107   
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -6.719005 -6.707803 -6.719013 -6.714891

GARCH(2,2)

nyse.m4 = garchFit(~garch(2,2), data = nyse, trace=F)

## Warning: Using formula(x) is deprecated when x is a character vector of length > 1.
##   Consider formula(paste(x, collapse = " ")) instead.

summary(nyse.m4)

## 
## Title:
##  GARCH Modelling 
## 
## Call:
##  garchFit(formula = ~garch(2, 2), data = nyse, trace = F) 
## 
## Mean and Variance Equation:
##  data ~ garch(2, 2)
## <environment: 0x7fc3003a2ab0>
##  [data = nyse]
## 
## Conditional Distribution:
##  norm 
## 
## Coefficient(s):
##         mu       omega      alpha1      alpha2       beta1       beta2  
## 7.3368e-04  7.9497e-06  1.4136e-01  1.0000e-08  3.7725e-01  3.8309e-01  
## 
## Std. Errors:
##  based on Hessian 
## 
## Error Analysis:
##         Estimate  Std. Error  t value Pr(>|t|)    
## mu     7.337e-04   1.791e-04    4.098 4.18e-05 ***
## omega  7.950e-06   3.367e-06    2.361   0.0182 *  
## alpha1 1.414e-01   2.408e-02    5.872 4.32e-09 ***
## alpha2 1.000e-08   6.478e-02    0.000   1.0000    
## beta1  3.772e-01   2.871e-01    1.314   0.1889    
## beta2  3.831e-01   2.067e-01    1.853   0.0639 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log Likelihood:
##  6726.978    normalized:  3.363489 
## 
## Description:
##  Thu May 13 00:46:14 2021 by user:  
## 
## 
## Standardised Residuals Tests:
##                                 Statistic p-Value    
##  Jarque-Bera Test   R    Chi^2  3283.62   0          
##  Shapiro-Wilk Test  R    W      0.9532771 0          
##  Ljung-Box Test     R    Q(10)  28.66327  0.001412368
##  Ljung-Box Test     R    Q(15)  29.53388  0.01371783 
##  Ljung-Box Test     R    Q(20)  31.75336  0.04599833 
##  Ljung-Box Test     R^2  Q(10)  2.732222  0.9870426  
##  Ljung-Box Test     R^2  Q(15)  3.796005  0.9983323  
##  Ljung-Box Test     R^2  Q(20)  6.058174  0.9988166  
##  LM Arch Test       R    TR^2   3.330788  0.9927238  
## 
## Information Criterion Statistics:
##       AIC       BIC       SIC      HQIC 
## -6.720978 -6.704175 -6.720996 -6.714808

For ARCH(1) Model1 we see small p values for Ljung and even for R squared it is zero. So R and R Squared both are correlated; AIC is -6.591697 For ARCH(2) Model2 we see small p values for Ljung and even for R squared it is zero. So R and R Squared both are correlated; AIC is -6.656251 For GARCH(1,1) Model3 we see small p values for Ljung less than 0.05 and for R squared it is more than 0.05. So R is correlated, R Squared is uncorrelated; AIC is -6.719005

For GARCH(2,2) Model4 we see small p values for Ljung less than 0.05 and for R squared it is more than 0.05. So R is correlated, R Squared is uncorrelated; AIC is -6.720978

All models have R correlated and when compare AIC we see that GARCH(2,2) has the least AIC value -6.720 and GARCH(2,2) has AIC value of -6.719. The difference between both AIC is very small but GARCH(1,1) has smaller number of coefficients making it less complex model. Hence we will select GARCH(1,1) as our final/best model.

(c) Make a prediction for the next 10 volatility based on your final model

(nyse.pred = predict(nyse.m3, n.ahead=10))

##    meanForecast   meanError standardDeviation
## 1  0.0007369498 0.010351669       0.010351669
## 2  0.0007369498 0.010253926       0.010253926
## 3  0.0007369498 0.010163156       0.010163156
## 4  0.0007369498 0.010078912       0.010078912
## 5  0.0007369498 0.010000767       0.010000767
## 6  0.0007369498 0.009928317       0.009928317
## 7  0.0007369498 0.009861183       0.009861183
## 8  0.0007369498 0.009799002       0.009799002
## 9  0.0007369498 0.009741435       0.009741435
## 10 0.0007369498 0.009688162       0.009688162

predict(nyse.m3,n.ahead=10,plot=TRUE, conf = .9)

##    meanForecast   meanError standardDeviation lowerInterval upperInterval
## 1  0.0007369498 0.010351669       0.010351669   -0.01629003    0.01776393
## 2  0.0007369498 0.010253926       0.010253926   -0.01612926    0.01760316
## 3  0.0007369498 0.010163156       0.010163156   -0.01597995    0.01745385
## 4  0.0007369498 0.010078912       0.010078912   -0.01584138    0.01731528
## 5  0.0007369498 0.010000767       0.010000767   -0.01571285    0.01718675
## 6  0.0007369498 0.009928317       0.009928317   -0.01559368    0.01706758
## 7  0.0007369498 0.009861183       0.009861183   -0.01548325    0.01695715
## 8  0.0007369498 0.009799002       0.009799002   -0.01538097    0.01685487
## 9  0.0007369498 0.009741435       0.009741435   -0.01528628    0.01676018
## 10 0.0007369498 0.009688162       0.009688162   -0.01519866    0.01667256

Final Exam

Shams Tabrez

5/7/2021

Question 1

(a) By visually checking time series, ACF, and PACF plots of the train set, decide what ARIMA (or SARIMA) model seems most appropriate

2. Fit the model which you chose in part (a) on the train set using sarima function in R. Make comments on diagnostics plots.

(c) By visually checking the time series plot of the train set, decide which ESM model seems most appropriate. Explain your choice.

(d) Fit the model which you chose in part (c) on the train set. Use ets function with the specification of model input with three letters. Report autoplot() and interpret how your chosen model decomposes the data.

(e) Make a prediction for next 12 values based on your

(i) ARIMA (or SARIMA) model you chose in part (a) and

(ii) ESM you chose in part (c). Calculate their prediction errors using the test set.

(f) Compare prediction performance between ESM and ARIMA.

Question 2

(a) By visually checking time series, ACF, and PACF plots of the train set, decide what ARIMA (or SARIMA) model seems most appropriate.

(b)Fit the model which you chose in part (a) on the train set using sarima function in R. Make comments on diagnostics plots.

(c) By visually checking the time series plot of the train set, decide which ESM model seems the most appropriate. Explain your choice.

(d)Fit the model which you chose in part (c) on the train set. Use ets function with the specification of model input with three letters. Report autoplot() and interpret how your chosen model decomposes the data.

(e) Make a prediction for next 36 values based on your (i) ARIMA (or SARIMA) model you chose in part (a) and (ii) ESM you chose in part (c). Calculate prediction errors using the test set.

(f) Compare prediction performance between ESM and ARIMA.

Question 3

The “nyse.RData” contains returns of the New York Stock Exchange (NYSE) from February 2, 1984 to December 31, 1991.

(a) Generate the time series plot on returns of NYSE. Generate ACF and PACF plots from both raw data and squared data, respectively. Make comments based on your observation.

(c) Make a prediction for the next 10 volatility based on your final model