Predictive
Analytics

Question 8:1

a) Graph 1 has a significant value at a lag of 12. If that’s monthly, it could indicate an annual lag. The picture caption says it’s a white noise series. If we set our confidence at 95%, this could be an anomoly arising from a white noise process. With 20 values, we would expect one lag to be significant. The second graph shows significant lags at 2 and 6. For a white noise process, that could happen. We might expect an Ma-2 or MA-6 process and would check for them. Our last model shows no significant lags. For 3 plots, with 60 values, we would expect the 3 significant values that we saw, on average.

b) In each successive simulation, a longer series is created. Critical values are \(\frac { 1.96 }{ \sqrt { n } }\) . As n gets larger, the critical values decrease.

Question 8:2

The plots of IBM’s closing price shows a non-stationary series. In the first plot, we can see a general downward trend. In the acf plot, we see that it slowly decreases in value. In the pacf, the first value is high and then it drops off quickly. These are all signs of non-stationarity. This series could use differencing to make it easier to predict from.

Question 8:3

All 5 of our datasets have an upward trend. We need to difference each of them to find a stationary series. Mcopper, enplanements and visitors appear to have increasing variance and could use a Box-Cox transformation.

Optimal lambdas for Box-Cox transformation

—————————


usnetelec lambda	0.516771443964645
usgdp lambda	0.366352049520934
mcopper lambda	0.191904709003829
enplanements lambda	-0.226946111237065
visitors lambda	0.277524910835111

Looking at the ACF and PACF, it appears that usnetelec could be AR-1. The Box-Ljung test shows that it is not stationary, however. It will need differencing before we can evaluate it.

## 
##  Box-Ljung test
## 
## data:  usnetelec
## X-squared = 329.22, df = 10, p-value < 2.2e-16

Ndiffs returns a suggested difference of 2. After 1 difference, the Ljung-Box test is no longer significant. The unit root test is significant at 1 difference, but not at 2. A graph of our doubly differenced data appears stationary.

## [1] "number of differences for usnetelec series:"

## [1] 2

## 
##  Box-Ljung test
## 
## data:  diff(usnetelec_boxcox)
## X-squared = 7.9451, df = 10, p-value = 0.6342

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 3 lags. 
## 
## Value of test-statistic is: 0.4315 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 3 lags. 
## 
## Value of test-statistic is: 0.072 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

Our usgdp series appears to be much more stubborn. Ndiffs returns 1 difference. The Box-Ljung doesn’t become insignificant even after many differences are applied and even after seasonality is applied. The unit root test works after 1 difference. The differenced series could still have an upward trend. This series does not lend itsself to simple analysis.

## [1] "number of differences for usgdp series:"

## [1] 1

## 
##  Box-Ljung test
## 
## data:  usgdp
## X-squared = 2078.3, df = 10, p-value < 2.2e-16

## 
##  Box-Ljung test
## 
## data:  diff(usgdp_boxcox)
## X-squared = 42.683, df = 10, p-value = 5.665e-06

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 4 lags. 
## 
## Value of test-statistic is: 0.2013 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

## 
##  Box-Ljung test
## 
## data:  diff(diff(usgdp_boxcox, 12))
## X-squared = 38.319, df = 10, p-value = 3.339e-05

## 
##  Box-Pierce test
## 
## data:  diff(diff(diff(diff(usgdp_boxcox, 8)), lag = 10, type = "Ljung-Box"))
## X-squared = 86.346, df = 1, p-value < 2.2e-16

For our mcopper series, ndiffs shows that 1 difference is the most appropriate. Again, the Box-Ljung test shows that there’s a problem. After differencing, the graph appears to be white noise. The acf now falls more sharply. The unit root test appears successful. There are no unit roots.

## 
##  Box-Ljung test
## 
## data:  mcopper
## X-squared = 3819, df = 10, p-value < 2.2e-16

## 
##  Box-Ljung test
## 
## data:  diff(mcopper)
## X-squared = 64.819, df = 10, p-value = 4.39e-10

## [1] "number of differences for mcopper series:"

## [1] 1

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 6 lags. 
## 
## Value of test-statistic is: 0.0573 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

## 
##  Box-Ljung test
## 
## data:  diff(mcopper_boxcox)
## X-squared = 67.115, df = 10, p-value = 1.594e-10

The enplanements series appears to have seasonality to it. To difference this series, we applied a seasonal difference first. Without differencing, the unit root test appears to show a unit root. With differencing, it has gone away. The acf after differencing no longer falls slowly. The difference series looks like white noise.

## 
##  Box-Ljung test
## 
## data:  enplanements
## X-squared = 2122.7, df = 10, p-value < 2.2e-16

## [1] "number of differences for enplanements series:"

## [1] 1

## [1] "number of seasonal differences for enplanements series:"

## [1] 1

## 
##  Box-Ljung test
## 
## data:  diff(enplanements_boxcox)
## X-squared = 159.09, df = 10, p-value < 2.2e-16

## 
##  Box-Ljung test
## 
## data:  diff(diff(enplanements_boxcox, 12))
## X-squared = 45.529, df = 10, p-value = 1.745e-06

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 5 lags. 
## 
## Value of test-statistic is: 4.3785 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0151

Our visitors series appears to act like our enplanements series. It has seasonality.

## 
##  Box-Ljung test
## 
## data:  visitors
## X-squared = 1522.6, df = 10, p-value < 2.2e-16

## [1] "number of differences for visitors series:"

## [1] 1

## [1] 1

The acf shows sesonality after differencing. We have to apply a seasonal difference also. The unit root test is no longer significant after differencing. The series appears to be white noise.

## 
##  Box-Ljung test
## 
## data:  diff(visitors_boxcox, 3)
## X-squared = 176.62, df = 10, p-value < 2.2e-16

## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 4.5233

## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0508

Question 8:5

The appropriate lambda for the Box-Cox trasformation of our retail series is -.0518. Ndiffs shows that 1 difference is appropriate. We also want to take care of seasonality. Retail data is seasonal and we can see that in our acf graph as well as our original graph. After differencing, our unit root disappears and our data looks like a white noise process.

## [1] -0.05182901

## 
##  Box-Ljung test
## 
## data:  turnover
## X-squared = 2566.8, df = 10, p-value < 2.2e-16

## [1] 1

## [1] 1

## 
##  Box-Ljung test
## 
## data:  diff(diff(turnover_boxcox, 12))
## X-squared = 54.291, df = 10, p-value = 4.282e-08

## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 6.1781

## 
## ####################################### 
## # KPSS Unit Root / Cointegration Test # 
## ####################################### 
## 
## The value of the test statistic is: 0.0215

Question 8:6

For our generated ar series, if alpha is high, the value stays close to the previous value and a directional change will build on previous changes. With low alpha, the white noise will overwhelm the trend from before. It will appear more random.

An ma series has a different relaionship. The noise builds on the previous of noise. The previous value of the series has less effect. After n values in an MA-n series, the previous values have no effect. The process looks more random. The ar series can reach 9 within our set of values. The ma will not get that high.

For our ARMA with alpha .6 and beta .6, our series still looks rather random even after 1000 values. It never strays that far from 0. The ar-2 process with alternating negative and positive alphas, however, diverges and become increasingly high in absolute value while it bounces back and forth from negative to positive.

To see the behavior of our ar-2 model, we generated a few short series using only the previous values and no error term. We can see how the negative and positive values oscillate to create larger absolute values. In a set of terms, we get a -- + ++, which creates 2 positive terms and then -+ + +-, which creates two negative terms. This, along with .8 +.3 > 1, gives us the divergent behavior.

##  [1]  1.0  1.0 -0.5  0.7 -0.7  0.8 -0.8  0.9 -1.0  1.0 -1.1  1.2 -1.3  1.4
## [15] -1.5  1.6 -1.8  1.9 -2.1  2.2 -2.4  2.6 -2.8  3.0 -3.2  3.5 -3.8  4.1
## [29] -4.4  4.7

Question 8:7

After looking the acf and pacf, an AR-1 model is a possibility. The ar-1 model has an alpha of .9666. That is close to 1. The unit root test and the Box-Ljung test indicate that there is a unit root. Ndiff indicates that 2 differences should be applied. After differencing twice, the Box-Ljung test is no longer significant. The unit root test is also not significant after 2 differences. The ar-1 model has a non-zero mean near 1, but the arima (1,2,1) model has a mean of 0.

## Series: murder_set_boxcox 
## ARIMA(1,0,0) with non-zero mean 
## 
## Coefficients:
##          ar1    mean
##       0.9666  1.0012
## s.e.  0.0269  0.1594
## 
## sigma^2 estimated as 0.002968:  log likelihood=81.66
## AIC=-157.32   AICc=-156.85   BIC=-151.3

## [1] 2

After differences, our data no longer looks stationary. The unit roots (reciprocals) are now inside the unit circle.

## Series: murder_set_boxcox 
## ARIMA(1,2,1) 
## 
## Coefficients:
##           ar1     ma1
##       -0.3006  -0.786
## s.e.   0.1529   0.119
## 
## sigma^2 estimated as 0.002851:  log likelihood=80.37
## AIC=-154.74   AICc=-154.25   BIC=-148.83

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 3 lags. 
## 
## Value of test-statistic is: 0.6745 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 3 lags. 
## 
## Value of test-statistic is: 0.5466 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 3 lags. 
## 
## Value of test-statistic is: 0.0532 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

## 
##  Box-Ljung test
## 
## data:  murder_set_boxcox
## X-squared = 257.04, df = 10, p-value < 2.2e-16

## 
##  Box-Ljung test
## 
## data:  diff(murder_set_boxcox)
## X-squared = 13.717, df = 10, p-value = 0.1863

##      Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
## 2005      0.8727642 0.8056456 0.9398828 0.7701151 0.9754133
## 2006      0.8394207 0.7485168 0.9303246 0.7003952 0.9784463
## 2007      0.8050389 0.6856365 0.9244414 0.6224286 0.9876492

## Time Series:
## Start = 2002 
## End = 2004 
## Frequency = 1 
## [1] 0.9799722 0.9348673 0.9095622

Our model has coefficients : alpha -.3006 beta -.786 The equation, in terms of the backshift operator is :(1 + .3006B)(1-B) = (1-.786B) \(w_t\)

Predicting the next three values from our model is not a trivial matter. The epsilon in a time series is not a residual. In an ma-model, that isn’t a trivial and I could not find epsilon in the output.

The auto.arima function returns a (1,2,1) model, indicating that 2 differences and an arma 1,1 model are appropriate. This matches what we decided.

—————————————————————————

_Appendix______________________

exercises are from: https://otexts.com/fpp2/arima-exercises.html

autoplot(ibmclose)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = “corben”,color=‘#249382’,size=12)) acf(ibmclose) pacf(ibmclose)

autoplot(usnetelec)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’) autoplot(usgdp)+ theme(panel.background = element_rect(fill =’#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’) autoplot(mcopper)+ theme(panel.background = element_rect(fill =’#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’) autoplot(enplanements)+ theme(panel.background = element_rect(fill =’#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’) autoplot(visitors)+ theme(panel.background = element_rect(fill =’#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(’’)

#————-usnetelec difference lambda_table<-c(“usnetelec lambda”,“usgdp lambda”,“mcopper lambda”,“enplanements lambda”,“visitors lambda”) lambda_table<-cbind(lambda_table,lambda_table) colnames(lambda_table)<-c(‘’,’’) lambda_table[1,2] <- BoxCox.lambda(usnetelec) lambda_table[2,2] <- BoxCox.lambda(usgdp) lambda_table[3,2] <- BoxCox.lambda(mcopper) lambda_table[4,2] <- BoxCox.lambda(enplanements) lambda_table[5,2] <- BoxCox.lambda(visitors) kable(lambda_table, “html”) %>% kable_styling(“striped”, full_width = F) %>% column_spec(1, bold = T, color = “white”, background = “#73b587”) %>% column_spec(2, bold = T, color = “#73b587”, background = “white”)

lambda <- BoxCox.lambda(usnetelec) usnetelec_boxcox<-BoxCox(usnetelec,lambda) acf(usnetelec) pacf(usnetelec) Box.test(usnetelec, lag=10, type=“Ljung-Box”)

print(“number of differences for usnetelec series:”) ndiffs(usnetelec_boxcox) Box.test(diff(usnetelec_boxcox), lag=10, type=“Ljung-Box”) ur.kpss(diff(usnetelec_boxcox)) %>% summary() ur.kpss(diff(diff(usnetelec_boxcox))) %>% summary() autoplot(diff(diff(usnetelec_boxcox)))+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’differenced usnetelec’) acf(diff(diff(usnetelec))) pacf(diff(diff(usnetelec)))

#————-usgdp difference lambda <- BoxCox.lambda(usgdp) usgdp_boxcox<-BoxCox(usgdp,lambda) acf(usgdp) pacf(usgdp) print(“number of differences for usgdp series:”) ndiffs(usgdp_boxcox) Box.test(usgdp, lag=10, type=“Ljung-Box”) Box.test(diff(usgdp_boxcox), lag=10, type=“Ljung-Box”) ur.kpss(diff(usgdp_boxcox)) %>% summary() autoplot(diff(usgdp_boxcox,12))+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’differenced usgdp’) Box.test(diff(diff(usgdp_boxcox,12)), lag=10, type=“Ljung-Box”) Box.test(diff(diff(diff(diff(usgdp_boxcox,8)), lag=10, type=“Ljung-Box”))) autoplot(diff(usgdp_boxcox,12))+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’differenced usgdp’) acf(diff(usgdp_boxcox))

#————-mcopper difference lambda <- BoxCox.lambda(mcopper) mcopper_boxcox<-BoxCox(mcopper,lambda) acf(mcopper) pacf(mcopper) Box.test(mcopper, lag=10, type=“Ljung-Box”) Box.test(diff(mcopper), lag=10, type=“Ljung-Box”) print(“number of differences for mcopper series:”) ndiffs(mcopper_boxcox) acf(diff(mcopper_boxcox)) pacf(diff(mcopper_boxcox)) ur.kpss(diff(mcopper_boxcox)) %>% summary() Box.test(diff(mcopper_boxcox), lag=10, type=“Ljung-Box”) autoplot(diff(mcopper_boxcox))+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’differenced mcopper’)

#————-enplanements difference lambda <- BoxCox.lambda(enplanements) enplanements_boxcox<-BoxCox(enplanements,lambda) acf(enplanements) pacf(enplanements) Box.test(enplanements, lag=10, type=“Ljung-Box”) print(“number of differences for enplanements series:”) ndiffs(enplanements_boxcox) print(“number of seasonal differences for enplanements series:”) nsdiffs(enplanements_boxcox) acf(diff(diff(enplanements_boxcox,12))) pacf(diff(diff(enplanements_boxcox,12))) Box.test(diff(enplanements_boxcox), lag=10, type=“Ljung-Box”) Box.test(diff(diff(enplanements_boxcox,12)), lag=10, type=“Ljung-Box”) ur.kpss(enplanements_boxcox) %>% summary() ur.kpss(diff(enplanements_boxcox)) autoplot(diff(enplanements_boxcox))+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’differenced enplanements’)

#————-enplanements difference lambda <- BoxCox.lambda(visitors) visitors_boxcox<-BoxCox(visitors,lambda) acf(visitors) pacf(visitors) Box.test(visitors, lag=10, type=“Ljung-Box”) print(“number of differences for visitors series:”) ndiffs(visitors_boxcox) nsdiffs(visitors_boxcox) acf(diff(visitors_boxcox)) pacf(diff(visitors_boxcox))

acf(diff(diff(visitors_boxcox,12))) pacf(diff(diff(visitors_boxcox,12))) Box.test(diff(visitors_boxcox,3), lag=10, type=“Ljung-Box”) ur.kpss(visitors_boxcox) ur.kpss(diff(visitors_boxcox,3)) autoplot(diff(diff(visitors_boxcox,12)))+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’differenced visitors’)

retaildata <- readxl::read_excel(“C:/Users/dawig/Desktop/Data624/retail.xlsx”, skip=1) turnover <- ts(retaildata[,“A3349608L”], frequency=12, start=c(1982,4)) autoplot(turnover, ylab=“turnover”, xlab=“”)+ theme(panel.background = element_rect(fill = ‘#efeae8’))+ ggtitle(“clothing, footwear and personal accessory turnover series”)+ theme(text = element_text(family = “corben”,color=‘#249382’,size=12),panel.background = element_rect(fill = ‘#f4f4ef’)) lambda <- BoxCox.lambda(turnover) lambda turnover_boxcox<-BoxCox(turnover,lambda) acf(turnover) pacf(turnover) Box.test(turnover, lag=10, type=“Ljung-Box”) ndiffs(turnover_boxcox) nsdiffs(turnover_boxcox) acf(diff(turnover_boxcox)) pacf(diff(turnover_boxcox)) Box.test(diff(diff(turnover_boxcox,12)), lag=10, type=“Ljung-Box”) ur.kpss(turnover_boxcox) ur.kpss(diff(turnover_boxcox)) autoplot(diff(turnover_boxcox))+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’differenced visitors’)

#To generate AR series y <- ts(numeric(100)) e <- rnorm(100) for(i in 2:100) y[i] <- 0.6y[i-1] + e[i] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’AR-1 with alpha=.6’) for(i in 2:100) y[i] <- 0.9y[i-1] + e[i] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’AR-1 with alpha=.9’) for(i in 2:100) y[i] <- 0.4y[i-1] + e[i] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’AR-1 with alpha=.4’) for(i in 2:100) y[i] <- 0.1y[i-1] + e[i] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’AR-1 with alpha=.1’) for(i in 2:100) y[i] <- 0.99y[i-1] + e[i] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’AR-1 with alpha=.99’) for(i in 2:100) y[i] <- 0.01y[i-1] + e[i] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’AR-1 with alpha=.01’)

#to generate ma series y <- ts(numeric(100)) e <- rnorm(100) for(i in 2:100) y[i] <- e[i]+ .6e[i-1] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’MA-1 with beta=.6’) for(i in 2:100) y[i] <- e[i]+ .9e[i-1] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’MA-1 with beta=.9’) for(i in 6:100) y[i] <- e[i]+ .99 * e[i-1]+.99e[i-2]+.99 e[i-3]+.99e[i-4]+.99e[i-5] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’MA-1 with beta=.99’)

#To generate ARMA series y <- ts(numeric(1000)) e <- rnorm(1000) for(i in 2:1000) y[i] <- 0.6y[i-1]+0.6e[i-1] + e[i] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’ARMA with alpha, beta=.6’) y <- ts(numeric(100)) for(i in 3:100) y[i] <- -0.8y[i-1]+0.3y[i-2] + e[i] autoplot(y)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’AR-2 with alpha -.8, alpha=.3’)

options(scipen = 10) test_series_1<-c(rep(1,30)) test_series_2<-c(rep(1,30)) test_series_3<-c(rep(1,30)) for (i in 3:30){ test_series_2[i]<-test_series_1[i-1]-.8 test_series_3[i]<-test_series_1[i-2].3 test_series_1[i]<-test_series_2[i]+test_series_3[i] } test_series_1<-sapply(test_series_1,function(x) round(x,1)) test_series_1

murder_set<-wmurders lambda <- BoxCox.lambda(murder_set) murder_set_boxcox<-BoxCox(murder_set,lambda) autoplot(murder_set_boxcox)+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(’’)

acf(murder_set_boxcox) pacf(murder_set_boxcox) roots_vis<-arima(murder_set_boxcox,c(1,0,0),incl=FALSE,optim.control = list(maxit = 50)) plot(roots_vis) Arima(murder_set_boxcox,order=c(1,0,0)) ndiffs(murder_set_boxcox)

acf(diff(diff(murder_set_boxcox))) pacf(diff(diff(murder_set_boxcox)))

autoplot(diff(murder_set_boxcox))+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’first difference’) autoplot(diff(diff(murder_set_boxcox)))+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’second difference’)

auto.arima(murder_set_boxcox)

roots_vis<-arima(murder_set_boxcox,c(1,2,1),incl=FALSE,optim.control = list(maxit = 50)) plot(roots_vis) plot(murder_set_boxcox) plot(diff(murder_set_boxcox)) plot(diff(diff(murder_set_boxcox))) ur.kpss(murder_set_boxcox) %>% summary() ur.kpss(diff(murder_set_boxcox)) %>% summary() ur.kpss(diff(diff(murder_set_boxcox))) %>% summary() Box.test(murder_set_boxcox, lag=10, type=“Ljung-Box”) Box.test(diff(murder_set_boxcox), lag=10, type=“Ljung-Box”)

predictive_model<-arima(murder_set_boxcox,c(1,2,1),incl=FALSE,optim.control = list(maxit = 50)) predictive_model %>% forecast(h=3) window(murder_set_boxcox, start=c(2002))

autoplot(forecast(predictive_model,h=3))+ theme(panel.background = element_rect(fill = ‘#f4f4ef’),panel.grid.major = element_blank(), panel.grid.minor = element_blank(),text = element_text(family = ‘corben’,color=‘#249382’,size=12))+xlab(‘’)+ylab(’’)

Dan Wigodsky

Data 624 Homework 6

March 14, 2019

Question 8:1

b) In each successive simulation, a longer series is created. Critical values are \(\frac { 1.96 }{ \sqrt { n } }\) . As n gets larger, the critical values decrease.

Question 8:2

Question 8:3

All 5 of our datasets have an upward trend. We need to difference each of them to find a stationary series. Mcopper, enplanements and visitors appear to have increasing variance and could use a Box-Cox transformation.

Optimal lambdas for Box-Cox transformation

—————————

Looking at the ACF and PACF, it appears that usnetelec could be AR-1. The Box-Ljung test shows that it is not stationary, however. It will need differencing before we can evaluate it.

Ndiffs returns a suggested difference of 2. After 1 difference, the Ljung-Box test is no longer significant. The unit root test is significant at 1 difference, but not at 2. A graph of our doubly differenced data appears stationary.

For our mcopper series, ndiffs shows that 1 difference is the most appropriate. Again, the Box-Ljung test shows that there’s a problem. After differencing, the graph appears to be white noise. The acf now falls more sharply. The unit root test appears successful. There are no unit roots.

Our visitors series appears to act like our enplanements series. It has seasonality.

The acf shows sesonality after differencing. We have to apply a seasonal difference also. The unit root test is no longer significant after differencing. The series appears to be white noise.

Question 8:5

Question 8:6

For our generated ar series, if alpha is high, the value stays close to the previous value and a directional change will build on previous changes. With low alpha, the white noise will overwhelm the trend from before. It will appear more random.

Question 8:7

After differences, our data no longer looks stationary. The unit roots (reciprocals) are now inside the unit circle.

Our model has coefficients : alpha -.3006 beta -.786 The equation, in terms of the backshift operator is :(1 + .3006B)(1-B) = (1-.786B) \(w_t\)

Predicting the next three values from our model is not a trivial matter. The epsilon in a time series is not a residual. In an ma-model, that isn’t a trivial and I could not find epsilon in the output.

The auto.arima function returns a (1,2,1) model, indicating that 2 differences and an arma 1,1 model are appropriate. This matches what we decided.

—————————————————————————

_Appendix______________________

Dan Wigodsky

Data 624 Homework 6

March 14, 2019

Question 8:1

b) In each successive simulation, a longer series is created. Critical values are \(\frac { 1.96 }{ \sqrt { n } }\) . As n gets larger, the critical values decrease.

Question 8:2

Question 8:3

All 5 of our datasets have an upward trend. We need to difference each of them to find a stationary series. Mcopper, enplanements and visitors appear to have increasing variance and could use a Box-Cox transformation.

Optimal lambdas for Box-Cox transformation

—————————

Looking at the ACF and PACF, it appears that usnetelec could be AR-1. The Box-Ljung test shows that it is not stationary, however. It will need differencing before we can evaluate it.

Ndiffs returns a suggested difference of 2. After 1 difference, the Ljung-Box test is no longer significant. The unit root test is significant at 1 difference, but not at 2. A graph of our doubly differenced data appears stationary.

For our mcopper series, ndiffs shows that 1 difference is the most appropriate. Again, the Box-Ljung test shows that there’s a problem. After differencing, the graph appears to be white noise. The acf now falls more sharply. The unit root test appears successful. There are no unit roots.

Our visitors series appears to act like our enplanements series. It has seasonality.

The acf shows sesonality after differencing. We have to apply a seasonal difference also. The unit root test is no longer significant after differencing. The series appears to be white noise.

Question 8:5

Question 8:6

For our generated ar series, if alpha is high, the value stays close to the previous value and a directional change will build on previous changes. With low alpha, the white noise will overwhelm the trend from before. It will appear more random.

Question 8:7

After differences, our data no longer looks stationary. The unit roots (reciprocals) are now inside the unit circle.

Our model has coefficients : alpha -.3006 beta -.786 The equation, in terms of the backshift operator is :(1 + .3006B)(1-B) = (1-.786B) \(w_t\)

Predicting the next three values from our model is not a trivial matter. The epsilon in a time series is not a residual. In an ma-model, that isn’t a trivial and I could not find epsilon in the output.

The auto.arima function returns a (1,2,1) model, indicating that 2 differences and an arma 1,1 model are appropriate. This matches what we decided.

—————————————————————————

_______________________Appendix____________________________________________

_Appendix______________________