Generate a time series of 100 observations and obtain the mean,variance,ACVF and ACF of the following models:
AR(1) Model
AR(3) Model
AR(p) Model
Draw ACF and PACF plot
ACF is an (complete) auto-correlation function which gives us values of auto-correlation of any series with its lagged values. We plot these values along with the confidence band and tada! We have an ACF plot. In simple terms, it describes how well the present value of the series is related with its past values. A time series can have components like trend, seasonality, cyclic and residual. ACF considers all these components while finding correlations hence it’s a ‘complete auto-correlation plot’.
PACF is a partial auto-correlation function. Basically instead of finding correlations of present with lags like ACF, it finds correlation of the residuals (which remains after removing the effects which are already explained by the earlier lag(s)) with the next lag value hence ‘partial’ and not ‘complete’ as we remove already found variations before we find the next correlation. So if there is any hidden information in the residual which can be modeled by the next lag, we might get a good correlation and we will keep that next lag as a feature while modeling. Remember while modeling we don’t want to keep too many features which are correlated as that can create multicollinearity issues. Hence we need to retain only the relevant features.
Auto regressive (AR) process , a time series is said to be AR when present value of the time series can be obtained using previous values of the same time series i.e the present value is weighted average of its past values. Stock prices and global temperature rise can be thought of as an AR processes. The AR process of an order p can be written as, \[y_{t}=c+\phi_{1} y_{t-1}+\phi_{2} y_{t-2}+\cdots+\phi_{p} y_{t-p}+\varepsilon_{t}\]
Order p is the lag value after which PACF plot crosses the upper confidence interval for the first time. These p lags will act as our features while forecasting the AR time series. We cannot use the ACF plot here because it will show good correlations even for the lags which are far in the past. If we consider those many features, we will have multicollinearity issues.This is not a problem with PACF plot as it removes components already explained by earlier lags, so we only get the lags which have the correlation with the residual i.e the component not explained by earlier lags.
We expect PACF to have sharp fall after near lags as these lags near to present can capture the variation so well that we don’t need past lags to predict present.
A \(p\) th-order autoregression, denoted \(A R(p)\), satisfies \[ Y_{t}=c+\phi_{1} Y_{t-1}+\phi_{2} Y_{t-2}+\cdots+\phi_{p} Y_{t-p}+\varepsilon_{t} \] Provided that the roots of \[ 1-\phi_{1} z-\phi_{2} z^{2}-\cdots-\phi_{p} z^{p}=0 \]
all lie outside the unit circle, it is straightforward to verify that a covariancestationary representation of the form \[ Y_{t}=\mu+\psi(L) \varepsilon_{t} \]
\[\psi(L)=\left(1-\phi_{1} L-\phi_{2} L^{2}-\cdots-\phi_{p} L^{p}\right)^{-1}\] and \(\Sigma_{j=0}^{*}\left|\psi_{j}\right|<\infty,\) Assuming that the stationarity condition is satisfied, one way to find the mean is to take expectations of \(:\) \[ \mu=c+\phi_{1} \mu+\phi_{2} \mu+\cdots+\phi_{p} \mu \] or \[ \mu=c /\left(1-\phi_{1}-\phi_{2}-\cdots-\phi_{p}\right) \] Further, \[\begin{aligned} Y_{t}-\mu=\phi_{1}\left(Y_{t-1}-\mu\right)+\phi_{2}\left(Y_{t-2}-\mu\right)+\cdots & \\ &+\phi_{p}\left(Y_{t-p}-\mu\right)+\varepsilon_{t} \end{aligned}\]
AR1<-arima.sim(model=list(order=c(1,0,0),ar=0.5),n=100)
AR1
## Time Series:
## Start = 1
## End = 100
## Frequency = 1
## [1] -2.791790352 -1.401859717 -0.528266985 -1.188401065 0.356027833
## [6] 1.140964864 1.903978448 0.024386935 0.784300333 0.127441276
## [11] -0.088827778 1.002242771 -0.871075428 -0.748530318 -0.763960004
## [16] -0.440869048 1.082819723 1.089351030 1.624760348 -1.133917647
## [21] -0.620349163 -1.669107422 -1.445963636 0.431655662 -1.823440263
## [26] -1.129113922 0.125676751 1.922082608 1.025382700 -0.087963834
## [31] -0.172632488 -0.310836050 -0.897717502 -1.349966716 -0.069916423
## [36] -1.002646950 -0.868260932 0.547535052 0.046006431 0.486215845
## [41] -1.218903590 1.289597120 0.913697349 2.156683225 0.003559838
## [46] -0.646026557 -0.587572523 0.703949795 0.635404388 0.862395660
## [51] 0.162516844 0.208570526 -0.354490111 -1.184971223 0.385044069
## [56] 0.098276703 -0.639269834 -0.115830404 -0.809105385 -1.296580586
## [61] -1.832428303 0.095850588 -0.216595681 -1.762141130 0.394028777
## [66] 1.699001932 1.739740081 0.837076438 -2.105747539 -1.125201228
## [71] -2.606053314 -2.902715630 -2.057845062 -1.788791937 0.315227073
## [76] 0.086225272 1.255637295 -0.439053976 0.230405513 -0.084804557
## [81] 0.337370487 0.304868070 1.174913524 1.491710312 0.976733828
## [86] 1.089886516 0.271008019 1.653392771 1.463131618 1.449168647
## [91] 0.612126956 0.226229237 -0.969868197 -0.907305608 -0.677362960
## [96] -0.179862333 0.725104920 2.046835719 1.350540126 0.408005999
plot(AR1,main=" AR1 Plot", col='blue')
mean(AR1)
## [1] -0.04539197
var(AR1)
## [1] 1.290105
par(mfrow=c(1,3))
acf<-acf(AR1,lag.max=20,type=c("correlation"),main="acf plot",col='red');acf
##
## Autocorrelations of series 'AR1', by lag
##
## 0 1 2 3 4 5 6 7 8 9 10
## 1.000 0.505 0.253 -0.013 -0.119 -0.154 -0.210 -0.046 -0.048 0.078 0.120
## 11 12 13 14 15 16 17 18 19 20
## 0.103 0.094 -0.023 -0.044 -0.096 -0.148 -0.175 -0.117 -0.072 -0.007
acvf<-acf(AR1,lag.max=20,type=c("covariance"),main="acvf plot",col='green');acvf
##
## Autocovariances of series 'AR1', by lag
##
## 0 1 2 3 4 5 6 7
## 1.27720 0.64561 0.32343 -0.01604 -0.15143 -0.19696 -0.26857 -0.05849
## 8 9 10 11 12 13 14 15
## -0.06142 0.09914 0.15380 0.13193 0.11996 -0.02953 -0.05562 -0.12202
## 16 17 18 19 20
## -0.18915 -0.22400 -0.14944 -0.09172 -0.00865
pacf<-pacf(AR1,lag.max=20,main="pacf plot",col='red');pacf
##
## Partial autocorrelations of series 'AR1', by lag
##
## 1 2 3 4 5 6 7 8 9 10 11
## 0.505 -0.003 -0.187 -0.057 -0.035 -0.136 0.157 -0.086 0.089 0.064 -0.036
## 12 13 14 15 16 17 18 19 20
## 0.029 -0.060 -0.026 0.006 -0.120 -0.078 0.052 -0.087 0.036
AR3<-arima.sim(model=list(order=c(3,0,0),ar=c(0.1,0.1,0.1)),n=100)
AR3
## Time Series:
## Start = 1
## End = 100
## Frequency = 1
## [1] 0.83991143 1.43171282 0.77662409 -0.36850628 -0.31586887 1.06467435
## [7] 0.12539017 -0.36504329 0.78360823 0.06016940 -0.97074668 -0.35349200
## [13] -2.39801773 -0.61655828 0.23739020 -0.75846774 -1.46013503 -0.64051753
## [19] 1.14482246 0.40837025 -0.94497962 1.77383824 0.86208232 -0.42891533
## [25] 1.40479231 0.75633624 -0.40317771 0.23770896 -1.03070940 -0.07933099
## [31] -0.73727152 1.49622413 1.58036858 0.70034270 0.65516851 0.67436318
## [37] 0.74592911 -0.87325272 -0.37917156 0.29808972 -0.28705969 -1.14390148
## [43] -0.56633550 0.66598833 1.15991360 0.11650757 -1.78667048 0.88318545
## [49] -0.80572000 -0.52614710 -0.23876714 -2.45908139 0.62881889 0.95598438
## [55] 2.07675111 -0.40530588 1.51639968 -0.62444580 0.42200172 -0.63768818
## [61] 0.03938523 -1.65892602 0.27329747 -0.63916997 -0.60589146 0.39188766
## [67] -0.51094170 -0.04882894 -0.80858022 1.99077727 -1.68658979 0.14526733
## [73] -0.38176627 -0.04477148 -0.23531055 0.32532546 0.18998084 1.30096251
## [79] -1.34743644 0.44116110 -1.46358938 -2.35276769 -0.52872372 -1.39910571
## [85] -3.00951284 0.13724424 -1.52933144 -0.09863215 1.36633507 -0.47509440
## [91] -0.03937219 -0.92083069 -0.16991617 0.06033894 -0.07336577 1.24942085
## [97] -0.75192204 1.85481166 -1.64000359 -0.60644593
plot(AR3, main=" AR3 Plot", col='blue')
mean(AR3)
## [1] -0.09382448
var(AR3)
## [1] 1.066067
par(mfrow=c(1,3))
acf<-acf(AR3,lag.max=20,type=c("correlation"),main="acf plot",col='red');acf
##
## Autocorrelations of series 'AR3', by lag
##
## 0 1 2 3 4 5 6 7 8 9 10
## 1.000 0.007 0.167 0.092 0.017 -0.073 -0.060 -0.119 -0.040 -0.017 0.020
## 11 12 13 14 15 16 17 18 19 20
## -0.007 -0.067 0.101 -0.077 -0.058 -0.079 -0.012 0.032 -0.012 0.067
acvf<-acf(AR3,lag.max=20,type=c("covariance"),main="acvf plot",col='green');acvf
##
## Autocovariances of series 'AR3', by lag
##
## 0 1 2 3 4 5 6 7
## 1.05541 0.00762 0.17641 0.09754 0.01841 -0.07687 -0.06310 -0.12603
## 8 9 10 11 12 13 14 15
## -0.04190 -0.01770 0.02134 -0.00727 -0.07047 0.10648 -0.08127 -0.06086
## 16 17 18 19 20
## -0.08368 -0.01279 0.03340 -0.01261 0.07091
pacf<-pacf(AR3,lag.max=20,main="pacf plot",col='red');pacf
##
## Partial autocorrelations of series 'AR3', by lag
##
## 1 2 3 4 5 6 7 8 9 10 11
## 0.007 0.167 0.093 -0.011 -0.108 -0.075 -0.096 -0.002 0.037 0.049 -0.012
## 12 13 14 15 16 17 18 19 20
## -0.110 0.078 -0.063 -0.074 -0.075 0.016 0.087 -0.009 0.052