ACF and PACF with different values of Ï•.

I am going to simulate four AR(1) with different strengths of positive autocorrelation. We can look at the plot of the data and then look at the ACF and PACF. We should see an increase in variance with the decrease of autocorrelation.

set.seed(2008)
n <- 100
epsilon <- rnorm(n = n,mean = 0,sd = 1) # white noise
x1 <- numeric(length = n)
x2 <- numeric(length = n)
x3 <- numeric(length = n)
x4 <- numeric(length = n)
for(i in 2:n){
  x1[i] <- 0.95 * x1[i-1] + epsilon[i]  
  x2[i] <- 0.75 * x2[i-1] + epsilon[i]  
  x3[i] <- 0.50 * x3[i-1] + epsilon[i]  
  x4[i] <- 0.25 * x4[i-1] + epsilon[i]  
}
x1<-ts(x1)
str(x1);str(x2);str(x3);str(x4)
##  Time-Series [1:100] from 1 to 100: 0 -0.089 0.258 0.304 0.612 ...
##  num [1:100] 0 -0.089 0.275 0.266 0.523 ...
##  num [1:100] 0 -0.089 0.298 0.208 0.427 ...
##  num [1:100] 0 -0.089 0.32 0.139 0.358 ...
plot(cbind(x1,x2,x3,x4))

As we were expecting there is less variation in the more auto correlated simualtions. With n=100, the pattern is not as drastic to the eye. If we use ACF we should see the pattern that we want.

par(mfrow=c(2,2))

acf(x1)
acf(x2)
acf(x3)
acf(x4)

You can see the ‘echo’ in the ACF plots from the AR(1) pattern. That should be completely accounted for in the PACF and we should only see a pattern at time-step 1.

par(mfrow=c(2,2))

pacf(x1)
pacf(x2)
pacf(x3)
pacf(x4)

Great. No pattern beyond the given values at time-step 1.

Now with arima.sim

No we will use arima.sim to replace the loop from above which was provided by Dr. Andy Bunn.

We should see the same pattern of increasing variation as we decrease Ï•.

set.seed(2008)
ar.sim.1<-arima.sim(model=list(ar=c(.95)),n=100,rand.gen = rnorm,sd=1,mean=0) 

ar.sim.2<-arima.sim(model=list(ar=c(.75)),n=100,rand.gen = rnorm,sd=1,mean=0) 

ar.sim.3<-arima.sim(model=list(ar=c(.5)),n=100,rand.gen = rnorm,sd=1,mean=0) 

ar.sim.4<-arima.sim(model=list(ar=c(.25)),n=100,rand.gen = rnorm,sd=1,mean=0) 


plot(cbind(ar.sim.1,ar.sim.2,ar.sim.3,ar.sim.4))

Wow, the math works. Somehow ar.sim.3 and ar.sim.2 look almost equally even so we should see what the ACF and PACF look like.

par(mfrow=c(2,2))

acf(ar.sim.1)
acf(ar.sim.2)
acf(ar.sim.3)
acf(ar.sim.4)

Looks great. Same kind of pattern that we expected to see and what we saw above. We can see that ar.sim.4 at lag 1 has a very low autocorrelation much below what we set it to at 0.25. Let’s look at the PACF and look farther in.

par(mfrow=c(2,2))

pacf(ar.sim.1)
pacf(ar.sim.2)
pacf(ar.sim.3)
pacf(ar.sim.4)

We see the echo drop away and that ar.sim.4 is indeed even smaller than it was set to. Small n and a weak signal got lost in the noise.

Roll your own AR(2)

Now I am going to make an AR(2) model using a similar loop to above. I will set ϕ1 and ϕ2 to 0.5 and see how ACF and PACF can parse out what is going on. I have upped n to 1000 as well.

set.seed(2008)
n<-1000
eps<-rnorm(n=n, mean =0, sd=1)
y<-numeric(length = n)
phi.1<-0.5
phi.2<-0.5
for (i in 3:n) {
  y[i]<-phi.1*y[i-1]+phi.2*y[i-2]+eps[i]
}

par(mfrow=c(1,1))
plot(y, type = "l")

acf(y, lag.max = 100)

pacf(y)

You can see that the series is very regular due to the double-layer of autocorrelation. ACF returns an echo that extends very far out. Oddly, PACF regularly returns values under 0.5 no matter the set.seed given. The fx must have overcompensated for AR(1) while trying to define AR(2).

Sockeye

This is for escapement of salmon for Hanson Creek starting in 1950. Some initial knowledge is that most salmon who do return will do so 3-5 years after spawning with most coming 4 years later. We should expect to see a pattern at time-lag 4 and possibly 3 and 5 as well.

fishies<- readRDS("HansenSockeye.rds");str(fishies)
##  Time-Series [1:64] from 1950 to 2013: 2175 2395 977 1169 8640 11808 1809 667 3200 5296 ...

There is a large difference right off the bat;some years have 12,000 and some have 667. I wonder if we will find anything that an educated person would recognize as changes in fisheries management strategies.

plot(fishies);plot(acf(fishies));plot(pacf(fishies))

Somehow the magic prevails. The autocorrelation at time-step 4 pokes through. But what catches my eye are the negative autocorrelations that are displayed. The most meaningful to me is at time-step 5. It is non-existent on the ACF plot (#frequentist) but when the expected correlations at lags 1-4 are accounted for, I guess the fx decides that AR(5) is significantly negative. Knowing the ‘truth’ that there should be a ~4 year cycle and that there is not a negative value at lag 2, I may call a bit of bullshit.

The extreme peaks that we see, actually the entire pattern that we see, is from fish usually returning every 4 years. I was pushing my brain a bit too far. It makes sense that there is a valley five years after a huge peak. But I would have expected there to be more of a significant negative correlation farthest away from the peak at lag 3. We see this in the ACF plot but it becomes positive in there PACF. Again, brain churning to make sense.

This is an odd mixture of variables that have some independence from each other in that escapement by year is primarily driven by different generations of fish every year but also intermixing by the jacks and geezers. This example specifically I think is telling us the pattern that there is but may be a bit too detached by year to give the pattern that my brain is searching for.

AKA: These are overlapping cycles, not just one cycle.

I hope this is thorough and interesting enough

I hope this is thorough and interesting enough