1 About

This is an assignment for Applied Time-Series and Spatial Analysis for Environmental Data. This course is a graduate level, R-based statistics course offered by the Huxley College of the Environment, within Western Washington University.

2 Rolling my own AR(2) series

Here I’ll create my own AR(2) process with some simple looping. I’ll define ϕ1 and ϕ2, and use them in concert with white noise to make a time series with significant correlation at lags ‘t-1’ and ‘t-2’.

n <- 500
epsilon <- rnorm(n = n,mean = 0,sd = 1) # white noise
x <- numeric(length = n)
phi <- 0.6
phi2 <- -0.2
for(i in 3:n){
  x[i] <- phi * x[i-1] + phi2 * x[i-2]   + epsilon[i]  
}

Now with this plotted, lets take a look at the autocorrelation. We know from how the data were generated, that there should be significant positive correlation (~0.6) at lag 1, and significant negative correlation (~0.2) at lag 2. This is based on x[i] <- phi * x[i-1] + phi2 * x[i-2] + epsilon[i]. We can also see that, because this is an AR(2) process, the first two values in the plot at 0. This is because we needed 2 values already in place before the influence of lag 1 and lag 2 were felt by the data.

2.1 Autocorrelation and Partial Autocorrelation

Now, lets plot the acf and pacf of series x.

And the first 4 correlation coefficients from each plot, respectively.

print(round(auto$acf[1:4],3))
## [1] 1.000 0.552 0.176 0.033
print(round(partial$acf[1:4],3))
## [1]  0.552 -0.185  0.029 -0.022

Remember that the first correlation coeficient in any acf is the correlation at lag 0, and thus will always be 1.

We can see that there is significant autocorrelation at lag 1 (0.552), but after lag 1 and correlations become insignificant. Here, there is no physical system creating these data, and so we cannot think about mechanism, but this should make sense givin the code that generate these data. We can also see from the pacf plot (which removes the propegated correlations and allows for a better examination of the actual physical process), that while there is significant positive correlation at lag 1, and significant negative correltaion at lag 2.

3 ‘Nino’ Data, from tseries Package

The data I’m working with are from the R package tseries and are found with data(nino). I’ve taken a subset of the data, working only with nino[,1]. The values are sea surface temperatures (SST) in Nino 3 region, bounded by 90W-150W and 5S-5N. See library(tseries);data(nino) for more info. The data source is Climate Prediction Center.

3.1 Data Summary

See below for a quick summary.

library(tseries); data(nino)
##      nino3 nino3.4
## [1,] 23.97   25.01
## [2,] 24.51   24.92
## [3,] 26.65   26.41
## [4,] 26.65   26.75
## [5,] 25.91   26.30
## [1] 1950.000 1950.083 1950.167 1950.250 1950.333 1950.417

I only want to work with one of the two time series datasets within nino, so I’ll set it as its own object: nino3 <- nino[,1]

summary(nino3); str(nino3)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   22.70   24.72   25.80   25.77   26.70   29.16
##  Time-Series [1:598] from 1950 to 2000: 24 24.5 26.6 26.6 25.9 ...

3.2 Zoo vs Ts class

As Andy mentioned in class, ts objects are useful in many cases, but sometimes its preferable to attach exact dates and times to the data. To do this, I’ll use the zoo package, and import the data into a zoo object format.

library(zoo)
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
tnn <- as.yearmon(1950 + seq(0, 12*50-3)/12)
nino3.zoo <- zoo(nino3, tnn)

3.3 Plotting and autocorrelations

Similar to series x from above, this is an AR(2) process. It seems that SST in the NINO 3 region have significant positive correlations to the previous month, and significant negative correlations to two months previous.

print(round(nino.acf$acf[1:3],3))
## [1] 1.000 0.881 0.636
print(round(nino.pacf$acf[1:3],3))
## [1]  0.881 -0.626 -0.164