So what is this dataset like?

Well, I started by accessing the dataset in R using data(lynx) and then looking at its basic dimensions using class(), start(), end(), and frequency() commands.

Or, more simply:

data(lynx); trapped <- lynx
class(trapped)
## [1] "ts"
tsp(trapped)
## [1] 1821 1934    1

One can see that ‘lynx’ is a time series with numbers of annual lynx trappings in Canada from 1821-1934. Using the length() command in R, it is evident that the dataset contains 114 values. Makes sense.

Ok, so it is a nice, simple dataset with no missing values. But what about all of the missing lynx?! Glorious though these data may be, the practice of trapping lynx most definitely is not. In 1821, the year that lynx trappin’ data collection began, 269 of these animals were trapped, and by the time that somebody finally grew tired of recording data (and of trapping lynx, let’s hope), they’d trapped a total of 175334 of these poor creatures! Lord only knows what they used these for…fur coats? Anyhow, let’s plot a histogram of the data, throw a normal curve on there, and examine the shape.

## Warning: package 'moments' was built under R version 3.1.3

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    39.0   348.2   771.0  1538.0  2567.0  6991.0

This is quite skewed toward the high end, suggesting that the median value of 771 lynx trapped per year is probably more informative than the mean value of 1538. Let’s get a visual to see why this might be the case…

Whoa, interesting! Lots of up and down, possibly with some sort of regular periodicity. However, as the skewness indicates, rather than oscillating cyclicly around some central value, it seems that there is some usual number of lynx trappings, a sort of baseline range, and then every 10 or so years the number of trappings spikes. Assuming that the number of lynx trappings is associated with the size of the lynx population, there must be some sort of ecological explanation for the longer-than-seasonal periodicity of this dataset, perhaps based on longer-term reproduction cycles or resource availability.

Autocorrelation in Lynx

Here are the ACF and PACF plots for lynx:

These correlograms indicate that the autocorrelation for lynx trappings is a higher-order process, in line with what I concluded based on the plot of lynx trappings over time. What order is it? Let’s look at the AIC values and the AR values, as calculated by the ar() function, which tries to fit the model as an AR process:

Relying strictly on the AIC values to determine the order, I would conclude that this is an AR(8) process, as the AIC is minimized at lag 8. The plot of the AR values, however, indicates that this is probably not an AR process, but rather an ARMA process, because of the way that the autocorrelation shifts back and forth between negative and positive values.