About

This is an assignment for Applied Time-Series and Spatial Analysis for Environmental Data. This course is a graduate level, R-based statistics course offered by the Huxley College of the Environment, within Western Washington University.

Smoothing

This document focuses mostly on spectral analysis. I have spent a significant amount of work smoothing some of my thesis data lately, which ties in nicely with this week’s content. Click here to see extensive use of the lowess smoother, used in comparing different temperature datasets.

Spectral Analysis

Here, I’m going to analyze several different datasets using spectral analysis. I will use periodograms to identify the dominant frequencies in these data and the lowess smoother in an attempt to remove certain frequencies to highligh others.

Packages

I’ll be using the following packages: dplR and `tseries``.

Data

Within tseries there are several great ts datasets that allow for spectral exploration. I’ll look at: ice.river, giving flow amounts of two Icelandic rivers, as well as local precipitation and temperature values; and JohnsonJohnson, which shows the quarterly earnings per share of Johnson & Johnson stock.

Icelandic Rivers

Below is are timeseries of mean daily flow from two rivers in Iceland, and mean daily precipitation and temperature values.

data(ice.river)

The series is only 3 years long, but with daily values we have over 1000 data points to work with and we should be able to pull out the dominant frequencies.

It makes sense that the two rivers have a similar spectral signature, and that this also matches up well with temperature.

See that the green. blue, and yellow vertical lines all fall on f = 1. Even though these data are resolved daily, within the ts object the frequency is set to 1 year, thus resulting in the spike on the periodograms at 1 year.

The precipitation values seem to have some longer frequencies imdedded, but with such a small window (3 years), many of the longer frequencies may be ‘invisible’, meaning their periods are longer than the actual dataset, and thus they cannot be seen in the raw plot.

Note

I’ve been trying to figure these graphs out and was messing around with the units on temperature. Turns out the actual numbers dont have much to do with the power of each spectral signature. See below.

kel <- rep(273.15, times=length(temp)) 
vin <- kel + temp
spectrum(vin, log="no",xlim=c(0,20),main="",sub="",col="goldenrod")

Lets move on to something more complex.

Johnson & Johnson Shareholder Earnings

This next data set is of quarterly earnings per share of Johnson & Johnson stock, from 1960 through 1980. Similar to ice.river, this data definitely has some annual trend going on, but there are also some other features of this dataset that make it interesting.

data(JohnsonJohnson)
jj <- JohnsonJohnson

stock <- jj[1:84]
qtr<- seq(from=1960, to=1980.75, by=.25)

f10 <- 4*10/length(qtr)
f10.lo <- lowess(x=qtr,y=stock, f = f10)

detrend <- stock/f10.lo$y

f5 <- 4*5.68/length(detrend)
f5.lo <- lowess(x=qtr,y=detrend, f = f5)

Similar to tree growth, this dataset has a trend that we can correct for, to better expose any underlying frequencies in the data. I’ll fit a stiff 10 year smoother, and then standardize the data by dividing it by the smoother. This can be seen in the plot below.

Now, with the increasing growth trent removed, we can see the annual and multi-annual trends more clearly. Before detrending, the yearly fluctuation was barely visible, but its dominance is much clearer in the detrended plot. Additionally, before detrenging the naked eye (at least mine) could not pick out the longer 5+ year variability. With the growth trend removed, we can use spectrum(detrend) to identify this frequency. Doing so gives a frequency of about 0.0444, which translates into about a period of just over 5 years.