Overview

On Thursday in class, we learned how to “de-noise,” or smooth, a time series. This process makes the trends pop out on plots, and allows some methods of inference to work better for analysis. We learned many different ways of smoothing, and we’ll go about how to do a few of them here.

To do this, we’ll use the jj dataset, which includes Johnson and Johnson’s quarterly earnings over time.

library(astsa)
## Warning: package 'astsa' was built under R version 3.4.3
plot(jj, type="o", ylab="Quarterly Earnings per Share")

We can see from the original plot that as time goes on, the trend gets harder to see as the line becomes more wiggly. This cycle might be due to seasonal trend. We’ll use smoothing to filter out the seasonal trend.

Smoothing Methods

Moving Average

Moving average is the easier kind of smoothing, where each of the weights are equal. This averages out each point with the points around it.

We use the filter() function to create the smoother. We specify that we want to smooth on both sides of each point.

out <- filter(jj, sides=2, filter=rep(1/3,3))
par(mfrow=c(2,1))
plot.ts(jj, main="unsmoothed earnings")
plot.ts(out, main="smoothed earnings")

The trend is much easier to see on the bottom plot with the smoothed earnings!

Weighted Moving Average

With weighted moving average, we get to specify the weights. Let’s give the data point more weight than the two points around it.

(wgts <- c(1, 2, 1)/3)
## [1] 0.3333333 0.6666667 0.3333333
out <- filter(jj, sides=2, filter=wgts)
par(mfrow=c(2,1))
plot.ts(jj, main="unsmoothed earnings")
plot.ts(out, main="smoothed earnings")

We don’t lose as much of the variability in the smoothed data set for this example.

Normal Kernal Smoother

The normal kernal smoother places higher weights on closer points. To do this, we use ksmooth(). Bandwidth indicates how many points on each side of each data point we want to use to create the average. We’ll plot the smoothed line over our original data.

kernalsmooth <- ksmooth(time(jj), jj, kernel = "normal", bandwidth = 1)
plot(jj)
lines(kernalsmooth, lwd=2, col=5)

This line looks way TOO smooth.

Lowess

Lowess smoothing is localized regression based on a certain number of neighbors closest to the point. F is the span.

lowsmooth <- lowess(jj, f=.08)
plot(jj)
lines(lowsmooth, lwd=2, col=5)

Spline

Spline is like Lowess, but it is based on points within a fixed width interval of time, instead of number of points. It breaks up the x-axis into little intervals and uses data in that interval. spar specifies the interval.

splinesmooth <- smooth.spline(time(jj), jj, spar = .3)
plot(jj)
lines(splinesmooth, lwd=2, col = 5)