Time Series Smoothing

Thursday’s class covered different methods of smoothing. Smoothing time series data is generally done to make it easier to identify trends. Each method of smoothing we covered uses averaging around a data point.

The globtemp data from the R package atsa gives the global mean land-ocean temperature deviations from 1880 to 2015. We will examine and compare the different smoothing methods for this time series data.

Without Smoothing

Before we can compare methods, we need to examine the original data using a plot.

library(astsa)
data("globtemp")
plot(globtemp)

We can identify an upward trend in the data along with a spike in global mean land-ocean temperatures around 1940, but the plot is choppy. The trend is made somewhat more difficult to decipher due to noise. We can try some smoothing techniques to further reveal the trend.

Smoothing Methods

The four methods we covered in class were weighted average, normal kernel smoother, lowess smoother, and splines.

Weighted Average

The weighted average smoothing method gives each data point, and its surrounding points on each side equal weight in calculating the average. The filter() command allows us to use moving averages. The arguments are the dataset, whether we wish to use a one or two-sided average, and the weight and number of points to average. We will also compare the original plot with the weighted average smoothed plot.

wtavg<-filter(globtemp, sides=2, filter=rep(1/3, 3))
par(mfrow=c(2,1))
plot(globtemp, main="Unsmoothed Temperature")
plot(wtavg, main= "Smoothed Temperature")

The smoothed plot appear less choppy and it is easier to identify the the overall upward trend.

Kernel Smoothing

Kernel Smoothing allows us to use points further left and right from the main data point. In normal kernel smoothing, the points closest to our main point will have higher weights. Use the ksmooth () command with arguments x and y points from our dataset, the kernel type, and the bandwidth. Bandwidth indicates how far out in each direction the other data points are from the main point. We will use a bandwith of 2, or two points in each direction. We compared this method with the original plot by plotting the smoothed data on top of the original plot.

kernsmooth<-ksmooth(time(globtemp), globtemp, kernel="normal", bandwidth=2)
plot(globtemp)
lines(kernsmooth, lwd=2, col=4)

We can better see how this method of smoothing compares with the original data. Again the jaggedness of the data is reduced.

Lowess Smoothing

Lowess smoothing uses the nearest neighbors and is similar to localized regression. We use the lowess() command with the dataset and f as the arguments. f is the span, and we must be careful to not choose an f value that oversmooths the data which then reduces our ability to identify trends.

lowsmooth<-lowess(globtemp, f=.04)
plot(globtemp)
lines(lowsmooth, lwd=2, col=4)

This method, with f=.04, removes all of the sharp peaks and valleys from the original data.

Spline Smoothing

Spline smoothing is a smoothing technique in which the x-axis is broken up into small intervals and the data is smoothed using each interval. Use the smooth.spline() command with values x and y from the data and spar as the smoothing parameter. We will use a smoothing parameter of .2 and again plot the smoothed curve on top of the original time series data plot.

spline<-smooth.spline(time(globtemp), globtemp, spar=.2)
plot(globtemp)
lines(spline, lwd=2, col=4)

We again see the sharp peaks and valleys reduced in the smoothed curve. It is easier to focus on the trend.