Time Series Smoothing

The main focus of class on Thursday was on the different techniques we can use to smooth time series data. When looking at time series data, there are often large increases and decreases, so smoothing out our data can often be helpful in helping to see clearer trends. There are four different methods of data smoothing that we learned about, but the main concept of it is to gather and analyze the data surrounding a certain point, and then averaging out the values to create a smoother trend line. The four main methods that we learned about were:

To learn more about these, I will use the glob temp data package in R to provide an example.

Example

library(astsa)
plot(globtemp, type="l", ylab="Global Temperature Deviations")

From the original scatter plot of the data, we can see the choppy trend that comes from the globtemp data. This is a great example of a data set that would benefit from smoothing. It can be tough to filter through the noise and see the overarching trends that are here. To smooth this data, we will use the filter command to create a weighted average. This method looks at the values on either one or both sides of a certain point, and then creates a weighted average, giving the point itself, and the point to the left and right, an equal weight. After using the filter command, we will plot the original trend line next to the new trend line to compare.

wt.avg <- filter(globtemp, sides=2, filter=rep(1/3,3)) # moving average
par(mfrow=c(2,1))
plot.ts(globtemp, main="Unsmoothed Temp")
plot.ts(wt.avg , , main="Smoothed Temp")

Looking at the bottom plot, we can see that the jaggedness of the data has been removed, allowing us to see the overall trend more clear. With this smoothing method, we are only looking at the points immediately to the left and right of each point. Another option we can use is kernal smoothing, which allows us to consider points futher out to the left and right.

For this method we will use the ksmooth command, and we will have to specify how far out we want to reach to the left and right. For example, if we wanted to look at the next 2 points to the left or right, we would ahve to use the argument bandwidth = 2. We use the lines command to plot the smoothed line over the orignial trend line.

plot(globtemp)
lines(ksmooth(time(globtemp), globtemp, "normal", bandwidth=2), lwd=3, col=2)

As we can see, the red line does not have the extreme peaks and valleys that the original data set had. I we needed to, we could increase or decrease the bandwidth argument to analyze more than two data points out. This allows us to smooth the trend line to the exact right amount.

One final example of data smoothing is using the Lowess method. This method is similar to localized regression, and will use it’s nearest neighbors to smooth the rend line. One thing to be careful about with this method is oversmoothing. If we use the default span for this command, our trendline will look like this.

plot(globtemp)
lines(lowess(globtemp), lty=1, lwd=2, col=2) 

Which essentially just creates a linear regression line. Instead, we will have to change the default span by adding f=.05 in the argument. Doing this will cause our trendline to look like this:

plot(globtemp)
lines(lowess(globtemp, f = .05), lty=1, lwd=2, col=2) 

Summary

Smoothing the trend in our time series data is extremely important, as it allows us to assess overall trends better throughout the data. All of the methods used help us to achieve this goal, and create a smoother trend across our data. These methods will continue to be used as we learn more about time series analysis and visualization.