Time Series Analysis:

Definitions and Reference Material

Not sure how this file is going to work yet!

Visit my website for more like this! I would love to hear your feedback (seriously).

library(TSA, quietly=TRUE, warn.conflicts=FALSE)

## Error: there is no package called 'TSA'

require(knitr)
library(ggplot2)

Data Sources:

Heavily borrowed from: Textbook: Time Series Analysis and It's Application

Wikipedia

1.0 Basic Examples and Definitions

The next few examples are representations of common types of time series. We then defined some terms and theory that are useful to understand prior to moving onto real analysis.

Ex 1.00: Moving Averages

Consider some white noise, smoothed by a moving average using filter(). This smoother representation eliminates some of the faster oscillations, and leaves us with a more representative trend.

set.seed(1122)
d<-rnorm(500, 0, 1) # 500 samples between 0, 1
D<-filter(d, sides=2, rep(1/3, 3))

par(mfrow=c(2,1))
plot.ts(d, main='White Noise')
plot.ts(D, main='Moving Average')

plot of chunk unnamed-chunk-3

Ex 1.11: Autoregressions

Create a prediction of the current value of x as a function of the previous values x - t. Auto-regressive models, and other similar generalizations can be used as an underlying model for many time series data.

One way to to do it in R:

# Add an extra 50 values for boundary effect
d = rnorm(550, 0, 1)
D = filter(d, filter=c(1,-0.9), method='recursive')[-(1:50)]

par(mfrow=c(2,1))
plot.ts(d, main='White Noise')
plot.ts(D, main='Autoregression')

plot of chunk unnamed-chunk-5

Ex 1.12: Random Walk with Drift

When the drift parameter = 0, the value of time series at time t is the value of the series at time t - 1 plus a completely random movement determined by white noise. Here we plot two lines,

Black: drift = 0

Red: drift = 0.2

set.seed(154)
d = rnorm(200, 0, 1); x = cumsum(d)
D = d + 0.2; Dsum = cumsum(D)

plot.ts(Dsum, ylim=c(-5,55), main='Random Walk', ylab='y')
lines(x, col='red'); lines(0.2*(1:200), lty='dashed')

plot of chunk unnamed-chunk-7

Ex 1.13: Signal in Noise

Most time series are composed of an underlying signal with some constant periodic variation, and a random error (noise) term. Generally, we are presented with data that show the signal obscured by noise. The purpose of many time series models is to decompose the time series to understand the underlying trend.

# A simple cosin wave
cs = 3*cos(2*pi*1:500/50 + 0.6*pi)
# Some random noise
noise = rnorm(500, 0, 1)

The ratio of amplitude of the signal to error is called the signal-to-noise ratio (SNR); the larger the SNR, the easier it is to detect the signal. Here, we can easily understand the signal in the second panel, but would have a hard time confidently explaining the third panel.

par(mfrow=c(3,1), mar=c(3,2,2,1), cex.main=1.5)
plot.ts(cs, main=expression(3*cos(2*pi*t/50 + 0.6*pi)))
plot.ts(cs+noise, main=expression(3*cos(2*pi*t/50 + 0.6*pi) + N(0, 1)))
plot.ts(cs+noise*5, main=expression(3*cos(2*pi*t/50 + 0.6*pi) + N(0, 25)))

plot of chunk unnamed-chunk-9

These simple additive models are some of the most common, and takes the form

\[xt = st + vt\]

where st denotes an unknown signal, and vt denotes a white noise or correlated error term.

1.2 Definitions

Since correlation is such an essential component of time series analysis, the best descriptive statistics are expressed in terms of co-variance and correlation.

The Autocorrelation function (ACF) is defined as: \[ρ(s,t) = 􏰿γ(s,s)γ(t,t)\] This function measures the cross-correlation of a signal with itself. Simply, it is the similarity between observations as a function of the time lag between them.
Cross-correlation is a measure of similarity between two time series as a function of a time lag applied to one of them.
Stationarity, or a stationary process is a stochastic process whose joint probability distribution does not change when shifted in time. Consequently, the mean and variance, if present, do not change over time or follow any trends. In time series analysis, we often have to transform raw data to a stationary process to satisfy the assumptions of time series analysis models and functions. This definition of stationarity is known as strict stationarity, and is generally too strong for most modeling applications. Thus most analysis utilize a milder version called weak stationarity.
Weak Stationarity only requires the mean and covariance to remain constant with respect to time. We will now reference weak stationarity as simply, stationary.

2.0: Regression and Exploratory Data Analysis

See next chapter, here