Time Series Analysis at a Glance

Author

Arqam Patel

A very high level overview. I have a whole course next semester, so till then, this knowledge should hopefully suffice.

Cross Sectional Analysis

In case of analysis of the various variables at one instant in time. Used to compare variable of our interest with other variables. Generally, Time Series Analysis is accompanied by cross sectional analysis.

Univariate Time Series Analysis

Difference between TSA and regression.

Extrapolation vs interpolation (normal regression)
Prediction intervals (confidence intervals of predictions) grow larger with time due to uncertainty piling up.

Components of Time Series:

Trend
Cyclical
Seasonal
Randomness

We’ll use the AirPassengers dataset (inbuilt in R), which shows global monthly air passenger traffic from 1949 to 1960.

library(tseries)

Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo

plot(AirPassengers)

Seasonality

Tests:

Visual inspection (normal plot)
Seasonal subsequence plots (local average kinda)
Box plots
Correlogram

White Noise

Randomness could be in the form of white noise. (Expectation 0, constant std dev, correlation between diff lags 0 i.e. no seasonality).

If your model residuals are white noise, good job. You’ve captured everything worth capturing in your model.

Data = Signal + Noise

To check if white noise, inspect visually, check local convolutionish averages (should be constant), check ACFs (should be insignificant).

Akaike’s Information Criterion

Stationarity

Constant mean and variance
No seasonality

Check visually or compare several local means or variances

For our current dataset, safe to say that it’s not stationary.

ADF Test

Even the maths says so.

adf.test(AirPassengers)

Warning in adf.test(AirPassengers): p-value smaller than printed p-value


    Augmented Dickey-Fuller Test

data:  AirPassengers
Dickey-Fuller = -7.3186, Lag order = 5, p-value = 0.01
alternative hypothesis: stationary

Making a TS stationary

Differencing

If mean isn’t constant take the difference (\(y_t - y_{t-1}\))

plot(diff(AirPassengers))

Well, we tried. The variance isn’t constant, and there’s still seasonality.

Modelling trend

Just take the trend to be linear or smth and model the residual.

Smoothing variance

Take log or sqrt to stabilise variance.

plot(log(diff(AirPassengers)))

Warning in log(diff(AirPassengers)): NaNs produced

Autocorrelation Coefficient

Simple Pearson’s correlation coefficient between \(A_t\) and \(A_{t-p}\). (may not be reflective of actual predictive value due to indirect effect). No. of significant ACFs is a useful estimate of the number of moving average (MA) coefficients in the model.

acf(AirPassengers)

In our case, the data is highly regular so the ACF is high for most lags. Decaying ACF suggests autoregressive process.

Partial Autocorrelation Coefficient

Coefficient in the regression model. More reflective of actual predictive utility as it accounts for and removes redundant correlation (which is passed on indirectly through time steps). Helps you to identify the number of autoregression (AR) coefficients in an ARIMA model.

pacf(AirPassengers)

~ 3 useful lags.

Forecasting models

Auto Regressive

Linear regression on its own past values.

# 10 years training data
modelo <- arma(AirPassengers[1:120], order = c(2,0))
summary(modelo)


Call:
arma(x = AirPassengers[1:120], order = c(2, 0))

Model:
ARMA(2,0)

Residuals:
    Min      1Q  Median      3Q     Max 
-90.334 -18.958  -5.639  14.965  73.367 

Coefficient(s):
           Estimate  Std. Error  t value Pr(>|t|)    
ar1         1.24926     0.08706   14.349  < 2e-16 ***
ar2        -0.31255     0.08650   -3.613 0.000303 ***
intercept  16.91794     6.92338    2.444 0.014542 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Fit:
sigma^2 estimated as 732.4,  Conditional Sum-of-Squares = 85686.21,  AIC = 1138.1

Moving Average

We model the series in terms of moving averages of “errors” left over from.

Can only predict n periods if n order.

Not too useful on its own.

Invertibility

MA(1) <-> AR(inf)

MA(Inf) <-> AR(1)

Exponential Smoothing

In moving average, all the samples are smoothed with equal weight whereas in exponential smoothing the more we go to past lesser the weight is given to the samples.

Single exponential smoothing

\[ F_t = F_{t-1} + \alpha (F_{t-1}-A_{t-1}) \]

Double exponential smoothing

Accounts for trend

Triple exponential smoothing

Accounts for seasonality

Box Jenkins Method

Multivariate Time Series

Just use multivariate Box Jenkins duh.

References

Time Series Analysis: YouTube playlist by ritvikmath (quite haphazard ordered)

R Cookbook