We pick up from where we left off in Chapter 1. In this Chapter we will cover key concepts of Time Series Analysis:

Revisit Time Series Data Preparation (Transformation , Missing Value treatment , Splitting into Train and Test)
White Noise Time Series
Random Walk Time Series
Stationarity
Seasonality
Autocorrelation
Partial Autocorrelation

Credits : 365 Data Science Team “Time Series Analysis in Python 2021”

1 Data Preparation

Let us begin by preparing our Time Series data for analysis. We will be analyzing the S&P Index values.

1.1 Load Libraries

import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import statsmodels.graphics.tsaplots as sgt 
import statsmodels.tsa.stattools as sts 
from statsmodels.tsa.seasonal import seasonal_decompose
import seaborn as sns
sns.set()
import warnings
warnings.filterwarnings('ignore')

1.2 Load , Transform and Prepare Data for Modeling

# load data and make a copy
raw_csv_data = pd.read_csv("index_2018.csv") 
df_comp=raw_csv_data.copy()

# convert date column to datetime
df_comp.date = pd.to_datetime(df_comp.date, dayfirst = True)

# set the index as date
df_comp.set_index("date", inplace=True)

# set the frequency as daily by business days
df_comp=df_comp.asfreq('b')

# fill missing values using forward fill 
df_comp=df_comp.fillna(method='ffill')

# final view
df_comp.head()

##                spx      dax     ftse    nikkei
## date                                          
## 1994-01-07  469.90  2224.95  3445.98  18124.01
## 1994-01-10  475.27  2225.00  3440.58  18443.44
## 1994-01-11  474.13  2228.10  3413.77  18485.25
## 1994-01-12  474.17  2182.06  3372.02  18793.88
## 1994-01-13  472.47  2142.37  3360.01  18577.26

# create a column market value which has copies of S&P only
df_comp['market_value']=df_comp.spx

# deleted unwanted columns
del df_comp['spx']
del df_comp['dax']
del df_comp['ftse']
del df_comp['nikkei']

# final view
df_comp.head()

##             market_value
## date                    
## 1994-01-07        469.90
## 1994-01-10        475.27
## 1994-01-11        474.13
## 1994-01-12        474.17
## 1994-01-13        472.47

1.3 Prepare Training and Testing Datasets

# split into 80% train and 20% test
# create a variable with 80% size of the data
size = int(len(df_comp)*0.8)

# create the training and testing sets 
# we will keep the name df for the training set and df_test for test set
df, df_test = df_comp.iloc[:size], df_comp.iloc[size:]

# view the training and testing data sets
df.tail()

##             market_value
## date                    
## 2013-04-01   1562.173837
## 2013-04-02   1570.252238
## 2013-04-03   1553.686978
## 2013-04-04   1559.979316
## 2013-04-05   1553.278930

df_test.head()

##             market_value
## date                    
## 2013-04-08   1563.071269
## 2013-04-09   1568.607909
## 2013-04-10   1587.731827
## 2013-04-11   1593.369863
## 2013-04-12   1588.854623

2 White Noise

2.1 Introduction to White Noise

White Noise is a special type of time series where the data doesn’t follow a pattern.Recall that one of the assumptions when we split our data into training and test sets i.e. the patterns found in the past also persist in the future.

In the case of White Noise since no pattern can be found, we can’t predict white noise.

For a Time Series to be categorized as White Noise , it must satisfy these three conditions :

A constant mean μ : Mean values are constant across intervals of the Time Series
A constant variation σ2 : Variation is constant across intervals of the Time Series
No Autocorrelation : There is no clear relationship between past and present values of a time series.

To summarize : White noise is a sequence of random data where every value has a time period associated with it.It behaves sporadically, so there is no way to successfully project it into the future.

In Financial Modeling, it is important to distinguish white noise data from regular time series data.We can easily tell the two apart by comparing their graphs.

We can generate white noise data and plot its values, then we can try plotting the graph of the S&P closing prices and compare the two.

2.2 Creating a White Noise Time Series

Let us create a White Noise series and store the values in a variable called “wn”.

To create WN , we can use np.random.normal() method of the Numpy package. We want the White Noise sequence to be comparable to the S&P so we should set its mean and standard deviation to that of the actual set.

The random.normal method takes three arguments :

loc (location) : this is the mean which in our case is the mean of market_value
scale (spread) : this is the variation in the distribution in our case sd of the market value
size (how many) : this is the number of values we want to generate in our case the size of the TS

np.random.seed(42)
# creating a White Noise TS based on the distribution of S&P Market value distribution
WN = (np.random.normal(loc = df.market_value.mean() , 
                      scale = df.market_value.std() , 
                      size = len(df)))
# add column WN to existing dataframe df
df['WN'] = WN

# view the dataframe
df.head()

##             market_value           WN
## date                                 
## 1994-01-07        469.90  1236.970264
## 1994-01-10        475.27  1051.201420
## 1994-01-11        474.13  1281.139223
## 1994-01-12        474.17  1537.228455
## 1994-01-13        472.47  1023.148181

Let us look at the summary statistics of market_vale and WN

# summary statistics
df.describe()

##        market_value           WN
## count   5021.000000  5021.000000
## mean    1091.651926  1093.238344
## std      292.559287   291.504159
## min      438.920000   143.389063
## 25%      914.620000   899.331480
## 50%     1144.650000  1095.405064
## 75%     1316.280729  1286.474250
## max     1570.252238  2240.309230

Mean & Standard Deviation of White Noise are similar but not exactly equal to that of S&P.The reason being , White Noise values we generated are normally distributed around the mean of the S&P.However, since each one is generated individually, the average does not have to end up being the same.

2.3 Plotting White Noise

Let us plot the “White Noise”

## create the line chart by extracting only the spx values
sns.lineplot(data = df.WN);
## add the title

plt.title("White-Noise-Time-Series" , size = 24);
## display the plot

plt.show()

We can see that there is no clear pattern.Let us plot both “White Noise” and “S&P”…

3 Random Walk

3.1 Introduction

A random walk is a special type of time series where values tend to persist over time and the differences between periods are white noise.

Suppose we express prices with P and residuals with ϵ then a Random Walk Time Series can be expressed as : Pt = Pt−1 + ϵt

.The underlying assumption is that the residuals are white noise, so they are arbitrary and cannot be predicted.

This suggests that Best Estimates for prices today are the prices yesterday and Best estimates for Tomorrow’s prices are today’s prices.

To get a better idea of what a random walk process looks like, let us load some data and plot it against the S&P prices for comparison.We have a random walk dataset that we will plot against the S&P500.

3.2 Generating Random Walk TS

We will load the given dataset , convert it to a Time Series by converting date column to datetime type , set the index of the dataframe as date and apply “frequency as Business Days”

# load the random walk dataset
rw = pd.read_csv("rand_walk.csv")
print(rw.head())

##         date        price
## 0   7/1/1994  1122.139662
## 1   8/1/1994  1135.238562
## 2   9/1/1994  1109.897831
## 3  10/1/1994  1080.347860
## 4  11/1/1994  1082.095245

print(rw.info())

## <class 'pandas.core.frame.DataFrame'>
## RangeIndex: 7029 entries, 0 to 7028
## Data columns (total 2 columns):
##  #   Column  Non-Null Count  Dtype  
## ---  ------  --------------  -----  
##  0   date    7029 non-null   object 
##  1   price   7029 non-null   float64
## dtypes: float64(1), object(1)
## memory usage: 82.4+ KB
## None

# convert date column to datetime type
rw.date = pd.to_datetime(rw.date, dayfirst = True)
rw.info()

## <class 'pandas.core.frame.DataFrame'>
## RangeIndex: 7029 entries, 0 to 7028
## Data columns (total 2 columns):
##  #   Column  Non-Null Count  Dtype         
## ---  ------  --------------  -----         
##  0   date    7029 non-null   datetime64[ns]
##  1   price   7029 non-null   float64       
## dtypes: datetime64[ns](1), float64(1)
## memory usage: 109.9 KB

# reset index to date
rw.set_index("date", inplace = True)
rw.head()

##                   price
## date                   
## 1994-01-07  1122.139662
## 1994-01-08  1135.238562
## 1994-01-09  1109.897831
## 1994-01-10  1080.347860
## 1994-01-11  1082.095245

# define the frequency in this case business day
rw = rw.asfreq('b')
rw.head()

##                   price
## date                   
## 1994-01-07  1122.139662
## 1994-01-10  1080.347860
## 1994-01-11  1082.095245
## 1994-01-12  1083.639265
## 1994-01-13  1067.146255

3.3 Plotting the Random Walk against S&P

We will add the RW column to our existing S&P indices dataframe and plot both the Time Series together

# add the rw values to existing dataframe df
df['rw'] = rw.price
df.head()

##             market_value           WN           rw
## date                                              
## 1994-01-07        469.90  1236.970264  1122.139662
## 1994-01-10        475.27  1051.201420  1080.347860
## 1994-01-11        474.13  1281.139223  1082.095245
## 1994-01-12        474.17  1537.228455  1083.639265
## 1994-01-13        472.47  1023.148181  1067.146255

Let us create a dataframe without WN

df_n_wn = df[['market_value' , 'rw']]
df_n_wn.head()

##             market_value           rw
## date                                 
## 1994-01-07        469.90  1122.139662
## 1994-01-10        475.27  1080.347860
## 1994-01-11        474.13  1082.095245
## 1994-01-12        474.17  1083.639265
## 1994-01-13        472.47  1067.146255

# plotting all the S&P and white random walk together
sns.lineplot(data = df_n_wn);
plt.show()

The two time series looks somewhat similar and both have small variations between consecutive time periods.

Both time series have cyclical increases and decreases in short periods of time.

4 Stationarity

4.1 Introduction

Technically speaking,time series stationary implies that taking consecutive samples of data with the same size should have identical covariance regardless of the starting point. Covariance(S1) = Covariance(S2) where S1 and S2 are the time periods of observations of the same lengths.

Statistically speaking , a time series whose statistical properties, such as mean, variance, etc., remain constant over time, are called a stationary time series.Statistical properties of a Stationary Time Series are independent of the point in time when the observations are recorded. More precisely, if {yt} is a stationary time series, then for all s, the distribution of (yt,…,yt+s) does not depend on t

In general, a stationary time series will have no predictable patterns in the long-term.

Time plots will show the series to be roughly horizontal , with constant variance https://otexts.com/fpp2/stationarity.html

4.2 Covariance Stationarity (weak-form stationarity)

Covariance(S1) = Covariance(S2) where S1 and S2 are the time periods of observations of the same lengths

We can classify a time series as covariance stationary if it satisfies three key assumptions :

Constant Mean \(\mu\)
Constant Variance \(\sigma^2\)
Covariance \(\sigma^2\)S1 = Covariance \(\sigma^2\)S2 i.e. Cov (x1 , x4) = Cov(x3 , x6)

Covariance between 1st and 4th period = Covariance between 3rd and 6th period

Distance between x1 and x4 = Distance between x3 and x6

An example of a Stationary Time Series is “White Noise” which has a constant mean and a constant variation.

Covariance = Correlation * \(\sigma\) we can say that correlation between time desired time periods = 0

4.3 Detecting Stationarity (Augmented Dickey Fuller ADF Test)

The 20th century, statisticians David Dickey and Wayne Fuller developed a test to help all of us determine whether the Time Series Data comes from a stationary or non stationary process.

The Augmented Dickey-Fuller (ADF) test is a type of statistical test applied on the Time Series data to assess stationarity.

Null Hypothesis (H0) : Series is non-stationary {the one lag autocorrelation coefficient \(\phi1\) < 1}

Alternate Hypothesis (HA) : Series is stationary {the one lag autocorrelation coefficient \(\phi1\) = 1}

If Test Statistic < Critical Value and p-value < 0.05 – Reject Null Hypothesis(H0) i.e., time series data comes from a stationary process

Let us conduct the ADF test on “market_value” , “white noise” and “random walk” data we have in our data set

The statsmodels package provides a reliable implementation of the ADF test via the adfuller() function in statsmodels.tsa.stattools. It returns the following outputs:

The p-value
The value of the test statistic
Number of lags considered for the test
The critical value cut-offs.

When the test statistic is lower than the critical value shown, you reject H0 and infer that the time series is stationary.

Let us run the ADF test on market value

sts.adfuller(df.market_value)

## (-1.7369847452352445, 0.4121645696770617, 18, 5002, {'1%': -3.431658008603046, '5%': -2.862117998412982, '10%': -2.567077669247375}, 39904.880607487445)

Let us create a small function to print out results in customized manner

# Function to print out results in customised manner
from statsmodels.tsa.stattools import adfuller
def adf_test(timeseries):
    print ('Results of Augmented-Dickey-Fuller Test:')
    adftest = adfuller(timeseries, autolag='AIC')
    adfoutput = pd.Series(adftest[0:4], index=['Test Statistic','p-value','#Lags Used','Number of Observations Used'])
    for key,value in adftest[4].items():
        adfoutput['Critical Value (%s)'%key] = value
    print (adfoutput)

Let us run the ADF Test on “market_vale” , “wn” and “rw”

# Call the function and run the test on market value
adf_test(df['market_value'])

## Results of Augmented-Dickey-Fuller Test:
## Test Statistic                   -1.736985
## p-value                           0.412165
## #Lags Used                       18.000000
## Number of Observations Used    5002.000000
## Critical Value (1%)              -3.431658
## Critical Value (5%)              -2.862118
## Critical Value (10%)             -2.567078
## dtype: float64

Inference : Market Value comes from a “Non-Stationary Time Series”

The Test Statistic (-1.736) > Critical Value (-2.862) @ 5% significance level with a p-value of the test statistic (0.412) > 5% significance level 0.05.
We reject the Null Hypothesis and conclude that the we do not have sufficient evidence that the S&P Index data comes from a “Stationary Time Series”.There is a 41% chance of rejecting H0.
The number of lags is 18 implying that there are some autocorrelation going back 18 periods.

Let us test if “White Noise Comes from a Stationary Time Series”

# Call the function and run the test on white noise
adf_test(df['WN'])

## Results of Augmented-Dickey-Fuller Test:
## Test Statistic                  -71.797823
## p-value                           0.000000
## #Lags Used                        0.000000
## Number of Observations Used    5020.000000
## Critical Value (1%)              -3.431653
## Critical Value (5%)              -2.862116
## Critical Value (10%)             -2.567077
## dtype: float64

Inference : White Noise comes from Stationary Time Series

The Test Statistic (-71.977) < Critical Value (-3.43) @ 5% significance level with a p-value of the test statistic (~0) < 5% significance level 0.05.
We fail to reject the Null Hypothesis and conclude that the we have sufficient evidence that the White Noise data comes from a “Stationary Time Series”.There is a 0% chance of not rejecting H0.

# Call the function and run the test on random walk
adf_test(df['rw'])

## Results of Augmented-Dickey-Fuller Test:
## Test Statistic                   -1.328607
## p-value                           0.615985
## #Lags Used                       24.000000
## Number of Observations Used    4996.000000
## Critical Value (1%)              -3.431660
## Critical Value (5%)              -2.862119
## Critical Value (10%)             -2.567078
## dtype: float64

Inference : Random Walk comes from a “Non-Stationary Time Series”

The Test Statistic (-1.3286) > Critical Value (-2.862) @ 5% significance level with a p-value of the test statistic (0.615) >5% significance level 0.05.
We reject the Null Hypothesis and conclude that the we do not have sufficient evidence that the Random Walk data comes from a “Stationary Time Series”.There is a 60% chance of rejecting H0

5 Seasonality

5.1 Introduction

Seasonality occurs when Time Series data exhibits regular and predictable patterns at time intervals that are smaller than a year. An example of a time series with seasonality is retail sales, which often increase between September to December and will decrease between January and February.Seasonality is quite common in economic time series but less common in engineering and scientific data.An increase in water consumption in summer due to warmer weather.

Seasonal effects are different from cyclical effects, as seasonal cycles are observed within one calendar year, while cyclical effects, such as boosted sales due to low unemployment rates, can span time periods shorter or longer than one calendar year.https://www.investopedia.com/terms/s/seasonality.asp

5.2 Identifying Seasonality (Decomposition)

There are several ways of identifying Time Series for Seasonality. One such method involves Decomposing the Time Series. Decomposition procedures are used in time series to describe the trend and seasonal factors in a time series.More extensive decompositions might also include long-run cycles, holiday effects, day of week effects and so on.

One of the main objectives for a decomposition is to estimate seasonal effects that can be used to create and present seasonally adjusted values. A seasonally adjusted value removes the seasonal effect from a value so that trends can be seen more clearly. For instance, in many regions of the U.S. unemployment tends to decrease in the summer due to increased employment in agricultural areas. Thus a drop in the unemployment rate in June compared to May doesn’t necessarily indicate that there’s a trend toward lower unemployment in the country. To see whether there is a real trend, we should adjust for the fact that unemployment is always lower in June than in May.

One of the most common methods is to decompose the Time Series into Three Components :

TREND Tt : Trend is defined as the ‘long term’ movement in a time series without calendar related and irregular effects, and is a reflection of a pattern consistent throughout the data

SEASONAL St: A seasonal effect is a systematic and calendar related effect

RESIDUAL Rt : The error of prediction or the difference between the actual data and the model we fit.This is also the remainder component after we have decomposed into trend and seasonality

The simplest type of decomposition is called naive wherein we expect a linear relationship between the three parts and the observed Time Series.Furthermore we have two types of naive decompositions :

Additive \(x_t\) = Trend + Seasonal + Residual… Additive assumes that for any time period, the observed value is the sum of the trend, seasonal and residual for that period.The additive model is useful when the seasonal variation is relatively constant over time.

Multiplicative \(x_t\) = Trend * Seasonal * Residual…The multiplicative decomposition assumes the original series is a product of the trend,seasonal and residual values.The multiplicative model is useful when the seasonal variation increases over time.When the variation in the seasonal pattern, or the variation around the trend-cycle, appears to be proportional to the level of the time series, then a multiplicative decomposition is more appropriate. Multiplicative decompositions are common with economic time series.

Look at the plot below where in the case of “Additive Seasonality” the amplitude of the seasonal variation is independent of the level, and in the case of “Multiplicative Seasonality , seasonal variation and level are connected i.e. seasonal variations are becoming”wider".https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/

blue dotted lines : variation in magnitude / amplitude

red solid line : trend component

5.3 Seasonal Decomposition (statsmodels)

The stats models package includes a method called seasonal_decompose that takes a Time Series and splits it up into the three parts.

seasonal_decompose(df.ts, model = “xyz”) : ts is the Time Series , model is either additive or multiplicative

First, we store the output of the seasonal_decompose method in a variable s_dec.

Second , we apply the .plot() method on the variable which will render a graphical visualization of the decomposition

# import seasonal decompose from statsmodel
from statsmodels.tsa.seasonal import seasonal_decompose
import matplotlib.pyplot as plt

# additive decomposition
s_dec_add = seasonal_decompose(df.market_value , model = "additive")

# plot the results
s_dec_add.plot()

plt.show()

The 1st plot is the Observed Time Series without decomposition i.e. Actual Time Series

The 2nd plot is Trend Component of the Time Series. The trend closely resembles the observed series, that’s because the decomposition function uses the previous period values as a trendsetter.The trend part of the decomposition explains most of the variability of the data

The 3rd plot is Seasonal Component of the Time Series , which looks like a rectangle..This happens when the values are constantly oscillating and the figure size is too small.In our case, the linear change results from constantly switching up and down between (-0.2) and (+0.1) for every period.Therefore, there is no concrete cyclical pattern determined by using naive decomposition.

The 4th plot is the Residual / Error / Random Component of the Time Series , which is the difference between true values and predictions for any time period.The residuals vary greatly around the turn of the century (2000 : dot-com bust) and around 2008 (housing prices bubble), which explains the instability in the Time Series..

Inference : The results of the additive decomposition suggest no seasonality in the data

# import seasonal decompose from statsmodel
from statsmodels.tsa.seasonal import seasonal_decompose
import matplotlib.pyplot as plt

# additive decomposition
s_dec_add = seasonal_decompose(df.market_value , model = "multiplicative")

# plot the results
s_dec_add.plot()

plt.show()

Inference : The seasonal sequence has no clear pattern , and the trend closely resembles the observed series

Examining the “Additive” and “Multiplicative Decomposition” we have proof there is no seasonality among S&P prices

6 Autocorrelation

6.1 Introduction

We know that we avoid shuffling time series data because we want to preserve the chronological order of the set to discover links between past and present values within the Time Series.

We are interested in the relationship between the entries for consecutive periods, T and T-1.

Correlation : \(\rho\) (x,y) measures the similarity in the change of values of two series x and y. In the case of Time Series data we have a single variable. To calculate the similarity in the change through time for a single series we introduce the concept of Autocorrelation. Autocorrelation measures the correlation between the sequence and itself to be more precise , it measures the level of resemblance between a sequence from several periods ago and the actual data.

Just as correlation measures the extent of a linear relationship between two variables, autocorrelation measures the linear relationship between lagged values of a time series.

Mathematically speaking , autocorrelation is a representation of the degree of similarity between a given time series and a lagged version of itself over successive time intervals. It’s conceptually similar to the correlation between two different time series, but autocorrelation uses the same time series twice: once in its original form and once lagged one or more time periods.

For example, if it’s rainy today, the data suggests that it’s more likely to rain tomorrow than if it’s clear today. When it comes to investing, a stock might have a strong positive autocorrelation of returns, suggesting that if it’s “up” today, it’s more likely to be up tomorrow, too.

If we find the autocorrelation for a Time series with Daily Frequency, we’re determining how much of yesterday’s values resemble today’s values.If the frequency is instead annual, autocorrelation will measure the similarities from year to year.

6.2 Autocorrelation Function (ACF)

In Time series analysis, it is vital to compute and compare autocorrelation values between different lag’s.To do so, we need to introduce the auto correlation function or ACF for short.The ACF computes the autocorrelation value for the number of lag’s we are interested in simultaneously.

\(\rho\) (Xt , Xt-1) — Autocorrelation with 1 Lag

\(\rho\) (Xt , Xt-2) — Autocorrelation with 2 Lags

\(\rho\) (Xt , Xt-3) — Autocorrelation with 2 Lags

The stats model’s graphics, TSA plots package contains a method plot_acf() for plotting the auto correlation function.The various parameters of the plot_acf() are illustrated below

Let us visualize the ACF plot of the S&P index values over 40 lags

# import the time series graphics package
import statsmodels.graphics.tsaplots as sgt

# plot the ACF (correlogram) over 40 lags
sgt.plot_acf(df.market_value, lags = 40, zero = False)

plt.title("ACF S&P", size = 24)

## Text(0.5, 1.0, 'ACF S&P')

plt.show()

Interpreting the ACF plot

Values on the X axis represent lag’s which go up to 40 since we had included lags = 40.
Values on the Y axis indicate the possible values for the auto correlation coefficient (\(\rho\)), which vary between -1 to +1.
Each Vertical Line represents the correlation coefficient between the original Time Series and the corresponding lagged copy of the TS.The 1st line indicates the coefficient values for “One” time period ago i.e. lag-1
The shaded blue area around the x axis represents significance of the Autocorrelation. Coefficient values situated outside the “Blue Shaded Area” are significantly different from zero, which suggests the existence of autocorrelation for that specific lag.
The Shaded Blue Area expands as lag values increase.The greater the distance in time, the more unlikely that the autocorrelation persists e.g. Today’s Prices are usually closer to Yesterday’s Prices than Prices 10 or 20 days ago.We need to make sure the autocorrelation coefficient in higher lags is sufficiently greater to be significantly different from zero.
The Vertical lines are higher than the blue region, which suggests the coefficients are significant, an indicator of time dependence in the data
The Autocorrelation barley reduces over lags (note the height change over lags) suggesting that Prices even 30 days ago can serve as decent estimates of future prices

Let us plot the ACF for White Noise

# import the time series graphics package
import statsmodels.graphics.tsaplots as sgt

# plot the ACF (correlogram) over 40 lags
sgt.plot_acf(df.WN, lags = 40, zero = False)

plt.title("ACF White Noise", size = 20)

## Text(0.5, 1.0, 'ACF White Noise')

plt.show()

Interpreting the ACF Plot for White Noise :

Coefficient values exhibit patterns of positive and negative autocorrelation, which contrasts the act for closing prices where all values were positive
All the Vertical lines fall within the “Blue Shaded Area” implying that none of the “Lag Coefficients” is significant , suggesting that there is no autocorrelation for any lag, which is one of the assumptions of white noise

7 Partial Autocorrelation

7.1 Introduction

The partial autocorrelation function (PACF) is similar to the ACF except that it displays only the correlation between two observations that the shorter lags between those observations do not explain. For example, the partial autocorrelation for lag 3 is only the correlation that lags 1 and 2 do not explain. In other words, the partial correlation for each lag is the unique correlation between those two observations after partialling out the intervening correlations.

https://statisticsbyjim.com/time-series/autocorrelation-partial-autocorrelation/

The partial autocorrelation at lag k is the correlation that results after removing the effect of any correlations due to the terms at shorter lags.

The autocorrelation for an observation (t) and an observation at a prior time step (t-n) is comprised of both the direct correlation (t-n) and indirect correlations i.e. (t-n-1) , (t-n-2) , (t-n-3) and so on. These indirect correlations are a linear function of the correlation of the observation, with observations at intervening time steps.It is these indirect correlations that the partial autocorrelation function seeks to remove.

You can put PACF to very effective use for the following use cases:

To determine how many past lags to include in the forecasting equation of an auto-regressive model. This is known as the Auto-Regression (AR) order of the model.We will dwell on this when we build TS forecasting models.
To determine, or to validate, how many seasonal lags to include in the forecasting equation of a moving average based forecast model for a seasonal time series. This is known as the Seasonal Moving Average (SMA) order of the process.We will discuss this later on.

7.2 Partial Autocorrelation Function (PACF)

Let us say that Prices today (t) are affected by Prices 3 days ago i.e. we are examining the correlation coefficient for 3rd lag (lag=3 at t-3). This is the direct correlation. However the indirect effects come in the form of prices three days ago (t-3) being affected by values at two and one day ago i.e. (t-2) and (t-1).The prices at (t-1) and (t-2) can directly affect prices today (t)…

To simplify , A partial autocorrelation function of order 3 returns the correlation between our time series (t) at points t-1 , t-2 , t-3 , and lagged values of itself by 3 time points t-4, t-5, t-6, but only after removing all effects attributable to lags 1 and 2

If we wish to determine only the direct relationship between the Time Series and its lagged version , we need to compute the partial autocorrelation.

The stats model’s graphics, TSA plots package contains a method plot_pacf() for plotting the partial auto correlation function.The various parameters of the plot_pacf() are illustrated below

There are several ways of computing the ACF, we need to define what method we want to use.We will rely on the order of least squares (OLS), hence we set the method argument equal to OLS , the reason being we are performing a regression of time series on lags of it.

If partial autocorrelation values are close to 0, then values between observations and lagged observations are not correlated with one another. Inversely, partial autocorrelations with values close to 1 or -1 indicate that there exists strong positive or negative correlations between the lagged observations of the time series

Let us visualize the PACF plot of the S&P index values over 40 lags

# import the time series graphics package
import statsmodels.graphics.tsaplots as sgt

# plot the PACF (correlogram) over 40 lags with 'OLS'
sgt.plot_pacf(df.market_value, lags = 40, zero = False, method = ('ols'))

plt.title("PACF S&P", size = 24)

## Text(0.5, 1.0, 'PACF S&P')

plt.show()

Interpreting a PACF Plot :

Values on the X axis represent lag’s which go up to 40 since we had included lags = 40.
Values on the Y axis indicate possible values for the partial auto correlation coefficient which vary between -1 to +1.
Each Vertical Line represents the partial correlation coefficient between the original Time Series and the corresponding lagged copy of the TS.T
The shaded blue area around the x axis represents significance of the Partial Autocorrelation. Coefficient values situated outside the “Blue Shaded Area” are significantly different from zero, which suggests the existence of partial autocorrelation for that specific lag.
Only the first several elements are significantly different from zero
Some of the values like the 9th lag are negative, this means that higher values nine periods ago result in lower values today and vice-versa

8 Some Useful Facts About PACF and ACF Patterns

https://online.stat.psu.edu/stat510/lesson/2/2.2#paragraph–267

Identification of an AR model is often best done with the PACF.

For an AR model, the theoretical PACF “shuts off” past the order of the model. The phrase “shuts off” means that in theory the partial auto correlations are equal to 0 beyond that point. Put another way, the number of non-zero partial auto correlations gives the order of the AR model. By the “order of the model” we mean the most extreme lag of x that is used as a predictor.

Following is the sample PACF for this series. Note that the first lag value is statistically significant, whereas partial auto correlations for all other lags are not statistically significant. This suggests a possible AR(1) model for these data

In the partial autocorrelation plot below , at lag values 0, 1 , 4 , 5 , 6 we have statistically significant partial auto correlations

Identification of an MA model is often best done with the ACF rather than the PACF.

For an MA model, the theoretical PACF does not shut off, but instead tapers toward 0 in some manner. A clearer pattern for an MA model is in the ACF. The ACF will have non-zero auto correlations only at lags involved in the model.

The following sample ACF for a simulated MA(1) series. Note that the first lag autocorrelation is statistically significant whereas all subsequent auto correlations are not. This suggests a possible MA(1) model for the data

9 Summary

White noise is a sequence of random data where every value has a time period associated with it.It behaves sporadically, so there is no way to successfully project it into the future.
A Random walk is a special type of time series where values tend to persist over time and the differences between periods are white noise.
Stationary Time Series is a time series whose statistical properties, such as mean, variance, etc., remain constant over time i.e. Constant Mean \(\mu\) ; Constant Variance \(\sigma^2\) and Covariance \(\sigma^2\)S1 = Covariance \(\sigma^2\)S2
The Augmented Dickey-Fuller (ADF) test is a type of statistical test applied on the Time Series data to assess stationarity. Null Hypothesis (H0) : Series is non-stationary {the one lag autocorrelation coefficient \(\phi1\) < 1} with an Alternate Hypothesis (HA) : Series is stationary {the one lag autocorrelation coefficient \(\phi1\) = 1}
Seasonality occurs when Time Series data exhibits regular and predictable patterns at time intervals that are smaller than a year.Decomposing the Time Series is one of the ways of identifying Time Series for Seasonality. We decompose the Time Series into Trend “long term movement” ; Seasonal “a systematic and calendar related effect” and Residual / Random “the remainder component after we have decomposed into trend and seasonality”
Autocorrelation measures the correlation between the sequence and itself to be more precise , it measures the level of resemblance between a sequence from several periods ago and the actual data.Autocorrelation is a representation of the degree of similarity between a given time series and a lagged version of itself over successive time intervals.The auto correlation function ACF computes the autocorrelation value for the number of lag’s we are interested in simultaneously
The partial autocorrelation function (PACF) is similar to the ACF except that it displays only the correlation between two observations that the shorter lags between those observations do not explain.A partial autocorrelation function of order 3 returns the correlation between our time series (t) at points t-1 , t-2 , t-3 , and lagged values of itself by 3 time points t-4, t-5, t-6, but only after removing all effects attributable to lags 1 and 2

Chapter 2 : Time Series Key Concepts

Ravi Mummigatti

10-Oct-2021