Basics of time series analysis

Isai Guizar

Disclaimer: This document is intended for educational purposes only. It is part of my Econometrics courses at Tec de Monterrey.


In this document we explore the foundational concepts of time series analysis using Mexico’s quarterly GDP data. We will begin by visualizing the series and decomposing it into its fundamental components: trend/cycle, seasonality, and remainder, illustrating the additive structure commonly used in economic data. Then we will analyze the seasonally adjusted series, removing regular seasonal fluctuations to better observe the underlying trend. Next, we will introduce lag operations and various transformations used to prepare data for modeling, including first differences, seasonal differences (lag 4), arithmetic growth rates, and logarithmic growth rates. These transformations help stabilize the mean and variance, making the series more suitable for further analysis. The simple moving average as a smoothing tecnique is also covered. Finally, we will study the concept of autocorrelation, both through the autocorrelation function (ACF) and the partial ACF. Together, these tools lay the groundwork for understanding, transforming, and modeling time series data effectively.


1 Time series components

import pandas as pd
import numpy  as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
from datetime import datetime
import yfinance as yf
# !pip install pandas-datareader
from pandas_datareader import data as pdr


GDP = pdr.DataReader('NGDPRNSAXDCMXQ', 'fred', start='1990Q1', end = '2024Q4')
GDP = GDP.rename(columns={'NGDPRNSAXDCMXQ': 'Values'})
GDP = GDP.reset_index()
GDP = GDP.rename(columns={'DATE': 'Date'})

GDP['Values'] = GDP['Values']/1000 # Mil millones
GDP['Date'] = pd.to_datetime(GDP['Date'])

decomp = sm.tsa.seasonal_decompose(GDP['Values'], model='additive', period=4)

1.1 Original times series

plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['Values'])
plt.xlabel('Date')
plt.ylabel('Billions of Mexican Pesos')
plt.title('Mexico Quartely GDP: 1990 to 2024')
plt.tight_layout()
# plt.grid(True)
plt.show()

1.2 Trend/cycle

plt.figure(figsize=(7, 5))
decomp.trend.plot(title='Trend/Cycle Component', ylabel='Trend')
# plt.grid(True)
plt.tight_layout()
plt.show()

1.3 Seasonality

plt.figure(figsize=(7, 5))
decomp.seasonal.plot(title='Seasonal Component', ylabel='Seasonality')
# plt.grid(True)
plt.tight_layout()
plt.show()

1.4 Remainder

plt.figure(figsize=(7, 5))
decomp.resid.plot(title='Remainder Component', ylabel='Residual')
# plt.grid(True)
plt.tight_layout()
plt.show()

fig, axes = plt.subplots(4, 1, figsize=(7, 7), sharex=True)

GDP['Values'].plot(ax=axes[0], title='Mexico Quarterly GDP', ylabel='Level')
decomp.trend.plot(ax=axes[1], title='Trend/Cycle')
decomp.seasonal.plot(ax=axes[2], title='Seasonality')
decomp.resid.plot(ax=axes[3], title='Remainder')

for ax in axes:
    ax.grid(False)
    
plt.tight_layout()
plt.show()

1.5 Seasonally adjusted

GDP['Deseasonalized'] = GDP['Values'] - decomp.seasonal

plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['Values'], label='GDP', linewidth=2,  color = 'steelblue')
plt.plot(GDP['Date'], GDP['Deseasonalized'], label='Seasonally Adjusted',color = 'slategray', linewidth=2)
plt.xlabel('Date')
plt.ylabel('GDP (Billions of Pesos)')
plt.title('Original vs. Seasonally Adjusted GDP')
plt.legend()
# plt.grid(True)
plt.tight_layout()
plt.show()

2 Differences, growth, and moving averages

2.1 First differences


GDP['First_Diff'] = GDP['Values'].diff(1)

plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['First_Diff'])
plt.title('First Difference of GDP')
plt.xlabel('Date')
plt.ylabel('Billions of Mexican Pesos')
plt.tight_layout()
plt.show()

2.2 Seasonal differences

GDP['Quarter_Diff'] = GDP['Values'].diff(4)

plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['Quarter_Diff'])
plt.title('Quarterly Difference of GDP')
plt.xlabel('Date')
plt.ylabel('Billions of Mexican Pesos')
plt.tight_layout()
plt.show()

2.3 Growth rate

GDP['Growth'] = GDP['Values'].pct_change(1) * 100

plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['Growth'])
plt.title('Quarterly Growth of GDP')
plt.xlabel('Date')
plt.ylabel('Percent')
plt.tight_layout()
plt.show()

GDP['Log_Growth'] = np.log(GDP['Values']).diff() * 100

plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['Log_Growth'])
plt.title('Quarterly Growth of GDP')
plt.xlabel('Date')
plt.ylabel('Percent')
plt.tight_layout()
plt.show()

Note that:

\[ \begin{align*}\Delta_4\% y_t &\approx \left(\ln y_t - \ln y_{t-4}\right)\times 100 \\ &= \left(\ln y_{t} - \ln y_{t-1} + \ln y_{t-1} - \ln y_{t-2} + \ln y_{t-2} - \ln y_{t-3} + \ln y_{t-3} - \ln y_{t-4}\right)\times 100 \\ &= \Delta\% y_{t} + \Delta\% y_{t-1} + \Delta\% y_{t-2} + \Delta\% y_{t-3}\end{align*} \]

So the inter-anual growth rate can be calculated as:

GDP['Year_Growth'] = np.log(GDP['Values']).diff(4) * 100

plt.figure(figsize=(7, 5))
plt.plot(GDP['Date'], GDP['Year_Growth'])
plt.title('Inter-Annual Growth of GDP')
plt.xlabel('Date')
plt.ylabel('Percent')
plt.tight_layout()
plt.show()

3 Moving average smoothing

\[ y^n_t \equiv \tfrac{1}{n}\left(y_t + y_{t-1} + y_{t-2} + y_{t-3} + \cdots +y_{t-n-1}\right) \]

The average eliminates seasonality in the data, leaving a smooth trend-cycle component

GDP['MA_4'] = GDP['Values'].rolling(window=4).mean()

plt.figure(figsize=(7, 5))
plt.plot(GDP.index, GDP['Values'], label='Original GDP', color = 'steelblue', linewidth=2)
plt.plot(GDP.index, GDP['MA_4'], label='4-Period Moving Average', color = 'slategray', linewidth=2)
plt.xlabel('Date')
plt.ylabel('GDP (Billions of Pesos)')
plt.title('GDP and 4-Period Moving Average')
plt.legend()
plt.tight_layout()
plt.show()

4 Autocorrelation

rho1 = GDP['Values'].autocorr(lag=1)
rho2 = GDP['Values'].autocorr(lag=2)
rho3 = GDP['Values'].autocorr(lag=3)
rho4 = GDP['Values'].autocorr(lag=4)

# Store results
autocorr_values = {
    'rho 1': rho1,
    'rho 2': rho2,
    'rho 3': rho3,
    'rho 4': rho4
}

autocorr_table = pd.DataFrame.from_dict(autocorr_values, orient='index', columns=['Autocorrelation'])
autocorr_table.index.name = 'Lag'

autocorr_table
Autocorrelation
Lag
rho 1 0.978707
rho 2 0.973389
rho 3 0.964187
rho 4 0.967040

4.1 Autocorrelation function (ACF)

from statsmodels.graphics.tsaplots import plot_acf

plot_acf(GDP['Values'], lags=36, zero=False)
plt.title('Autocorrelation Function (ACF) of Mexico GDP')
plt.xlabel('Lag')
plt.ylabel('Autocorrelation')
plt.tight_layout()
plt.show()

4.2 Partial Autocorrelation function (PACF)

from statsmodels.graphics.tsaplots import plot_pacf

plot_pacf(GDP['Values'], lags=36, zero=False, method = 'ols')
plt.title('Partial autocorrelation Function (PACF) of Mexico GDP')
plt.xlabel('Lag')
plt.ylabel('Autocorrelation')
plt.tight_layout()
plt.show()