SARIMAs

CD2010 Introduction to Econometrics

Author
Affiliation

Sergio Castellanos-Gamboa, PhD

Tecnológico de Monterrey

Published

September 9, 2025

1 Introduction

This workshop introduces ARIMA and then moves to SARIMA so that you can build seasonal forecasts straightforwardly in Python.

We will follow this storyline:

  1. ARIMA: what AR, I, and MA mean, and why basic ARIMA typically requires you to know the order of (p), (q), and (d) (stationarity).
  2. Why SARIMA?: seasonality adds a second type of differencing D; fortunately, a loop routine can decide both d and D for you.
  3. Hands-on: load US GDP (NSA, quarterly) from FRED, fit SARIMA, print the chosen model, and forecast with a simple plot.
  4. Diagnostics (lite): quick check of residuals.

2 Load libraries

# Load libraries you will use
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# New libraries
import yfinance as yf # financial data from yahoo finance
from pandas_datareader import data as pdr # Data from FRED (Federal Reserve St. Louis)

# Decomposition, Autocorrelation Functions, and Dickey Fuller (non-sationarity)
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.stattools import acf as sm_acf, adfuller
from scipy.stats import norm
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.stats.stattools import durbin_watson
import statsmodels.api as sm
import statsmodels.formula.api as smf

# ARIMAs
from statsmodels.tsa.statespace.sarimax import SARIMAX
import warnings
warnings.filterwarnings("ignore")
from itertools import product

# Utilities
from scipy.stats import norm

# How many decimals in your outputs? Here 2
pd.options.display.float_format = "{:.2f}".format

3 Modelling univariate time series

3.1 Load the data

# US GDP Millions of Dollars, Not Seasonally Adjusted (FRED: NA000334Q)
start = "2000-01-01"
end = "2025-09-08"
gdp = pdr.DataReader("NA000334Q", "fred", start=start, end=end)
gdp.columns = ["GDP"]  

gdp.head()
GDP
DATE
2000-01-01 2448892
2000-04-01 2569266
2000-07-01 2577927
2000-10-01 2655636
2001-01-01 2562488

As usual, first start by plotting the data, first the series in levels and then its logarithmic growth:

gdp.plot(y="GDP", title="US GDP 1980–present (not seasonally adjusted)")
plt.show()

# GDP growth rate (log-diff)
GDPgrowth = np.log(gdp["GDP"]).diff()

GDPgrowth.plot(title="US GDP growth Q-Q")
plt.show()

# GDP growth rate 4th lag for difference
GDPgrowth_4 = np.log(gdp["GDP"]).diff(4)

GDPgrowth_4.plot(title="US GDP growth Y-Y") 
plt.show()

Levels show a strong upward path and visible within-year oscillations ⇒ non-stationary with seasonality. The log growth looks more stable around a mean with variability but no obvious long-term trend ⇒ closer to stationary but still with clear seasonal patterns. This is why ARIMA typically needs differencing (d), and SARIMA will also consider seasonal differencing (D).

Additionally, we now add a series were we take the difference, not with the first lag but with the 4th lag, so we are taking into account seasonal patterns and this is a year-to-year growth rate.

From our previous workshop, we know that GDP in levels (NSA) is non-stationary and that we had to take difference twice in order to find a stationary I(0) series.

3.2 ACF and PACF:

ACF (Autocorrelation Function) and PACF (Partial Autocorrelation Function) are essential tools for identifying the appropriate parameters for ARIMA models.

The ACF measures the correlation between the time series and its lagged values at different lag intervals. It helps to identify the degree of dependency between observations in the series, which is crucial for determining the (q) parameter of the MA part.

The PACF, on the other hand, measures the correlation between the time series and its lagged values, controlling for the values of the intermediate lags. This helps in identifying the (p) parameter of the AR part.

By examining the plots of ACF and PACF, one can make informed decisions about the values of (p) and (q) to use in the ARIMA model. Peaks in the ACF plot suggest the lag order for the MA component, while peaks in the PACF plot suggest the lag order for the AR component. These functions are crucial for diagnosing the structure of time series data and ensuring the chosen ARIMA model accurately captures the underlying patterns. However, nowadays we realy more on the use of AIC and BIC, or other information criteria to select the appropiate lag-length.

3.3 ACF and PACF plots

# Plots of ACF for GDP
fig, ax = plt.subplots(1, 3, figsize=(14,4))
plot_acf(gdp.dropna(), lags=12, ax=ax[0]); ax[0].set_title("ACF — US GDP (levels)")
plot_acf(GDPgrowth.dropna(), lags=12, ax=ax[1]); ax[1].set_title("ACF — US GDP Q-Q")
plot_acf(GDPgrowth_4.dropna(), lags=12, ax=ax[2]); ax[2].set_title("ACF — US GDP Y-Y")
plt.tight_layout(); plt.show()

# Plots of PACF for GDP
fig, ax = plt.subplots(1, 3, figsize=(14,4))
plot_pacf(gdp.dropna(), lags=12, ax=ax[0]); ax[0].set_title("PACF — US GDP (levels)")
plot_pacf(GDPgrowth.dropna(), lags=12, ax=ax[1]); ax[1].set_title("PACF — US GDP (log growth)")
plot_pacf(GDPgrowth_4.dropna(), lags=12, ax=ax[2]); ax[2].set_title("PACF — US GDP Y-Y")
plt.tight_layout(); plt.show()

4 ARIMA

We will now estimate time-series models exploiting the series onw past. Therefore, we are not longer estimating OLS models, but using another method called Log-Likelihood.

  • AR (Autoregressive): today’s value relates to its own lags. The order p says how many lags of the series enter.
  • I (Integrated / differencing): many economic series are non-stationary. Differencing removes trends; d times differencing makes the series stationary (mean/variance not drifting over time).
  • MA (Moving Average): today’s value relates to lags of shocks/errors. The order q says how many lagged errors enter.

An ARIMA(p, d, q) combines these three blocks. In ARIMA you usually study the series’ order of integration to pick d. For example, levels of GDP are often I(1) or I(2) while growth rates are closer to stationary (I(0)).

4.1 Why SARIMA? Seasonality matters

Many series also have repeating seasonal patterns (quarters, months, weeks). A SARIMA adds a seasonal ARMA structure and seasonal differencing D with a known seasonal period m: - Quarterly data: m = 4 - Monthly data: m = 12 - Weekly data: m ≈ 52

A SARIMA is denoted as ARIMA(p, d, q) × (P, D, Q)_m.

5 Manual SARIMA selection by BIC

5.1 AIC & BIC in a nutshell

Information criteria are single-number scores that combine how well a model fits the data (via its likelihood) with a penalty for model complexity (number of parameters). Lower is better.

Why are they useful?
- Give an objective way to compare competing models (even non-nested) on the same dataset.
- Balance fit vs. simplicity, helping avoid overfitting.
- Fast to compute; handy when data are limited (no holdout needed).
- Point you to a small, interpretable model before running residual checks.

AIC vs. BIC (conceptual):
- AIC: lighter penalty → often chooses richer models; good when pure prediction with plenty of data.
- BIC: stronger penalty → prefers parsimonious models; consistent selector as sample grows.

Why BIC for lag-length selection?
In SARIMA, extra lags can soak up noise. BIC’s heavier penalty tends to pick the smallest model that still captures the seasonal and short-run dynamics, improving interpretability and out-of-sample behavior. Fit several candidates on the same sample → compare AIC/BIC → pick the lowest → then check residual diagnostics.

We search a small grid for (p,d,q) and (P,D,Q) with quarterly m=4 and pick the lowest BIC. Keep the grid tight for class speed; widen later if needed.

# The following code creates a loop to estimate many models and then compare the best 
# with the lowest BIC

def sarima_bic_grid(y, m=4,
                    p_range=range(0,4), d_range=range(0,3), q_range=range(0,4),
                    P_range=range(0,2), D_range=range(0,2), Q_range=range(0,2),
                    trend="n"):
    """
    Fit SARIMAX over a small grid and return the best model by BIC.
    Defaults are intentionally small for speed.
    """
    results = []
    best_res = None
    best_bic = np.inf
    best_spec = None

    for p, d, q in product(p_range, d_range, q_range):
        for P, D, Q in product(P_range, D_range, Q_range):
            order = (p, d, q)
            seasonal_order = (P, D, Q, m)
            try:
                model = SARIMAX(y, order=order, seasonal_order=seasonal_order,
                                trend=trend, enforce_stationarity=False, enforce_invertibility=False)
                res = model.fit(disp=False)
                bic = res.bic
                results.append((order, seasonal_order, bic))
                if bic < best_bic:
                    best_bic = bic
                    best_res = res
                    best_spec = (order, seasonal_order)
            except Exception:
                continue

    results_df = pd.DataFrame(results, columns=["order", "seasonal_order", "BIC"]).sort_values("BIC")
    return best_res, best_spec, best_bic, results_df

# Run the grid search on GDP levels (NSA)
best_res, best_spec, best_bic, tried = sarima_bic_grid(
    gdp["GDP"],
    m=4,
    p_range=range(0,4), d_range=range(0,3), q_range=range(0,4),
    P_range=range(0,2), D_range=range(0,2), Q_range=range(0,2),
    trend="n"
)

best_spec, best_bic, tried.head(10)
(((2, 2, 3), (0, 1, 1, 4)),
 np.float64(2305.392337685326),
          order seasonal_order     BIC
 283  (2, 2, 3)   (0, 1, 1, 4) 2305.39
 379  (3, 2, 3)   (0, 1, 1, 4) 2306.58
 287  (2, 2, 3)   (1, 1, 1, 4) 2308.13
 383  (3, 2, 3)   (1, 1, 1, 4) 2309.11
 187  (1, 2, 3)   (0, 1, 1, 4) 2309.12
 191  (1, 2, 3)   (1, 1, 1, 4) 2310.50
 59   (0, 1, 3)   (0, 1, 1, 4) 2314.92
 91   (0, 2, 3)   (0, 1, 1, 4) 2315.75
 63   (0, 1, 3)   (1, 1, 1, 4) 2316.83
 95   (0, 2, 3)   (1, 1, 1, 4) 2317.44)

5.2 Inspect the selected model

We present the model here. Keep in mind that unlike with the OLS, we will not interpret the coefficients here. It gets too complicated. However, we can look for individual and joint significance and OLS tests

print("Selected spec (order, seasonal_order):", best_spec)
print("Selected BIC:", round(best_bic, 2))
print(best_res.summary())
Selected spec (order, seasonal_order): ((2, 2, 3), (0, 1, 1, 4))
Selected BIC: 2305.39
                                      SARIMAX Results                                      
===========================================================================================
Dep. Variable:                                 GDP   No. Observations:                  102
Model:             SARIMAX(2, 2, 3)x(0, 1, [1], 4)   Log Likelihood               -1137.025
Date:                             Tue, 09 Sep 2025   AIC                           2288.051
Time:                                     13:46:27   BIC                           2305.392
Sample:                                 01-01-2000   HQIC                          2295.037
                                      - 04-01-2025                                         
Covariance Type:                               opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ar.L1         -1.5589      0.316     -4.934      0.000      -2.178      -0.940
ar.L2         -0.8653      0.314     -2.755      0.006      -1.481      -0.250
ma.L1          0.6155      0.332      1.851      0.064      -0.036       1.267
ma.L2         -0.4594      0.161     -2.855      0.004      -0.775      -0.144
ma.L3         -0.5753      0.343     -1.677      0.094      -1.248       0.097
ma.S.L4       -0.4595      0.099     -4.662      0.000      -0.653      -0.266
sigma2      1.424e+10   3.58e-12   3.98e+21      0.000    1.42e+10    1.42e+10
===================================================================================
Ljung-Box (L1) (Q):                   0.03   Jarque-Bera (JB):               484.10
Prob(Q):                              0.87   Prob(JB):                         0.00
Heteroskedasticity (H):              10.10   Skew:                            -0.37
Prob(H) (two-sided):                  0.00   Kurtosis:                        14.47
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
[2] Covariance matrix is singular or near-singular, with condition number 2.39e+38. Standard errors may be unstable.

5.3 Ljung–Box test (serial correlation of residuals)

While ACF/PACF plots give a visual sense, the Ljung–Box test formally checks whether model residuals are uncorrelated (i.e., resemble white noise).
- Null hypothesis (H0): residuals are not autocorrelated (good model fit).
- Alternative (H1): residuals show serial correlation (model may be missing dynamics).

This is useful after fitting ARIMA/SARIMA: even if the model passes normality and variance checks, autocorrelation in residuals would mean the dynamics aren’t fully captured. In this case we would also try different estimation methods suchs as GARCH (beyond the scope of the course).

5.4 Forecast and plot

# Forecast next 8 quarters with confidence intervals
n_fore = 20
fc_res = best_res.get_forecast(steps=n_fore)
fc_mean = fc_res.predicted_mean
fc_ci = fc_res.conf_int()

# Plot: history + forecast + CI band
ax = gdp["GDP"].plot(title="US GDP (NSA) — SARIMA (BIC) forecast (next 5 years)", figsize=(10,4))
fc_mean.plot(ax=ax)
ax.fill_between(fc_ci.index, fc_ci.iloc[:,0], fc_ci.iloc[:,1], alpha=0.2)
ax.set_xlabel("Date"); ax.set_ylabel("Millions of dollars")
plt.show()

The forecast extends the pattern of GDP levels; the confidence band shows uncertainty. Seasonal peaks/troughs persist if the model includes seasonal components. The further into the future we forecast, the poorer our forecast becoms (greater confindence intervals).

5.5 (Optional) Quick diagnostics

best_res.plot_diagnostics(figsize=(10,6))
plt.show()

Look for residuals centered around zero, little autocorrelation, and roughly normal errors.

6 What to remember

  • Lag length matters: too few lags underfit; too many overfit.
  • BIC balances fit and parsimony; pick the lowest BIC among tried models.
  • SARIMA adds seasonal terms and seasonal differencing (D) with known period m (4 for quarterly).
  • This manual BIC approach is simple, transparent, and avoids external auto-model libraries.