---
title: "Predicting Nigerian Sovereign Spread Dynamics: A Treasury-Led Predictive Analytics Study of NTB-MPR and FGN Bond-MPR Spreads (2008–2026)"
author: "[Taye Olusola Adelanwa] | Group Head, Asset-Liability Management / Manager | Lagos Business School EMBA"
date: today
format:
html:
theme: flatly
toc: true
toc-depth: 3
code-fold: true
self-contained: true
fig-width: 10
fig-height: 6
highlight-style: github
code-tools: true
execute:
warning: false
message: false
---
```{python setup}
#| include: false
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns
import warnings
warnings.filterwarnings("ignore")
plt.rcParams['axes.titleweight'] = 'bold'
plt.rcParams['figure.dpi'] = 110
# Colour palette consistent across all figures
PAL = {'pre': '#888888', 'post': '#c0392b', 'ntb': '#1a6b4a', 'bond': '#2980b9',
'mpr': '#d35400', 'cpi': '#c0392b', 'fx': '#8e44ad', 'reserves': '#27ae60'}
```
---
## 1. Executive Summary
This study applies five predictive-modelling techniques to a self-assembled monthly macro-financial panel covering Nigerian sovereign rate dynamics from April 2006 to April 2026. Six primary datasets — Central Bank of Nigeria (CBN) exchange rates, foreign reserves, NFEM market data, the Nigerian Bureau of Statistics (NBS) inflation series, the Debt Management Office (DMO) government securities auction history, and crude oil price/production data — were merged with 87 Monetary Policy Committee decisions parsed from CBN MPC communiqués. The resulting **241-month panel with 17 variables** is the analytical foundation.
The analytical target is the **NTB-MPR spread** (short end) and the **FGN bond-MPR spread** (long end) — the term-premium components that determine DFI funding costs once the mechanical policy-rate pass-through is removed. The five techniques applied — ARIMA time-series forecasting, principal component analysis, K-means regime clustering, gradient-boosting classification, and SHAP explainability — converge on a single empirical finding: **the June 2023 FX unification was a structural break that fundamentally transformed how Nigerian sovereign spreads are priced**, shifting from FX-vulnerability-driven (pre-2023) to inflation-driven (post-2023). The recommendation: DFI funding-cost models calibrated on pre-2023 sensitivities are misspecified and require regime-aware re-estimation as the post-unification sample matures.
---
## 2. Professional Disclosure
**Job title:** Treasury Analyst / Manager
**Organisation:** Development Finance Institution (DFI), Nigerian financial sector
**Analytical context:** This study addresses a recurring operational problem in my role: forecasting our institution's funding cost trajectory for the bond-issuance and bridge-financing decisions that anchor our quarterly Asset & Liability Committee (ALCO) work. Rather than relying on intuition or single-point market consensus, this study formalises the question into a reproducible predictive pipeline using publicly available macro-financial data — exactly the kind of analytical rigour our investment committee increasingly expects.
### Technique 1 — Time Series Analysis
Time-series forecasting is the foundational tool for treasury yield-curve work. Every quarter, my team produces a forecast of the NTB and FGN bond yield curve to guide our funding-side decisions: when to issue, what tenor, and at what indicative coupon. The ARIMA framework in Section 5 — with stationarity testing, ACF/PACF identification, and three-period-ahead forecasting with prediction intervals — replicates the formal apparatus that should sit behind those treasury memos. The current process is largely judgemental; this study brings it inside a defensible statistical envelope.
### Technique 2 — Dimensionality Reduction (PCA)
When ALCO members ask "where is the macro environment relative to last year?", the honest answer involves seven-plus indicators (CPI, FX, reserves, MPR, oil, etc.). PCA distils these into two interpretable components — an overall macro-stress factor and an FX-inflation tilt factor — that I can plot on a single chart for the committee. Section 6 demonstrates that the first two components capture roughly 85% of variance, making PCA an operationally useful dimensionality-reduction tool for executive communication, not just an academic technique.
### Technique 3 — Customer/Entity Segmentation (Clustering)
In treasury, the "entities" being segmented are macro-economic time periods, not customers. Identifying historical regimes is the foundation of scenario analysis: when we stress-test a new bond issuance, we ask "which historical period is most analogous to today?" K-means clustering in Section 7 produces an unsupervised regime taxonomy with explicit silhouette-validated optimal-k choice. The four regimes the data reveals — and especially the way the post-2023 period emerges as a distinct cluster — is exactly the empirical validation a scenario analysis needs.
### Technique 4 — Classification Model
The single most useful directional output for treasury is a probabilistic forecast of whether next-month funding conditions will tighten (spread compression) or loosen. Section 8 frames this as a binary classification problem and compares two standard models — logistic regression and gradient boosting — using a walk-forward validation protocol that mirrors how a forecast would be used in practice. The deployment recommendation flows directly from the AUC and confusion-matrix analysis: which model would I trust to support a real funding decision next month.
### Technique 5 — Model Evaluation & Explainability (SHAP)
A predictive model is only operationally useful if non-technical stakeholders can understand why it makes a given prediction. SHAP values translate the gradient-boosting model into a per-feature attribution that I can take into the room and defend coefficient-by-coefficient. Section 9 demonstrates both the global view (SHAP summary plot — what generally drives compression) and the local view (waterfall plot — why this specific month is predicted as compression). This explainability layer is what would allow the model to actually support decisions in our ALCO, rather than being treated as a black box.
---
## 3. Data Collection & Sampling
### 3.1 Primary Datasets Assembled
This study is built on a **self-assembled monthly macro-financial panel** constructed from seven primary data sources. Each source was extracted, parsed, validated, and structured by the author; the integrated panel did not previously exist as a single dataset and constitutes original primary data under the methodological definition (data collected and structured by the researcher for a specific research purpose).
| Source | Type | Original cadence | Records | Period |
|---|---|---|---|---|
| CBN Exchange Rates | Daily FX series, 25 currencies | Daily | 60,969 | Dec 2001 – May 2026 |
| CBN External Reserves | Gross/Liquid/Blocked USD | Daily | 4,828 | Apr 2006 – May 2026 |
| CBN NFEM Market Data | Post-unification FX microstructure | Daily | 359 | Dec 2024 – May 2026 |
| NBS Inflation | 4 CPI measures (headline, food, core × 2) | Monthly | 279 | 2003 – Mar 2026 |
| DMO Government Securities | NTB, OMO, FGN Bond auctions | Per auction | 4,682 | Feb 2001 – Apr 2026 |
| OPEC Crude Oil | Price, domestic production, exports | Monthly | 242 | Jan 2006 – Mar 2026 |
| CBN MPC Decisions | MPR per meeting | Per MPC meeting | 87 | Feb 2008 – Feb 2026 |
### 3.2 Sampling Frame and Rationale
**Frame:** All publicly disclosed Nigerian macro-financial indicators that have a documented monthly observation from at least 2008 onwards.
**Sample size after merger:** 241 monthly observations (Apr 2006 – Apr 2026) × 17 variables. This satisfies the Case Study 2 minimum thresholds (200 observations for classification, 24 periods for time series, ≥6 predictors).
**Period choice:** April 2006 is the earliest month for which all six daily-frequency series (FX, reserves) and monthly series (CPI, oil) have continuous coverage. The MPR series begins February 2008, giving 219 months of MPR-aligned data — comfortably above all minimum thresholds.
**Statistical rationale:** Monthly frequency is the binding constraint imposed by the inflation series. Daily and weekly series were averaged to monthly within each calendar month; auction-level securities data was aggregated to monthly mean stop rates by tenor bucket.
### 3.3 Mapping to Business Operations
| Variable group | Operational use in DFI treasury |
|---|---|
| NTB stop rates (91/182/364) | Short-term funding cost benchmarks; commercial paper pricing |
| FGN bond rates | Long-term funding cost / DFI bond issuance pricing |
| MPR | Reference rate for asset pricing and inter-bank cost of funds |
| Inflation (CPI, food, core) | Real-rate calculation; loan re-pricing decisions |
| FX (NGN/USD) | FCY-loan portfolio re-valuation; FX-linked asset pricing |
| Reserves | Sovereign credit risk indicator; inter-bank counterparty risk |
| Oil price | Federation revenue proxy; impacts sovereign credit and FX |
### 3.4 Ethics and Data Provenance
- All source data is publicly disclosed by the issuing authority (CBN, NBS, DMO, OPEC).
- No personally identifiable information is collected or processed.
- Web-scraping where used respected `robots.txt` and rate-limiting conventions.
- The MPR document was extracted from the CBN public Monetary Policy Decisions page (https://www.cbn.gov.ng/MonetaryPolicy/decisions.html).
- No organisational confidential data is used; all findings are reproducible from public sources.
---
## 4. Data Description and Preparation
### 4.1 Building the Master Panel
```{python build-panel}
#| label: build-panel
#| code-summary: "Build cleaned monthly panel from primary sources"
from pathlib import Path
import re
DATA = Path("data") # adjust path as needed; assumes raw files in ./data/
def build_master_panel():
# Inflation (already monthly)
infl = pd.read_excel(DATA / "Inflation_Data_in_Excel__1_.xlsx")
infl['date'] = pd.to_datetime(dict(year=infl.tyear, month=infl.tmonth, day=1))
infl = infl[['date','allItemsYearOn','foodYearOn',
'allItemsLessFrmProdYearOn','allItemsLessFrmProdAndEnergyYearOn']]
infl.columns = ['date','cpi_headline','cpi_food','cpi_core_lessfarm','cpi_core_lessfarm_energy']
# Crude oil (monthly)
oil = pd.read_excel(DATA / "Crude_Oil_Data_in_Excel.xlsx")
oil['date'] = pd.to_datetime(dict(year=oil.tyear, month=oil.tmonth, day=1))
oil = oil[['date','crudeOilPrice','domProd','crudeOilExp']]
oil.columns = ['date','oil_price','oil_dom_prod','oil_exports']
# Reserves (daily → monthly mean, in USD bn)
res = pd.read_excel(DATA / "Reserves.xlsx")
res['Date'] = pd.to_datetime(res['Date'], dayfirst=True, errors='coerce')
res = res.dropna(subset=['Date'])
res['date'] = res['Date'].dt.to_period('M').dt.to_timestamp()
res_m = res.groupby('date').agg(
reserves_gross=('Gross (USD)','mean'),
reserves_liquid=('Liquid (USD)','mean'),
reserves_blocked=('Blocked (USD)','mean')).reset_index()
for c in ['reserves_gross','reserves_liquid','reserves_blocked']:
res_m[c] = res_m[c] / 1e9 # to USD bn
# FX (USD only, daily → monthly mean of central rate)
fx = pd.read_excel(DATA / "Exchange_rates.xlsx")
fx['Currency'] = fx['Currency'].str.strip()
fx_usd = fx[fx['Currency'] == 'US DOLLAR'].copy()
fx_usd['Date'] = pd.to_datetime(fx_usd['Date'], errors='coerce')
fx_usd = fx_usd.dropna(subset=['Date'])
fx_usd['date'] = fx_usd['Date'].dt.to_period('M').dt.to_timestamp()
fx_m = fx_usd.groupby('date').agg(fx_usd_official=('Central Rate','mean')).reset_index()
# Government securities (clean and aggregate to monthly stop rates by tenor)
gov = pd.read_excel(DATA / "Government_Securities_in_Excel__1_.xlsx")
gov['auctionDate'] = pd.to_datetime(gov['auctionDate'], dayfirst=True, errors='coerce')
gov = gov.dropna(subset=['auctionDate','rate'])
gov = gov[gov.auctionDate <= pd.Timestamp.today()]
def norm_sec(s):
s = str(s).strip().upper()
if s.startswith('NTB'): return 'NTB'
if 'BOND' in s or s in ('FGB BONDS','FGN BODS','NT BONDS'): return 'FGN_BOND'
if s.startswith('OMO'): return 'OMO'
return 'OTHER'
gov['sec'] = gov['securityType'].apply(norm_sec)
def parse_tenor(t):
s = str(t).strip().upper().replace('DAY','').replace('DAYS','').strip()
m = re.match(r'^(\d+)\s*YEAR', s)
if m: return int(m.group(1)) * 365
m = re.match(r'^(\d+)', s)
return int(m.group(1)) if m else np.nan
gov['tenor_days'] = gov['tenor'].apply(parse_tenor)
gov = gov[(gov.rate > 0) & (gov.rate < 50) & gov.tenor_days.notna()]
gov['date'] = gov['auctionDate'].dt.to_period('M').dt.to_timestamp()
def monthly_rate(df, sec, lo, hi, label):
sub = df[(df.sec == sec) & df.tenor_days.between(lo, hi)]
return sub.groupby('date').agg(**{label: ('rate', 'mean')}).reset_index()
ntb91 = monthly_rate(gov, 'NTB', 85, 95, 'ntb_91')
ntb182 = monthly_rate(gov, 'NTB', 175, 195, 'ntb_182')
ntb364 = monthly_rate(gov, 'NTB', 350, 380, 'ntb_364')
bonds = monthly_rate(gov, 'FGN_BOND', 3*365, 30*365, 'fgn_bond_avg')
panel = (infl.merge(oil, on='date', how='outer')
.merge(res_m, on='date', how='outer')
.merge(fx_m, on='date', how='outer')
.merge(ntb91, on='date', how='outer')
.merge(ntb182, on='date', how='outer')
.merge(ntb364, on='date', how='outer')
.merge(bonds, on='date', how='outer')
.sort_values('date').reset_index(drop=True))
return panel[(panel.date >= '2006-04-01') & (panel.date <= '2026-04-01')].copy()
panel = build_master_panel()
print(f"Master panel: {panel.shape[0]} months × {panel.shape[1]} variables")
print(f"Period: {panel.date.min().strftime('%b %Y')} to {panel.date.max().strftime('%b %Y')}")
```
### 4.2 Parsing the MPR Series from CBN MPC Communiqués
The MPR is the dominant macroeconomic anchor for sovereign rates. CBN does not publish MPR as a downloadable time series — instead, MPC decisions are issued as text communiqués. The series below was constructed by parsing 87 communiqués (covering the 201st through 304th MPC meetings) from the CBN Monetary Policy Decisions page using a custom regex parser that extracts the meeting date and MPR decision from each.
```{python parse-mpr}
#| label: parse-mpr
#| code-summary: "Parse MPR meetings from CBN communiqué text"
# Pre-parsed MPR series (parser shown in appendix). Format: meeting_date, mpr_pct
# In a fresh run, this is reconstructed from the CBN MPC document.
mpr = pd.read_csv("mpr_meetings.csv", parse_dates=['date'])
mpr['mpr'] = mpr['mpr'].ffill() # handles 1 meeting where rate phrased as 'remains at...'
# Build daily series (MPR persists between meetings) → monthly month-end
all_days = pd.date_range('2008-01-01', '2026-04-30', freq='D')
mpr_d = pd.Series(index=all_days, dtype=float)
mpr_d.loc[mpr['date']] = mpr['mpr'].values
mpr_d = mpr_d.ffill()
mpr_m = mpr_d.resample('MS').last().reset_index()
mpr_m.columns = ['date','mpr']
panel = panel.merge(mpr_m, on='date', how='left')
# Build the spread targets — the analytical core of this study
panel['spread_ntb91'] = panel['ntb_91'] - panel['mpr']
panel['spread_ntb182'] = panel['ntb_182'] - panel['mpr']
panel['spread_ntb364'] = panel['ntb_364'] - panel['mpr']
panel['spread_bond'] = panel['fgn_bond_avg'] - panel['mpr']
# Regime indicator (June 2023 FX unification)
panel['post_unif'] = (panel['date'] >= '2023-06-01').astype(int)
print(f"MPR coverage: {panel['mpr'].notna().sum()} of {len(panel)} months")
print(f"\nSpread descriptives (% points):")
print(panel[['spread_ntb91','spread_ntb364','spread_bond']].describe().round(2))
```
### 4.3 Variable Inventory
| Variable | Type | Unit | Role | Coverage |
|---|---|---|---|---|
| date | datetime | month | Index | 241 |
| cpi_headline | numeric | YoY % | Predictor | 240 |
| cpi_food | numeric | YoY % | Predictor | 240 |
| cpi_core_lessfarm | numeric | YoY % | Predictor | 240 |
| oil_price | numeric | USD/bbl | Predictor | 239 |
| reserves_gross | numeric | USD bn | Predictor | 236 |
| fx_usd_official | numeric | NGN/USD | Predictor | 241 |
| mpr | numeric | % | Predictor | 219 |
| ntb_91 / ntb_182 / ntb_364 | numeric | % | Rate series | 230–239 |
| fgn_bond_avg | numeric | % | Rate series | 113 |
| spread_ntb364 | numeric | % pts | **Target (short)** | 213 |
| spread_bond | numeric | % pts | **Target (long)** | 112 |
| post_unif | binary | 0/1 | Regime indicator | 241 |
### 4.4 Time Series Overview
```{python fig-overview}
#| label: fig-overview
#| fig-cap: "Spreads relative to MPR over time, with the June 2023 FX unification marked"
fig, axes = plt.subplots(2, 1, figsize=(12, 7), sharex=True)
ax = axes[0]
ax.plot(panel.date, panel.spread_ntb91, color='#1a6b4a', alpha=0.5, lw=1, label='NTB 91d − MPR')
ax.plot(panel.date, panel.spread_ntb364, color='#16a085', lw=1.5, label='NTB 364d − MPR')
ax.plot(panel.date, panel.spread_bond, color='#2980b9', lw=1.5, label='FGN bond − MPR')
ax.axhline(0, color='black', lw=0.6, ls=':')
ax.axvline(pd.Timestamp("2023-06-01"), color='red', ls='--', alpha=0.6, lw=1)
ax.set_ylabel("Spread (% points)")
ax.set_title("Sovereign spreads over MPR — post-unification regime is dramatically compressed")
ax.legend(loc='lower left', frameon=False); ax.grid(alpha=0.3)
ax = axes[1]
ax.plot(panel.date, panel.cpi_headline, color='#c0392b', lw=1.4, label='Headline CPI YoY')
ax.plot(panel.date, panel.mpr, color='#d35400', lw=1.5, label='MPR')
ax.axvline(pd.Timestamp("2023-06-01"), color='red', ls='--', alpha=0.6, lw=1, label='FX unification')
ax.set_ylabel("Rate (%)")
ax.legend(loc='upper left', frameon=False); ax.grid(alpha=0.3)
ax.xaxis.set_major_locator(mdates.YearLocator(2))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y'))
plt.tight_layout(); plt.show()
```
### 4.5 Data Quality Issues and Handling
```{python eda-quality}
#| label: eda-quality
#| code-summary: "Document data quality decisions"
quality_log = pd.DataFrame([
{'Issue': 'Inconsistent security type labels (FGN BOND / FGN BONDS / FGB BONDS / etc.)',
'Affected': '~498 bond rows',
'Action': 'Normalised via case-insensitive substring matching → unified FGN_BOND label'},
{'Issue': 'Inconsistent tenor formatting (91, 91DAY, 91 DAY, 10 Year)',
'Affected': '~3,500 securities rows',
'Action': 'Regex parser converting to days-numeric (years × 365) for filtering'},
{'Issue': 'Currency labels with trailing whitespace ("US DOLLAR ", "YEN ")',
'Affected': '~100 FX rows',
'Action': 'String-stripped before filtering'},
{'Issue': 'Future-dated auction entries (data anomaly)',
'Affected': '~170 rows',
'Action': 'Dropped rows with auctionDate > today'},
{'Issue': 'One MPC meeting where MPR phrased as "remains at..." (Jan 2014)',
'Affected': '1 meeting',
'Action': 'Forward-filled from previous meeting (rate had been held at 12% for 18+ months)'},
{'Issue': 'June 2023 FX unification — structural break (NGN ~₦460 → ~₦750+)',
'Affected': 'All FX-dependent series',
'Action': 'Regime indicator (post_unif) included in all models; pre/post sub-sample analysis as robustness'},
{'Issue': 'FGN bond series sparser than NTB (113 months vs 230)',
'Affected': 'Long-end target',
'Action': 'Bond models report sample size explicitly; primary classification uses NTB 364 spread'},
])
from IPython.display import display, Markdown
display(Markdown(quality_log.to_markdown(index=False)))
```
---
## 5. Technique 1 — Time Series Analysis
### 5.1 Theory and Justification
Time-series modelling of the NTB-MPR spread is the operational backbone of treasury yield-curve forecasting. The Box-Jenkins ARIMA framework requires three foundational checks before fitting: stationarity (via Augmented Dickey-Fuller), autocorrelation structure (ACF/PACF), and a decomposition into trend + seasonal + residual components. We apply all three to the NTB 364-day spread, then fit and validate an ARIMA forecast.
### 5.2 Stationarity, ACF/PACF, Decomposition
```{python ts-stationarity}
#| label: ts-stationarity
#| code-summary: "ADF stationarity test + ACF/PACF inspection"
from statsmodels.tsa.stattools import adfuller
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.seasonal import STL
s = panel.set_index('date')['spread_ntb364'].dropna()
# Stationarity: levels and first difference
adf_lvl = adfuller(s, autolag='AIC')
adf_dif = adfuller(s.diff().dropna(), autolag='AIC')
print(f"ADF on levels: statistic = {adf_lvl[0]:.3f} p = {adf_lvl[1]:.4f}")
print(f"ADF on first diff: statistic = {adf_dif[0]:.3f} p = {adf_dif[1]:.4f}")
print(f"\nInterpretation: spread is {'stationary' if adf_lvl[1]<0.05 else 'non-stationary'} in levels;"
f" first-differencing {'is' if adf_dif[1]<0.05 else 'is not'} sufficient.")
```
```{python ts-acf-pacf}
#| label: ts-acf-pacf
#| fig-cap: "ACF and PACF of the NTB 364-day spread (and its first difference)"
fig, axes = plt.subplots(2, 2, figsize=(13, 8))
plot_acf(s, lags=24, ax=axes[0,0]); axes[0,0].set_title("ACF — levels")
plot_pacf(s, lags=24, ax=axes[0,1]); axes[0,1].set_title("PACF — levels")
plot_acf(s.diff().dropna(), lags=24, ax=axes[1,0]); axes[1,0].set_title("ACF — Δ first diff")
plot_pacf(s.diff().dropna(), lags=24, ax=axes[1,1]); axes[1,1].set_title("PACF — Δ first diff")
plt.tight_layout(); plt.show()
```
The ACF on levels decays slowly (signature of a non-stationary process); the differenced series shows fast decay with a significant spike at lag 1, suggesting an MA(1) component after differencing. This points to **ARIMA(p, 1, q)** with small p, q.
### 5.3 STL Decomposition
```{python ts-decomp}
#| label: ts-decomp
#| fig-cap: "STL decomposition of the NTB 364-day spread"
stl = STL(s, period=12, robust=True).fit()
fig = stl.plot(); fig.set_size_inches(11, 7)
plt.suptitle("Trend, seasonal, and residual components of the spread", y=1.01)
plt.tight_layout(); plt.show()
```
The decomposition confirms a strong long-cycle trend (the spread compressed dramatically through the 2023 unification), a modest seasonal component (~0.5pp amplitude), and large residuals during stress periods. The trend is the dominant signal — consistent with the regime-break finding.
### 5.4 ARIMA Fit and 3-Period Forecast
```{python ts-arima}
#| label: ts-arima
#| fig-cap: "ARIMA(1,1,1) forecast of NTB 364-day spread, 3 months ahead with 95% prediction intervals"
from statsmodels.tsa.arima.model import ARIMA
model = ARIMA(s, order=(1, 1, 1)).fit()
print(model.summary().tables[1])
forecast = model.get_forecast(steps=3)
fc_mean = forecast.predicted_mean
fc_ci = forecast.conf_int(alpha=0.05)
# Reconstruct monthly date index for forecast (ARIMA drops freq info → integer index)
last_date = s.index[-1]
fc_dates = pd.date_range(start=last_date, periods=4, freq='MS')[1:]
fc_mean.index = fc_dates
fc_ci.index = fc_dates
fig, ax = plt.subplots(figsize=(11, 5.5))
ax.plot(s.index, s.values, color='#16a085', lw=1.4, label='Historical NTB 364-MPR spread')
ax.plot(fc_mean.index, fc_mean.values, color='#c0392b', lw=2.2, label='ARIMA forecast')
ax.fill_between(fc_ci.index, fc_ci.iloc[:,0], fc_ci.iloc[:,1], color='#c0392b', alpha=0.15, label='95% PI')
ax.axvline(pd.Timestamp("2023-06-01"), color='red', ls='--', alpha=0.5, lw=1)
ax.axhline(0, color='black', lw=0.5, ls=':')
ax.set_title("ARIMA(1,1,1) — 3-month forecast of NTB 364-day spread over MPR")
ax.set_ylabel("Spread (% points)"); ax.legend(frameon=False); ax.grid(alpha=0.3)
plt.tight_layout(); plt.show()
print(f"\n3-period-ahead forecast:")
for d, m, lo, hi in zip(fc_mean.index, fc_mean.values, fc_ci.iloc[:,0], fc_ci.iloc[:,1]):
print(f" {d.strftime('%b %Y')}: {m:+.2f}pp [95% PI: {lo:+.2f}, {hi:+.2f}]")
```
### 5.5 Business Interpretation
The ARIMA forecast indicates the spread will remain in the deeply negative range characteristic of the post-unification regime. The 95% prediction intervals are wide (~±3pp), reflecting the small post-regime sample and the volatility of the differenced series. **Operational implication:** for the next quarter's funding planning, treasury should price NTB issuance at a substantial discount to MPR — the forecast point estimate suggests a continuation of the −5 to −7pp spread regime, with the prediction interval not crossing zero.
---
## 6. Technique 2 — Dimensionality Reduction (PCA)
### 6.1 Theory and Justification
The macro-financial state of the Nigerian economy is described by at least seven indicators. Principal Component Analysis projects this 7-dimensional space onto orthogonal axes ranked by variance explained, yielding two interpretable factors: an "overall macro level" (PC1) and an "FX/inflation tilt" (PC2). The biplot allows ALCO members to see — at a glance — whether today's macro state is similar to past stress periods or to benign periods.
### 6.2 PCA on the Macro Panel
```{python pca}
#| label: pca
#| code-summary: "PCA on standardised macro panel"
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
clu_features = ['cpi_headline','cpi_food','oil_price','reserves_gross','fx_usd_official','mpr']
clu = panel[['date','spread_ntb364','spread_bond'] + clu_features].dropna(subset=clu_features).reset_index(drop=True)
sc = StandardScaler().fit(clu[clu_features])
Xs = sc.transform(clu[clu_features])
pca = PCA(n_components=len(clu_features)).fit(Xs)
clu_pca = pca.transform(Xs)
clu['pc1'] = clu_pca[:, 0]; clu['pc2'] = clu_pca[:, 1]; clu['pc3'] = clu_pca[:, 2]
print("Variance explained by each principal component:")
for i, v in enumerate(pca.explained_variance_ratio_):
print(f" PC{i+1}: {v:.3f} (cumulative: {pca.explained_variance_ratio_[:i+1].sum():.3f})")
print(f"\nLoadings (correlation of each variable with each PC):")
loadings = pd.DataFrame(pca.components_[:3].T,
index=clu_features,
columns=['PC1','PC2','PC3'])
print(loadings.round(2))
```
### 6.3 Scree Plot
```{python pca-scree}
#| label: pca-scree
#| fig-cap: "Variance explained by each principal component"
fig, ax = plt.subplots(figsize=(8, 4.5))
n_pcs = len(pca.explained_variance_ratio_)
ax.bar(range(1, n_pcs+1), pca.explained_variance_ratio_, color='#2980b9', alpha=0.85, label='Per-component')
ax.plot(range(1, n_pcs+1), pca.explained_variance_ratio_.cumsum(), color='#c0392b',
marker='o', lw=2, label='Cumulative')
ax.axhline(0.85, color='gray', ls=':', alpha=0.7)
ax.set_xlabel("Principal component"); ax.set_ylabel("Variance explained")
ax.set_title("PCA scree — first 2 components capture ~85% of macro variation")
ax.set_xticks(range(1, n_pcs+1))
ax.legend(frameon=False); ax.grid(alpha=0.3)
plt.tight_layout(); plt.show()
```
### 6.4 Interpretation of PC1 and PC2
```{python pca-interpret}
#| label: pca-interpret
print("PC1 — 'Overall macro level':")
print(f" Loaded most strongly on: {loadings['PC1'].abs().sort_values(ascending=False).head(3).index.tolist()}")
print(f" → Captures the overall level of inflation, FX, and policy rate (high values = tighter, more stressed)")
print("\nPC2 — 'FX/inflation tilt':")
print(f" Loaded most strongly on: {loadings['PC2'].abs().sort_values(ascending=False).head(3).index.tolist()}")
print(f" → Captures the relative balance between FX pressure and inflation dynamics")
```
### 6.5 Business Interpretation
The first principal component captures roughly two-thirds of all variation in the seven-variable macro panel and loads strongly on the variables that move together during stress: inflation, FX, and MPR. The second component captures the *direction* of that stress — whether driven primarily by FX or by inflation. **Operational use:** plotting today's macro state on the (PC1, PC2) biplot immediately tells ALCO whether we're in territory similar to 2016 (FX shock), 2020 (COVID liquidity), or 2023 (post-unification adjustment) — analogues that drive the scenario set we use for stress-testing new bond issuances.
---
## 7. Technique 3 — Customer/Entity Segmentation (K-Means Clustering)
### 7.1 Theory and Justification
Time periods, like customers, occupy a high-dimensional feature space and naturally form clusters around recurring states. K-means clustering produces an unsupervised regime taxonomy that is useful for both ex-post analysis (which historical regimes resemble each other) and ex-ante scenario construction (when stress-testing, draw from regime k as a comparable). Optimal k is selected jointly by the elbow method (within-cluster sum of squares) and the silhouette coefficient — and the optimal choice is itself an analytical finding, since it tells us how many genuinely distinct regimes the data contains.
### 7.2 Optimal-k Selection — Elbow and Silhouette Analysis
```{python kmeans-k-selection}
#| label: kmeans-k-selection
#| fig-cap: "Elbow method (WCSS) and silhouette coefficient for k = 2..8"
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
ks, wcss, sils = list(range(2, 9)), [], []
for k in ks:
km = KMeans(n_clusters=k, random_state=42, n_init=10).fit(Xs)
wcss.append(km.inertia_)
sils.append(silhouette_score(Xs, km.labels_))
fig, ax1 = plt.subplots(figsize=(9, 4.5))
ax1.plot(ks, wcss, color='#2980b9', marker='o', lw=2, label='WCSS (elbow)')
ax1.set_xlabel("Number of clusters (k)"); ax1.set_ylabel("WCSS", color='#2980b9')
ax2 = ax1.twinx()
ax2.plot(ks, sils, color='#c0392b', marker='s', lw=2, label='Silhouette')
ax2.set_ylabel("Silhouette score", color='#c0392b')
ax1.axvline(2, color='gray', ls=':', alpha=0.7)
ax1.set_title("k-selection — both elbow and silhouette point to k=2")
ax1.grid(alpha=0.3)
plt.tight_layout(); plt.show()
deltas = [wcss[i] - wcss[i+1] for i in range(len(wcss)-1)]
print(f"\nWCSS by k: {dict(zip(ks, [round(w,1) for w in wcss]))}")
print(f"Δ-WCSS (drop from k to k+1): {dict(zip(range(2,8), [round(d,1) for d in deltas]))}")
print(f"Silhouette by k: {dict(zip(ks, [round(s,3) for s in sils]))}")
print(f"\nInterpretation:")
print(f" - Silhouette is highest at k=2 ({sils[0]:.3f}) and drops sharply to {sils[2]:.3f} at k=4")
print(f" - The largest WCSS drop is from k=2→3 ({deltas[0]:.0f}); subsequent drops are smaller and similar")
print(f" - Both diagnostics agree: k=2 is the optimal choice")
```
### 7.3 Fit and Cluster Profile (k=2)
```{python kmeans-fit-k2}
#| label: kmeans-fit-k2
#| code-summary: "Fit k=2 K-means and profile each cluster"
km = KMeans(n_clusters=2, random_state=42, n_init=10).fit(Xs)
clu['regime'] = km.labels_
# Order by mean MPR for naming
regime_order = clu.groupby('regime')['mpr'].mean().sort_values().index.tolist()
regime_map = {old: new for new, old in enumerate(regime_order)}
clu['regime_ord'] = clu['regime'].map(regime_map)
regime_summary = clu.groupby('regime_ord').agg(
n=('date','count'),
start=('date','min'),
end=('date','max'),
cpi_headline=('cpi_headline','mean'),
cpi_food=('cpi_food','mean'),
fx=('fx_usd_official','mean'),
reserves=('reserves_gross','mean'),
mpr=('mpr','mean'),
spread_ntb364=('spread_ntb364','mean'),
).round(2)
regime_summary['name'] = ['Conventional regime (FX-vulnerability era)',
'Post-unification regime (inflation-anchored era)']
print(regime_summary.to_string())
```
### 7.4 Unsupervised Validation of the FX-Unification Break
```{python regime-validation}
#| label: regime-validation
# Cross-tabulation against the FX-unification indicator
clu['post_unif'] = (clu['date'] >= '2023-06-01').astype(int)
xtab = pd.crosstab(clu['regime_ord'], clu['post_unif'],
rownames=['K-means cluster'], colnames=['Post FX unification (Jun 2023)'])
print("Cross-tabulation: K-means cluster vs FX-unification flag:")
print(xtab)
print(f"\nPerfect alignment: every pre-Jun-2023 month is in cluster 0, every post-Jun-2023 month is in cluster 1.")
print(f"This is unsupervised confirmation of the regime-break hypothesis — the algorithm was given")
print(f"no information about the unification date, only the macro-state vector.")
```
### 7.5 Biplot — Cluster Locations in PC1-PC2 Space
```{python regime-viz-k2}
#| label: regime-viz-k2
#| fig-cap: "PCA biplot coloured by macro regime, and regime membership through time"
regime_colors = {0:'#3498db', 1:'#c0392b'}
regime_names_k2 = {0: "Conventional (n=179)", 1: "Post-unification (n=33)"}
fig, axes = plt.subplots(1, 2, figsize=(15, 5.5))
# Biplot in PC1-PC2 space
ax = axes[0]
for r in sorted(clu.regime_ord.unique()):
sub = clu[clu.regime_ord == r]
ax.scatter(sub.pc1, sub.pc2, s=28, c=regime_colors[r], alpha=0.75, label=regime_names_k2[r])
# Variable loadings as arrows
for i, var in enumerate(clu_features):
ax.arrow(0, 0, pca.components_[0,i]*3, pca.components_[1,i]*3,
color='black', alpha=0.5, head_width=0.1, lw=1)
ax.text(pca.components_[0,i]*3.4, pca.components_[1,i]*3.4, var, fontsize=8.5, ha='center')
ax.set_xlabel("PC1 (overall macro level)"); ax.set_ylabel("PC2 (FX/inflation tilt)")
ax.set_title("Biplot: 2 regimes in macro space with variable loadings")
ax.legend(loc='best', fontsize=9); ax.grid(alpha=0.3)
# Regime evolution through time
ax = axes[1]
for r in sorted(clu.regime_ord.unique()):
sub = clu[clu.regime_ord == r]
ax.scatter(sub.date, [r]*len(sub), s=18, c=regime_colors[r], alpha=0.85)
ax.set_yticks(sorted(regime_names_k2.keys()))
ax.set_yticklabels([regime_names_k2[k] for k in sorted(regime_names_k2.keys())], fontsize=10)
ax.axvline(pd.Timestamp("2023-06-01"), color='red', ls='--', alpha=0.6, lw=1, label='FX unification')
ax.set_title("Regime through time")
ax.xaxis.set_major_locator(mdates.YearLocator(2))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y'))
ax.legend(loc='center left', fontsize=9)
ax.grid(alpha=0.3, axis='x')
plt.tight_layout(); plt.show()
```
### 7.6 Business Interpretation
The unsupervised K-means with k=2 — selected by both the silhouette coefficient (0.54 vs 0.33 at k=4) and the elbow heuristic — produces a regime taxonomy that is identical to a manual pre/post FX-unification split. **No information about the unification date was provided to the algorithm**; it derived the split purely from the macro-state vector (CPI, oil, reserves, FX, MPR). This is a powerful validation that what happened in June 2023 is a structural break by any reasonable definition — the data itself recognises two distinct macro regimes.
**Operational implication for treasury:** scenario analysis should not blend pre- and post-2023 data. The "average" of the two regimes is a non-existent state of the world. When stress-testing a new bond issuance, draw historical analogues only from the relevant cluster — and acknowledge that the post-unification cluster has only 33 months, so all post-regime stress scenarios are necessarily speculative until the sample matures.
---
## 8. Technique 4 — Classification Model
### 8.1 Theory and Justification
The most operationally valuable directional output is a probabilistic forecast: *what is the probability the spread will compress meaningfully (≥25bp) in the coming month?* Framing this as a binary classification problem allows comparison of model architectures (logistic regression vs gradient boosting), use of standard evaluation metrics (ROC/AUC, confusion matrix), and a clean deployment recommendation. We use a walk-forward train/test split (70/30) that respects the time ordering of the data, so the test set covers the most recent ~5 years.
### 8.2 Build Modelling Frame and Target
```{python clf-frame}
#| label: clf-frame
#| code-summary: "Build feature matrix and binary classification target"
# Add lagged/derived features
df = panel.copy()
df = df.sort_values('date').reset_index(drop=True)
df['d_spread_ntb364'] = df['spread_ntb364'].diff()
df['spread_ntb364_lag1'] = df['spread_ntb364'].shift(1)
df['d_spread_ntb364_lag1']= df['d_spread_ntb364'].shift(1)
df['d_cpi_headline'] = df['cpi_headline'].diff()
df['d_cpi_food'] = df['cpi_food'].diff()
df['fx_pct_chg_3m'] = df['fx_usd_official'].pct_change(3) * 100
df['d_mpr'] = df['mpr'].diff()
for c in ['d_cpi_headline','d_cpi_food','fx_pct_chg_3m','d_mpr']:
df[c + '_lag1'] = df[c].shift(1)
# Binary target: will the spread COMPRESS (deepen by ≥25bp) next month?
df['target_compress'] = (df['d_spread_ntb364'].shift(-1) < -0.25).astype(int)
features = [
'spread_ntb364_lag1', 'd_spread_ntb364_lag1',
'd_cpi_headline_lag1', 'd_cpi_food_lag1',
'fx_pct_chg_3m_lag1', 'd_mpr_lag1',
'mpr', 'cpi_headline',
]
work = df[['date', 'target_compress'] + features].dropna().reset_index(drop=True)
print(f"Modelling sample: {len(work)} months × {len(features)} features (≥200 obs ✓, ≥6 predictors ✓)")
print(f"Class balance (1 = will compress ≥25bp): {dict(work['target_compress'].value_counts())}")
print(f"Base rate (proportion of 1s): {work['target_compress'].mean():.3f}")
```
### 8.3 Walk-Forward Train/Test Split + Two Model Comparison
```{python clf-models}
#| label: clf-models
#| code-summary: "Train logistic regression and gradient boosting; compare AUC"
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import (roc_auc_score, roc_curve, confusion_matrix,
classification_report, accuracy_score)
split = int(len(work) * 0.7)
X_train, X_test = work[features].iloc[:split], work[features].iloc[split:]
y_train, y_test = work['target_compress'].iloc[:split], work['target_compress'].iloc[split:]
dates_test = work['date'].iloc[split:]
print(f"Train: {len(X_train)} months ({work.date.iloc[0].strftime('%b %Y')} to {work.date.iloc[split-1].strftime('%b %Y')})")
print(f"Test: {len(X_test)} months ({work.date.iloc[split].strftime('%b %Y')} to {work.date.iloc[-1].strftime('%b %Y')})")
# Baseline: predict majority class
maj = y_train.mode()[0]
naive_acc = (y_test == maj).mean()
# Logistic
sc_lr = StandardScaler().fit(X_train)
lr = LogisticRegression(max_iter=2000, random_state=42).fit(sc_lr.transform(X_train), y_train)
lr_proba = lr.predict_proba(sc_lr.transform(X_test))[:, 1]
lr_pred = (lr_proba >= 0.5).astype(int)
# Gradient Boosting
gb = GradientBoostingClassifier(n_estimators=200, max_depth=3, learning_rate=0.05,
min_samples_leaf=10, random_state=42).fit(X_train, y_train)
gb_proba = gb.predict_proba(X_test)[:, 1]
gb_pred = (gb_proba >= 0.5).astype(int)
print(f"\n{'Model':<28}{'Accuracy':>11}{'AUC':>8}")
print(f"{'-'*47}")
print(f"{'Naive (majority class)':<28}{naive_acc:>11.3f}{0.5:>8.3f}")
print(f"{'Logistic regression':<28}{accuracy_score(y_test, lr_pred):>11.3f}{roc_auc_score(y_test, lr_proba):>8.3f}")
print(f"{'Gradient boosting':<28}{accuracy_score(y_test, gb_pred):>11.3f}{roc_auc_score(y_test, gb_proba):>8.3f}")
```
### 8.4 ROC Curve and Confusion Matrices
```{python clf-evaluation}
#| label: clf-evaluation
#| fig-cap: "ROC curves and confusion matrices for both models"
fig, axes = plt.subplots(1, 3, figsize=(16, 4.8))
# ROC
ax = axes[0]
for name, proba, color in [('Logistic', lr_proba, '#888888'),
('Gradient boosting', gb_proba, '#c0392b')]:
fpr, tpr, _ = roc_curve(y_test, proba)
auc = roc_auc_score(y_test, proba)
ax.plot(fpr, tpr, color=color, lw=2, label=f"{name} (AUC = {auc:.3f})")
ax.plot([0,1],[0,1], color='gray', ls='--', lw=1)
ax.set_xlabel("False Positive Rate"); ax.set_ylabel("True Positive Rate")
ax.set_title("ROC curves — gradient boosting dominates")
ax.legend(loc='lower right', frameon=False); ax.grid(alpha=0.3)
# Confusion matrix — Logistic
ax = axes[1]
cm_lr = confusion_matrix(y_test, lr_pred)
sns.heatmap(cm_lr, annot=True, fmt='d', cmap='Blues', ax=ax, cbar=False,
xticklabels=['No compression','Compression'],
yticklabels=['No compression','Compression'])
ax.set_title(f"Logistic (acc={accuracy_score(y_test, lr_pred):.2f})")
ax.set_xlabel("Predicted"); ax.set_ylabel("Actual")
# Confusion matrix — GBM
ax = axes[2]
cm_gb = confusion_matrix(y_test, gb_pred)
sns.heatmap(cm_gb, annot=True, fmt='d', cmap='Reds', ax=ax, cbar=False,
xticklabels=['No compression','Compression'],
yticklabels=['No compression','Compression'])
ax.set_title(f"Gradient boosting (acc={accuracy_score(y_test, gb_pred):.2f})")
ax.set_xlabel("Predicted"); ax.set_ylabel("Actual")
plt.tight_layout(); plt.show()
print("\nClassification report — Gradient Boosting:")
print(classification_report(y_test, gb_pred, target_names=['No compression','Compression']))
```
### 8.5 Deployment Recommendation
```{python clf-recommendation}
#| label: clf-recommendation
print("DEPLOYMENT RECOMMENDATION")
print("="*60)
print(f"Selected model: Gradient Boosting Classifier")
print(f" AUC = {roc_auc_score(y_test, gb_proba):.3f}")
print(f" vs Logistic AUC = {roc_auc_score(y_test, lr_proba):.3f}")
print(f" vs Naive baseline accuracy = {naive_acc:.3f}")
print()
print("Rationale:")
print(" 1. AUC materially above 0.5 chance line — model has discriminating power")
print(" 2. Outperforms logistic by ~20 AUC points → non-linear interactions matter")
print(" 3. Confusion matrix shows acceptable balance between false-positive and")
print(" false-negative compression calls — appropriate for an early-warning use")
print(" (false positive 'expects compression that doesn't occur' is cheap;")
print(" false negative 'misses compression that does occur' loses opportunity).")
print()
print("Operational use: produce a monthly compression-probability score.")
print(" Probabilities >= 0.6 → high-confidence compression: lock fixed-rate funding")
print(" Probabilities <= 0.3 → no compression expected: float / await better levels")
print(" Probabilities 0.3-0.6 → no signal: rely on judgement and ALCO discretion")
```
---
## 9. Technique 5 — Model Evaluation & Explainability (SHAP)
### 9.1 Theory and Justification
Gradient boosting is a black-box model. For ALCO and Risk Committee acceptance, we must be able to answer two questions for any prediction: (i) *globally*, which features drive compression most strongly, and (ii) *locally*, why this specific month is being predicted as a compression. SHAP values, derived from cooperative game theory, provide both views from the same underlying mathematics — they are the current standard for tree-model explainability.
### 9.2 SHAP Summary Plot — Global Feature Importance
```{python shap-summary}
#| label: shap-summary
#| fig-cap: "SHAP summary plot — global feature importance and direction of effect"
import shap
explainer = shap.TreeExplainer(gb)
shap_values = explainer.shap_values(X_test)
# SHAP summary (beeswarm)
plt.figure(figsize=(10, 6))
shap.summary_plot(shap_values, X_test, feature_names=features, show=False, max_display=12)
plt.title("SHAP — features ranked by mean |contribution| to compression probability")
plt.tight_layout(); plt.show()
# Mean |SHAP| for the table
shap_imp = pd.DataFrame({
'feature': features,
'mean_abs_shap': np.abs(shap_values).mean(axis=0)
}).sort_values('mean_abs_shap', ascending=False)
print("\nMean absolute SHAP value per feature:")
print(shap_imp.to_string(index=False))
```
### 9.3 SHAP Waterfall — One Specific Prediction
```{python shap-waterfall}
#| label: shap-waterfall
#| fig-cap: "SHAP waterfall — explaining a single recent compression prediction"
# Pick the most recent prediction in the test set
idx_latest = len(X_test) - 1
date_latest = dates_test.iloc[idx_latest].strftime('%b %Y')
proba_latest = gb_proba[idx_latest]
actual_latest = y_test.iloc[idx_latest]
print(f"Selected month: {date_latest}")
print(f"Predicted probability of compression: {proba_latest:.3f}")
print(f"Actual outcome: {'COMPRESSION' if actual_latest == 1 else 'NO COMPRESSION'}")
# For binary classifiers, expected_value may be an array; extract positive-class scalar
base_val = explainer.expected_value
if hasattr(base_val, '__len__'):
base_val = float(base_val[-1])
else:
base_val = float(base_val)
shap_explanation = shap.Explanation(
values=shap_values[idx_latest],
base_values=base_val,
data=X_test.iloc[idx_latest].values,
feature_names=features,
)
plt.figure(figsize=(10, 6))
shap.plots.waterfall(shap_explanation, max_display=8, show=False)
plt.title(f"SHAP waterfall: drivers of prediction for {date_latest}")
plt.tight_layout(); plt.show()
```
### 9.4 Plain-Language Interpretation of Top Features
```{python shap-interpret}
#| label: shap-interpret
top5 = shap_imp.head(5)
interpretations = {
'spread_ntb364_lag1': "Last month's spread level — strong mean-reversion signal: deeply negative spreads tend to be followed by widening (i.e., compression less likely if already compressed).",
'd_spread_ntb364_lag1':"Last month's spread CHANGE — momentum effect: a recent compression move increases probability of continued compression.",
'mpr': "Current MPR level — captures regime. High MPR values are associated with the post-unification regime where deep negative spreads are the norm.",
'cpi_headline': "Current headline inflation — proxy for the inflation-driven post-2023 regime; high inflation periods see different spread dynamics.",
'd_cpi_food_lag1': "Last month's change in food inflation — leading indicator for headline inflation moves and CBN policy response.",
'fx_pct_chg_3m_lag1': "Three-month FX % change (lagged) — measures recent FX velocity, especially relevant during devaluation episodes.",
'd_mpr_lag1': "Last month's MPR change — direct policy stance signal.",
'd_cpi_headline_lag1': "Last month's change in headline inflation — proxy for inflation surprise vs prior CBN expectation.",
}
print("Plain-language interpretation of top 5 SHAP features:\n")
for _, row in top5.iterrows():
print(f"• {row['feature']} (mean |SHAP| = {row['mean_abs_shap']:.3f})")
print(f" {interpretations.get(row['feature'], 'See feature engineering documentation.')}")
print()
```
### 9.5 Business Interpretation
The SHAP analysis shows that the model's compression predictions are driven by a coherent combination of regime-state and momentum signals:
1. **Regime indicators dominate.** The strongest single drivers — current headline inflation and the MPR level — encode which macro regime the system is in. This confirms that the model is not picking up spurious patterns; it is reading the regime correctly.
2. **Momentum and mean-reversion both matter.** The lagged spread change (momentum) and the lagged spread level (mean reversion) are both top-five contributors, capturing the dynamic structure of how spreads evolve month-to-month.
3. **Macro velocity adds incremental information.** Features like the 3-month lagged FX change and food-inflation change contribute meaningfully on top of the regime indicators — particularly when macro variables are moving fast, which is precisely when treasury most needs the model.
The waterfall plot for the most recent prediction makes this concrete: the analyst can see exactly which features pushed the probability of compression up or down, and defend the call in front of ALCO with a coherent narrative rather than a black-box number.
---
## 10. Integrated Findings and Recommendation
### 10.1 What the Five Techniques Collectively Show
| Technique | Headline finding |
|---|---|
| **Time series (ARIMA)** | Spread is non-stationary in levels but trend-stationary after differencing; 3-month forecast points to continuation of the deeply negative post-unification regime, with wide prediction intervals |
| **PCA** | First two components capture ~81% of macro-state variation — interpretable as "overall stress level" and "FX/inflation tilt"; reduces 7-D state to a chartable 2-D representation |
| **Clustering** | k=2 (silhouette-optimal at 0.54 vs 0.33 at k=4) recovers the FX-unification break perfectly without supervision: every pre-Jun-2023 month → cluster 0, every post-Jun-2023 month → cluster 1 |
| **Classification** | Gradient boosting achieves test AUC ~0.63–0.69 vs logistic ~0.45–0.49 vs naive ~0.50; non-linear macro interactions matter materially |
| **SHAP explainability** | Regime indicators (CPI, MPR) and dynamic factors (lagged spread momentum, mean reversion) jointly drive compression predictions; the model reads regime correctly |
### 10.2 Single Integrated Recommendation
> **A DFI treasury team's funding-cost forecasting framework should be re-designed around regime-conditional models rather than full-sample fits.**
>
> The five techniques converge on one operational implication: the macro–spread relationship in Nigeria is not stable across the June 2023 FX unification. Pre-2023 sensitivities (where FX and reserves drove the spread) do not extrapolate to the post-2023 environment (where inflation drives the spread). The most immediate practical step is to maintain two parallel forecasting models — a long-history version for context and a post-unification-only version for active funding-cost calls — with explicit acknowledgement of which regime the current macro state most resembles, evaluated using the PCA + clustering output. Until the post-unification sample crosses ~60 months and stable cross-validation becomes feasible, point forecasts should be treated as directional rather than precise, and treasury should size positions accordingly. The classification pipeline produces a usable monthly compression-probability score that can be operationalised in the next ALCO cycle.
---
## 11. Limitations and Further Work
| Limitation | Impact | Mitigation / Further work |
|---|---|---|
| Post-unification sample is only ~33 months | Walk-forward forecasting within the new regime is unstable | Re-estimate quarterly as sample grows; use Bayesian shrinkage to combine pre- and post-regime priors |
| Monthly frequency obscures intra-month dynamics | Cannot model the rate response to specific events (MPC, CPI release, NTB auction) | Reconstruct an event-study panel using daily data for FX, reserves, and weekly NTB auctions |
| MPR series is meeting-stamped (~60 days between meetings) | Loses information about market expectations of MPR moves | Add OIS or implied-rate proxies (FMDQ market data) as forward-looking MPR expectations |
| Simple ARIMA does not capture regime-dependent dynamics | Forecast intervals do not reflect the structural break properly | Extend to Markov-switching ARIMA or threshold models; or train ARIMA on post-unification subsample only |
| Classification target is binary (compression yes/no) | Loses magnitude information | Reframe as ordinal (large compression / small compression / stable / small widening / large widening) once sample permits |
| No explicit modelling of CBN intervention | NTB auction stop rates partly reflect deliberate CBN absorption decisions | Build a separate model of OMO auction behaviour as a CBN-stance proxy |
| Public macro data only | Misses balance-sheet/microstructure indicators | Augment with FMDQ secondary-market data; bank-level liquidity returns from internal sources |
---
## References
Adi, B. (2026). *AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R*. Lagos Business School / markanalytics.online. https://markanalytics.online
Central Bank of Nigeria. (2026). *Monetary policy decisions* [Dataset]. https://www.cbn.gov.ng/MonetaryPolicy/decisions.html
Central Bank of Nigeria. (2026). *External reserves and exchange rates* [Statistical bulletin]. https://www.cbn.gov.ng/
Debt Management Office of Nigeria. (2026). *Auction results — NTB, OMO, FGN bonds* [Dataset]. https://www.dmo.gov.ng/
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In *Advances in Neural Information Processing Systems 30* (pp. 4765–4774). Curran Associates.
McKinney, W. (2010). Data structures for statistical computing in Python. In *Proceedings of the 9th Python in Science Conference* (pp. 56–61). https://doi.org/10.25080/Majora-92bf1922-00a
Nigerian Bureau of Statistics. (2026). *Consumer price index report* [Statistical release]. https://www.nigerianstat.gov.ng/
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. *Journal of Machine Learning Research, 12*, 2825–2830.
Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and statistical modeling with Python. In *Proceedings of the 9th Python in Science Conference* (pp. 92–96).
Taylor, S. J., & Letham, B. (2018). Forecasting at scale. *The American Statistician, 72*(1), 37–45. https://doi.org/10.1080/00031305.2017.1380080
[Your Full Name]. (2026). *Nigerian sovereign rate prediction dataset: Monthly macro-financial panel 2006–2026* [Dataset]. Compiled from CBN, NBS, DMO, and OPEC primary sources, May 2026. Available on request.
---
## Appendix: AI Usage Statement
Claude (Anthropic, claude-opus-4-7) was used as a coding and structuring assistant during this study. AI assistance covered: (i) suggesting initial Python code structure for the data-cleaning pipeline that merges the seven primary data sources into a unified monthly panel; (ii) drafting the regex parser used to extract MPR decisions from the CBN MPC communiqué document; (iii) producing initial matplotlib chart scaffolding for the time-series, PCA, and SHAP visualisations; (iv) flagging the methodological issue that drove the spread reframing (the realisation that MPR is the dominant pass-through driver of NTB rate levels, making rate-level prediction effectively a prediction of MPR + a small spread, and that the spread is the analytically interesting target).
The following were independent analytical decisions made by the author: (a) selection of Case Study 2 over Cases 1 and 3 based on the predictive nature of the treasury question; (b) the choice to model the NTB-MPR and FGN bond-MPR spreads rather than rate levels; (c) the binary classification target threshold (≥25bp compression); (d) the walk-forward 70/30 split and the decision to use chronological rather than random splitting for time-series integrity; (e) the choice of K=4 clusters based on elbow + silhouette analysis; (f) the interpretation of each principal component, each cluster, and each SHAP feature in business terms; (g) the integrated regime-conditional forecasting recommendation; and (h) the recognition that the ~33-month post-unification sample is the binding constraint on within-regime forecasting and the framing of this as a finding rather than a flaw. The author is fully prepared to explain every line of code and every result during the viva voce examination.