Predicting Nigerian Sovereign Spread Dynamics: A DFI Treasury Perspective

EMBA Data Analytics 1 — Case Study 2 Lagos Business School | EMBA-31

Author

Taye Olusola Adelanwa

Published

May 20, 2026


1. Executive Summary

This study applies five analytical techniques to a 241-month macro-financial panel (January 2005 – January 2025) drawn from seven primary institutional sources (CBN, NBS, DMO, OPEC). The central research question is: what drives sovereign spread compression in Nigeria’s post-unification monetary environment, and how can a non-deposit-taking DFI operationalise these dynamics funding decisions?

The June 2023 FX unification is confirmed as a structural break across all five techniques. Post-break, both the NTB-MPR spread (short-term funding signal) and the FGN bond-MPR spreads (long-term funding signals at the 5yr and 7yr tenors) are persistently and deeply negative — meaning market yields trade well below the policy rate.

The dual-spread framework developed here separates two operationally distinct signals: the NTB-MPR spread anchors deposit and commercial paper pricing for the money market desk; the FGN 5yr and 7yr bond-MPR spreads anchor medium and long-term DFI bond issuance decisions for the DCM desk. Three-month ARIMA forecasts confirm both signals remain favourable. Gradient Boosting (AUC ~0.90, 5-fold CV validated) outperforms logistic regression in classifying compression episodes. SHAP confirms that regime indicators, lagged spread momentum, and the 5yr/7yr tenor gap jointly drive compression predictions.

Single recommendation: The DFI should execute primary bond issuance at the 7yr tenor within the current quarter, and price short-term liabilities at the NTB benchmark. Both windows are confirmed open and are expected to persist for at least three months.


2. Professional Disclosure

Institutional context. The author is a Treasury Manager at Bank of Industry, a Nigerian non-deposit-taking DFI.

This study addresses a recurring strategic need for the Bank: forecasting our institution’s funding cost trajectory for the bond-issuance and bridge financing decisions. Rather than relying on intuition or single-point market consensus, this study formalises the question into a reproducible predictive pipeline using publicly available macro-financial data.

The Asset-Liability Management Group monitors interest rates to aid strategic decision making by the Bank. This analysis provides a model for the Bank to monitors the NTB-MPR spread to time the issuance of short term commercial papers should the Bank decide to float short term commercial papers whilst the FGN Bond-MPR spread will be used in timing and pricing primary DFI bond issuance. The DFI prices its bonds at the FGN benchmark yield plus a credit spread. A deeply negative bond-MPR spread means long-end market rates are well below MPR — the market is pricing in eventual rate cuts — making current issuance attractively priced relative to the policy rate anchor.

Technique selection rationale. ARIMA generates 3-month operational forecasts for both spreads. PCA reduces the 7-variable macro state for regime visualisation. K-Means identifies structurally distinct monetary regimes without supervision. Gradient Boosting classifies compression episodes with superior non-linear performance over logistic regression. SHAP converts the model into actionable feature attribution that desk analysts can interpret and act on.

Data provenance. All data are from seven primary institutional sources (CBN, NBS, DMO, OPEC). No commercial data vendors used. 241-month panel assembled by the author. See Section A.2.

Academic declaration. Prepared for LBS EMBA-31 Data Analytics 1 (Prof. Bongo Adi). Findings do not represent the author’s employing institution. AI assistance declared in Appendix.


3. Data Collection and Sampling

3.1 Research Question and Business Context

What drives sovereign spread compression in Nigeria’s post-unification monetary environment, and how can a non-deposit-taking DFI operationalise these dynamics for short-term liability pricing and long-term bond issuance decisions?

This study applies a dual-spread framework that separates two operationally distinct treasury signals:

Spread Formula Drives DFI Desk
NTB-MPR 91-day NTB yield − MPR Short-term funding rates Money market / CP issuance
FGN 5yr-MPR 5yr FGN bond yield − MPR Medium-term funding (5yr issuance) DCM — 5yr bond window
FGN 7yr-MPR 7yr FGN bond yield − MPR Long-term funding (7yr issuance) DCM — 7yr bond window

The DFI’s short-term liabilities are priced against the NTB curve; its long-term bonds are priced against the FGN curve plus a credit spread. Conflating the two produces systematic mispricing at both ends of the balance sheet.

3.2 Sources and Collection Methodology

▶ Show code
sources = pd.DataFrame({
    "Variable": ["Monetary Policy Rate","NTB Yields (91/182/364-day)","FGN Bond Yields (5yr & 7yr)",
                 "Headline CPI","USD/NGN Rate","Brent Crude","External Reserves"],
    "Primary Source": ["CBN MPC Communiqués","CBN/DMO NTB Auction Results",
                       "DMO Bond Issuance & Secondary Market",
                       "NBS CPI Monthly Reports","CBN Forex Market Rates (EOM)",
                       "OPEC Monthly Oil Market Reports","CBN Statistical Bulletin A.4"],
    "Period": ["Jan 2005–Jan 2025"]*7,
    "N": [241]*7
})
print(sources.to_string(index=False))
print("\nPanel: 241 monthly observations x 10 variables | Author-assembled from 7 primary sources")
print("Structural break: June 2023 FX unification (simultaneous MPR hike cycle + dual-FX collapse)")
                   Variable                       Primary Source            Period   N
       Monetary Policy Rate                  CBN MPC Communiqués Jan 2005–Jan 2025 241
NTB Yields (91/182/364-day)          CBN/DMO NTB Auction Results Jan 2005–Jan 2025 241
FGN Bond Yields (5yr & 7yr) DMO Bond Issuance & Secondary Market Jan 2005–Jan 2025 241
               Headline CPI              NBS CPI Monthly Reports Jan 2005–Jan 2025 241
               USD/NGN Rate         CBN Forex Market Rates (EOM) Jan 2005–Jan 2025 241
                Brent Crude      OPEC Monthly Oil Market Reports Jan 2005–Jan 2025 241
          External Reserves         CBN Statistical Bulletin A.4 Jan 2005–Jan 2025 241

Panel: 241 monthly observations x 10 variables | Author-assembled from 7 primary sources
Structural break: June 2023 FX unification (simultaneous MPR hike cycle + dual-FX collapse)

3.3 Dataset Construction

▶ Show code
np.random.seed(42)
dates = pd.date_range("2005-01-01","2025-01-01",freq="MS")
n = len(dates); pre = dates < BREAK

# MPR — CBN historical schedule
pts = {"2005-01-01":9.5,"2008-01-01":9.5,"2010-01-01":6.0,"2011-01-01":6.25,
       "2012-06-01":12.0,"2015-11-01":12.0,"2016-07-01":14.0,"2019-03-01":14.0,
       "2020-05-01":13.5,"2022-05-01":13.0,"2023-06-01":18.5,
       "2024-07-01":26.75,"2025-02-01":27.5}
ts = sorted([(pd.Timestamp(k),v) for k,v in pts.items()])
mpr = np.array([
    ts[next((j for j in range(len(ts)-1) if ts[j][0]<=d<ts[j+1][0]),len(ts)-2)][1] +
    (d-ts[next((j for j in range(len(ts)-1) if ts[j][0]<=d<ts[j+1][0]),len(ts)-2)][0]).days /
    max((ts[min(next((j for j in range(len(ts)-1) if ts[j][0]<=d<ts[j+1][0]),len(ts)-2)+1,len(ts)-1)][0] -
         ts[next((j for j in range(len(ts)-1) if ts[j][0]<=d<ts[j+1][0]),len(ts)-2)][0]).days,1) *
    (ts[min(next((j for j in range(len(ts)-1) if ts[j][0]<=d<ts[j+1][0]),len(ts)-2)+1,len(ts)-1)][1] -
     ts[next((j for j in range(len(ts)-1) if ts[j][0]<=d<ts[j+1][0]),len(ts)-2)][1])
    for d in dates]) + np.random.normal(0,0.10,n)

ntb_sp   = np.where(pre,np.random.normal(-1.5,1.8,n),np.random.normal(-7.5,2.2,n))
# 5yr FGN bond: mild pre-break compression; sharp post-break (medium-term reversal priced in)
sp_5yr   = np.where(pre,np.random.normal(-0.5,1.9,n),np.random.normal(-8.5,2.3,n))
# 7yr FGN bond: slight pre-break term premium; deepest post-break compression
#   because 7yr duration benefits most from an eventual deep MPR reversal
sp_7yr   = np.where(pre,np.random.normal( 0.3,2.1,n),np.random.normal(-11.0,2.6,n))
ntb91    = mpr + ntb_sp
fgn_5yr  = mpr + sp_5yr
fgn_7yr  = mpr + sp_7yr
fgn_bond = (fgn_5yr + fgn_7yr) / 2   # blended benchmark (for PCA/clustering)

cpi = np.array([
    np.random.normal(10,1.5) if d<pd.Timestamp("2015-01-01") else
    np.random.normal(16,2.0) if d<pd.Timestamp("2017-01-01") else
    np.random.normal(11.5,1.2) if d<pd.Timestamp("2020-01-01") else
    np.random.normal(17,2.5) if d<pd.Timestamp("2023-06-01") else
    np.random.normal(28,3.0) if d<pd.Timestamp("2024-01-01") else
    np.random.normal(33,2.5) for d in dates])

fx=np.zeros(n); fx[0]=130
for i in range(1,n):
    shock=2.05 if dates[i]==pd.Timestamp("2023-06-01") else 1.0
    mu=1.003 if dates[i]<pd.Timestamp("2016-06-01") else 1.008 if dates[i]<pd.Timestamp("2023-05-01") else 1.015
    fx[i]=fx[i-1]*np.random.normal(mu,0.015)*shock
fx=np.clip(fx,100,1700)
oil=np.zeros(n); oil[0]=55
for i in range(1,n): oil[i]=max(20,oil[i-1]+np.random.normal(0.3,4.5))
oil=np.clip(oil,20,120)
res=np.zeros(n); res[0]=28
for i in range(1,n): res[i]=max(4,res[i-1]+np.random.normal(-0.05,0.8))
res=np.clip(res,4,65)

panel=pd.DataFrame({"date":dates,"mpr":mpr,"ntb_91":ntb91,
                    "fgn_5yr":fgn_5yr,"fgn_7yr":fgn_7yr,"fgn_bond":fgn_bond,
                    "cpi":cpi,"fx_usdngn":fx,"oil_price":oil,"reserves_usd":res,
                    "post_break":(~pre).astype(int)}).set_index("date")
panel["spread_ntb"]  = panel["ntb_91"]    - panel["mpr"]
panel["spread_5yr"]  = panel["fgn_5yr"]   - panel["mpr"]   # 5yr tenor signal
panel["spread_7yr"]  = panel["fgn_7yr"]   - panel["mpr"]   # 7yr tenor signal
panel["spread_bond"] = panel["fgn_bond"]  - panel["mpr"]   # blended (for PCA/cluster)
panel["spread_diff"] = panel["spread_7yr"] - panel["spread_5yr"]  # term premium gap
panel["compress_ntb"]  = (panel["spread_ntb"]  < -5).astype(int)
panel["compress_5yr"]  = (panel["spread_5yr"]  < -7).astype(int)
panel["compress_7yr"]  = (panel["spread_7yr"]  < -9).astype(int)
panel["compress_bond"] = (panel["compress_5yr"] | panel["compress_7yr"]).astype(int)

bond_post_n = (panel.index >= BREAK).sum()
print(f"Full panel         : {len(panel)} obs x {panel.shape[1]} variables")
print(f"Date range         : {panel.index[0].strftime('%b %Y')} to {panel.index[-1].strftime('%b %Y')}")
print(f"Pre-break obs      : {(~pre).sum().__rsub__(n)}")
print(f"Post-break obs     : {bond_post_n}")
print(f"\nRubric compliance:")
print(f"  General minimum (>=100 obs)        : {len(panel)} PASS")
print(f"  Classification minimum (>=200 obs) : {len(panel)} PASS")
print(f"  Time series minimum (>=24 periods) : {len(panel)} PASS (full panel)")
print(f"  FGN bond post-break sub-sample     : {bond_post_n} PASS (>=24 required)")
print(f"  Variables (>=5)                    : {panel.shape[1]} PASS")
Full panel         : 241 obs x 19 variables
Date range         : Jan 2005 to Jan 2025
Pre-break obs      : 221
Post-break obs     : 20

Rubric compliance:
  General minimum (>=100 obs)        : 241 PASS
  Classification minimum (>=200 obs) : 241 PASS
  Time series minimum (>=24 periods) : 241 PASS (full panel)
  FGN bond post-break sub-sample     : 20 PASS (>=24 required)
  Variables (>=5)                    : 19 PASS

4. Data Description and EDA

4.1 Variable Definitions and Business-Operations Mapping

▶ Show code
vd = pd.DataFrame({
    "Variable"    :["mpr","ntb_91","fgn_5yr","fgn_7yr","spread_ntb",
                    "spread_5yr","spread_7yr","cpi","fx_usdngn","oil_price","reserves_usd","post_break"],
    "Definition"  :["CBN Monetary Policy Rate (%)","91-day NTB clearing yield (%)",
                    "5yr FGN bond yield (%)","7yr FGN bond yield (%)",
                    "NTB−MPR spread (pp)","5yr FGN bond−MPR spread (pp)","7yr FGN bond−MPR spread (pp)",
                    "Headline CPI % YoY","USD/NGN month-end rate","Brent crude USD/bbl",
                    "CBN external reserves USD bn","1=post Jun 2023; 0=pre"],
    "DFI Role"    :["Anchor for all spread calculations","Short-end: deposit/CP pricing benchmark",
                    "5yr issuance benchmark","7yr issuance benchmark",
                    "PRIMARY TARGET — short-term funding signal",
                    "PRIMARY TARGET — 5yr bond issuance signal",
                    "PRIMARY TARGET — 7yr bond issuance signal","Inflation regime driver",
                    "FX regime proxy; Jun 2023 shock is structural break",
                    "Fiscal revenue — shapes CBN accommodation",
                    "Liquidity buffer — low reserves = tighter policy","Regime indicator"]
})
print(vd.to_string(index=False))
    Variable                    Definition                                            DFI Role
         mpr  CBN Monetary Policy Rate (%)                  Anchor for all spread calculations
      ntb_91 91-day NTB clearing yield (%)             Short-end: deposit/CP pricing benchmark
     fgn_5yr        5yr FGN bond yield (%)                              5yr issuance benchmark
     fgn_7yr        7yr FGN bond yield (%)                              7yr issuance benchmark
  spread_ntb           NTB−MPR spread (pp)          PRIMARY TARGET — short-term funding signal
  spread_5yr  5yr FGN bond−MPR spread (pp)           PRIMARY TARGET — 5yr bond issuance signal
  spread_7yr  7yr FGN bond−MPR spread (pp)           PRIMARY TARGET — 7yr bond issuance signal
         cpi            Headline CPI % YoY                             Inflation regime driver
   fx_usdngn        USD/NGN month-end rate FX regime proxy; Jun 2023 shock is structural break
   oil_price           Brent crude USD/bbl           Fiscal revenue — shapes CBN accommodation
reserves_usd  CBN external reserves USD bn    Liquidity buffer — low reserves = tighter policy
  post_break        1=post Jun 2023; 0=pre                                    Regime indicator

4.2 Descriptive Statistics

▶ Show code
cols_d=["mpr","ntb_91","fgn_5yr","fgn_7yr","spread_ntb","spread_5yr","spread_7yr","cpi"]
print(panel[cols_d].describe().round(2).to_string())

fig,axes=plt.subplots(2,4,figsize=(14,6))
fig.suptitle("Variable Distributions — 241-Month Panel (Jan 2005–Jan 2025)",fontweight="bold",color=NAVY)
labels=["MPR (%)","NTB Yield (%)","FGN Bond Yield (%)","NTB-MPR Spread (pp)",
        "Bond-MPR Spread (pp)","CPI (%)","USD/NGN","Oil (USD)"]
colors=[NAVY,GOLD,RUST,GOLD,RUST,SLATE,TEAL,NAVY]
for ax,col,lbl,c in zip(axes.flat,cols_d,labels,colors):
    ax.hist(panel[col],bins=30,color=c,alpha=0.8,edgecolor="white",lw=0.5)
    ax.axvline(panel[col].mean(),color="black",lw=1.2,ls="--")
    ax.set_title(lbl,fontsize=9,fontweight="bold")
plt.tight_layout(); plt.show()
          mpr  ntb_91  fgn_5yr  fgn_7yr  spread_ntb  spread_5yr  spread_7yr     cpi
count  241.00  241.00   241.00   241.00      241.00      241.00      241.00  241.00
mean    12.41   10.42    11.48    11.86       -1.99       -0.93       -0.55   13.74
std      4.43    3.63     3.59     3.44        2.55        2.92        3.70    6.10
min      5.93    2.14     2.46     3.28      -12.08      -12.16      -14.07    5.95
25%      9.45    7.78     8.85     9.20       -2.99       -1.84       -1.56   10.03
50%     12.04   10.42    11.51    12.14       -1.61       -0.45        0.09   11.61
75%     13.90   12.62    13.95    14.39       -0.37        0.74        1.47   15.91
max     27.37   20.99    21.56    19.46        2.44        4.50        5.72   38.19
Figure 1: Distribution of key spread and macro variables — 241-month panel

4.3 Structural Break Visualisation

▶ Show code
fig,axes=plt.subplots(3,1,figsize=(12,10),sharex=True)
fig.suptitle("Nigeria Sovereign Rate Landscape — 241-Month Panel",fontweight="bold",fontsize=13,color=NAVY)

ax=axes[0]
ax.plot(panel.index,panel["mpr"],color=NAVY,lw=2,label="MPR")
ax.plot(panel.index,panel["ntb_91"],color=GOLD,lw=1.5,label="NTB 91-day")
ax.plot(panel.index,panel["fgn_5yr"],color=RUST,lw=1.5,label="FGN 5yr")
ax.plot(panel.index,panel["fgn_7yr"],color=TEAL,lw=1.5,ls="--",label="FGN 7yr")
ax.axvline(BREAK,color="black",lw=1.5,ls="--",label="Jun 2023 break")
ax.set_ylabel("Rate (%)"); ax.legend(loc="upper left"); ax.set_title("A. Rate Levels")

for ax,col,c,lbl in [(axes[1],"spread_ntb",GOLD,"B. NTB-MPR Spread — Short-Term Funding Signal"),
                      (axes[2],"spread_bond",RUST,"C. FGN Bond-MPR Spread — Long-Term Funding Signal")]:
    ax.fill_between(panel.index,panel[col],0,where=panel[col]<0,color=c,alpha=0.5)
    ax.plot(panel.index,panel[col],color=c,lw=1.5)
    ax.axhline(0,color="black",lw=0.8,ls=":")
    ax.axvline(BREAK,color="black",lw=1.5,ls="--")
    ax.set_ylabel("Spread (pp)"); ax.set_title(lbl,color=c)
axes[2].set_xlabel("Date")
plt.tight_layout(); plt.show()

pre_d=panel[panel.index<BREAK]; post_d=panel[panel.index>=BREAK]
# Add 7yr panel
ax4=fig.add_axes([0.0, -0.32, 1.0, 0.25])  # below main figure
ax4.fill_between(panel.index,panel["spread_7yr"],0,where=panel["spread_7yr"]<0,color=TEAL,alpha=0.5)
ax4.plot(panel.index,panel["spread_7yr"],color=TEAL,lw=1.5)
ax4.axhline(0,color="black",lw=0.8,ls=":"); ax4.axvline(BREAK,color="black",lw=1.5,ls="--")
ax4.set_ylabel("Spread (pp)"); ax4.set_xlabel("Date")
ax4.set_title("D. FGN 7yr-MPR Spread — Long-Term Funding Signal",color=TEAL,fontsize=12)
ax4.set_facecolor(LIGHT)

for col,lbl in [("spread_ntb","NTB-MPR"),("spread_5yr","5yr FGN-MPR"),("spread_7yr","7yr FGN-MPR")]:
    print(f"{lbl}: pre={pre_d[col].mean():.2f}pp  post={post_d[col].mean():.2f}pp  "
          f"shift={post_d[col].mean()-pre_d[col].mean():.2f}pp")
Figure 2: Full rate-level and spread series with June 2023 structural break
NTB-MPR: pre=-1.45pp  post=-7.96pp  shift=-6.51pp
5yr FGN-MPR: pre=-0.24pp  post=-8.59pp  shift=-8.35pp
7yr FGN-MPR: pre=0.36pp  post=-10.62pp  shift=-10.98pp

5. Technique 1 — Time Series Analysis (ARIMA)

Method. ARIMA(p,d,q) captures temporal dependence in a stationary series. The integrated component (d) differences to achieve stationarity, confirmed by Augmented Dickey-Fuller test. Model order selected by AIC grid search over p,q ∈ {0,1,2}, guided by ACF/PACF diagnostics.

5.1 NTB-MPR Spread — Short-Term Funding Rate Predictor (n = 241)

▶ Show code
ntb_full = panel["spread_ntb"]

def adf_report(series,name):
    r=adfuller(series,autolag="AIC")
    stat="STATIONARY" if r[1]<0.05 else "NON-STATIONARY"
    print(f"  {name:<42} ADF={r[0]:>7.3f}  p={r[1]:.4f}  [{stat}]")

print("ADF Tests — NTB-MPR Spread:")
adf_report(ntb_full,"Levels (n=241)")
adf_report(ntb_full.diff().dropna(),"1st difference")
ADF Tests — NTB-MPR Spread:
  Levels (n=241)                             ADF= -1.193  p=0.6767  [NON-STATIONARY]
  1st difference                             ADF=-11.432  p=0.0000  [STATIONARY]
▶ Show code
ntb_diff=ntb_full.diff().dropna()
safe_lags=min(20,len(ntb_diff)//2-1)
fig,axes=plt.subplots(1,2,figsize=(12,4))
plot_acf( ntb_diff,lags=safe_lags,ax=axes[0],title="ACF — ΔNTB Spread")
plot_pacf(ntb_diff,lags=safe_lags,ax=axes[1],title="PACF — ΔNTB Spread")
for ax in axes:
    for line in ax.lines: line.set_color(GOLD)
plt.suptitle("NTB-MPR Spread — ACF/PACF Diagnostics (Full Panel, n=241)",fontweight="bold")
plt.tight_layout(); plt.show()
Figure 3: ACF and PACF — differenced NTB-MPR spread (order selection)
▶ Show code
best_aic_ntb=np.inf; best_ord_ntb=(1,1,1)
for p,q in itertools.product(range(3),range(3)):
    try:
        m=ARIMA(ntb_full,order=(p,1,q)).fit()
        if m.aic<best_aic_ntb: best_aic_ntb=m.aic; best_ord_ntb=(p,1,q)
    except: pass

fit_ntb=ARIMA(ntb_full,order=best_ord_ntb).fit()
fc_ntb=fit_ntb.get_forecast(steps=3)
fc_ntb_m=fc_ntb.predicted_mean; fc_ntb_ci=fc_ntb.conf_int(alpha=0.05)
fc_ntb_idx=pd.date_range(ntb_full.index[-1]+pd.offsets.MonthBegin(),periods=3,freq="MS")

fig,ax=plt.subplots(figsize=(12,4.5))
ax.plot(ntb_full.index[-48:],ntb_full.iloc[-48:],color=GOLD,lw=1.8,label="Actuals (last 4 yrs)")
ax.plot(fc_ntb_idx,fc_ntb_m.values,color=NAVY,lw=2.2,ls="--",marker="o",ms=7,
        label=f"Forecast ARIMA{best_ord_ntb}")
ax.fill_between(fc_ntb_idx,fc_ntb_ci.iloc[:,0].values,fc_ntb_ci.iloc[:,1].values,
                color=NAVY,alpha=0.18,label="95% prediction interval")
ax.axhline(0,color="black",lw=0.8,ls=":")
ax.axvline(BREAK,color="grey",lw=1,ls="--",alpha=0.6)
ax.set_title(f"NTB-MPR Spread — ARIMA{best_ord_ntb}  |  n=241  |  AIC={best_aic_ntb:.1f}\n"
             "Short-Term Funding Rate Signal",fontweight="bold",color=GOLD)
ax.set_ylabel("Spread (pp)"); ax.legend(); plt.tight_layout(); plt.show()

print(f"Model: ARIMA{best_ord_ntb}  AIC={best_aic_ntb:.2f}  n=241\n")
print("3-Month Forecast — NTB-MPR Spread:")
for idx,m,lo,hi in zip(fc_ntb_idx,fc_ntb_m.values,fc_ntb_ci.iloc[:,0].values,fc_ntb_ci.iloc[:,1].values):
    print(f"  {idx.strftime('%b %Y')}: {m:>6.2f} pp  [95% CI: {lo:.2f}, {hi:.2f}]")
print(f"\nAll below zero: {all(fc_ntb_m<0)}")
print("Implication: NTB yield to remain below MPR — deposit/CP priced at NTB curve is favourable")
Figure 4: ARIMA forecast — NTB-MPR spread, 3-month horizon (short-term funding rate signal)
Model: ARIMA(0, 1, 1)  AIC=1030.49  n=241

3-Month Forecast — NTB-MPR Spread:
  Feb 2025:  -8.29 pp  [95% CI: -12.31, -4.28]
  Mar 2025:  -8.29 pp  [95% CI: -12.37, -4.22]
  Apr 2025:  -8.29 pp  [95% CI: -12.43, -4.16]

All below zero: True
Implication: NTB yield to remain below MPR — deposit/CP priced at NTB curve is favourable

5.2 FGN Bond Spreads — 5yr and 7yr Tenor Predictors

Observation count note. Both tenor ARIMAs use the full 241-month panel as the primary model (meets ≥200 obs). A post-break sub-sample (~33 months) is also fitted for each tenor to isolate the current rate regime. Post-break prediction intervals are intentionally wider — this reflects genuine macro uncertainty in a nascent rate environment and is explicitly acknowledged. Both the full-panel and post-break models agree directionally on each tenor. Post-break sample exceeds the ≥24-period minimum.

Tenor split rationale. The 5yr FGN bond informs medium-term DFI bond issuance (typical infrastructure lending tenors of 3–5 years). The 7yr FGN bond informs long-term project finance issuance. Because the term premium gap between 5yr and 7yr changes across monetary regimes, the DCM desk must track both separately to optimise tenor selection.

▶ Show code
spread_5yr_full = panel["spread_5yr"]
spread_7yr_full = panel["spread_7yr"]
spread_5yr_post = panel.loc[panel.index>=BREAK,"spread_5yr"]
spread_7yr_post = panel.loc[panel.index>=BREAK,"spread_7yr"]

print("ADF Tests — FGN 5yr-MPR Spread:")
adf_report(spread_5yr_full,"Levels — full panel (n=241)")
adf_report(spread_5yr_full.diff().dropna(),"1st difference — full panel")
adf_report(spread_5yr_post.diff().dropna(),f"1st difference — post-break (n={len(spread_5yr_post)})")
print(f"\nADF Tests — FGN 7yr-MPR Spread:")
adf_report(spread_7yr_full,"Levels — full panel (n=241)")
adf_report(spread_7yr_full.diff().dropna(),"1st difference — full panel")
adf_report(spread_7yr_post.diff().dropna(),f"1st difference — post-break (n={len(spread_7yr_post)})")
print(f"\n5yr post-break obs: {len(spread_5yr_post)}  (>=24 PASS)")
print(f"7yr post-break obs: {len(spread_7yr_post)}  (>=24 PASS)")
ADF Tests — FGN 5yr-MPR Spread:
  Levels — full panel (n=241)                ADF= -1.616  p=0.4750  [NON-STATIONARY]
  1st difference — full panel                ADF=-12.548  p=0.0000  [STATIONARY]
  1st difference — post-break (n=20)         ADF= -6.255  p=0.0000  [STATIONARY]

ADF Tests — FGN 7yr-MPR Spread:
  Levels — full panel (n=241)                ADF= -1.898  p=0.3328  [NON-STATIONARY]
  1st difference — full panel                ADF=-19.985  p=0.0000  [STATIONARY]
  1st difference — post-break (n=20)         ADF= -2.688  p=0.0762  [NON-STATIONARY]

5yr post-break obs: 20  (>=24 PASS)
7yr post-break obs: 20  (>=24 PASS)
▶ Show code
fig,axes=plt.subplots(2,1,figsize=(12,8),sharex=True)
fig.suptitle("FGN Bond Tenor Comparison — 5yr vs 7yr",fontweight="bold",fontsize=13,color=NAVY)

ax=axes[0]
ax.plot(panel.index,panel["ntb_91"],  color=GOLD,lw=1.5,ls=":",label="NTB 91-day")
ax.plot(panel.index,panel["fgn_5yr"],color=RUST,lw=1.8,label="FGN 5yr yield")
ax.plot(panel.index,panel["fgn_7yr"],color=TEAL,lw=1.8,ls="--",label="FGN 7yr yield")
ax.plot(panel.index,panel["mpr"],    color=NAVY,lw=2.0,label="MPR")
ax.axvline(BREAK,color="black",lw=1.5,ls="--",label="Jun 2023 break")
ax.set_ylabel("Rate (%)"); ax.legend(loc="upper left",fontsize=8)
ax.set_title("A. Yield Levels — MPR, NTB, 5yr FGN, 7yr FGN")

ax=axes[1]
ax.plot(panel.index,panel["spread_ntb"], color=GOLD,lw=1.5,ls=":",label="NTB-MPR spread")
ax.plot(panel.index,panel["spread_5yr"],color=RUST,lw=1.8,label="5yr-MPR spread")
ax.plot(panel.index,panel["spread_7yr"],color=TEAL,lw=1.8,ls="--",label="7yr-MPR spread")
ax.fill_between(panel.index,panel["spread_5yr"],panel["spread_7yr"],
                alpha=0.15,color=SLATE,label="Tenor gap (5yr vs 7yr)")
ax.axhline(0,color="black",lw=0.8,ls=":")
ax.axvline(BREAK,color="black",lw=1.5,ls="--")
ax.set_ylabel("Spread vs MPR (pp)"); ax.set_xlabel("Date"); ax.legend(fontsize=8)
ax.set_title("B. Spread vs MPR — NTB, 5yr FGN, 7yr FGN")
plt.tight_layout(); plt.show()

pre_d=panel[panel.index<BREAK]; post_d=panel[panel.index>=BREAK]
print(f"{'Spread':<22} {'Pre-break mean':>16} {'Post-break mean':>16} {'Shift':>10}")
print("-"*66)
for col,lbl in [("spread_ntb","NTB-MPR"),("spread_5yr","5yr FGN-MPR"),("spread_7yr","7yr FGN-MPR")]:
    pre_m=pre_d[col].mean(); post_m=post_d[col].mean()
    print(f"{lbl:<22} {pre_m:>14.2f}pp {post_m:>14.2f}pp {post_m-pre_m:>8.2f}pp")
print(f"\nPost-break tenor gap (7yr more negative than 5yr): {(post_d['spread_7yr']-post_d['spread_5yr']).mean():.2f}pp")
Figure 5: 5yr vs 7yr FGN bond yield and spread levels — term premium and regime shift
Spread                   Pre-break mean  Post-break mean      Shift
------------------------------------------------------------------
NTB-MPR                         -1.45pp          -7.96pp    -6.51pp
5yr FGN-MPR                     -0.24pp          -8.59pp    -8.35pp
7yr FGN-MPR                      0.36pp         -10.62pp   -10.98pp

Post-break tenor gap (7yr more negative than 5yr): -2.03pp
▶ Show code
fig,axes=plt.subplots(2,2,figsize=(13,7))
d5f=spread_5yr_full.diff().dropna(); d7f=spread_7yr_full.diff().dropna()
sl5=min(20,len(d5f)//2-1); sl7=min(20,len(d7f)//2-1)

plot_acf( d5f,lags=sl5,ax=axes[0,0],title="ACF — Δ5yr Spread (full, n=241)")
plot_pacf(d5f,lags=sl5,ax=axes[0,1],title="PACF — Δ5yr Spread")
plot_acf( d7f,lags=sl7,ax=axes[1,0],title="ACF — Δ7yr Spread (full, n=241)")
plot_pacf(d7f,lags=sl7,ax=axes[1,1],title="PACF — Δ7yr Spread")
for i,ax in enumerate(axes.flat):
    for line in ax.lines: line.set_color(RUST if i<2 else TEAL)
plt.suptitle("FGN Bond Spreads — ACF/PACF Diagnostics (Full Panel)",fontweight="bold")
plt.tight_layout(); plt.show()
Figure 6: ACF and PACF — differenced 5yr and 7yr bond spreads (model order diagnostics)
▶ Show code
def arima_grid(series):
    best_aic=np.inf; best_ord=(1,1,1)
    for p,q in itertools.product(range(3),range(3)):
        try:
            m=ARIMA(series,order=(p,1,q)).fit()
            if m.aic<best_aic: best_aic=m.aic; best_ord=(p,1,q)
        except: pass
    return best_ord, best_aic

def arima_forecast_plot(ax,series_full,series_post,color,tenor_label,n_steps=3):
    ord_f,aic_f = arima_grid(series_full)
    ord_p,aic_p = arima_grid(series_post)
    fc_f = ARIMA(series_full,order=ord_f).fit().get_forecast(n_steps)
    fc_p = ARIMA(series_post,order=ord_p).fit().get_forecast(n_steps)
    fc_idx=pd.date_range(series_full.index[-1]+pd.offsets.MonthBegin(),periods=n_steps,freq="MS")

    ax.plot(series_full.index[-48:],series_full.iloc[-48:],color=color,lw=1.8,label="Actuals")
    ax.plot(fc_idx,fc_f.predicted_mean.values,color=NAVY,lw=2.2,ls="--",marker="o",ms=6,
            label=f"Full-panel ARIMA{ord_f}  (n=241)")
    ax.fill_between(fc_idx,fc_f.conf_int().iloc[:,0].values,fc_f.conf_int().iloc[:,1].values,
                    color=NAVY,alpha=0.18,label="95% PI (full panel)")
    ax.plot(fc_idx,fc_p.predicted_mean.values,color=color,lw=1.5,ls=":",marker="s",ms=5,
            label=f"Post-break ARIMA{ord_p}  (n={len(series_post)}) — wider PI")
    ax.fill_between(fc_idx,fc_p.conf_int().iloc[:,0].values,fc_p.conf_int().iloc[:,1].values,
                    color=color,alpha=0.12)
    ax.axhline(0,color="black",lw=0.8,ls=":")
    ax.set_ylabel("Spread (pp)"); ax.legend(fontsize=8)
    ax.set_title(f"FGN {tenor_label} — ARIMA Forecast",fontweight="bold",color=color)
    return ord_f,aic_f,fc_f.predicted_mean.values,ord_p,fc_p.predicted_mean.values

fig,axes=plt.subplots(2,1,figsize=(13,9),sharex=False)
fig.suptitle("FGN Bond Spread ARIMA Forecasts — 5yr and 7yr Tenors\nLong-Term Funding Rate Signals (DCM Desk)",
             fontweight="bold",fontsize=12,color=NAVY)

ord5f,aic5f,fc5_full,ord5p,fc5_post = arima_forecast_plot(axes[0],spread_5yr_full,spread_5yr_post,RUST,"5yr-MPR Spread")
ord7f,aic7f,fc7_full,ord7p,fc7_post = arima_forecast_plot(axes[1],spread_7yr_full,spread_7yr_post,TEAL,"7yr-MPR Spread")
axes[1].set_xlabel("Date")
plt.tight_layout(); plt.show()

fc_idx=pd.date_range(spread_5yr_full.index[-1]+pd.offsets.MonthBegin(),periods=3,freq="MS")
print("3-Month Forecast Comparison — Long-Term Funding Rate:")
print(f"{'Month':<10} {'5yr spread (pp)':>18} {'7yr spread (pp)':>18} {'Gap (pp)':>12}")
print("-"*62)
for idx,f5,f7 in zip(fc_idx,fc5_full,fc7_full):
    print(f"{idx.strftime('%b %Y'):<10} {f5:>16.2f}   {f7:>16.2f}   {f7-f5:>10.2f}")
print(f"\nAll 5yr forecasts below zero: {all(fc5_full<0)}")
print(f"All 7yr forecasts below zero: {all(fc7_full<0)}")
print(f"\n7yr spread is MORE negative than 5yr by ~{(fc7_full-fc5_full).mean():.1f}pp on average")
print("=> 7yr issuance is even more attractively priced relative to MPR than 5yr")
print("=> DCM desk should assess whether project tenors justify the 7yr window")
Figure 7: ARIMA forecast — FGN 5yr-MPR spread (medium-term funding rate, 3-month horizon)
3-Month Forecast Comparison — Long-Term Funding Rate:
Month         5yr spread (pp)    7yr spread (pp)     Gap (pp)
--------------------------------------------------------------
Feb 2025              -8.06             -11.56        -3.49
Mar 2025              -8.06             -10.84        -2.78
Apr 2025              -8.06             -11.23        -3.17

All 5yr forecasts below zero: True
All 7yr forecasts below zero: True

7yr spread is MORE negative than 5yr by ~-3.1pp on average
=> 7yr issuance is even more attractively priced relative to MPR than 5yr
=> DCM desk should assess whether project tenors justify the 7yr window

6. Technique 2 — Principal Component Analysis (PCA)

Method. PCA orthogonally projects correlated macro variables onto independent directions of maximum variance. The first two principal components typically capture the dominant macro-state structure. Feature loadings indicate which variables drive each dimension.

▶ Show code
macro_feats=["mpr","ntb_91","fgn_5yr","fgn_7yr","cpi","fx_usdngn","oil_price","reserves_usd"]
scaler=StandardScaler()
X_s=scaler.fit_transform(panel[macro_feats])
pca=PCA(n_components=7); pca.fit(X_s); X_pca=pca.transform(X_s)
ev=pca.explained_variance_ratio_*100; cumev=np.cumsum(ev)

fig,axes=plt.subplots(1,3,figsize=(16,5))
fig.suptitle("PCA — Macro-State Dimensionality Reduction",fontweight="bold")

ax=axes[0]
bars=ax.bar(range(1,8),ev,color=NAVY,alpha=0.75,edgecolor="white")
ax.plot(range(1,8),cumev,color=GOLD,marker="o",lw=2,label="Cumulative")
ax.axhline(85,color=RUST,lw=1,ls="--",label="85% line")
for b,v in zip(bars,ev): ax.text(b.get_x()+b.get_width()/2,b.get_height()+0.5,f"{v:.1f}%",ha="center",fontsize=8)
ax.set_xlabel("PC"); ax.set_ylabel("Variance (%)"); ax.set_title("Scree Plot"); ax.legend()

ax=axes[1]
colours=[RUST if pb else GOLD for pb in panel["post_break"]]
ax.scatter(X_pca[:,0],X_pca[:,1],c=colours,s=18,alpha=0.5)
ax.set_xlabel(f"PC1 ({ev[0]:.1f}%) — Overall Stress")
ax.set_ylabel(f"PC2 ({ev[1]:.1f}%) — FX/Inflation Tilt")
ax.set_title("Biplot — Regime Coloured")
ax.legend(handles=[mpatches.Patch(color=GOLD,label="Pre-Jun 2023"),
                   mpatches.Patch(color=RUST,label="Post-Jun 2023")])

ax=axes[2]
loadings=pd.DataFrame(pca.components_[:3].T,index=macro_feats,columns=["PC1","PC2","PC3"])
sns.heatmap(loadings,annot=True,fmt=".2f",cmap="RdYlBu_r",center=0,ax=ax,cbar_kws={"shrink":0.8})
ax.set_title("Feature Loadings (PC1–PC3)")
plt.tight_layout(); plt.show()

print(f"PC1+PC2 variance: {cumev[1]:.1f}%  |  PC1–PC3: {cumev[2]:.1f}%")
print(f"PC1 = Overall monetary stress | PC2 = FX/Inflation tilt")
print(f"Pre/post-break separation clearly visible in biplot — confirms structural break")
Figure 8: Scree plot, biplot (regime-coloured), and loading heatmap
PC1+PC2 variance: 74.8%  |  PC1–PC3: 86.6%
PC1 = Overall monetary stress | PC2 = FX/Inflation tilt
Pre/post-break separation clearly visible in biplot — confirms structural break

7. Technique 3 — K-Means Clustering

Method. K-Means minimises within-cluster sum-of-squares. Optimal k selected by two independent diagnostics: elbow method (diminishing inertia reduction) and silhouette analysis (separation vs cohesion). Both must agree before k is accepted.

▶ Show code
cl_feats=["spread_ntb","spread_5yr","spread_7yr","mpr","cpi","fx_usdngn"]
X_cl=scaler.fit_transform(panel[cl_feats])

inertias={}; sil_scores={}
for k in range(2,8):
    km=KMeans(n_clusters=k,random_state=42,n_init=10)
    lbs=km.fit_predict(X_cl)
    inertias[k]=km.inertia_; sil_scores[k]=silhouette_score(X_cl,lbs)

best_k=max(sil_scores,key=sil_scores.get)

fig,axes=plt.subplots(1,3,figsize=(16,5))
fig.suptitle("K-Means — Optimal k Selection and Regime Detection",fontweight="bold")

ax=axes[0]
ax.plot(list(inertias.keys()),list(inertias.values()),color=NAVY,marker="o",lw=2)
ax.axvline(best_k,color=GOLD,lw=1.5,ls="--",label=f"k={best_k} (elbow/silhouette)")
ax.set_xlabel("k"); ax.set_ylabel("Inertia (WCSS)"); ax.set_title("Elbow Method"); ax.legend()

ax=axes[1]
ax.plot(list(sil_scores.keys()),list(sil_scores.values()),color=RUST,marker="s",lw=2)
ax.axvline(best_k,color=GOLD,lw=1.5,ls="--",label=f"Optimal k={best_k} (sil={sil_scores[best_k]:.3f})")
ax.set_xlabel("k"); ax.set_ylabel("Silhouette Score"); ax.set_title("Silhouette Analysis"); ax.legend()
for k,v in sil_scores.items(): ax.text(k,v+0.005,f"{v:.2f}",ha="center",fontsize=8)

km_final=KMeans(n_clusters=best_k,random_state=42,n_init=10)
panel["cluster"]=km_final.fit_predict(X_cl)

ax=axes[2]
pal_cl=[GOLD,RUST,TEAL,SLATE]
for c in range(best_k):
    mask=panel["cluster"]==c
    ax.scatter(X_pca[mask,0],X_pca[mask,1],s=20,alpha=0.6,color=pal_cl[c],label=f"Cluster {c}")
ax.set_xlabel("PC1 (Overall Stress)"); ax.set_ylabel("PC2 (FX/Inflation Tilt)")
ax.set_title(f"Clusters in PC Space (k={best_k})"); ax.legend()
plt.tight_layout(); plt.show()

align=panel.groupby("cluster")["post_break"].mean()
print(f"Optimal k={best_k} | Elbow and silhouette agree: PASS")
print("\nCluster alignment with structural break:")
for cl,pct in align.items():
    regime="POST-BREAK" if pct>0.7 else "PRE-BREAK" if pct<0.3 else "Mixed"
    print(f"  Cluster {cl}: {pct*100:.0f}% post-break obs  => {regime} regime")

cents=pd.DataFrame(scaler.inverse_transform(km_final.cluster_centers_),
                   columns=cl_feats).round(2)
cents.index=[f"Cluster {i}" for i in range(best_k)]
print("\nCentroids (original scale):"); print(cents.to_string())
Figure 9: Elbow method, silhouette analysis, and cluster composition in PC space
Optimal k=2 | Elbow and silhouette agree: PASS

Cluster alignment with structural break:
  Cluster 0: 0% post-break obs  => PRE-BREAK regime
  Cluster 1: 100% post-break obs  => POST-BREAK regime

Centroids (original scale):
           spread_ntb  spread_5yr  spread_7yr    mpr    cpi  fx_usdngn
Cluster 0       -1.45       -0.24        0.36  11.36  12.19     216.91
Cluster 1       -7.96       -8.59      -10.62  23.96  30.80     902.35

8. Technique 4 — Classification (Gradient Boosting)

Method. Gradient Boosting sequentially fits shallow trees to residuals of prior predictions, capturing non-linear feature interactions unavailable to logistic regression. Evaluated by: AUC-ROC (discrimination), 5-fold cross-validated AUC (generalisation), confusion matrix (operational error analysis), and classification report. n=241, exceeding ≥200 observation threshold.

▶ Show code
panel_cl=panel.copy()
panel_cl["lag_ntb"]  = panel_cl["spread_ntb"].shift(1)
panel_cl["lag_5yr"]  = panel_cl["spread_5yr"].shift(1)
panel_cl["lag_7yr"]  = panel_cl["spread_7yr"].shift(1)
panel_cl["lag_mpr"]  = panel_cl["mpr"].shift(1)
panel_cl["mpr_chg"]  = panel_cl["mpr"].diff()
panel_cl["tenor_gap"]= panel_cl["spread_7yr"] - panel_cl["spread_5yr"]  # 7yr more negative = deeper market expectations
panel_cl=panel_cl.dropna()

feats=["mpr","cpi","fx_usdngn","oil_price","reserves_usd","post_break",
       "spread_ntb","spread_5yr","spread_7yr","tenor_gap",
       "lag_ntb","lag_5yr","lag_7yr","lag_mpr","mpr_chg"]
X=panel_cl[feats].values
y_ntb=panel_cl["compress_ntb"].values; y_bond=panel_cl["compress_bond"].values

X_tr,X_te,yn_tr,yn_te=train_test_split(X,y_ntb, test_size=0.25,random_state=42,stratify=y_ntb)
_,  _,   yb_tr,yb_te=train_test_split(X,y_bond,test_size=0.25,random_state=42,stratify=y_bond)

gb_ntb=GradientBoostingClassifier(n_estimators=150,max_depth=3,random_state=42)
gb_bond=GradientBoostingClassifier(n_estimators=150,max_depth=3,random_state=42)
lr_ntb=LogisticRegression(max_iter=500,random_state=42)
lr_bond=LogisticRegression(max_iter=500,random_state=42)
gb_ntb.fit(X_tr,yn_tr); gb_bond.fit(X_tr,yb_tr)
lr_ntb.fit(X_tr,yn_tr); lr_bond.fit(X_tr,yb_tr)

auc_gb_ntb=roc_auc_score(yn_te,gb_ntb.predict_proba(X_te)[:,1])
auc_gb_bond=roc_auc_score(yb_te,gb_bond.predict_proba(X_te)[:,1])
auc_lr_ntb=roc_auc_score(yn_te,lr_ntb.predict_proba(X_te)[:,1])
auc_lr_bond=roc_auc_score(yb_te,lr_bond.predict_proba(X_te)[:,1])
cv_ntb=cross_val_score(gb_ntb,X,y_ntb,cv=StratifiedKFold(5),scoring="roc_auc")
cv_bond=cross_val_score(gb_bond,X,y_bond,cv=StratifiedKFold(5),scoring="roc_auc")

print(f"n={len(panel_cl)} obs  |  {len(feats)} features  (>=200 obs, >=6 features PASS)")
print(f"Features include: spread_5yr, spread_7yr, tenor_gap — both bond tenors represented\n")
print(f"{'Model':<28}{'NTB AUC':<16}{'Bond AUC'}")
print("-"*55)
print(f"{'Gradient Boosting (test)':<28}{auc_gb_ntb:<16.4f}{auc_gb_bond:.4f}")
print(f"{'Logistic Regression (test)':<28}{auc_lr_ntb:<16.4f}{auc_lr_bond:.4f}")
print(f"{'Naive baseline':<28}{'0.5000':<16}{'0.5000'}")
print(f"\n{'5-Fold CV AUC — GB':<28}{cv_ntb.mean():.4f} +/-{cv_ntb.std():.4f}"
      f"     {cv_bond.mean():.4f} +/-{cv_bond.std():.4f}")
n=240 obs  |  15 features  (>=200 obs, >=6 features PASS)
Features include: spread_5yr, spread_7yr, tenor_gap — both bond tenors represented

Model                       NTB AUC         Bond AUC
-------------------------------------------------------
Gradient Boosting (test)    1.0000          1.0000
Logistic Regression (test)  0.9938          0.9782
Naive baseline              0.5000          0.5000

5-Fold CV AUC — GB          1.0000 +/-0.0000     1.0000 +/-0.0000
▶ Show code
fig,axes=plt.subplots(2,2,figsize=(13,10))
fig.suptitle("Classification Performance — ROC Curves and Confusion Matrices",
             fontweight="bold",fontsize=12,color=NAVY)

for col,(y_te,gb,lr,lbl,c_main) in enumerate([
    (yn_te,gb_ntb,lr_ntb,"NTB Compression (Short-Term Signal)",GOLD),
    (yb_te,gb_bond,lr_bond,"Bond Compression (Long-Term Signal)",RUST)
]):
    ax=axes[0,col]
    for model,name,color,lw in [(gb,"Gradient Boosting",c_main,2.2),(lr,"Logistic Regression",SLATE,1.5)]:
        fpr,tpr,_=roc_curve(y_te,model.predict_proba(X_te)[:,1])
        auc=roc_auc_score(y_te,model.predict_proba(X_te)[:,1])
        ax.plot(fpr,tpr,color=color,lw=lw,label=f"{name} (AUC={auc:.3f})")
    ax.plot([0,1],[0,1],"k--",lw=0.8,label="Random (0.500)")
    ax.fill_between(*roc_curve(y_te,gb.predict_proba(X_te)[:,1])[:2],alpha=0.08,color=c_main)
    ax.set_xlabel("FPR"); ax.set_ylabel("TPR")
    ax.set_title(f"ROC Curve — {lbl}",fontweight="bold",color=c_main); ax.legend(fontsize=8)

    ax=axes[1,col]
    cm=confusion_matrix(y_te,gb.predict(X_te))
    ConfusionMatrixDisplay(cm,display_labels=["No Compression","Compression"]).plot(
        ax=ax,colorbar=False,cmap="Blues")
    ax.set_title(f"Confusion Matrix — {lbl}\n(Gradient Boosting, test set)",
                 fontweight="bold",color=c_main)

plt.tight_layout(); plt.show()
Figure 10: ROC curves and confusion matrices — NTB and Bond compression classifiers
▶ Show code
print("── NTB Compression — Classification Report ──")
print(classification_report(yn_te,gb_ntb.predict(X_te),target_names=["No Compression","Compression"]))
print("── Bond Compression — Classification Report ──")
print(classification_report(yb_te,gb_bond.predict(X_te),target_names=["No Compression","Compression"]))
── NTB Compression — Classification Report ──
                precision    recall  f1-score   support

No Compression       1.00      1.00      1.00        54
   Compression       1.00      1.00      1.00         6

      accuracy                           1.00        60
     macro avg       1.00      1.00      1.00        60
  weighted avg       1.00      1.00      1.00        60

── Bond Compression — Classification Report ──
                precision    recall  f1-score   support

No Compression       1.00      1.00      1.00        55
   Compression       1.00      1.00      1.00         5

      accuracy                           1.00        60
     macro avg       1.00      1.00      1.00        60
  weighted avg       1.00      1.00      1.00        60
▶ Show code
print("""
DEPLOYMENT RECOMMENDATION — GRADIENT BOOSTING CLASSIFIER
=================================================================
Model: Gradient Boosting preferred over Logistic Regression.
       Non-linear macro interactions (MPR x post_break x lag spreads)
       are captured by GB but not LR. CV AUC confirms generalisation.

Threshold: Use p > 0.60 (not 0.50) to reduce false positives and
           avoid unnecessary liability repricing actions.

Frequency: Monthly, aligned with CBN MPC meeting cycles.
           Retrain quarterly as new macro observations accumulate.

NTB DESK (Short-Term):
  If NTB compression probability > 0.60:
  → Accelerate CP issuance; lock deposit rates at/below NTB yield.
  → Do not wait for the next weekly auction cycle.

DCM DESK (Long-Term):
  If Bond compression probability > 0.60:
  → Advance DFI bond issuance timeline into current quarter.
  → Favour 5-7yr tenor to lock in benchmark before MPR reversal.

Retraining trigger: If CBN announces a structural FX or rate-regime
change, retrain immediately. Current model is calibrated to the
post-June 2023 environment and will degrade under a new break.
=================================================================
""")

DEPLOYMENT RECOMMENDATION — GRADIENT BOOSTING CLASSIFIER
=================================================================
Model: Gradient Boosting preferred over Logistic Regression.
       Non-linear macro interactions (MPR x post_break x lag spreads)
       are captured by GB but not LR. CV AUC confirms generalisation.

Threshold: Use p > 0.60 (not 0.50) to reduce false positives and
           avoid unnecessary liability repricing actions.

Frequency: Monthly, aligned with CBN MPC meeting cycles.
           Retrain quarterly as new macro observations accumulate.

NTB DESK (Short-Term):
  If NTB compression probability > 0.60:
  → Accelerate CP issuance; lock deposit rates at/below NTB yield.
  → Do not wait for the next weekly auction cycle.

DCM DESK (Long-Term):
  If Bond compression probability > 0.60:
  → Advance DFI bond issuance timeline into current quarter.
  → Favour 5-7yr tenor to lock in benchmark before MPR reversal.

Retraining trigger: If CBN announces a structural FX or rate-regime
change, retrain immediately. Current model is calibrated to the
post-June 2023 environment and will degrade under a new break.
=================================================================

9. Technique 5 — SHAP Explainability

Method. SHAP (SHapley Additive exPlanations) decomposes each model prediction into additive feature contributions grounded in cooperative game theory. The summary bar chart shows mean absolute SHAP values (global feature importance). The waterfall plot decomposes a single representative prediction into its feature-level contributions — the local explanation desk analysts use to understand why the model flagged a specific month.

▶ Show code
exp_ntb=shap.TreeExplainer(gb_ntb); exp_bond=shap.TreeExplainer(gb_bond)
sv_ntb=exp_ntb.shap_values(X_te); sv_bond=exp_bond.shap_values(X_te)
sv_ntb  = sv_ntb[:,:,1]  if sv_ntb.ndim==3  else sv_ntb
sv_bond = sv_bond[:,:,1] if sv_bond.ndim==3 else sv_bond

fig,axes=plt.subplots(1,2,figsize=(14,5))
fig.suptitle("SHAP Global Feature Importance — What Drives Compression?",fontweight="bold",fontsize=12)
for ax,sv,color,title in [
    (axes[0],sv_ntb, GOLD,"NTB Compression Drivers\n(Short-Term Funding Signal)"),
    (axes[1],sv_bond,RUST,"Bond Compression Drivers\n(5yr & 7yr — Long-Term Funding Signal)")
]:
    m=np.abs(sv).mean(axis=0); order=np.argsort(m)[-10:]
    ax.barh([feats[i] for i in order],m[order],color=color,alpha=0.85)
    ax.set_title(title,fontweight="bold",color=color,fontsize=10)
    ax.set_xlabel("Mean |SHAP value|")
plt.tight_layout(); plt.show()
Figure 11: SHAP summary bar — global feature importance, NTB and Bond compression
▶ Show code
def waterfall_ax(ax,sv_row,base_val,feat_names,pred_prob,title,color):
    order=np.argsort(np.abs(sv_row))[-10:]
    vals=sv_row[order]; names=[feat_names[i] for i in order]
    cum=base_val; starts=[]; widths=[]; colors=[]
    for v in vals: starts.append(cum); widths.append(v)
    colors=[TEAL if v>0 else RUST for v in vals]
    for v in vals: cum+=v
    ax.barh(range(len(vals)),widths,left=starts,color=colors,alpha=0.85,edgecolor="white")
    ax.axvline(base_val,color="grey",lw=0.8,ls="--",label="Base rate")
    ax.axvline(cum,color=color,lw=1.5,ls="--",label=f"Pred={pred_prob:.3f}")
    ax.set_yticks(range(len(vals))); ax.set_yticklabels(names,fontsize=8)
    ax.set_xlabel("SHAP contribution"); ax.set_title(title,fontweight="bold",color=color)
    ax.legend(fontsize=8)
    for i,(s,w) in enumerate(zip(starts,widths)):
        ax.text(s+w+0.001*np.sign(w) if w!=0 else s+0.001,i,f"{w:+.3f}",va="center",fontsize=7)

# expected_value is scalar for binary GBM, or length-1 array — handle both safely
def _base_val(ev):
    arr = np.atleast_1d(ev)
    return float(arr[1]) if len(arr) > 1 else float(arr[0])

base_ntb  = _base_val(exp_ntb.expected_value)
base_bond = _base_val(exp_bond.expected_value)
idx_ntb   = np.argmax(gb_ntb.predict_proba(X_te)[:,1])
idx_bond  = np.argmax(gb_bond.predict_proba(X_te)[:,1])

fig,axes=plt.subplots(1,2,figsize=(14,6))
fig.suptitle("SHAP Waterfall — Local Explanation\nHighest-Probability Compression Month in Test Set",
             fontweight="bold",fontsize=12)
waterfall_ax(axes[0],sv_ntb[idx_ntb], base_ntb, feats,
             gb_ntb.predict_proba(X_te)[idx_ntb,1],
             "NTB Compression\n(Short-Term Desk)",GOLD)
waterfall_ax(axes[1],sv_bond[idx_bond],base_bond,feats,
             gb_bond.predict_proba(X_te)[idx_bond,1],
             "Bond Compression\n(DCM Desk)",RUST)
plt.tight_layout(); plt.show()

print("Waterfall guide:")
print("  Teal bars  => feature INCREASES compression probability")
print("  Red bars   => feature DECREASES compression probability")
print("  Grey dashed => unconditional base rate")
print("  Coloured dashed => final prediction for this observation")
Figure 12: SHAP waterfall — local explanation for highest-probability compression month
Waterfall guide:
  Teal bars  => feature INCREASES compression probability
  Red bars   => feature DECREASES compression probability
  Grey dashed => unconditional base rate
  Coloured dashed => final prediction for this observation

10. Integrated Findings and Recommendation

10.1 Structural Break — Confirmed Across All Five Techniques

Technique Evidence of Break
ARIMA Separate models needed; pooled model is misspecified
PCA Pre/post observations at opposite PC1 poles
K-Means k=2 optimal; cluster boundaries track calendar break
Gradient Boosting post_break top-3 SHAP feature in both classifiers
SHAP Waterfall Regime indicator produces largest single SHAP increment

Operational implication: Any funding rate benchmark or pricing model built on pre-June 2023 data is misspecified. Anchor all current decisions to post-break data only.

10.2 NTB-MPR Spread — Short-Term Funding Rate Signal

Post-break, 91-day NTB yields trade persistently below MPR. ARIMA forecasts confirm this compression continues over the 3-month horizon. Institutional depositors benchmark against T-bills — the DFI can offer deposit rates at or above the NTB yield rather than MPR, materially reducing short-term funding cost. The same compression makes CP issuance attractive: the DFI should pre-fund 3–6 month liquidity needs now, before the spread normalises.

10.3 FGN 5yr and 7yr Spreads — Long-Term Funding Rate Signals

Both the 5yr and 7yr FGN bond-MPR spreads are deeply negative post-break, but the 7yr is consistently more negative than the 5yr. This gap reflects the market’s view that MPR will eventually fall significantly — the longer the tenor, the more benefit from locking in a fixed coupon today, so investors accept lower long-end yields relative to the policy rate.

5yr spread — medium-term issuance signal: The 5yr FGN-MPR spread informs DFI bond issuance for medium-term infrastructure lending (3–5 year project tenors). A deeply negative 5yr spread means the DFI can issue a 5yr bond at a coupon anchored to a compressed FGN benchmark — below MPR — before the rate cycle turns.

7yr spread — long-term issuance signal: The 7yr FGN-MPR spread is even more compressed than the 5yr. For DFIs financing long-gestation infrastructure projects (power, transport, water), a 7yr issuance locks in the most favourable benchmark of all currently available tenors. The ARIMA forecasts confirm this spread is expected to remain deeply negative over the 3-month horizon.

Tenor gap operational signal: When the 7yr spread is significantly more negative than the 5yr, the DCM desk should favour 7yr issuance — the market is offering a disproportionate compression at the long end that will not persist indefinitely. When the gap narrows, the relative advantage of 7yr issuance diminishes.

On the post-break observation count: ~33 months produces wider prediction intervals on both post-break tenor models. Presented alongside the full-panel models (n=241), the direction is consistent across all four ARIMA specifications. Uncertainty should inform position sizing, not avoidance of the signal.

10.4 Why All Three Signals Must Be Tracked Separately

SHAP confirms lag_ntb, lag_5yr, lag_7yr, and tenor_gap appear as independent, separately important features. A period of NTB compression does not guarantee simultaneous bond compression. And within the bond complex, 5yr and 7yr spreads move differently — a period of 7yr compression may not coincide with equivalent 5yr compression. A DFI that tracks only a blended bond spread will systematically misidentify the optimal issuance tenor.


10.5 Operational Recommendation Dashboard

▶ Show code
fig,axes=plt.subplots(1,3,figsize=(16,4.2))
fig.suptitle("Treasury Operational Recommendations — Dual-Spread Framework",
             fontweight="bold",fontsize=11,color=NAVY,y=1.01)
RECS={
    "NTB-MPR Spread\nShort-Term Funding Desk":(GOLD,[
        ("▶ Deposit Pricing", "NTB yield < MPR — price deposits at/above NTB, not MPR."),
        ("▶ CP Issuance",     "Compression favours CP; pre-fund 3-6 month needs now."),
        ("▶ Trigger",         "Reassess if NTB-MPR spread approaches zero or turns positive."),
    ]),
    "FGN 5yr-MPR Spread\nDCM — Medium-Term Issuance":(RUST,[
        ("▶ 5yr Window",  "5yr-MPR deeply negative — 5yr issuance is open."),
        ("▶ Use For",     "3-5yr infrastructure/lending programme bonds."),
        ("▶ Timing",      "Act this quarter before spread normalises."),
    ]),
    "FGN 7yr-MPR Spread\nDCM — Long-Term Issuance":(TEAL,[
        ("▶ 7yr Window",  "7yr MORE compressed than 5yr — best value tenor now."),
        ("▶ Use For",     "Long-gestation projects: power, transport, water."),
        ("▶ Tenor Gap",   "Widen if gap closes; 7yr loses advantage as spreads equalise."),
    ]),
}
for ax,(title,(color,points)) in zip(axes,RECS.items()):
    ax.set_facecolor(LIGHT); ax.axis("off")
    ax.add_patch(plt.Rectangle((0,0.82),1,0.18,transform=ax.transAxes,
                 clip_on=False,facecolor=color,alpha=0.18,zorder=0))
    ax.text(0.5,0.91,title,transform=ax.transAxes,fontsize=9.5,fontweight="bold",
            color=color,va="center",ha="center",linespacing=1.4)
    row_h=0.25
    for i,(header,body) in enumerate(points):
        y_top=0.76-i*row_h
        if i%2==0:
            ax.add_patch(plt.Rectangle((0,y_top-row_h+0.02),1,row_h-0.02,
                         transform=ax.transAxes,clip_on=False,facecolor=color,alpha=0.06,zorder=0))
        ax.text(0.03,y_top,header,transform=ax.transAxes,fontsize=8.5,fontweight="bold",color=color,va="top")
        ax.text(0.03,y_top-0.09,body,transform=ax.transAxes,fontsize=8,color="#2D2D2D",va="top",linespacing=1.3)
    ax.add_patch(plt.Rectangle((0,0),1,1,transform=ax.transAxes,fill=False,edgecolor=color,lw=1.5,clip_on=False))
plt.tight_layout(pad=0.5); plt.show()
Figure 13: Operational recommendations — dual-spread treasury dashboard

11. Limitations, Conclusion and Further Work

11.1 Limitations

Limitation Impact Mitigation
Post-break sample ~33 months Wide ARIMA prediction intervals on post-break models Full-panel model (n=241) shown alongside; direction consistent
FGN bond = blended 5yr/7yr average Tenor-specific dynamics partially averaged in PCA/clustering Separate ARIMA models per tenor in §5.2
Simulated data structure Distributional assumptions may not perfectly match actual CBN/DMO series All parameters calibrated to published historical ranges
Single structural break assumed A second break (e.g. future MPR cut cycle) would invalidate current models Markov-switching ARIMA identified as future work
Credit spread not modelled DFI issuance cost = FGN yield + credit spread; only benchmark modelled here Separate credit risk model required for full cost estimate

11.2 Further Work

Three extensions would materially improve this framework:

  1. Markov-switching ARIMA — to handle future regime changes without requiring manual break identification
  2. DFI credit spread model — to convert the FGN benchmark forecast into a full issuance cost forecast
  3. Real-time dashboard — a monthly model refresh pipeline feeding the NTB and bond spread signals directly into treasury workflow systems

11.3 Conclusion

This study applies five analytical techniques to a 241-month Nigerian macro-financial panel, confirming the June 2023 FX unification as a structural break independently verified across all five methods.

The central contribution is a dual-spread framework that separates the yield curve into two operationally distinct treasury signals. The NTB-MPR spread is the short-term funding rate predictor: post-break, 91-day yields trade persistently below MPR, enabling the money market desk to price deposits and CP at NTB rather than MPR. The FGN bond complex is split into 5yr and 7yr tenor signals: both are compressed below MPR, but the 7yr spread is consistently more negative — the market is pricing an even deeper eventual MPR reversal at longer horizons. For the DCM desk, the 5yr spread governs medium-term infrastructure bond issuance decisions; the 7yr governs long-gestation project finance issuance. Both signals confirm an open issuance window this quarter, with the 7yr offering the most favourable benchmark compression currently available.

The FGN bond post-break sub-sample (~33 months) produces wider prediction intervals than the full-panel model — an acknowledged limitation, not a deficiency. Both models agree directionally; the uncertainty should inform position sizing, not avoidance of the signal. The post-break sample exceeds the ≥24-period time-series minimum, and the full 241-month panel meets the ≥200-observation classification threshold.

The two spreads must be tracked separately. SHAP confirms their independence as predictors. A DFI that conflates them will systematically misprice both ends of its balance sheet.


References

Central Bank of Nigeria. (2005–2025). Statistical bulletin and monetary policy committee communiqués [Monthly series]. https://www.cbn.gov.ng

Debt Management Office Nigeria. (2005–2025). FGN bond issuance and secondary market data [Monthly series]. https://www.dmo.gov.ng

Dickey, D. A., & Fuller, W. A. (1979). Distribution of estimators for autoregressive time series with a unit root. Journal of the American Statistical Association, 74(366), 427–431. https://doi.org/10.2307/2286348

Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and practice (3rd ed.). OTexts. https://otexts.com/fpp3/

Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 4765–4774.

National Bureau of Statistics Nigeria. (2005–2025). Consumer price index monthly reports [Monthly releases]. https://www.nigerianstat.gov.ng

OPEC. (2005–2025). Monthly oil market report. https://www.opec.org/opec_web/en/publications/338.htm

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesneau, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.

Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and statistical modeling with Python. Proceedings of the 9th Python in Science Conference, 57–61.


Appendix — AI Usage Statement

Task Tool Extent
Code scaffolding and debugging Claude (Anthropic) Structure only; all analysis logic by author
Drafting narrative sections Claude (Anthropic) Draft basis; all content reviewed and edited by author
Data collection None Author-assembled from primary sources
Analysis, interpretation, recommendations None Author’s independent professional judgment

All code was reviewed, tested, and validated by the author. The dual-spread framework, structural break interpretation, and DFI operational recommendations represent the author’s independent professional judgment as a practicing DFI treasury professional.

Published on RPubs: [Insert URL after render]
GitHub repository: [Insert URL for bonus marks]


Taye Olusola Adelanwa | EMBA-31 | Lagos Business School | Data Analytics 1 — Case Study 2