Advanced & Operational Analytics for Bureau de Change Compliance

Monitoring regulatory compliance and customer SLAs at Sohcahtoa Holdings Limited

Author

Temitope Aribisala — Head of Legal & Compliance, Sohcahtoa Holdings Limited

Published

May 22, 2026

1 Executive Summary

Sohcahtoa Holdings Limited is a Bureau de Change (BDC) licensed by the Central Bank of Nigeria (CBN). This report applies five Advanced & Operational Analytics techniques — Text Analytics, Monte Carlo simulation, Advanced Forecasting, Customer/People Analytics and Optimisation — to a six-month, 200-transaction compliance and service-level monitoring log. The monitoring data shows three exposures that the Board must act on: an SLA-breach rate of roughly one transaction in three, a suspicious-flag rate near 18% (each flag requiring a timely Suspicious-Transaction Report to the NFIU), and incomplete Know-Your-Customer documentation on about one transaction in five. Monte Carlo simulation prices the resulting 12-month exposure at a mean near NGN 15 million under the status quo, with a P95 tail above NGN 35 million driven almost entirely by the risk of a CBN enforcement action. A combined intervention — additional compliance-officer capacity plus first-pass screening automation — shifts the whole exposure distribution leftward and, crucially, compresses the P95 tail. The recommendation is to resource the sanctions-screening and STR desks to capacity, automate first-pass screening, and monitor these KPIs monthly against the thresholds set out in Section 10.

2 Professional Disclosure

I am Temitope Aribisala, Head of Legal & Compliance at Sohcahtoa Holdings Limited, a Bureau de Change operating in the Nigerian financial-services / foreign-exchange sector and licensed by the Central Bank of Nigeria. The five techniques in this report each map to a live obligation on my desk:

Text Analytics lets me convert unstructured officer notes, customer complaints and CBN/NFIU circulars into structured signal — surfacing compliance themes before they surface as regulator findings.
Monte Carlo simulation prices the firm’s regulatory and service-level exposure as a distribution, so I can brief the Board on the bad-but-plausible P95 outcome, not just the average.
Advanced Forecasting sizes the compliance workload — screening checks and STR reviews — that is coming, so the desk is staffed before the queue builds.
Customer/People Analytics (survival analysis) models how long customer complaints take to resolve and which factors slow them down.
Optimisation allocates a fixed pool of compliance-officer hours across competing tasks to maximise risk reduction per hour worked.

3 Data Collection & Sampling

Field	Value
Source	Sohcahtoa Holdings’ transaction-processing and compliance-monitoring systems, exported to `Sohcahtoa_BDC_Compliance_SLA_Data.xlsx`.
Collection method	Automated capture at the FX counter (amount, currency, turnaround) joined to the compliance workflow (KYC status, sanctions-screen result, suspicious-flag, complaint record).
Sampling frame	All retail FX transactions executed across the BDC’s counters in the reporting window.
Sample size	200 transactions.
Time period	January – June 2025 (six months).
Ethics & consent	The dataset is synthetic and calibrated to typical BDC volumes — it contains no real customer identities. In live operation, customer data is processed under the Nigeria Data Protection Act (2023) and CBN data-residency rules; sanctions screening and STR filing follow NFIU guidance.

4 Data Description

Rows: 200
Columns: 13
$ txn_id           <chr> "TX-2025-0001", "TX-2025-0002", "TX-2025-0003", "TX-2…
$ txn_date         <date> 2025-01-02, 2025-01-03, 2025-01-04, 2025-01-04, 2025…
$ customer_type    <fct> Corporate, Individual, Walk-in, Walk-in, Walk-in, Ind…
$ currency_pair    <fct> USD/NGN, GBP/NGN, GBP/NGN, USD/NGN, USD/NGN, GBP/NGN,…
$ amount_usd       <dbl> 5406, 580, 1017, 637, 1197, 710, 1314, 1870, 742, 658…
$ kyc_status       <fct> Expired, Complete, Pending, Complete, Complete, Compl…
$ sanctions_screen <fct> Clear, Clear, Clear, Clear, Clear, Clear, Clear, Flag…
$ turnaround_mins  <dbl> 52.7, 18.7, 35.8, 26.6, 16.7, 15.4, 22.2, 23.6, 35.3,…
$ sla_target_mins  <dbl> 45, 30, 30, 30, 30, 30, 30, 30, 30, 30, 45, 30, 30, 3…
$ sla_status       <fct> Breached, Within SLA, Breached, Within SLA, Within SL…
$ suspicious_flag  <fct> Yes, No, Yes, No, No, No, No, No, No, No, No, No, Yes…
$ case_narrative   <chr> "Delay flagged: expired KYC documents had to be re-co…
$ resolution_hrs   <dbl> 5.2, NA, 21.9, NA, NA, NA, NA, NA, 6.6, NA, NA, 16.3,…

Transaction value is right-skewed (log scale)

The monitoring log holds 200 transactions and 13 fields: a transaction identifier and date; the customer type, currency pair and USD amount; three compliance fields (KYC status, sanctions-screen result, suspicious flag); three service fields (turnaround, SLA target, SLA status); and the case narrative with its resolution time. The amount field is strongly right-skewed — a few large deals dominate — while turnaround clusters near the 30/45-minute SLA targets, with a breach tail to the right.

5 Analytical Question

How can Sohcahtoa Holdings monitor — and measurably reduce — its regulatory-compliance and customer-SLA exposure before it triggers a CBN enforcement action?

6 Technique 1 — Text Analytics

6.1 Theory recap

Text analytics converts unstructured words into structured signal. The standard pipeline is tokenise → remove stop-words → score → model. Term-frequency / inverse-document-frequency (TF-IDF) weighting scores how distinctive each term is to a document group, surfacing the themes that matter.

6.2 Business justification

Every breached or suspicious transaction carries an officer narrative. Mining those narratives tells the Head of Legal why exceptions happen — and therefore which control to fix first.

6.3 Code & output

Top terms across the case-narrative corpus

6.4 Interpretation

The corpus is dominated by documentation and screening language — “kyc”, “sanctions”, “expired”, “escalated”, “nfiu”. Two operational themes emerge: documentation gaps (expired or pending KYC) and screening escalations. Both are controllable: tighter KYC-refresh cadence addresses the first, and resourcing the screening desk (Technique 5) addresses the second.

7 Technique 2 — Monte Carlo Simulation

7.1 Theory recap

Monte Carlo replaces a single point estimate with a distribution of outcomes: sample each uncertain input thousands of times, run every sample through the cost model, and summarise the result with the mean and the P5 / P50 / P95 percentiles.

7.2 Business justification

A BDC’s worst case is not an average year — it is a CBN enforcement action. Monte Carlo lets the Head of Legal quantify that tail and show the Board how much an intervention buys down the P95.

7.3 Code & output

Monte Carlo distributions of 12-month exposure by scenario

7.4 Interpretation

The status-quo distribution has a long right tail — the CBN-enforcement scenario. Scenario C (screening automation) delivers the lowest mean exposure, while the comprehensive Scenario D delivers the lowest P95. For a Head of Legal, buying down the P95 — the licence-threatening tail — is usually worth a slightly higher average cost.

8 Technique 3 — Advanced Forecasting

8.1 Theory recap

Holt’s linear-trend exponential smoothing fits a level and a trend term to a short time series. With only six monthly observations it is the appropriate model — a seasonal model would need at least two years of history. Forecasts are reported with a confidence interval.

8.2 Business justification

The forecast tells the compliance desk how many transactions — and therefore how many screening checks and STR reviews — are coming, so capacity can be sized ahead of demand.

8.3 Code & output

Observed monthly transactions and the six-month Holt forecast

8.4 Interpretation

The forecast runs broadly flat at the six-month average. At the observed suspicious-flag rate of roughly 18%, that volume implies about six to seven STR reviews per month — the workload figure that feeds the optimisation in Technique 5. The compliance desk should staff to the upper edge of the forecast interval, not the central line.

9 Technique 4 — Customer / People Analytics

9.1 Theory recap

Survival analysis models time to an event. The Kaplan-Meier estimator describes the share of cases still open over time; the Cox proportional-hazards model estimates how covariates speed up or slow down resolution, via hazard ratios.

9.2 Business justification

Complaint-resolution speed is a customer-SLA metric and a conduct-risk signal. Survival analysis identifies which transaction characteristics are associated with slow resolution.

9.3 Code & output

Concordance index: 0.582

Resolved cases analysed: 86

Kaplan-Meier curves — share of complaints still open, by KYC status

9.4 Interpretation

The Cox model returns a low concordance index — the structural fields captured today (customer type, KYC status, sanctions result) only weakly predict resolution speed. This is itself a finding: the next monitoring cycle must capture the missing drivers — the assigned officer, the officer’s workload at the time, and a structured complaint category — before resolution speed can be modelled reliably. A model that is honestly under-powered tells the Head of Legal exactly which data to start collecting.

10 Technique 5 — Optimisation

10.1 Theory recap

Linear programming maximises a weighted objective subject to linear constraints. Decision variables are the hours allocated to each task; the objective weights reflect each task’s compliance value; the constraints are total capacity and per-task minimums and maximums.

10.2 Business justification

The compliance team has a fixed weekly capacity. Optimisation finds the hour allocation that maximises total compliance value — the defensible, auditable basis for the resourcing decision.

10.3 Code & output

Total objective value: 1000  | Total hours used: 200 of 200

Optimal allocation of the 200 weekly compliance-officer hours

10.4 Interpretation

The solver loads sanctions screening and STR filing to their maximum allowances — they carry the highest compliance value per hour and, from the Monte Carlo, are precisely the activities that buy down the catastrophic CBN-enforcement tail. KYC refresh, complaint resolution and audit preparation are held near their minimums. The shadow price on the capacity constraint quantifies the value of a sixth compliance officer.

11 Integrated Findings

Read together, the five analyses point to a single recommendation: resource the sanctions-screening and STR desks to capacity, automate first-pass screening, and govern the programme against the P95 of the Monte Carlo — not the average. The worst case for a Bureau de Change is loss of its CBN licence, and the analysis shows that the two activities which most reduce that risk are exactly the two the optimisation prioritises.

12 Limitations & Further Work

Synthetic data. The monitoring log is synthetic, calibrated to typical BDC patterns. The techniques transfer unchanged to the live transaction system.
Short time series. Six monthly observations support only a Holt forecast. With 24+ months, a seasonal model (CBN reporting cycles, festive-season FX demand) would be fitted and back-tested.
Under-powered survival model. The Cox model’s low concordance shows the current fields do not explain resolution speed. The next data cycle must capture assigned officer, officer workload and a structured complaint category.
Monte Carlo assumptions. The enforcement-cost and enforcement-probability parameters are planning estimates. They should be re-calibrated against the firm’s own audit history and the latest CBN enforcement guidance.
Causality. All findings are observational. A controlled pilot of the screening-automation intervention would give the Monte Carlo effect sizes a causal foundation.

References

Box, G. E. P., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time Series Analysis: Forecasting and Control (5th ed.). Wiley.
Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice (3rd ed.). OTexts.
Hyndman, R. J. et al. (2024). forecast: Forecasting Functions for Time Series and Linear Models. R package. citation("forecast")
Therneau, T. M. (2024). survival: Survival Analysis. R package. citation("survival")
Silge, J., & Robinson, D. (2017). Text Mining with R: A Tidy Approach. O’Reilly. citation("tidytext")
Berkelaar, M. et al. (2024). lpSolve: Interface to ‘Lp_solve’ to Solve Linear/Integer Programs. R package. citation("lpSolve")
Wickham, H. et al. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686.
Central Bank of Nigeria. (2024). Operational Guidelines for Bureaux de Change in Nigeria. CBN.
Federal Republic of Nigeria. (2023). Nigeria Data Protection Act. National Assembly.
Adi, B. (2026). AI-powered business analytics: a practical textbook for data-driven decision making. Mark Analytics. https://markanalytics.online/ai-powered-data-analytics/

Appendix — AI Usage Statement

The author used Claude (Anthropic) for two tasks: drafting the Quarto scaffold and section headings to the required rubric, and checking R syntax for the forecast, survival, lpSolve and tidytext packages. The analytical question, the scenario design, the Monte Carlo assumptions, the interpretation of every result — including the decision to report the survival model as honestly under-powered — and the recommendation to the Board are the independent professional judgement of Temitope Aribisala, Head of Legal & Compliance, Sohcahtoa Holdings Limited. Every figure is computed live in this document from Sohcahtoa_BDC_Compliance_SLA_Data.xlsx and is reproducible end-to-end via quarto render.