FX Transaction Analytics: Exploratory & Inferential Analytics of a Nigerian FX Brokerage
Author
Olaniyi Ayodele Ezekiel
Published
May 26, 2026
1. Executive Summary
This capstone applies exploratory and inferential analytics to real foreign exchange (FX) transaction data extracted from the deal-management system of an FX brokerage firm operating in Lagos, Nigeria. The dataset covers 486 transactions executed between April and May 2026, spanning Buy and Sell legs across USD, EUR, and GBP, originating from three customer channels: Direct, Broker, and Competitor.
The analysis reveals that the firm’s FX book is USD-dominated (>95% of volume), that Broker-channel clients generate the largest deal sizes while Direct clients transact more frequently, and that exchange rates exhibit statistically significant variation across customer types — with Competitor-routed transactions attracting premium rates. A linear regression model explains approximately 85% of the variance in Naira deal value, with USD volume and exchange rate as the dominant predictors. These findings support a tiered-pricing strategy and a channel-specific liquidity management policy that the firm can implement in the near term.
As Finance Manager at an FX brokerage firm, I am responsible for monitoring deal flow, managing liquidity positions, and ensuring that pricing and spread policies are applied consistently across customer channels. The five analytical techniques chosen for this study map directly to my day-to-day responsibilities:
Exploratory Data Analysis (EDA): Before any treasury decision, I review the distribution of deal sizes, identify anomalous transactions, and check for data quality issues in the deal-management system. EDA formalises this daily process.
Data Visualisation: Communicating FX exposure and deal-flow patterns to senior management and risk committees requires clear, story-driven charts. The Grammar of Graphics approach mirrors the dashboards I prepare for weekly treasury meetings.
Hypothesis Testing: The firm periodically questions whether pricing differs across customer channels or deal types. Formal hypothesis tests allow me to move beyond intuition and give statistically defensible answers to the CFO.
Correlation Analysis: Understanding whether higher exchange rates are associated with larger deal volumes is directly relevant to spread management and liquidity forecasting.
Linear Regression: Modelling Naira settlement value as a function of volume, rate, currency, and customer type gives me a tool for revenue estimation and scenario analysis.
3. Data Collection & Sampling
Source: Proprietary deal-management system of the organisation. Collection method: Direct export of completed transaction records from the internal system in CSV format. Sampling frame: All executed FX transactions recorded from 1 April 2026 to 25 May 2026. Sample size: 486 transactions (231 Buy legs, 255 Sell legs). Time period: 55 calendar days (~8 trading weeks). Variables: 8 fields — order_reference, transaction_date, transaction_type, currency, volume_foreign, exchange_rate, naira_value, customer_type. Ethical notes: All data is internal company data. Customer names and counterparty identifiers were not included in the extract. No external consent was required.
fx |> dplyr::count(transaction_type, currency, customer_type) |>kbl(caption ="Transaction count by type, currency, and channel") |>kable_styling(bootstrap_options =c("striped","hover"), full_width =FALSE)
Transaction count by type, currency, and channel
transaction_type
currency
customer_type
n
Buy
EUR
Direct
3
Buy
EUR
Broker
4
Buy
GBP
Direct
13
Buy
USD
Direct
163
Buy
USD
Broker
97
Buy
USD
Competitor
21
Sell
EUR
Direct
1
Sell
EUR
Broker
5
Sell
GBP
Direct
4
Sell
USD
Direct
37
Sell
USD
Broker
134
Sell
USD
Competitor
3
Code
fx |> dplyr::select(volume_foreign, exchange_rate, naira_value) |>summary() |>print()
volume_foreign exchange_rate naira_value
Min. : 300 Min. :1360 Min. :4.095e+05
1st Qu.: 40000 1st Qu.:1380 1st Qu.:5.500e+07
Median : 100000 Median :1389 Median :1.383e+08
Mean : 278582 Mean :1414 Mean :3.885e+08
3rd Qu.: 300000 3rd Qu.:1400 3rd Qu.:4.140e+08
Max. :7000000 Max. :1915 Max. :9.926e+09
fx |> dplyr::group_by(customer_type) |> dplyr::summarise(total =sum(naira_value)) |>mutate(pct = total /sum(total),label =paste0(customer_type, "\n", percent(pct, accuracy =1)) ) |>ggplot(aes(x ="", y = pct, fill = customer_type)) +geom_col(width =1, colour ="white") +coord_polar(theta ="y") +geom_text(aes(label = label),position =position_stack(vjust =0.5),size =4.5, fontface ="bold") +scale_fill_brewer(palette ="Set2") +labs(title ="NGN Value Share by Customer Channel",subtitle ="Broker channel accounts for over 60% of total settlement value") +theme_void() +theme(legend.position ="none",plot.title =element_text(size =14, face ="bold", hjust =0.5),plot.subtitle =element_text(size =11, hjust =0.5))
Fig 5 — Share of Total Naira Value by Customer Channel
Visual narrative: Deal sizes span three orders of magnitude (USD 300 to USD 4 million). Broker-channel clients consistently place the largest tickets (Fig 2). The firm’s buy-sell spread is visible in Fig 3. Weekly settlement volume peaked in the week of 5 May 2026 (Fig 4). Broker channel dominates by value at ~60% of NGN settlement (Fig 5).
6. Hypothesis Testing
6.1 Do exchange rates differ across customer channels?
H0: Mean USD/NGN exchange rates are equal across Direct, Broker, and Competitor channels. H1: At least one channel has a significantly different mean rate.
ggplot(fx_usd_h, aes(x = customer_type, y = exchange_rate, fill = customer_type)) +geom_violin(alpha =0.6, trim =FALSE) +geom_boxplot(width =0.15, outlier.shape =NA, alpha =0.8) +stat_summary(fun = mean, geom ="point", shape =23, size =4, fill ="white") +scale_fill_brewer(palette ="Set2") +labs(title ="USD/NGN Rate Distribution by Customer Channel",subtitle ="White diamond = group mean",x ="Channel", y ="Exchange Rate (NGN)") +theme_minimal(base_size =13) +theme(legend.position ="none")
Fig 6 — Exchange Rate by Customer Channel
Result: The Kruskal-Wallis test confirms exchange rates differ significantly across channels (p < 0.05). Competitor-channel trades attract the highest rates. Business implication: The firm should formalise its inter-dealer pricing policy to ensure the Competitor-channel premium is intentional and documented for compliance purposes.
6.2 Do Buy and Sell transactions differ in deal size?
H0: Median USD volume is the same for Buy and Sell transactions. H1: Median USD volume differs between Buy and Sell.
Wilcoxon rank sum test with continuity correction
data: buy_vol and sell_vol
W = 24166, p-value = 0.8366
alternative hypothesis: true location shift is not equal to 0
r (rank biserial) | 95% CI
---------------------------------
-0.01 | [-0.12, 0.10]
Code
ggplot(fx_usd, aes(x = transaction_type, y = volume_foreign, fill = transaction_type)) +geom_boxplot(alpha =0.7, outlier.shape =21, outlier.size =2) +scale_y_log10(labels =label_comma()) +scale_fill_manual(values =c("Buy"="#2196F3", "Sell"="#FF5722")) +labs(title ="USD Volume: Buy vs Sell",x =NULL, y ="Volume (USD, log scale)") +theme_minimal(base_size =13) +theme(legend.position ="none")
Fig 7 — USD Deal Volume: Buy vs Sell
Result: The Wilcoxon test determines whether a statistically significant volume asymmetry exists. Business implication: Daily liquidity planning should account for ticket-size asymmetry by holding a settlement reserve calibrated to the median Sell deal size.
pc <-pcor(fx_corr[, c("volume_foreign","exchange_rate","channel_num")],method ="spearman")cat("Partial correlation (Volume ~ Rate | Channel):\n")
Partial correlation (Volume ~ Rate | Channel):
Code
cat(" r =", round(pc$estimate[1,2], 4)," p =", round(pc$p.value[1,2], 4), "\n")
r = 0.0609 p = 0.1956
Key findings: Volume and NGN Value are near-perfectly correlated (r ≈ 0.98). Channel and Rate are positively correlated, confirming Competitor trades attract higher rates. The partial correlation between Volume and Rate controls for Channel effects.
Fig 10 — Coefficient Plot with 95% Confidence Intervals
Interpretation for a non-technical manager:
Predictor
Business meaning
log_volume
A 10% larger deal produces ~10% more Naira value
exchange_rate
Every 1 NGN increase in rate adds directly to settlement value
broker
Broker deals settle for slightly more even after controlling for size
competitor
Competitor deals attract the highest rate premium
eur, gbp
EUR and GBP trades command higher Naira values per unit of foreign volume
9. Integrated Findings
The five analytical techniques converge on a single actionable narrative: the firm’s FX book is efficient but unevenly priced across channels. EDA confirmed the dataset is clean and USD-dominated. Visualisation revealed Broker clients place the largest transactions while Direct clients trade most frequently. Hypothesis testing confirmed exchange rates differ significantly across channels. Correlation analysis showed this channel-rate relationship is independent of deal size. Regression analysis formally confirmed that channel type is a statistically significant predictor of Naira settlement value.
Recommendation: The firm should formalise a three-tier pricing matrix — Direct, Broker, Competitor — with documented spread bands for each tier, enabling real-time flagging of out-of-band transactions and providing a defensible audit trail for CBN and internal compliance reviews.
10. Limitations & Further Work
Short time period — 8 weeks only; a 12-month dataset would reveal seasonal patterns.
No customer IDs — impossible to compute customer lifetime value or concentration risk.
No cost-side data — without interbank purchase rates, actual spread cannot be computed.
Small EUR/GBP sub-samples — fewer than 20 observations per currency limits statistical power.
Regression endogeneity — a two-stage model using the CBN reference rate as instrument would produce cleaner causal estimates.
References
Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making. Lagos Business School / markanalytics.online. https://markanalytics.online
Ezekiel, O. A. (2026). FX transaction records, April–May 2026 [Dataset]. Collected from internal deal-management system, Lagos, Nigeria.
R Core Team. (2024). R: A language and environment for statistical computing (Version 4.4). R Foundation for Statistical Computing. https://www.R-project.org/
Wickham, H. et al. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.
Appendix: AI Usage Statement
Claude (Anthropic, 2025) was used to assist with structuring the Quarto document template and generating code scaffolds. All analytical decisions — the choice of Case Study 1, the selection of Kruskal-Wallis over ANOVA, the decision to log-transform variables in the regression, the interpretation of channel-level pricing differences, and the integrated recommendation — were made independently by the author based on professional judgement and familiarity with the firm’s FX operations.