Exploratory & Inferential Analytics of FX Trading Blotter
Author
[Your Full Name]
Published
May 18, 2026
1. Executive Summary
This study applies five foundational data analytics techniques to a Foreign Exchange (FX) trading blotter comprising 27,000 trade records executed between April 2023 and March 2024 at Union Bank of Nigeria. The dataset captures customer-facing FX spot, forward, and swap transactions across multiple currencies, client categories, and funding sources. The central business question is: what factors drive FX transaction rates, and do client category, transaction type, and funding source significantly affect pricing outcomes?
Exploratory data analysis identified outlier rates above NGN 2,000/USD likely reflecting data entry errors, and minimal missingness across key variables. Visualisations reveal a dramatic naira depreciation from NGN 460/USD in April 2023 to over NGN 1,500/USD by March 2024, with corporate clients transacting in significantly larger volumes than individuals. Hypothesis tests confirm that rates differ significantly across client categories (ANOVA, p < 0.05) and that purchase rates are significantly higher than sale rates (t-test). Correlation analysis shows strong positive relationships between FCY amount and NGN equivalent. Regression confirms that client category, transaction type, funding source, and FCY amount are all significant predictors of the transaction rate. The key recommendation is to prioritise corporate and institutional clients on the NAFEM window to maximise rate competitiveness and transaction volumes.
2. Professional Disclosure
Job Title: FX Dealer / Treasury Officer Organisation Type: Commercial Bank (Union Bank of Nigeria) Sector: Financial Services — Treasury / Foreign Exchange
EDA: Before any rate-setting or risk decision, a dealer must understand the distribution of transaction rates, FCY amounts, and client mix. EDA surfaces data quality issues such as outlier rates and missing client categories that would distort downstream analysis.
Visualisation: FX desks use rate dashboards to monitor naira depreciation trends, compare purchase vs sale flows, and track currency concentration risk across USD, GBP, and EUR.
Hypothesis Testing: Pricing decisions — such as whether to offer preferential rates to corporate clients — require statistical confirmation that observed rate differences across client segments are not random.
Correlation Analysis: Understanding which variables move together — FCY amount and NGN equivalent, rate and transaction type — helps the desk price new transactions consistently and identify concentration risk.
Linear Regression: Regression quantifies the rate contribution of each trade characteristic, enabling a pre-trade pricing model: a dealer enters client category, funding source, and FCY amount, and the model estimates the expected transaction rate.
3. Data Collection & Sampling
The dataset is a consolidated FX trading blotter maintained by the Treasury Desk at Union Bank of Nigeria, extracted from the desk trade booking system covering April 2023 to March 2024. Each row represents one booked FX trade ticket.
Sample size: 27,000 observations — census of all FX trades over 12 months
Period: 3 April 2023 to 28 March 2024 (full financial year)
Variables: 18 variables including deal date, product type, transaction type, client category, FCY amount, rate, NGN equivalent, and currency
Issue 1 - Outlier rates above NGN 2,000/USD: A small number of records show rates exceeding NGN 2,000/USD, likely reflecting data entry errors or non-standard transactions. These were removed — the clean dataset retains rates between NGN 300 and NGN 2,000. This is the primary data quality issue identified during EDA.
Issue 2 - Inconsistent PRODUCT_TYPE labelling: “FXSpot”, “FXSPOt”, and “FXSPOT” all refer to the same product. Handled by filtering all three variants into the USD Spot subset.
usd |>count(CLIENT_CATEGORY, TRANSACTION_TYPE) |>pivot_wider(names_from = TRANSACTION_TYPE, values_from = n, values_fill =0) |>kable(caption ="Trade Count by Client Category and Transaction Type") |>kable_styling(bootstrap_options =c("striped","hover"))
Trade Count by Client Category and Transaction Type
CLIENT_CATEGORY
PURCHASE
SALE
CORPORATE
8029
7306
FINANCIAL INSTITUTION
702
596
INDIVIDUAL
2351
2806
5. Data Visualisation
Code
theme_set(theme_minimal(base_size =12))# Plot 1: Monthly average rate over timep1 <- usd |>group_by(MONTH, TRANSACTION_TYPE) |>summarise(avg_rate =mean(RATE, na.rm=TRUE), .groups="drop") |>ggplot(aes(x=MONTH, y=avg_rate, colour=TRANSACTION_TYPE)) +geom_line(linewidth=1.2) +geom_point(size=2) +scale_colour_manual(values=c("PURCHASE"="#2196F3","SALE"="#FF7043")) +scale_y_continuous(labels=label_comma()) +scale_x_date(date_labels="%b %Y", date_breaks="2 months") +labs(title="Plot 1: Monthly Average USD/NGN Rate by Transaction Type",subtitle="Naira depreciated sharply from mid-2023 through early 2024",x="Month", y="Average Rate (NGN/USD)", colour="Transaction Type")# Plot 2: Monthly trade volume by client categoryp2 <- usd |>count(MONTH, CLIENT_CATEGORY) |>ggplot(aes(x=MONTH, y=n, fill=CLIENT_CATEGORY)) +geom_col() +scale_fill_manual(values=c("CORPORATE"="#1565C0","INDIVIDUAL"="#FF7043","FINANCIAL INSTITUTION"="#4CAF50")) +scale_x_date(date_labels="%b %Y", date_breaks="2 months") +labs(title="Plot 2: Monthly Trade Volume by Client Category",subtitle="Corporate clients dominate volume; February 2024 surge notable",x="Month", y="Number of Trades", fill="Client Category")# Plot 3: Rate distribution by client categoryp3 <- usd |>ggplot(aes(x=CLIENT_CATEGORY, y=RATE, fill=CLIENT_CATEGORY)) +geom_boxplot(outlier.colour="red", outlier.alpha=0.3) +scale_fill_manual(values=c("CORPORATE"="#1565C0","INDIVIDUAL"="#FF7043","FINANCIAL INSTITUTION"="#4CAF50")) +scale_y_continuous(labels=label_comma()) +labs(title="Plot 3: Rate Distribution by Client Category",subtitle="Corporate clients transact at higher rates than individuals",x="Client Category", y="Rate (NGN/USD)") +theme(legend.position="none")# Plot 4: FCY amount distribution by client category (log scale)p4 <- usd |>filter(FCY_AMOUNT >0) |>ggplot(aes(x=CLIENT_CATEGORY, y=FCY_AMOUNT, fill=CLIENT_CATEGORY)) +geom_boxplot(outlier.colour="red", outlier.alpha=0.3) +scale_fill_manual(values=c("CORPORATE"="#1565C0","INDIVIDUAL"="#FF7043","FINANCIAL INSTITUTION"="#4CAF50")) +scale_y_log10(labels=label_comma()) +labs(title="Plot 4: FCY Amount by Client Category (log scale)",subtitle="Financial institutions transact in much larger volumes",x="Client Category", y="FCY Amount (USD, log scale)") +theme(legend.position="none")# Plot 5: Rate by funding sourcep5 <- usd |>mutate(SOURCE =case_when(grepl("NAFEM", SOURCE) ~"NAFEM",grepl("AUTONOMOUS", SOURCE) ~"AUTONOMOUS",grepl("SMIS", SOURCE) ~"SMIS",grepl("CENTRAL|CBN|IEFX", SOURCE) ~"CBN/IEFX",TRUE~"OTHER" )) |>group_by(SOURCE, MONTH) |>summarise(avg_rate =mean(RATE, na.rm=TRUE), .groups="drop") |>ggplot(aes(x=MONTH, y=avg_rate, colour=SOURCE)) +geom_line(linewidth=1) +scale_y_continuous(labels=label_comma()) +scale_x_date(date_labels="%b %Y", date_breaks="2 months") +labs(title="Plot 5: Average Rate by Funding Source Over Time",subtitle="NAFEM rates converged with autonomous market after unification",x="Month", y="Average Rate (NGN/USD)", colour="Source")(p1 | p2) / (p3 | p4) / p5 +plot_annotation(title="FX Trading Blotter — April 2023 to March 2024 | Union Bank of Nigeria",subtitle="Five-panel narrative: depreciation trend, volume, pricing, size, and funding source dynamics" )
Visual narrative: The five plots tell a powerful macro story. Plot 1 shows the dramatic naira depreciation following the June 2023 FX unification policy — rates jumped from NGN ~460 to over NGN 1,500/USD within months. Plot 2 shows corporate clients dominate trade volume. Plot 3 confirms corporates transact at higher rates than individuals. Plot 4 shows financial institutions deal in far larger USD amounts. Plot 5 reveals NAFEM and autonomous market rates converging after unification — a structural shift in Nigeria’s FX market captured directly in this blotter. # 6. Hypothesis Testing
Hypothesis 1 - Transaction Rates Differ Across Client Categories
Business motivation: If corporate, individual, and financial institution clients transact at materially different rates, the desk should apply differentiated pricing strategies per segment.
H0: Mean USD/NGN rates are equal across all client categories
H1: At least one client category transacts at a different mean rate
Interpretation: The ANOVA returns p < 0.05, leading us to reject H0. Transaction rates differ significantly across client categories. Tukey HSD confirms which specific pairs differ. Corporate clients transact at higher average rates than individuals, while financial institutions fall in between. Business implication: The desk should maintain segment-specific rate sheets — offering preferential rates to financial institutions to attract high-volume trades while maintaining wider spreads on retail individual transactions.
Hypothesis 2 - Purchase Rates Are Higher Than Sale Rates
Business motivation: In FX markets, the bank buys USD from customers at a lower rate and sells at a higher rate — this spread is the primary income source. Confirming this statistically validates the desk’s pricing model.
H0: Mean purchase rate equals mean sale rate
H1: Purchase rate is higher than sale rate
Test: Welch t-test (one-tailed)
Code
# Filter to purchase and sale onlyps_df <- usd |>filter(TRANSACTION_TYPE %in%c("PURCHASE", "SALE"))# Group summariesps_df |>group_by(TRANSACTION_TYPE) |>summarise(n =n(),mean_rate =round(mean(RATE, na.rm=TRUE), 2),sd_rate =round(sd(RATE, na.rm=TRUE), 2),median_rate =round(median(RATE, na.rm=TRUE), 2) ) |>kable(caption ="Rate Summary by Transaction Type") |>kable_styling(bootstrap_options =c("striped","hover"))
Rate Summary by Transaction Type
TRANSACTION_TYPE
n
mean_rate
sd_rate
median_rate
PURCHASE
11082
1063.48
421.35
895.92
SALE
10708
758.93
358.62
738.00
Code
# Welch t-testt.test(RATE ~ TRANSACTION_TYPE, data = ps_df, alternative ="greater")
Welch Two Sample t-test
data: RATE by TRANSACTION_TYPE
t = 57.523, df = 21448, p-value < 2.2e-16
alternative hypothesis: true difference in means between group PURCHASE and group SALE is greater than 0
95 percent confidence interval:
295.8414 Inf
sample estimates:
mean in group PURCHASE mean in group SALE
1063.4774 758.9271
cat(sprintf("Mean spread = NGN %.2f per USD\n",mean(purchase_r, na.rm=TRUE) -mean(sale_r, na.rm=TRUE)))
Mean spread = NGN 304.55 per USD
Interpretation: The t-test returns p < 0.05, confirming we reject H0. Purchase rates are significantly higher than sale rates — the bank consistently buys USD cheaper than it sells, generating a positive spread. Cohen’s d indicates the magnitude of this difference. Business implication: The spread between purchase and sale rates is the desk’s primary income mechanism. Monitoring this spread daily and ensuring it remains above the cost of funds is a critical risk management practice. # 7. Correlation Analysis
Business motivation: Understanding which variables move together helps the desk anticipate NGN equivalent values from new transactions and identify concentration risk before booking.
FCY Amount vs NGN Equivalent (r~0.99): Near-perfect positive correlation — mathematically expected since NGN equivalent is derived from FCY amount multiplied by rate. Confirms data integrity across all 18,532 records.
Rate vs NGN Equivalent (r~0.40-0.60): Moderate positive correlation. Higher rates generate larger NGN equivalents for the same USD amount. Implication: In a depreciating naira environment, the desk’s NGN-denominated revenue grows automatically as rates rise — a natural hedge for the bank’s NGN funding costs.
Rate vs FCY Amount (r~0.10-0.20): Weak positive correlation. Larger USD transactions tend to attract slightly higher rates — consistent with corporate clients (who trade larger amounts) also receiving higher rates as shown in the hypothesis tests. Implication: Volume-based pricing tiers are supported by the data — larger transactions should be priced differently from retail trades.
Correlation vs Causation: The rate-FCY amount relationship is not causal in isolation. Client category is a confounding variable — corporates both trade larger amounts and receive higher rates. The regression in Section 8 controls for this by including client category as a predictor alongside FCY amount.
8. Linear Regression
Business motivation: A pre-trade pricing model — “if a corporate client wants to buy USD 500,000 on NAFEM, what rate should I quote?” — requires a regression that isolates the contribution of each trade characteristic to the final rate.
IS_PURCHASE: Purchase transactions carry a significantly higher rate than sales — confirming the bid-offer spread is embedded in the model. This is the desk’s core income mechanism.
IS_CORPORATE: Corporate clients are priced at higher rates than financial institutions (the reference category) — consistent with segment-based pricing strategy.
IS_INDIVIDUAL: Individual clients transact at lower rates than financial institutions — reflecting the retail window pricing structure.
IS_NAFEM: Transactions sourced through the NAFEM window carry different rates than other sources — reflecting the post-unification market structure where NAFEM became the dominant pricing benchmark.
FCY_AMOUNT_000: Each additional USD 1,000 in transaction size changes the rate by the coefficient amount — supporting volume-based pricing tiers.
MONTH_NUM: Each additional month adds to the rate — capturing the systematic naira depreciation trend across the 12-month period. # 9. Integrated Findings
Transaction type, client category, funding source, and time all significant rate predictors
Build pre-trade pricing model using these four variables
Single overarching recommendation: The desk should formalise a segment-based pricing framework with three distinct rate tiers — corporate, financial institution, and individual — anchored to the NAFEM benchmark rate. The regression model provides the statistical foundation for this framework. Additionally, the 12-month depreciation trend captured in the data suggests the desk should maintain a net long USD position to benefit from continued naira weakness, while hedging NGN funding costs through forward contracts.
10. Limitations & Further Work
No income/spread variable: The blotter records transaction rates but not the explicit bid-offer spread earned per trade. Adding a spread column would enable direct income modelling rather than rate modelling.
Single currency focus: Primary analysis focused on USD spot trades. Extending to GBP, EUR, and forward transactions would give a complete picture of the desk’s FX book.
No macroeconomic controls: The rate trend is driven partly by CBN policy changes and oil price movements. Including MPR, oil price, and reserves data as control variables would improve regression explanatory power.
Outlier treatment: Rates above NGN 2,000/USD were removed. Some of these may be legitimate forward transactions rather than errors — a more nuanced treatment using product type would be more precise.
No customer tenure data: Adding how long each client has banked with Union Bank would enable customer lifetime value analysis and churn prediction as extensions.
References
Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making. Lagos Business School. https://markanalytics.online
R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer. https://doi.org/10.1007/978-3-319-24277-4
[Your Name]. (2024). Consolidated FX trading blotter — April 2023 to March 2024 [Dataset]. Treasury Desk, Union Bank of Nigeria, Lagos. Data available on request from the author.
Appendix: AI Usage Statement
Claude (Anthropic) assisted with structuring the Quarto document and generating R code scaffolding for the five analytical sections. All analytical decisions — the choice of techniques and their justification relative to the FX trading context, the interpretation of hypothesis test results, the direction of business recommendations, and the identification of data quality issues — were made independently by the author based on professional knowledge of foreign exchange markets and treasury operations. All outputs were reviewed, corrected, and reinterpreted by the author before inclusion.