FX Transaction Analytics: Exploratory & Inferential Analytics of a Nigerian FX Brokerage

Author

Olaniyi Ayodele Ezekiel

Published

May 26, 2026

1. Executive Summary

This capstone applies exploratory and inferential analytics to real foreign exchange (FX) transaction data extracted from the deal-management system of an FX brokerage firm operating in Lagos, Nigeria. The dataset covers 486 transactions executed between April and May 2026, spanning Buy and Sell legs across USD, EUR, and GBP, originating from three customer channels: Direct, Broker, and Competitor.

The analysis reveals that the firm’s FX book is USD-dominated (>95% of volume), that Broker-channel clients generate the largest deal sizes while Direct clients transact more frequently, and that exchange rates exhibit statistically significant variation across customer types — with Competitor-routed transactions attracting premium rates. A linear regression model explains approximately 85% of the variance in Naira deal value, with USD volume and exchange rate as the dominant predictors. These findings support a tiered-pricing strategy and a channel-specific liquidity management policy that the firm can implement in the near term.

2. Professional Disclosure

Name: Olaniyi Ayodele Ezekiel
Job Title: Finance Manager
Organisation Type: FX Brokerage Services (Lagos, Nigeria)

As Finance Manager at an FX brokerage firm, I am responsible for monitoring deal flow, managing liquidity positions, and ensuring that pricing and spread policies are applied consistently across customer channels. The five analytical techniques chosen for this study map directly to my day-to-day responsibilities:

Exploratory Data Analysis (EDA): Before any treasury decision, I review the distribution of deal sizes, identify anomalous transactions, and check for data quality issues in the deal-management system. EDA formalises this daily process.
Data Visualisation: Communicating FX exposure and deal-flow patterns to senior management and risk committees requires clear, story-driven charts. The Grammar of Graphics approach mirrors the dashboards I prepare for weekly treasury meetings.
Hypothesis Testing: The firm periodically questions whether pricing differs across customer channels or deal types. Formal hypothesis tests allow me to move beyond intuition and give statistically defensible answers to the CFO.
Correlation Analysis: Understanding whether higher exchange rates are associated with larger deal volumes is directly relevant to spread management and liquidity forecasting.
Linear Regression: Modelling Naira settlement value as a function of volume, rate, currency, and customer type gives me a tool for revenue estimation and scenario analysis.

3. Data Collection & Sampling

Source: Proprietary deal-management system of the organisation.
Collection method: Direct export of completed transaction records from the internal system in CSV format.
Sampling frame: All executed FX transactions recorded from 1 April 2026 to 25 May 2026.
Sample size: 486 transactions (231 Buy legs, 255 Sell legs).
Time period: 55 calendar days (~8 trading weeks).
Variables: 8 fields — order_reference, transaction_date, transaction_type, currency, volume_foreign, exchange_rate, naira_value, customer_type.
Ethical notes: All data is internal company data. Customer names and counterparty identifiers were not included in the extract. No external consent was required.

4. Data Description

Code

library(tidyverse)
library(lubridate)
library(knitr)
library(kableExtra)
library(scales)
library(car)
library(effectsize)
library(broom)
library(ppcor)
library(dunn.test)

fx <- read_csv(
  "C:/Users/Ayodele Olaniyi/Downloads/FX_Capstone Folder/FX_Transactions_Cleaned.csv",
  show_col_types = FALSE
) |>
  mutate(
    transaction_date = dmy(transaction_date),
    transaction_type = factor(transaction_type, levels = c("Buy","Sell")),
    currency         = factor(currency),
    customer_type    = factor(customer_type, levels = c("Direct","Broker","Competitor")),
    week             = floor_date(transaction_date, "week"),
    naira_bn         = naira_value / 1e9,
    vol_k            = volume_foreign / 1000
  )

cat("Total transactions:", nrow(fx), "\n")

Total transactions: 485

Code

cat("Date range:", as.character(min(fx$transaction_date)),
    "to", as.character(max(fx$transaction_date)), "\n")

Date range: 2026-02-27 to 2026-05-25

Code

cat("Columns:", paste(names(fx), collapse=", "), "\n")

Columns: order_reference, transaction_date, transaction_type, currency, volume_foreign, exchange_rate, naira_value, customer_type, week, naira_bn, vol_k

Code

fx |>
  dplyr::count(transaction_type, currency, customer_type) |>
  kbl(caption = "Transaction count by type, currency, and channel") |>
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)

Transaction count by type, currency, and channel
transaction_type	currency	customer_type	n
Buy	EUR	Direct	3
Buy	EUR	Broker	4
Buy	GBP	Direct	13
Buy	USD	Direct	163
Buy	USD	Broker	97
Buy	USD	Competitor	21
Sell	EUR	Direct	1
Sell	EUR	Broker	5
Sell	GBP	Direct	4
Sell	USD	Direct	37
Sell	USD	Broker	134
Sell	USD	Competitor	3

Code

fx |>
  dplyr::select(volume_foreign, exchange_rate, naira_value) |>
  summary() |>
  print()

 volume_foreign    exchange_rate   naira_value       
 Min.   :    300   Min.   :1360   Min.   :4.095e+05  
 1st Qu.:  40000   1st Qu.:1380   1st Qu.:5.500e+07  
 Median : 100000   Median :1389   Median :1.383e+08  
 Mean   : 278582   Mean   :1414   Mean   :3.885e+08  
 3rd Qu.: 300000   3rd Qu.:1400   3rd Qu.:4.140e+08  
 Max.   :7000000   Max.   :1915   Max.   :9.926e+09

Code

cat("Missing values per column:\n")

Missing values per column:

Code

colSums(is.na(fx)) |> print()

 order_reference transaction_date transaction_type         currency 
               0                0                0                0 
  volume_foreign    exchange_rate      naira_value    customer_type 
               0                0                0                0 
            week         naira_bn            vol_k 
               0                0                0

Code

q   <- quantile(fx$volume_foreign, c(0.25, 0.75))
iqr <- q[2] - q[1]
outliers <- fx |> dplyr::filter(volume_foreign > q[2] + 3 * iqr)
cat("\nExtreme-volume outliers (retained):", nrow(outliers), "\n")


Extreme-volume outliers (retained): 21

Code

outliers |>
  dplyr::select(order_reference, transaction_date, transaction_type,
         currency, volume_foreign, customer_type) |>
  kbl(caption = "Extreme-volume transactions") |>
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)

Extreme-volume transactions
order_reference	transaction_date	transaction_type	currency	volume_foreign	customer_type
P/2026/0784	2026-05-13	Buy	USD	2000000	Broker
P/2026/0778	2026-05-11	Buy	USD	2200000	Broker
P/2026/0742	2026-05-07	Buy	USD	2530000	Competitor
P/2026/0675	2026-04-28	Buy	USD	1450000	Direct
P/2026/0668	2026-04-24	Buy	USD	2050000	Direct
P/2026/0648	2026-04-22	Buy	USD	2420000	Direct
P/2026/0622	2026-04-20	Buy	USD	2000000	Direct
P/2026/0511	2026-03-27	Buy	USD	2000000	Direct
P/2026/0478	2026-03-23	Buy	USD	1100000	Competitor
P/2026/0429	2026-03-13	Buy	USD	2000000	Direct
P/2026/0354	2026-03-04	Buy	USD	1300000	Direct
P/2026/0342	2026-03-02	Buy	USD	1164218	Competitor
S/2026/0382	2026-03-17	Sell	USD	7000000	Broker
S/2026/0343	2026-03-06	Sell	USD	3000000	Broker
S/2026/0550	2026-04-28	Sell	USD	2000000	Broker
S/2026/0494	2026-04-17	Sell	USD	2500000	Broker
S/2026/0619	2026-05-13	Sell	USD	4000000	Broker
S/2026/0625	2026-05-14	Sell	USD	3000000	Broker
S/2026/0594	2026-05-06	Sell	USD	1946000	Broker
S/2026/0598	2026-05-07	Sell	USD	3990000	Broker
S/2026/0573	2026-05-04	Sell	USD	2500000	Broker

Data quality findings:

No missing values — the dataset is complete across all 8 variables.
Extreme-volume outliers exist (e.g. single deals of USD 4 million) but are legitimate large institutional trades and have been retained.
Three currencies are present: USD (~96%), EUR (~3%), and GBP (~1%).

5. Data Visualisation

Code

fx_usd <- fx |> dplyr::filter(currency == "USD")

ggplot(fx_usd, aes(x = volume_foreign)) +
  geom_histogram(aes(fill = transaction_type), bins = 40,
                 alpha = 0.8, position = "identity") +
  scale_x_log10(labels = label_comma()) +
  scale_fill_manual(values = c("Buy" = "#2196F3", "Sell" = "#FF5722")) +
  labs(
    title    = "USD Deal Volume Distribution (log scale)",
    subtitle = "Most deals cluster between USD 10,000 and USD 500,000",
    x = "Deal Volume (USD, log scale)", y = "Count", fill = "Type"
  ) +
  theme_minimal(base_size = 13)

Code

ggplot(fx_usd, aes(x = customer_type, y = volume_foreign, fill = customer_type)) +
  geom_boxplot(outlier.shape = 21, outlier.size = 2, alpha = 0.75) +
  geom_jitter(width = 0.15, alpha = 0.3, size = 1.2) +
  scale_y_log10(labels = label_comma()) +
  scale_fill_brewer(palette = "Set2") +
  labs(
    title    = "USD Deal Volume by Customer Channel",
    subtitle = "Broker channel drives the largest individual deal sizes",
    x = "Customer Channel", y = "Volume (USD, log scale)"
  ) +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none")

Code

fx_usd |>
  dplyr::group_by(transaction_date, transaction_type) |>
  dplyr::summarise(avg_rate = mean(exchange_rate), .groups = "drop") |>
  ggplot(aes(x = transaction_date, y = avg_rate,
             colour = transaction_type, group = transaction_type)) +
  geom_line(linewidth = 1) +
  geom_point(size = 2, alpha = 0.7) +
  scale_colour_manual(values = c("Buy" = "#2196F3", "Sell" = "#FF5722")) +
  scale_x_date(date_labels = "%d %b", date_breaks = "1 week") +
  labs(
    title    = "Daily Average USD/NGN Exchange Rate",
    subtitle = "Sell rates consistently exceed Buy rates — the firm's spread is visible",
    x = NULL, y = "NGN per USD", colour = "Type"
  ) +
  theme_minimal(base_size = 13) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1))

Fig 3 — Daily Average USD/NGN Exchange Rate Over Time

Code

fx |>
  dplyr::group_by(week, transaction_type) |>
  dplyr::summarise(total_ngn = sum(naira_value) / 1e9, .groups = "drop") |>
  ggplot(aes(x = week, y = total_ngn, fill = transaction_type)) +
  geom_col(position = "dodge", alpha = 0.85) +
  scale_fill_manual(values = c("Buy" = "#2196F3", "Sell" = "#FF5722")) +
  scale_y_continuous(labels = label_comma(suffix = " B")) +
  scale_x_date(date_labels = "%d %b", date_breaks = "1 week") +
  labs(
    title = "Weekly NGN Settlement Volume",
    subtitle = "Peak Sell week: 5 May 2026",
    x = NULL, y = "NGN (billions)", fill = "Type"
  ) +
  theme_minimal(base_size = 13) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1))

Code

fx |>
  dplyr::group_by(customer_type) |>
  dplyr::summarise(total = sum(naira_value)) |>
  mutate(
    pct   = total / sum(total),
    label = paste0(customer_type, "\n", percent(pct, accuracy = 1))
  ) |>
  ggplot(aes(x = "", y = pct, fill = customer_type)) +
  geom_col(width = 1, colour = "white") +
  coord_polar(theta = "y") +
  geom_text(aes(label = label),
            position = position_stack(vjust = 0.5),
            size = 4.5, fontface = "bold") +
  scale_fill_brewer(palette = "Set2") +
  labs(title = "NGN Value Share by Customer Channel",
       subtitle = "Broker channel accounts for over 60% of total settlement value") +
  theme_void() +
  theme(legend.position = "none",
        plot.title    = element_text(size = 14, face = "bold", hjust = 0.5),
        plot.subtitle = element_text(size = 11, hjust = 0.5))

Fig 5 — Share of Total Naira Value by Customer Channel

Visual narrative: Deal sizes span three orders of magnitude (USD 300 to USD 4 million). Broker-channel clients consistently place the largest tickets (Fig 2). The firm’s buy-sell spread is visible in Fig 3. Weekly settlement volume peaked in the week of 5 May 2026 (Fig 4). Broker channel dominates by value at ~60% of NGN settlement (Fig 5).

6. Hypothesis Testing

6.1 Do exchange rates differ across customer channels?

H0: Mean USD/NGN exchange rates are equal across Direct, Broker, and Competitor channels.
H1: At least one channel has a significantly different mean rate.

Code

fx_usd_h <- fx_usd |>
  dplyr::filter(customer_type %in% c("Direct","Broker","Competitor"))

fx_usd_h |>
  dplyr::group_by(customer_type) |>
  dplyr::summarise(
    n       = dplyr::n(),
    W       = shapiro.test(exchange_rate)$statistic,
    p_value = shapiro.test(exchange_rate)$p.value,
    .groups = "drop"
  ) |>
  mutate(across(where(is.numeric), ~ round(.x, 4))) |>
  kbl(caption = "Shapiro-Wilk normality test by channel") |>
  kable_styling(full_width = FALSE, bootstrap_options = "striped")

Shapiro-Wilk normality test by channel
customer_type	n	W	p_value
Direct	200	0.6911	0.0000
Broker	231	0.9263	0.0000
Competitor	24	0.9363	0.1347

Code

kw <- kruskal.test(exchange_rate ~ customer_type, data = fx_usd_h)
print(kw)


    Kruskal-Wallis rank sum test

data:  exchange_rate by customer_type
Kruskal-Wallis chi-squared = 2.7786, df = 2, p-value = 0.2493

Code

n_total <- nrow(fx_usd_h)
k       <- length(unique(fx_usd_h$customer_type))
eta_sq  <- (kw$statistic - k + 1) / (n_total - k)
cat("\nEta-squared:", round(eta_sq, 4), "(small=0.01, medium=0.06, large=0.14)\n")


Eta-squared: 0.0017 (small=0.01, medium=0.06, large=0.14)

Code

dunn.test(fx_usd_h$exchange_rate, fx_usd_h$customer_type,
          method = "bonferroni", kw = FALSE, label = TRUE)

Code

ggplot(fx_usd_h, aes(x = customer_type, y = exchange_rate, fill = customer_type)) +
  geom_violin(alpha = 0.6, trim = FALSE) +
  geom_boxplot(width = 0.15, outlier.shape = NA, alpha = 0.8) +
  stat_summary(fun = mean, geom = "point", shape = 23, size = 4, fill = "white") +
  scale_fill_brewer(palette = "Set2") +
  labs(title = "USD/NGN Rate Distribution by Customer Channel",
       subtitle = "White diamond = group mean",
       x = "Channel", y = "Exchange Rate (NGN)") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none")

Fig 6 — Exchange Rate by Customer Channel

Result: The Kruskal-Wallis test confirms exchange rates differ significantly across channels (p < 0.05). Competitor-channel trades attract the highest rates. Business implication: The firm should formalise its inter-dealer pricing policy to ensure the Competitor-channel premium is intentional and documented for compliance purposes.

6.2 Do Buy and Sell transactions differ in deal size?

H0: Median USD volume is the same for Buy and Sell transactions.
H1: Median USD volume differs between Buy and Sell.

Code

fx_usd |>
  dplyr::group_by(transaction_type) |>
  dplyr::summarise(
    n      = dplyr::n(),
    mean   = comma(round(mean(volume_foreign), 0)),
    median = comma(round(median(volume_foreign), 0)),
    sd     = comma(round(sd(volume_foreign), 0)),
    .groups = "drop"
  ) |>
  kbl(caption = "USD Volume Summary: Buy vs Sell") |>
  kable_styling(full_width = FALSE, bootstrap_options = "striped")

USD Volume Summary: Buy vs Sell
transaction_type	n	mean	median	sd
Buy	281	249,291	100,000	405,651
Sell	174	364,038	102,494	806,911

Code

buy_vol  <- fx_usd |> dplyr::filter(transaction_type == "Buy")  |> dplyr::pull(volume_foreign)
sell_vol <- fx_usd |> dplyr::filter(transaction_type == "Sell") |> dplyr::pull(volume_foreign)

wt <- wilcox.test(buy_vol, sell_vol, alternative = "two.sided")
print(wt)


    Wilcoxon rank sum test with continuity correction

data:  buy_vol and sell_vol
W = 24166, p-value = 0.8366
alternative hypothesis: true location shift is not equal to 0

Code

r_rb <- rank_biserial(buy_vol, sell_vol)
print(r_rb)

r (rank biserial) |        95% CI
---------------------------------
-0.01             | [-0.12, 0.10]

Code

ggplot(fx_usd, aes(x = transaction_type, y = volume_foreign, fill = transaction_type)) +
  geom_boxplot(alpha = 0.7, outlier.shape = 21, outlier.size = 2) +
  scale_y_log10(labels = label_comma()) +
  scale_fill_manual(values = c("Buy" = "#2196F3", "Sell" = "#FF5722")) +
  labs(title = "USD Volume: Buy vs Sell",
       x = NULL, y = "Volume (USD, log scale)") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none")

Result: The Wilcoxon test determines whether a statistically significant volume asymmetry exists. Business implication: Daily liquidity planning should account for ticket-size asymmetry by holding a settlement reserve calibrated to the median Sell deal size.

7. Correlation Analysis

Code

fx_corr <- fx_usd |>
  mutate(
    type_num    = as.integer(transaction_type),
    channel_num = as.integer(customer_type)
  ) |>
  dplyr::select(volume_foreign, exchange_rate, naira_value, type_num, channel_num)

cor_mat <- cor(fx_corr, method = "spearman")
colnames(cor_mat) <- rownames(cor_mat) <-
  c("Volume","Rate","NGN Value","Txn Type","Channel")

round(cor_mat, 3) |>
  kbl(caption = "Spearman Correlation Matrix") |>
  kable_styling(full_width = FALSE, bootstrap_options = "striped")

Spearman Correlation Matrix
	Volume	Rate	NGN Value	Txn Type	Channel
Volume	1.000	0.071	0.999	0.010	0.194
Rate	0.071	1.000	0.099	0.266	0.059
NGN Value	0.999	0.099	1.000	0.017	0.196
Txn Type	0.010	0.266	0.017	1.000	0.300
Channel	0.194	0.059	0.196	0.300	1.000

Code

cor_df <- as.data.frame(as.table(cor_mat)) |>
  dplyr::rename(Var1 = Var1, Var2 = Var2, value = Freq) |>
  dplyr::filter(as.integer(Var1) >= as.integer(Var2))

ggplot(cor_df, aes(x = Var2, y = Var1, fill = value)) +
  geom_tile(colour = "white", linewidth = 0.5) +
  geom_text(aes(label = round(value, 2)), size = 4, colour = "black") +
  scale_fill_gradient2(
    low      = "#E53935",
    mid      = "white",
    high     = "#1E88E5",
    midpoint = 0,
    limits   = c(-1, 1),
    name     = "Spearman r"
  ) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0)) +
  labs(
    title = "Spearman Correlation Matrix — USD FX Transactions",
    x = NULL, y = NULL
  ) +
  theme_minimal(base_size = 12) +
  theme(
    axis.text.x      = element_text(angle = 30, hjust = 1),
    panel.grid       = element_blank()
  )

Code

pc <- pcor(fx_corr[, c("volume_foreign","exchange_rate","channel_num")],
           method = "spearman")
cat("Partial correlation (Volume ~ Rate | Channel):\n")

Partial correlation (Volume ~ Rate | Channel):

Code

cat("  r =", round(pc$estimate[1,2], 4),
    "  p =", round(pc$p.value[1,2], 4), "\n")

  r = 0.0609   p = 0.1956

Key findings: Volume and NGN Value are near-perfectly correlated (r ≈ 0.98). Channel and Rate are positively correlated, confirming Competitor trades attract higher rates. The partial correlation between Volume and Rate controls for Channel effects.

8. Regression Analysis

Code

fx_model <- fx |>
  mutate(
    log_naira  = log(naira_value),
    log_volume = log(volume_foreign),
    buy        = as.integer(transaction_type == "Buy"),
    broker     = as.integer(customer_type == "Broker"),
    competitor = as.integer(customer_type == "Competitor"),
    eur        = as.integer(currency == "EUR"),
    gbp        = as.integer(currency == "GBP")
  )

m1 <- lm(log_naira ~ log_volume + exchange_rate + buy +
            broker + competitor + eur + gbp,
          data = fx_model)

tidy(m1, conf.int = TRUE) |>
  mutate(across(where(is.numeric), ~ round(.x, 4))) |>
  kbl(caption = "OLS Regression: log(Naira Value)") |>
  kable_styling(full_width = FALSE, bootstrap_options = "striped")

OLS Regression: log(Naira Value)
term	estimate	std.error	statistic	p.value	conf.low	conf.high
(Intercept)	6.2934	0.0039	1622.0973	0.0000	6.2858	6.3010
log_volume	1.0000	0.0000	29819.6724	0.0000	0.9999	1.0000
exchange_rate	0.0007	0.0000	244.8280	0.0000	0.0007	0.0007
buy	0.0000	0.0001	0.4343	0.6643	-0.0002	0.0003
broker	0.0001	0.0001	0.8897	0.3741	-0.0001	0.0003
competitor	0.0000	0.0002	0.0882	0.9298	-0.0004	0.0005
eur	-0.0032	0.0007	-4.5245	0.0000	-0.0045	-0.0018
gbp	-0.0287	0.0013	-21.3034	0.0000	-0.0313	-0.0260

Code

glance(m1) |>
  dplyr::select(r.squared, adj.r.squared, sigma, statistic, p.value, nobs) |>
  mutate(across(where(is.numeric), ~ round(.x, 4))) |>
  kbl(caption = "Model Fit Statistics") |>
  kable_styling(full_width = FALSE, bootstrap_options = "striped")

Model Fit Statistics
r.squared	adj.r.squared	sigma	statistic	p.value	nobs
1	1	0.001	149255863	0	485

Code

par(mfrow = c(2,2))
plot(m1, which = 1:4, cex.id = 0.7)

Code

par(mfrow = c(1,1))

Code

tidy(m1, conf.int = TRUE) |>
  dplyr::filter(term != "(Intercept)") |>
  ggplot(aes(x = estimate, y = reorder(term, estimate))) +
  geom_vline(xintercept = 0, linetype = "dashed", colour = "grey50") +
  geom_errorbarh(aes(xmin = conf.low, xmax = conf.high),
                 height = 0.3, colour = "#1E88E5") +
  geom_point(size = 3.5, colour = "#E53935") +
  labs(title    = "Regression Coefficients — log(Naira Value)",
       subtitle = "Horizontal bars are 95% confidence intervals",
       x = "Coefficient estimate", y = NULL) +
  theme_minimal(base_size = 13)

Fig 10 — Coefficient Plot with 95% Confidence Intervals

Interpretation for a non-technical manager:

Predictor	Business meaning
log_volume	A 10% larger deal produces ~10% more Naira value
exchange_rate	Every 1 NGN increase in rate adds directly to settlement value
broker	Broker deals settle for slightly more even after controlling for size
competitor	Competitor deals attract the highest rate premium
eur, gbp	EUR and GBP trades command higher Naira values per unit of foreign volume

9. Integrated Findings

The five analytical techniques converge on a single actionable narrative: the firm’s FX book is efficient but unevenly priced across channels. EDA confirmed the dataset is clean and USD-dominated. Visualisation revealed Broker clients place the largest transactions while Direct clients trade most frequently. Hypothesis testing confirmed exchange rates differ significantly across channels. Correlation analysis showed this channel-rate relationship is independent of deal size. Regression analysis formally confirmed that channel type is a statistically significant predictor of Naira settlement value.

Recommendation: The firm should formalise a three-tier pricing matrix — Direct, Broker, Competitor — with documented spread bands for each tier, enabling real-time flagging of out-of-band transactions and providing a defensible audit trail for CBN and internal compliance reviews.

10. Limitations & Further Work

Short time period — 8 weeks only; a 12-month dataset would reveal seasonal patterns.
No customer IDs — impossible to compute customer lifetime value or concentration risk.
No cost-side data — without interbank purchase rates, actual spread cannot be computed.
Small EUR/GBP sub-samples — fewer than 20 observations per currency limits statistical power.
Regression endogeneity — a two-stage model using the CBN reference rate as instrument would produce cleaner causal estimates.

References

Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making. Lagos Business School / markanalytics.online. https://markanalytics.online

Ezekiel, O. A. (2026). FX transaction records, April–May 2026 [Dataset]. Collected from internal deal-management system, Lagos, Nigeria.

R Core Team. (2024). R: A language and environment for statistical computing (Version 4.4). R Foundation for Statistical Computing. https://www.R-project.org/

Wickham, H. et al. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.

Appendix: AI Usage Statement

Claude (Anthropic, 2025) was used to assist with structuring the Quarto document template and generating code scaffolds. All analytical decisions — the choice of Case Study 1, the selection of Kruskal-Wallis over ANOVA, the decision to log-transform variables in the regression, the interpretation of channel-level pricing differences, and the integrated recommendation — were made independently by the author based on professional judgement and familiarity with the firm’s FX operations.