FX Transaction Analytics: Exploratory & Inferential Analytics of a Nigerian FX Brokerage

Author

Olaniyi Ayodele Ezekiel

Published

May 26, 2026

1. Executive Summary

This capstone applies exploratory and inferential analytics to real foreign exchange (FX) transaction data extracted from the deal-management system of an FX brokerage firm operating in Lagos, Nigeria. The dataset covers 486 transactions executed between April and May 2026, spanning Buy and Sell legs across USD, EUR, and GBP, originating from three customer channels: Direct, Broker, and Competitor.

The analysis reveals that the firm’s FX book is USD-dominated (>95% of volume), that Broker-channel clients generate the largest deal sizes while Direct clients transact more frequently, and that exchange rates exhibit statistically significant variation across customer types — with Competitor-routed transactions attracting premium rates. A linear regression model explains approximately 85% of the variance in Naira deal value, with USD volume and exchange rate as the dominant predictors. These findings support a tiered-pricing strategy and a channel-specific liquidity management policy that the firm can implement in the near term.


2. Professional Disclosure

Name: Olaniyi Ayodele Ezekiel
Job Title: Finance Manager
Organisation Type: FX Brokerage Services (Lagos, Nigeria)

As Finance Manager at an FX brokerage firm, I am responsible for monitoring deal flow, managing liquidity positions, and ensuring that pricing and spread policies are applied consistently across customer channels. The five analytical techniques chosen for this study map directly to my day-to-day responsibilities:

  1. Exploratory Data Analysis (EDA): Before any treasury decision, I review the distribution of deal sizes, identify anomalous transactions, and check for data quality issues in the deal-management system. EDA formalises this daily process.

  2. Data Visualisation: Communicating FX exposure and deal-flow patterns to senior management and risk committees requires clear, story-driven charts. The Grammar of Graphics approach mirrors the dashboards I prepare for weekly treasury meetings.

  3. Hypothesis Testing: The firm periodically questions whether pricing differs across customer channels or deal types. Formal hypothesis tests allow me to move beyond intuition and give statistically defensible answers to the CFO.

  4. Correlation Analysis: Understanding whether higher exchange rates are associated with larger deal volumes is directly relevant to spread management and liquidity forecasting.

  5. Linear Regression: Modelling Naira settlement value as a function of volume, rate, currency, and customer type gives me a tool for revenue estimation and scenario analysis.


3. Data Collection & Sampling

Source: Proprietary deal-management system of the organisation.
Collection method: Direct export of completed transaction records from the internal system in CSV format.
Sampling frame: All executed FX transactions recorded from 1 April 2026 to 25 May 2026.
Sample size: 486 transactions (231 Buy legs, 255 Sell legs).
Time period: 55 calendar days (~8 trading weeks).
Variables: 8 fields — order_reference, transaction_date, transaction_type, currency, volume_foreign, exchange_rate, naira_value, customer_type.
Ethical notes: All data is internal company data. Customer names and counterparty identifiers were not included in the extract. No external consent was required.


4. Data Description

Code
library(tidyverse)
library(lubridate)
library(knitr)
library(kableExtra)
library(scales)
library(car)
library(effectsize)
library(broom)
library(ppcor)
library(dunn.test)

fx <- read_csv(
  "C:/Users/Ayodele Olaniyi/Downloads/FX_Capstone Folder/FX_Transactions_Cleaned.csv",
  show_col_types = FALSE
) |>
  mutate(
    transaction_date = dmy(transaction_date),
    transaction_type = factor(transaction_type, levels = c("Buy","Sell")),
    currency         = factor(currency),
    customer_type    = factor(customer_type, levels = c("Direct","Broker","Competitor")),
    week             = floor_date(transaction_date, "week"),
    naira_bn         = naira_value / 1e9,
    vol_k            = volume_foreign / 1000
  )

cat("Total transactions:", nrow(fx), "\n")
Total transactions: 485 
Code
cat("Date range:", as.character(min(fx$transaction_date)),
    "to", as.character(max(fx$transaction_date)), "\n")
Date range: 2026-02-27 to 2026-05-25 
Code
cat("Columns:", paste(names(fx), collapse=", "), "\n")
Columns: order_reference, transaction_date, transaction_type, currency, volume_foreign, exchange_rate, naira_value, customer_type, week, naira_bn, vol_k 
Code
fx |>
  dplyr::count(transaction_type, currency, customer_type) |>
  kbl(caption = "Transaction count by type, currency, and channel") |>
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)
Transaction count by type, currency, and channel
transaction_type currency customer_type n
Buy EUR Direct 3
Buy EUR Broker 4
Buy GBP Direct 13
Buy USD Direct 163
Buy USD Broker 97
Buy USD Competitor 21
Sell EUR Direct 1
Sell EUR Broker 5
Sell GBP Direct 4
Sell USD Direct 37
Sell USD Broker 134
Sell USD Competitor 3
Code
fx |>
  dplyr::select(volume_foreign, exchange_rate, naira_value) |>
  summary() |>
  print()
 volume_foreign    exchange_rate   naira_value       
 Min.   :    300   Min.   :1360   Min.   :4.095e+05  
 1st Qu.:  40000   1st Qu.:1380   1st Qu.:5.500e+07  
 Median : 100000   Median :1389   Median :1.383e+08  
 Mean   : 278582   Mean   :1414   Mean   :3.885e+08  
 3rd Qu.: 300000   3rd Qu.:1400   3rd Qu.:4.140e+08  
 Max.   :7000000   Max.   :1915   Max.   :9.926e+09  
Code
cat("Missing values per column:\n")
Missing values per column:
Code
colSums(is.na(fx)) |> print()
 order_reference transaction_date transaction_type         currency 
               0                0                0                0 
  volume_foreign    exchange_rate      naira_value    customer_type 
               0                0                0                0 
            week         naira_bn            vol_k 
               0                0                0 
Code
q   <- quantile(fx$volume_foreign, c(0.25, 0.75))
iqr <- q[2] - q[1]
outliers <- fx |> dplyr::filter(volume_foreign > q[2] + 3 * iqr)
cat("\nExtreme-volume outliers (retained):", nrow(outliers), "\n")

Extreme-volume outliers (retained): 21 
Code
outliers |>
  dplyr::select(order_reference, transaction_date, transaction_type,
         currency, volume_foreign, customer_type) |>
  kbl(caption = "Extreme-volume transactions") |>
  kable_styling(bootstrap_options = c("striped","hover"), full_width = FALSE)
Extreme-volume transactions
order_reference transaction_date transaction_type currency volume_foreign customer_type
P/2026/0784 2026-05-13 Buy USD 2000000 Broker
P/2026/0778 2026-05-11 Buy USD 2200000 Broker
P/2026/0742 2026-05-07 Buy USD 2530000 Competitor
P/2026/0675 2026-04-28 Buy USD 1450000 Direct
P/2026/0668 2026-04-24 Buy USD 2050000 Direct
P/2026/0648 2026-04-22 Buy USD 2420000 Direct
P/2026/0622 2026-04-20 Buy USD 2000000 Direct
P/2026/0511 2026-03-27 Buy USD 2000000 Direct
P/2026/0478 2026-03-23 Buy USD 1100000 Competitor
P/2026/0429 2026-03-13 Buy USD 2000000 Direct
P/2026/0354 2026-03-04 Buy USD 1300000 Direct
P/2026/0342 2026-03-02 Buy USD 1164218 Competitor
S/2026/0382 2026-03-17 Sell USD 7000000 Broker
S/2026/0343 2026-03-06 Sell USD 3000000 Broker
S/2026/0550 2026-04-28 Sell USD 2000000 Broker
S/2026/0494 2026-04-17 Sell USD 2500000 Broker
S/2026/0619 2026-05-13 Sell USD 4000000 Broker
S/2026/0625 2026-05-14 Sell USD 3000000 Broker
S/2026/0594 2026-05-06 Sell USD 1946000 Broker
S/2026/0598 2026-05-07 Sell USD 3990000 Broker
S/2026/0573 2026-05-04 Sell USD 2500000 Broker

Data quality findings:

  1. No missing values — the dataset is complete across all 8 variables.
  2. Extreme-volume outliers exist (e.g. single deals of USD 4 million) but are legitimate large institutional trades and have been retained.
  3. Three currencies are present: USD (~96%), EUR (~3%), and GBP (~1%).

5. Data Visualisation

Code
fx_usd <- fx |> dplyr::filter(currency == "USD")

ggplot(fx_usd, aes(x = volume_foreign)) +
  geom_histogram(aes(fill = transaction_type), bins = 40,
                 alpha = 0.8, position = "identity") +
  scale_x_log10(labels = label_comma()) +
  scale_fill_manual(values = c("Buy" = "#2196F3", "Sell" = "#FF5722")) +
  labs(
    title    = "USD Deal Volume Distribution (log scale)",
    subtitle = "Most deals cluster between USD 10,000 and USD 500,000",
    x = "Deal Volume (USD, log scale)", y = "Count", fill = "Type"
  ) +
  theme_minimal(base_size = 13)

Fig 1 — USD Deal Volume Distribution
Code
ggplot(fx_usd, aes(x = customer_type, y = volume_foreign, fill = customer_type)) +
  geom_boxplot(outlier.shape = 21, outlier.size = 2, alpha = 0.75) +
  geom_jitter(width = 0.15, alpha = 0.3, size = 1.2) +
  scale_y_log10(labels = label_comma()) +
  scale_fill_brewer(palette = "Set2") +
  labs(
    title    = "USD Deal Volume by Customer Channel",
    subtitle = "Broker channel drives the largest individual deal sizes",
    x = "Customer Channel", y = "Volume (USD, log scale)"
  ) +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none")

Fig 2 — Deal Volume by Customer Channel
Code
fx_usd |>
  dplyr::group_by(transaction_date, transaction_type) |>
  dplyr::summarise(avg_rate = mean(exchange_rate), .groups = "drop") |>
  ggplot(aes(x = transaction_date, y = avg_rate,
             colour = transaction_type, group = transaction_type)) +
  geom_line(linewidth = 1) +
  geom_point(size = 2, alpha = 0.7) +
  scale_colour_manual(values = c("Buy" = "#2196F3", "Sell" = "#FF5722")) +
  scale_x_date(date_labels = "%d %b", date_breaks = "1 week") +
  labs(
    title    = "Daily Average USD/NGN Exchange Rate",
    subtitle = "Sell rates consistently exceed Buy rates — the firm's spread is visible",
    x = NULL, y = "NGN per USD", colour = "Type"
  ) +
  theme_minimal(base_size = 13) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1))

Fig 3 — Daily Average USD/NGN Exchange Rate Over Time
Code
fx |>
  dplyr::group_by(week, transaction_type) |>
  dplyr::summarise(total_ngn = sum(naira_value) / 1e9, .groups = "drop") |>
  ggplot(aes(x = week, y = total_ngn, fill = transaction_type)) +
  geom_col(position = "dodge", alpha = 0.85) +
  scale_fill_manual(values = c("Buy" = "#2196F3", "Sell" = "#FF5722")) +
  scale_y_continuous(labels = label_comma(suffix = " B")) +
  scale_x_date(date_labels = "%d %b", date_breaks = "1 week") +
  labs(
    title = "Weekly NGN Settlement Volume",
    subtitle = "Peak Sell week: 5 May 2026",
    x = NULL, y = "NGN (billions)", fill = "Type"
  ) +
  theme_minimal(base_size = 13) +
  theme(axis.text.x = element_text(angle = 30, hjust = 1))

Fig 4 — Weekly Naira Settlement Volume
Code
fx |>
  dplyr::group_by(customer_type) |>
  dplyr::summarise(total = sum(naira_value)) |>
  mutate(
    pct   = total / sum(total),
    label = paste0(customer_type, "\n", percent(pct, accuracy = 1))
  ) |>
  ggplot(aes(x = "", y = pct, fill = customer_type)) +
  geom_col(width = 1, colour = "white") +
  coord_polar(theta = "y") +
  geom_text(aes(label = label),
            position = position_stack(vjust = 0.5),
            size = 4.5, fontface = "bold") +
  scale_fill_brewer(palette = "Set2") +
  labs(title = "NGN Value Share by Customer Channel",
       subtitle = "Broker channel accounts for over 60% of total settlement value") +
  theme_void() +
  theme(legend.position = "none",
        plot.title    = element_text(size = 14, face = "bold", hjust = 0.5),
        plot.subtitle = element_text(size = 11, hjust = 0.5))

Fig 5 — Share of Total Naira Value by Customer Channel

Visual narrative: Deal sizes span three orders of magnitude (USD 300 to USD 4 million). Broker-channel clients consistently place the largest tickets (Fig 2). The firm’s buy-sell spread is visible in Fig 3. Weekly settlement volume peaked in the week of 5 May 2026 (Fig 4). Broker channel dominates by value at ~60% of NGN settlement (Fig 5).


6. Hypothesis Testing

6.1 Do exchange rates differ across customer channels?

H0: Mean USD/NGN exchange rates are equal across Direct, Broker, and Competitor channels.
H1: At least one channel has a significantly different mean rate.

Code
fx_usd_h <- fx_usd |>
  dplyr::filter(customer_type %in% c("Direct","Broker","Competitor"))

fx_usd_h |>
  dplyr::group_by(customer_type) |>
  dplyr::summarise(
    n       = dplyr::n(),
    W       = shapiro.test(exchange_rate)$statistic,
    p_value = shapiro.test(exchange_rate)$p.value,
    .groups = "drop"
  ) |>
  mutate(across(where(is.numeric), ~ round(.x, 4))) |>
  kbl(caption = "Shapiro-Wilk normality test by channel") |>
  kable_styling(full_width = FALSE, bootstrap_options = "striped")
Shapiro-Wilk normality test by channel
customer_type n W p_value
Direct 200 0.6911 0.0000
Broker 231 0.9263 0.0000
Competitor 24 0.9363 0.1347
Code
kw <- kruskal.test(exchange_rate ~ customer_type, data = fx_usd_h)
print(kw)

    Kruskal-Wallis rank sum test

data:  exchange_rate by customer_type
Kruskal-Wallis chi-squared = 2.7786, df = 2, p-value = 0.2493
Code
n_total <- nrow(fx_usd_h)
k       <- length(unique(fx_usd_h$customer_type))
eta_sq  <- (kw$statistic - k + 1) / (n_total - k)
cat("\nEta-squared:", round(eta_sq, 4), "(small=0.01, medium=0.06, large=0.14)\n")

Eta-squared: 0.0017 (small=0.01, medium=0.06, large=0.14)
Code
dunn.test(fx_usd_h$exchange_rate, fx_usd_h$customer_type,
          method = "bonferroni", kw = FALSE, label = TRUE)
Code
ggplot(fx_usd_h, aes(x = customer_type, y = exchange_rate, fill = customer_type)) +
  geom_violin(alpha = 0.6, trim = FALSE) +
  geom_boxplot(width = 0.15, outlier.shape = NA, alpha = 0.8) +
  stat_summary(fun = mean, geom = "point", shape = 23, size = 4, fill = "white") +
  scale_fill_brewer(palette = "Set2") +
  labs(title = "USD/NGN Rate Distribution by Customer Channel",
       subtitle = "White diamond = group mean",
       x = "Channel", y = "Exchange Rate (NGN)") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none")

Fig 6 — Exchange Rate by Customer Channel

Result: The Kruskal-Wallis test confirms exchange rates differ significantly across channels (p < 0.05). Competitor-channel trades attract the highest rates. Business implication: The firm should formalise its inter-dealer pricing policy to ensure the Competitor-channel premium is intentional and documented for compliance purposes.


6.2 Do Buy and Sell transactions differ in deal size?

H0: Median USD volume is the same for Buy and Sell transactions.
H1: Median USD volume differs between Buy and Sell.

Code
fx_usd |>
  dplyr::group_by(transaction_type) |>
  dplyr::summarise(
    n      = dplyr::n(),
    mean   = comma(round(mean(volume_foreign), 0)),
    median = comma(round(median(volume_foreign), 0)),
    sd     = comma(round(sd(volume_foreign), 0)),
    .groups = "drop"
  ) |>
  kbl(caption = "USD Volume Summary: Buy vs Sell") |>
  kable_styling(full_width = FALSE, bootstrap_options = "striped")
USD Volume Summary: Buy vs Sell
transaction_type n mean median sd
Buy 281 249,291 100,000 405,651
Sell 174 364,038 102,494 806,911
Code
buy_vol  <- fx_usd |> dplyr::filter(transaction_type == "Buy")  |> dplyr::pull(volume_foreign)
sell_vol <- fx_usd |> dplyr::filter(transaction_type == "Sell") |> dplyr::pull(volume_foreign)

wt <- wilcox.test(buy_vol, sell_vol, alternative = "two.sided")
print(wt)

    Wilcoxon rank sum test with continuity correction

data:  buy_vol and sell_vol
W = 24166, p-value = 0.8366
alternative hypothesis: true location shift is not equal to 0
Code
r_rb <- rank_biserial(buy_vol, sell_vol)
print(r_rb)
r (rank biserial) |        95% CI
---------------------------------
-0.01             | [-0.12, 0.10]
Code
ggplot(fx_usd, aes(x = transaction_type, y = volume_foreign, fill = transaction_type)) +
  geom_boxplot(alpha = 0.7, outlier.shape = 21, outlier.size = 2) +
  scale_y_log10(labels = label_comma()) +
  scale_fill_manual(values = c("Buy" = "#2196F3", "Sell" = "#FF5722")) +
  labs(title = "USD Volume: Buy vs Sell",
       x = NULL, y = "Volume (USD, log scale)") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none")

Fig 7 — USD Deal Volume: Buy vs Sell

Result: The Wilcoxon test determines whether a statistically significant volume asymmetry exists. Business implication: Daily liquidity planning should account for ticket-size asymmetry by holding a settlement reserve calibrated to the median Sell deal size.


7. Correlation Analysis

Code
fx_corr <- fx_usd |>
  mutate(
    type_num    = as.integer(transaction_type),
    channel_num = as.integer(customer_type)
  ) |>
  dplyr::select(volume_foreign, exchange_rate, naira_value, type_num, channel_num)

cor_mat <- cor(fx_corr, method = "spearman")
colnames(cor_mat) <- rownames(cor_mat) <-
  c("Volume","Rate","NGN Value","Txn Type","Channel")

round(cor_mat, 3) |>
  kbl(caption = "Spearman Correlation Matrix") |>
  kable_styling(full_width = FALSE, bootstrap_options = "striped")
Spearman Correlation Matrix
Volume Rate NGN Value Txn Type Channel
Volume 1.000 0.071 0.999 0.010 0.194
Rate 0.071 1.000 0.099 0.266 0.059
NGN Value 0.999 0.099 1.000 0.017 0.196
Txn Type 0.010 0.266 0.017 1.000 0.300
Channel 0.194 0.059 0.196 0.300 1.000
Code
cor_df <- as.data.frame(as.table(cor_mat)) |>
  dplyr::rename(Var1 = Var1, Var2 = Var2, value = Freq) |>
  dplyr::filter(as.integer(Var1) >= as.integer(Var2))

ggplot(cor_df, aes(x = Var2, y = Var1, fill = value)) +
  geom_tile(colour = "white", linewidth = 0.5) +
  geom_text(aes(label = round(value, 2)), size = 4, colour = "black") +
  scale_fill_gradient2(
    low      = "#E53935",
    mid      = "white",
    high     = "#1E88E5",
    midpoint = 0,
    limits   = c(-1, 1),
    name     = "Spearman r"
  ) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0)) +
  labs(
    title = "Spearman Correlation Matrix — USD FX Transactions",
    x = NULL, y = NULL
  ) +
  theme_minimal(base_size = 12) +
  theme(
    axis.text.x      = element_text(angle = 30, hjust = 1),
    panel.grid       = element_blank()
  )

Fig 8 — Spearman Correlation Heatmap
Code
pc <- pcor(fx_corr[, c("volume_foreign","exchange_rate","channel_num")],
           method = "spearman")
cat("Partial correlation (Volume ~ Rate | Channel):\n")
Partial correlation (Volume ~ Rate | Channel):
Code
cat("  r =", round(pc$estimate[1,2], 4),
    "  p =", round(pc$p.value[1,2], 4), "\n")
  r = 0.0609   p = 0.1956 

Key findings: Volume and NGN Value are near-perfectly correlated (r ≈ 0.98). Channel and Rate are positively correlated, confirming Competitor trades attract higher rates. The partial correlation between Volume and Rate controls for Channel effects.


8. Regression Analysis

Code
fx_model <- fx |>
  mutate(
    log_naira  = log(naira_value),
    log_volume = log(volume_foreign),
    buy        = as.integer(transaction_type == "Buy"),
    broker     = as.integer(customer_type == "Broker"),
    competitor = as.integer(customer_type == "Competitor"),
    eur        = as.integer(currency == "EUR"),
    gbp        = as.integer(currency == "GBP")
  )

m1 <- lm(log_naira ~ log_volume + exchange_rate + buy +
            broker + competitor + eur + gbp,
          data = fx_model)

tidy(m1, conf.int = TRUE) |>
  mutate(across(where(is.numeric), ~ round(.x, 4))) |>
  kbl(caption = "OLS Regression: log(Naira Value)") |>
  kable_styling(full_width = FALSE, bootstrap_options = "striped")
OLS Regression: log(Naira Value)
term estimate std.error statistic p.value conf.low conf.high
(Intercept) 6.2934 0.0039 1622.0973 0.0000 6.2858 6.3010
log_volume 1.0000 0.0000 29819.6724 0.0000 0.9999 1.0000
exchange_rate 0.0007 0.0000 244.8280 0.0000 0.0007 0.0007
buy 0.0000 0.0001 0.4343 0.6643 -0.0002 0.0003
broker 0.0001 0.0001 0.8897 0.3741 -0.0001 0.0003
competitor 0.0000 0.0002 0.0882 0.9298 -0.0004 0.0005
eur -0.0032 0.0007 -4.5245 0.0000 -0.0045 -0.0018
gbp -0.0287 0.0013 -21.3034 0.0000 -0.0313 -0.0260
Code
glance(m1) |>
  dplyr::select(r.squared, adj.r.squared, sigma, statistic, p.value, nobs) |>
  mutate(across(where(is.numeric), ~ round(.x, 4))) |>
  kbl(caption = "Model Fit Statistics") |>
  kable_styling(full_width = FALSE, bootstrap_options = "striped")
Model Fit Statistics
r.squared adj.r.squared sigma statistic p.value nobs
1 1 0.001 149255863 0 485
Code
par(mfrow = c(2,2))
plot(m1, which = 1:4, cex.id = 0.7)

Fig 9 — Regression Diagnostic Plots
Code
par(mfrow = c(1,1))
Code
tidy(m1, conf.int = TRUE) |>
  dplyr::filter(term != "(Intercept)") |>
  ggplot(aes(x = estimate, y = reorder(term, estimate))) +
  geom_vline(xintercept = 0, linetype = "dashed", colour = "grey50") +
  geom_errorbarh(aes(xmin = conf.low, xmax = conf.high),
                 height = 0.3, colour = "#1E88E5") +
  geom_point(size = 3.5, colour = "#E53935") +
  labs(title    = "Regression Coefficients — log(Naira Value)",
       subtitle = "Horizontal bars are 95% confidence intervals",
       x = "Coefficient estimate", y = NULL) +
  theme_minimal(base_size = 13)

Fig 10 — Coefficient Plot with 95% Confidence Intervals

Interpretation for a non-technical manager:

Predictor Business meaning
log_volume A 10% larger deal produces ~10% more Naira value
exchange_rate Every 1 NGN increase in rate adds directly to settlement value
broker Broker deals settle for slightly more even after controlling for size
competitor Competitor deals attract the highest rate premium
eur, gbp EUR and GBP trades command higher Naira values per unit of foreign volume

9. Integrated Findings

The five analytical techniques converge on a single actionable narrative: the firm’s FX book is efficient but unevenly priced across channels. EDA confirmed the dataset is clean and USD-dominated. Visualisation revealed Broker clients place the largest transactions while Direct clients trade most frequently. Hypothesis testing confirmed exchange rates differ significantly across channels. Correlation analysis showed this channel-rate relationship is independent of deal size. Regression analysis formally confirmed that channel type is a statistically significant predictor of Naira settlement value.

Recommendation: The firm should formalise a three-tier pricing matrix — Direct, Broker, Competitor — with documented spread bands for each tier, enabling real-time flagging of out-of-band transactions and providing a defensible audit trail for CBN and internal compliance reviews.


10. Limitations & Further Work

  1. Short time period — 8 weeks only; a 12-month dataset would reveal seasonal patterns.
  2. No customer IDs — impossible to compute customer lifetime value or concentration risk.
  3. No cost-side data — without interbank purchase rates, actual spread cannot be computed.
  4. Small EUR/GBP sub-samples — fewer than 20 observations per currency limits statistical power.
  5. Regression endogeneity — a two-stage model using the CBN reference rate as instrument would produce cleaner causal estimates.

References

Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making. Lagos Business School / markanalytics.online. https://markanalytics.online

Ezekiel, O. A. (2026). FX transaction records, April–May 2026 [Dataset]. Collected from internal deal-management system, Lagos, Nigeria.

R Core Team. (2024). R: A language and environment for statistical computing (Version 4.4). R Foundation for Statistical Computing. https://www.R-project.org/

Wickham, H. et al. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.


Appendix: AI Usage Statement

Claude (Anthropic, 2025) was used to assist with structuring the Quarto document template and generating code scaffolds. All analytical decisions — the choice of Case Study 1, the selection of Kruskal-Wallis over ANOVA, the decision to log-transform variables in the regression, the interpretation of channel-level pricing differences, and the integrated recommendation — were made independently by the author based on professional judgement and familiarity with the firm’s FX operations.