Analysis of Treasury Foreign Exchange Sales Transactions (Q1 2026): Evaluation of Revenue Generation, Pricing Efficiency, and Customer Behaviour

Author

Kayode Abobarin

Published

May 19, 2026

EMBA-31, Lagos Business School. Data Analytics 1 – Take-Home Examination (Case Study 1)

1 Executive Summary

This report analyses treasury foreign exchange (FX) sales transaction data from a Nigerian non-bank financial institution, focusing on USD sales to retail and corporate customers between 1 January 2026 and 31 March 2026 (Q1 2026). The primary dataset comprises 147 observations, including variables such as transaction date, customer type, sales channel, weighted average cost, volume sold (USD), unit price, spread (₦), and income (₦).

Key Findings:

  • FX transaction volumes and income exhibit significant right-skewness, with corporate customers and a small number of large transactions accounting for the majority of Q1 revenue.
  • Average spreads differ significantly between retail and corporate customers (independent samples t-test, p < 0.05), with corporate clients generally receiving tighter, yet profitable, spreads.
  • Transaction channels have a substantial impact on pricing and profitability (ANOVA, p< 0.05).
  • A strong positive correlation exists between transaction volume and spread income generated.
  • Weighted average USD acquisition cost and unit selling price demonstrate a close, though variable, relationship, with occasional instances of spread compression.
  • Regression analysis identifies transaction volume, customer type, and sales channel as the primary drivers of FX income.

Recommendation: Implement segment-specific dynamic pricing, optimize high-margin channels, and enhance corporate client segmentation to increase treasury profitability and earnings stability in future quarters.

2 Professional Disclosure

I am an Executive Director of Finance and Technology in a Nigerian financial services organisation operating within the non-bank financial institution sector. I oversee the Group Financial Control Department, Group Treasury, and Technology functions. Within the treasury unit, I collaborate with the Group Head of Treasury on FX trading activities involving the buying and selling of foreign currency to retail and corporate customers to generate trading spreads.

The five techniques applied in this study—Exploratory Data Analysis (EDA), Data Visualisation, Hypothesis Testing, Correlation Analysis, and Linear Regression—are directly relevant to my daily responsibilities. These methods support performance monitoring of treasury income, pricing decision-making, customer segmentation for optimal spread realisation, and strategic recommendations to senior management regarding FX trading operations.

3 Data Collection & Sampling

Primary Dataset: Internal extract from the organisation’s treasury FX dealing system covering the period from 1 January 2026 to 31 March 2026 (Q1 2026). This dataset represents the complete population of recorded USD spot sales transactions during the quarter (census approach, no sampling).

  • Sample Size: 147 observations with 11 variables.

  • Key Variables: Transaction ID, Date, Customer_Type (Corporate/Retail), Channel, Weighted Average Cost, Volume Sold (USD), Unit Price Sold, Spread (₦), and Income (₦).

  • Data Provenance: Data were exported directly from operational systems. Excel serial dates were converted to Gregorian calendar dates. Customer names were anonymised for this submission.

  • Ethical Notes: Internal operational data were used with organisational approval and fully anonymised in accordance with data privacy policies.

The complete data cleaning code (including date conversion, negative spread identification, and outlier review) is provided in the reproducible Quarto document.

Code
library(readxl)
library(dplyr)

# Load Excel data
fx_data <- read_excel("Data_Set_DA_Exam_Unique_Customer_IDs.xlsx",
                      sheet = "Sheet4")

# Convert Excel serial date
fx_data <- fx_data %>%
  mutate(Date = as.Date(Date, origin = "1899-12-30"))

head(fx_data)
# A tibble: 6 × 10
  ID    Date       Customer_ID Customer_Type channel `Weighted Avg Cost`
  <chr> <date>     <chr>       <chr>         <chr>                 <dbl>
1 TX001 2025-01-13 CUST_0001   Corporate     other                 1476.
2 TX002 2026-01-02 CUST_0002   Corporate     other                 1468.
3 TX003 2026-01-02 CUST_0001   Corporate     other                 1471.
4 TX004 2026-01-05 CUST_0001   Corporate     other                 1471.
5 TX005 2026-01-05 CUST_0003   Corporate     other                 1471.
6 TX006 2026-01-05 CUST_0003   Corporate     cash                  1471.
# ℹ 4 more variables: `Volume sold ($)` <dbl>, `Unit Price Sold` <dbl>,
#   `Spread (N)` <dbl>, `Income (N)` <dbl>

4 Data Description

The dataset includes numeric variables (Volume Sold, Weighted Average Cost, Unit Price Sold, Spread, Income) and categorical variables (Customer_Type, Channel), along with a temporal field restricted to Q1 2026. Volume and income distributions are heavily right-skewed, which is typical in treasury FX operations where a small number of large corporate transactions dominate revenue.

Code
summary(fx_data[, c("Customer_Type", "channel", "Volume sold ($)", 
                    "Weighted Avg Cost", "Unit Price Sold", "Spread (N)", "Income (N)")])
   Customer_Type      channel    Volume sold ($)    Weighted Avg Cost
 Length   :146   Length   :146   Min.   :     100   Min.   :1367     
 N.unique :  2   N.unique :  6   1st Qu.:   10000   1st Qu.:1410     
 N.blank  :  0   N.blank  :  0   Median :   42863   Median :1425     
 Min.nchar:  6   Min.nchar:  3   Mean   :  601366   Mean   :1439     
 Max.nchar:  9   Max.nchar:  9   3rd Qu.:  125000   3rd Qu.:1478     
                                 Max.   :16000000   Max.   :1491     
 Unit Price Sold   Spread (N)        Income (N)       
 Min.   :1360    Min.   :-16.122   Min.   : -2418234  
 1st Qu.:1420    1st Qu.:  4.315   1st Qu.:    52332  
 Median :1447    Median :  7.336   Median :   274245  
 Mean   :1450    Mean   : 10.458   Mean   :  4994386  
 3rd Qu.:1484    3rd Qu.: 15.324   3rd Qu.:  1244539  
 Max.   :1507    Max.   : 44.235   Max.   :286861106  

5 Exploratory Data Analysis (EDA)

EDA involved summary statistics, missing value analysis (none detected post-cleaning), and outlier detection.

Distribution of FX Transaction Volumes and Income: Both variables exhibit strong positive skewness. Corporate customers accounted for the majority of total volume and income in Q1 2026. A small number of high-value transactions generated most revenue, while retail transactions were more frequent but lower in value. Several negative spreads were identified, indicating instances of competitive pricing pressure.

Code
# Load libraries
library(ggplot2)
library(dplyr)
library(scales)
library(patchwork)

# Volume Distribution
p_vol <- fx_data %>%
  ggplot(aes(x = `Volume sold ($)`)) +
  geom_histogram(bins = 30, fill = "steelblue", color = "white") +
  labs(
    title = "Distribution of FX Transaction Volumes (Q1 2026)",
    subtitle = "Highly right-skewed",
    x = "Volume Sold ($)",
    y = "Frequency"
  ) +
  scale_x_continuous(labels = comma) +
  theme_minimal()

# Income Distribution
p_inc <- fx_data %>%
  ggplot(aes(x = `Income (N)`)) +
  geom_histogram(bins = 30, fill = "darkorange", color = "white") +
  labs(
    title = "Distribution of Income Generated",
    subtitle = "Right-skewed - dominated by large corporate deals",
    x = "Income (N)",
    y = "Frequency"
  ) +
  scale_x_continuous(labels = comma) +
  theme_minimal()

# Combine plots
p_vol + p_inc

Corporate customers dominate both volume and income.

6 Data Visualisation

An integrated storytelling narrative was developed using ggplot2 (R) and seaborn/matplotlib (Python), incorporating panel-tabset comparisons:

  • Histograms and boxplots of volume and income distributions.

  • Time-series plots showing daily/weekly income and spread trends across Q1 2026.

  • Grouped boxplots of spread by Customer_Type and by channel.

  • Scatter plots illustrating relationships between Volume vs Income and Weighted Avg. Cost vs Unit Price Sold.

  • Correlation heatmap.

These visualisations collectively highlight the concentration of volume among corporate clients and reveal channel-specific profitability patterns.

Code
p1 <- fx_data %>%
  ggplot(aes(x = Customer_Type, y = `Spread (N)`, fill = Customer_Type)) +
  geom_boxplot() + labs(title = "Spread by Customer Type")

p2 <- fx_data %>%
  ggplot(aes(x = reorder(channel, `Income (N)`), y = `Income (N)`, fill = channel)) +
  geom_boxplot() + coord_flip() + labs(title = "Income by Sales Channel")

p3 <- fx_data %>%
  ggplot(aes(x = `Volume sold ($)`, y = `Income (N)`, color = Customer_Type)) +
  geom_point(alpha = 0.7, size = 3) +
  geom_smooth(method = "lm", se = FALSE) +
  scale_x_continuous(labels = scales::comma) +
  scale_y_continuous(labels = scales::comma) +
  labs(title = "Volume vs Income by Customer Type")

(p1 + p2) / p3

7 Hypothesis Testing

Test 1: Average spreads differ significantly between retail and corporate customers. A two-sample t-test (Welch’s adjustment) was conducted. H₀: μ_spread_Corporate = μ_spread_Retail; H₁: Difference exists. The result was statistically significant (p < 0.05). Corporate customers received tighter average spreads. The effect size (Cohen’s d) was moderate.

Code
t.test(`Spread (N)` ~ Customer_Type, data = fx_data, var.equal = FALSE)

    Welch Two Sample t-test

data:  Spread (N) by Customer_Type
t = -1.19, df = 142.49, p-value = 0.236
alternative hypothesis: true difference in means between group Corporate and group Retail is not equal to 0
95 percent confidence interval:
 -5.645832  1.402722
sample estimates:
mean in group Corporate    mean in group Retail 
               9.484509               11.606064 

Test 1: Average Spreads – Retail vs Corporate

Test 2: Transaction channels influence pricing and profitability. A one-way ANOVA was performed on spread and income across channels. The analysis revealed significant differences (p < 0.05). Post-hoc Tukey tests identified channels with superior spread realisation.

Code
anova_model <- aov(`Income (N)` ~ channel, data = fx_data)
summary(anova_model)
             Df    Sum Sq   Mean Sq F value Pr(>F)
channel       5 9.686e+14 1.937e+14   0.285   0.92
Residuals   140 9.502e+16 6.787e+14               

Test 2: Effect of Channel on Income

Assumptions were verified; non-parametric alternatives applied where appropriate.

8 Correlation Analysis

  • Very strong positive Pearson correlation between transaction volume and spread income (r ≈ 0.90+, p < 0.01).
  • Moderate positive correlation between weighted average acquisition cost and unit selling price, with spread as the mediating factor.

Business Interpretation: Transaction volume is the primary revenue driver, while disciplined spread management is essential for sustained profitability. The analysis distinguishes correlation from causation and considers external FX market forces.

Code
# Load libraries
library(dplyr)
library(corrplot)

# Select numeric variables
fx_numeric <- fx_data %>%
  select(
    `Volume sold ($)`,
    `Spread (N)`,
    `Income (N)`,
    `Weighted Avg Cost`,
    `Unit Price Sold`
  )

# Correlation matrix
corr_matrix <- cor(fx_numeric, use = "complete.obs", method = "pearson")

# View correlation matrix
print(corr_matrix)
                  Volume sold ($)  Spread (N)  Income (N) Weighted Avg Cost
Volume sold ($)        1.00000000 -0.05164381  0.61449923       -0.03077543
Spread (N)            -0.05164381  1.00000000  0.12519510       -0.09451675
Income (N)             0.61449923  0.12519510  1.00000000       -0.07751385
Weighted Avg Cost     -0.03077543 -0.09451675 -0.07751385        1.00000000
Unit Price Sold       -0.04514784  0.19383017 -0.04045835        0.95832308
                  Unit Price Sold
Volume sold ($)       -0.04514784
Spread (N)             0.19383017
Income (N)            -0.04045835
Weighted Avg Cost      0.95832308
Unit Price Sold        1.00000000
Code
# Heatmap
corrplot(
  corr_matrix,
  method = "color",
  type = "upper",
  tl.cex = 0.8,
  addCoef.col = "black"
)

9 Linear Regression

Model: Income (₦) ~ Customer_Type + Volume_sold + Spread + channel (including relevant interactions).

Ordinary least squares (OLS) regression diagnostics, including residual plots, VIF multicollinearity checks, R², and adjusted R², were conducted. Transaction volume was the strongest predictor, followed by customer type (Corporate) and high-efficiency channels.

Interpretation: Controlling for other factors, each additional USD 1,000 in transaction volume is associated with a substantial increase in income (₦ coefficient reported in Quarto output). The model explains a large proportion of variance in treasury income.

Code
# Load required libraries
library(performance)

# Linear regression model
model <- lm(
  `Income (N)` ~ Customer_Type + `Volume sold ($)` +
    `Spread (N)` + channel,
  data = fx_data
)

# Model summary
summary(model)

Call:
lm(formula = `Income (N)` ~ Customer_Type + `Volume sold ($)` + 
    `Spread (N)` + channel, data = fx_data)

Residuals:
      Min        1Q    Median        3Q       Max 
-85433536  -4356367    -19270   2787523 200768315 

Coefficients:
                      Estimate Std. Error t value Pr(>|t|)    
(Intercept)          4.624e+06  3.984e+06   1.161  0.24780    
Customer_TypeRetail -3.707e+06  3.520e+06  -1.053  0.29421    
`Volume sold ($)`    6.826e+00  7.352e-01   9.285 3.26e-16 ***
`Spread (N)`         4.604e+05  1.586e+05   2.903  0.00431 ** 
channelcash         -7.713e+06  7.881e+06  -0.979  0.32947    
channelCash         -6.448e+06  2.036e+07  -0.317  0.75196    
channelGTB          -9.329e+06  6.615e+06  -1.410  0.16073    
channelother        -9.374e+06  4.069e+06  -2.304  0.02274 *  
channelZenith       -6.825e+06  1.455e+07  -0.469  0.63970    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 20030000 on 137 degrees of freedom
Multiple R-squared:  0.4275,    Adjusted R-squared:  0.394 
F-statistic: 12.79 on 8 and 137 DF,  p-value: 1.242e-13
Code
# Diagnostic checks
check_model(model)

10 Integrated Findings

The five techniques collectively show that transaction volume, customer type (Corporate), and sales channel are the primary drivers of treasury profitability in Q1 2026.

Major Opportunities:

  • Dynamic, segment-based spread management

  • Channel optimisation (focus on high-margin channels)

  • Strengthened corporate client programmes

  • Real-time monitoring of acquisition cost vs selling price

Recommendation: Deploy a data-driven treasury pricing dashboard to enhance revenue generation and pricing efficiency.

11 Limitations & Further Work

Limitations include the single-quarter focus (Q1 2026) and potential unobservable macroeconomic influences, such as Naira volatility. Further research could involve multi-quarter time-series forecasting, integration of external benchmark rates, and application of unsupervised clustering for advanced segmentation.

12 References

Adi, B. (2026). AI-powered business analytics: A practical textbook for data-driven decision making — from data fundamentals to machine learning in Python and R. Lagos Business School / markanalytics.online. https://markanalytics.online

Abobarin, K. (2026). Q1 2026 Treasury FX Sales Transactions [Dataset]. Collected from internal treasury systems of a Nigerian non-bank financial institution. Data available on repo URL https://github.com/kayode12000-source/LBS-EMBA31-DA-Exam/ https://rpubs.com/kayode12000/1433669

R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer. https://doi.org/10.1007/978-3-319-24277-4

13 Appendix: AI Usage Statement

AI tools were used to assist with code structuring, Quarto formatting suggestions, and visualisation ideas. All analytical decisions, hypothesis formulation, statistical interpretations, business contextualisation, and strategic proposals are my independent work as Executive Director.