Drivers of Income Generation in Corporate Banking

An evidence-based analysis of Top-100 corporate-accounts (2024–2025)

Author

Caroline Edeh — Manager, Zenith Bank PLC

Published

May 17, 2026

Audience: Zenith Bank PLC Board of Directors · Author: Caroline Edeh, Manager · Purpose: evidence-based recommendation on the drivers of income generation in the corporate-banking book. The analysis is fully reproducible: every figure on every page below is produced live from the underlying Top-100 extract (CorporateBankingReport.xlsx).

1 Executive Summary

We applied five Exploratory & Inferential Analytics techniques — EDA, Visualisation, Hypothesis Testing, Correlation Analysis and Regression — to the Top-100 Credit-Turnover account book at Zenith Bank PLC for the 12-month window 2024 – 2025 (n = 100 accounts, 16 industries, 5 relationship managers, total annual income of ₦8.39bn on a credit-turnover base of ₦1.28tn). The headline finding is unambiguous: Credit Turnover and the presence of a Credit Facility are the dominant drivers of income generation in corporate banking, together explaining 86 % of the variation in income/fees in a log-log OLS specification (F-test p < 10⁻³⁸). All other operational levers we measure — Transaction Volume, RM Visits per Month, Relationship Years and even the identity of the assigned Relationship Manager — fail to move the needle once the two dominant drivers are controlled for. Customers with a Credit Facility generate 3.3× the income of those without (₦183 mn vs ₦56 mn on average; Welch t = 5.99, p < 10⁻⁵). Income also differs significantly by industry (ANOVA F = 3.39, p = 0.009), led by Construction & Engineering and Oil & Gas. The single recommendation we ask the Board to consider is to (i) prioritise origination and renewal of Credit Facilities in the Top-100 cohort and (ii) redeploy RM coverage from low-elasticity activities (visit cadence, tenure management) to credit-conversion campaigns in Construction & Engineering, Oil & Gas, and the long-tail Traders & General Merchants segment.

2 Professional Disclosure

I am Caroline Edeh, Manager at Zenith Bank PLC — a tier-1 Nigerian commercial bank in the Financial Services / Corporate Banking sector. The five techniques in this paper map directly to live operational decisions on my desk:

  • Exploratory Data Analysis (EDA) is the always-on substrate that runs at the start of every corporate-banking review: missing-value scans, distribution checks (income / fees in particular is heavily right-skewed) and outlier flags before any modelling begins. EDA is what separates a hand-cleaned book from a defensible one.

  • Data Visualisation is how findings travel from my team to the Executive Committee and the Board. Pareto charts, boxplots and scatterplots are the lingua franca that lets a non-technical director follow the income narrative in seconds.

  • Hypothesis Testing is how I separate signal from noise. With n = 100 across 16 industries, formally stated null and alternative hypotheses — paired with p-values and effect sizes — keep the Coverage and Credit conversations honest.

  • Correlation Analysis is the first lens I use to identify candidate income drivers and to decide which variables earn a place in the regression model.

  • Regression is the workhorse for the analytical question. Coefficients, partial effects and R² together turn a noisy table of Top-100 corporate accounts into a ranked list of revenue levers I can present to the Board.

3 Data Collection & Sampling

Field Value
Source Corporate core banking operations and CRM systems. Account-level credit turnover, transaction volume and income/fees are extracted from the General Ledger; RM visit cadence, relationship years, RM and RSM assignments come from the Relationship Management module.
Collection method Direct workbook export covering the 12-month period 2024–2025 — the Top-100 Credit-Turnover league table that the Corporate Banking Directorate reviews monthly.
Sampling frame The 100 highest credit-turnover corporate accounts at Zenith Bank for the period, across all 16 industry verticals and all five Relationship Managers in the Top-100 desk.
Sample size n = 100 accounts (full cohort, not a sample).
Time period 12 months, 2024–2025.
Ethics & consent Customer-identifying fields (account names, account numbers) have been replaced with sequential pseudonyms (“CORPORATE 1” … “CORPORATE 100”). The dataset is held under Zenith Bank’s data-protection policy aligned with the Central Bank of Nigeria (CBN) data-residency rules and the Nigeria Data Protection Act (NDPA, 2023). The Head of Corporate Banking and the Chief Compliance Officer have approved use of the dataset for analytics development.

4 Data Description

The dataset contains the Top-100 corporate accounts that generated total credit turnover of ₦1.28tn and total income/fees of ₦8.39bn in the 12-month window. Coverage spans 16 industries (Traders & General Merchants and Oil & Gas dominate), 5 Relationship Managers and 9 Regional Sales Managers (RSMs). Two operational dimensions are of particular interest: whether the customer has a Credit Facility (only 22 of 100 do — a clear under-penetration) and the RM Visit cadence (range 1–5 per month). The income distribution is heavily right-skewed: the top account alone contributes ₦452.39mn.

5 The Analytical Question

What are the drivers of income generation in corporate banking?

Each of the five techniques below contributes one piece of evidence towards this question, and the Integrated Findings section combines them into a single Board-level recommendation.

6 Analysis 1 — Exploratory Data Analysis

6.1 Theory recap

EDA is the disciplined first look: descriptive statistics, missing- value scans, outlier flags and shape diagnostics (skew, kurtosis) before any modelling. Right-skewed financial variables are the rule rather than the exception, and EDA is where we make that explicit before fitting a model.

6.2 Business justification

Before recommending coverage or product investments to the Board, we need to know whether the Top-100 book is balanced or concentrated, whether outlier accounts dominate the totals, and whether the data quality is adequate to support a formal recommendation.

What the EDA tells the Board. The data is complete — no missing values on any of the 13 columns. Credit Turnover and Income/Fees are both heavily right-skewed (skewness > 3.5) — the Top-5 accounts contribute a disproportionate share of revenue, which is normal for a corporate book and is the operational rationale for managing the cohort separately. RM visit cadence is bounded 1–5 and Relationship Years is bounded 2–10 — neither shows outliers. There are no data-quality issues that would invalidate the downstream analysis.

7 Analysis 2 — Data Visualisation

7.1 Theory recap

A statistic summarises; a chart shows. Five visuals are sufficient to tell the income-driver story coherently: the distribution of income, the income-by-Credit-Facility comparison, income by industry, the income–turnover scatter, and a Pareto view of revenue concentration. All charts below are interactive — hover for exact values, click the legend to filter, drag to zoom.

7.2 Business justification

The Board meets monthly. Interactive charts let any director probe a data-point without asking a follow-up question, and remain in a single page of the read-ahead pack.

Visual 1. Distribution of Income / Fees across the Top-100 cohort

Visual 2. Income / Fees by Credit Facility status

Visual 3. Income by Industry (sorted by mean)

Visual 4. Credit Turnover vs Income / Fees — with linear fit

Visual 5. Pareto — cumulative share of total income across the Top-100

Visual story. Income is concentrated — a Pareto pattern — and the two visible levers are Credit Turnover (almost-linear on log-log axes) and Credit Facility (an upward shift of the entire fitted line). Industry mix matters at the mean-per-account level, with Construction & Engineering and Oil & Gas at the top.

8 Analysis 3 — Hypothesis Testing

8.1 Theory recap

A hypothesis test starts with a null (H₀ — usually “no effect”), an alternative (H₁), an α (typically 0.05) and an appropriate test statistic. For two-group continuous comparisons we use Welch’s two-sample t-test; for three-or-more groups we use one-way ANOVA.

8.2 Business justification

The Board needs binary “is this real, or chance?” answers on two questions: (1) does income differ between customers with and without a Credit Facility? and (2) does income differ across industries? Testing them formally — rather than eyeballing the visuals — is what justifies any follow-on coverage or credit-conversion recommendation.

H1 interpretation. Welch t = 5.99, p < 10⁻⁵ — strongly significant. Customers with a Credit Facility generate roughly 3.3× the income of those without. Only 22 of 100 Top-100 accounts currently hold a facility — there is clear, quantifiable upside in Credit Facility origination.

H2 interpretation. F = 3.39, p = 0.009 — significant. Mean income per account is highest in Construction & Engineering (₦155 mn) and Oil & Gas (₦133 mn), and lowest in Real Estate and Consultancy (~₦46 mn). Industry mix is a secondary, but real, driver of income.

9 Analysis 4 — Correlation Analysis

9.1 Theory recap

Pearson’s correlation matrix summarises pairwise linear relationships among numeric variables, with values in [−1, +1]. A heatmap renders the matrix at a glance. Significance for each pair is tested with t = r √(n − 2) / √(1 − r²).

9.2 Business justification

Before fitting a regression we want a ranked list of candidate predictors and a check for redundancy among them.

Correlation heatmap (interactive — hover for exact values)

Reading the matrix.

  • Strongest relationship: Income/Fees ↔︎ Credit Turnover at r = +0.57, p < 10⁻⁹ — by far the strongest single signal.
  • Weakest relationship: Income/Fees ↔︎ Relationship Years at r ≈ 0.00 — tenure alone does not earn revenue.
  • Income/Fees ↔︎ RM Visits per Month at r = +0.13 (p ≈ 0.20) — visit cadence has no statistically meaningful association with income in this cohort. The current RM visit playbook is not driving the income outcome the way the playbook implies.
  • Managerial implication. The two operational levers that earn income — Credit Turnover and Credit Facility — should be the centrepiece of the Board’s coverage strategy. Visit cadence and tenure-based pricing should be re-examined.

10 Analysis 5 — Regression

10.1 Theory recap

OLS regression fits y = β₀ + β₁x₁ + β₂x₂ + … + ε by minimising the residual sum of squares. Each β̂ⱼ is the partial effect of xⱼ on y holding the other predictors constant. is the share of variance explained; the F-test asks whether the model collectively explains more variance than a mean-only model. Because income, credit turnover and transaction volume are heavily right- skewed, we also fit a log-log specification where the coefficients have a direct elasticity interpretation.

10.2 Business justification

The Board’s question — “what are the drivers of income generation in corporate banking?” — is exactly what regression answers. We fit two models: a simple OLS on the raw scale (for direct ₦ effects) and a log-log model on the transformed scale (for elasticities). Together they give comparable, interpretable coefficients and a quantified R² that tells us how much of the income story we actually capture.

Variance explained by each model

Residuals vs fitted — log-log model (linearity / homoscedasticity check)

Direct answer to the analytical question.

  • Model 1 (linear OLS) explains R² ≈ 62 % of the variation in income (F-test p < 10⁻¹⁷). The two significant drivers are Credit Turnover (β̂ ≈ 0.001 — every additional ₦1 of credit turnover adds ≈ 0.1 kobo of income, p < 0.001) and Credit Facility (β̂ ≈ +₦112 mn vs no facility, p < 0.001). Transaction Volume, RM Visits and Relationship Years are not statistically significant.

  • Model 2 (log-log OLS) explains R² ≈ 86 % of variance in ln(Income) (F-test p < 10⁻³⁸). The elasticity of income with respect to credit turnover is +0.68 — a 10 % increase in turnover translates into a ≈ 6.8 % increase in income, p < 0.001. Credit Facility lifts log-income by +1.51 — equivalent to a ~4.5× multiplier on income controlling for turnover. Transaction Volume, RM Visits and Relationship Years remain non- significant.

  • Therefore the drivers of income generation in corporate banking at Zenith are, in order: (1) Credit Turnover, (2) Credit Facility, and (3) Industry mix. RM Visits, Transaction Volume and Relationship Years individually contribute little once the dominant two are controlled for.

11 Integrated Findings

Single recommendation to the Board. Reposition the Top-100 coverage playbook around the two proven income drivers: (a) launch a Credit-Facility-origination campaign — only 22 of the Top-100 currently hold a facility, and converting an additional 10 accounts (at the average uplift observed in §7) is worth approximately ₦1.27bn of incremental annual income; and (b) redeploy RM coverage from low-elasticity activities (visit cadence, tenure management) to credit conversion and turnover-growth campaigns in Construction & Engineering, Oil & Gas and the long-tail Traders & General Merchants segment.

12 Limitations & Further Work

  • Snapshot-only data. The dataset is a 12-month view of the Top-100 cohort with no time axis. Adding monthly observations would let us decompose income into trend / seasonality and model the causal effect of new credit-facility activations (a regression- discontinuity or interrupted-time-series design).
  • Selection on credit turnover. The Top-100 cohort is selected precisely on credit turnover, which mechanically inflates the Income–Turnover correlation. A repeat analysis on the full corporate book (including small / dormant accounts) is needed before generalising the elasticity to the franchise.
  • Causality. Every coefficient here is observational. The Credit- Facility effect could partly reflect endogenous selection (riskier customers are denied facilities, and they happen to have lower income). The correct next step is a quasi-experimental design — for example, comparing accounts that crossed the credit-approval threshold marginally on either side.
  • Latent confounders. Industry mix is captured but client size, multi-bank wallet share, and product holdings (FX, trade-finance, cash-management) are not. A future iteration should fold these variables in before recommending coverage-budget reallocations.
  • RM identity vs RM coverage. The five RMs are not directly modelled in the headline regression because their effects are confounded with their customer portfolios. A mixed-effects model (RM as a random intercept) is the right tool to disentangle RM skill from book quality.

References

  • Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum Associates.

  • Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE.

  • Wickham, H., Averick, M., Bryan, J., et al. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686

  • Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag.

  • Sievert, C. (2020). Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman and Hall/CRC.

  • Robinson, D., Hayes, A., & Couch, S. (2024). broom: Convert Statistical Objects into Tidy Tibbles. R package version 1.0.x. citation("broom")

  • Revelle, W. (2024). psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University, Evanston, Illinois. citation("psych")

  • Xie, Y., Cheng, J., & Tan, X. (2024). DT: A wrapper of the JavaScript library “DataTables”. R package version 0.30. citation("DT")

  • Central Bank of Nigeria. (2024). Risk-based cybersecurity framework and guidelines for deposit money banks and payment service providers (revised). CBN Banking Supervision Department.

  • Federal Republic of Nigeria. (2023). Nigeria Data Protection Act. National Assembly.

  • Mark Analytics. (2025). AI-Powered Data Analytics: a reproducible reporting workflow. https://markanalytics.online/ai-powered-data-analytics/

Appendix — AI Usage Statement

The author used Claude (Anthropic) for two specific tasks: (1) drafting the boilerplate scaffold of the Quarto YAML and section headings to match the required submission rubric, and (2) double- checking R syntax for ggplot2, plotly, lm(), aov() and DT. The analytical question, the choice of techniques, the interpretation of the Credit-Facility coefficient as a quantifiable origination opportunity (rather than a causal claim), the recommendation to redeploy RM coverage from low-elasticity activities to credit conversion, and the narrative throughout — all of these are the independent professional judgement of Caroline Edeh, Manager at Zenith Bank PLC. Every numerical result is computed live in this document on the 100-row Top-100 corporate-banking extract (CorporateBankingReport.xlsx) and is reproducible end-to-end via quarto render.