Mobile Money Fraud Detection: A Predictive Modelling & Segmentation Study
Author
GRACE KALU
Published
May 10, 2026
1. Executive Summary
Mobile money fraud represents one of the most pressing operational risks facing financial services providers across sub-Saharan Africa. This study analyses 199,999 mobile money transactions from a simulated but operationally realistic payment system (PaySim), containing transaction type, amount, origin and destination account balances, and fraud labels. The objective is to build a comprehensive, reproducible fraud detection pipeline that supports both automated real-time screening and strategic risk management.
Five analytical techniques were applied: (1) a classification model — Logistic Regression and Random Forest to predict fraudulent transactions; (2) model explainability via variable importance and a contribution plot to identify the key fraud drivers; (3) K-Means clustering to segment transactions into behavioural risk profiles; (4) PCA dimensionality reduction to visualise cluster separation; and (5) time series analysis of hourly fraud counts to detect temporal patterns and produce a 3-step forecast.
Key findings show that fraud is concentrated exclusively in CASH_OUT and TRANSFER transaction types and is driven by accounts being drained to exactly zero. The Random Forest classifier achieves near-perfect discrimination (AUC ≈ 1.0). Clustering identifies distinct high-risk and low-risk transaction segments. The time series reveals irregular but autocorrelated fraud spikes. The integrated recommendation is a two-stage real-time screening system combining transaction-type gating with balance-anomaly classification.
Classification Model: Predicting whether a transaction is fraudulent or legitimate is the core operational need of any fraud team. I am responsible for due diligence and compliance, especially in financial transactions. A binary classifier provides the real-time scoring engine that flags suspicious transactions before they are processed, directly reducing financial losses.
Model Explainability: Fraud decisions must be justifiable to compliance officers, regulators, and customers who dispute declined transactions. Variable importance analysis allows me to communicate to non-technical stakeholders which transaction characteristics triggered a fraud flag essential for regulatory compliance and customer relations.
Clustering: Understanding natural groupings of transactions enables differentiated risk controls. High-risk clusters receive mandatory secondary authentication while low-risk clusters get straight-through processing, balancing security with customer experience.
Dimensionality Reduction (PCA): With many correlated financial features, PCA allows us to visualise the transaction landscape, validate that risk clusters are genuinely distinct, and identify outliers representing novel fraud patterns.
Time Series Analysis: Fraud volumes fluctuate over time. Forecasting hourly fraud counts enables the operations team to staff review queues proactively, allocate fraud loss provisions, and detect emerging fraud waves early.
3. Data Collection & Sampling
Source: PaySim — a synthetic mobile money transaction simulator developed by Lopez-Rojas et al. (2016), calibrated against real transaction logs from a mobile money operator. Used as a benchmark dataset mirroring the transaction patterns of the analyst’s operational environment.
Collection Method: Obtained from a publicly available research repository to supplement the professional context of the analyst where raw internal transaction data cannot be published due to confidentiality obligations.
Sampling Frame: All transaction records in the PaySim simulation covering approximately 30 days of mobile money activity across five transaction types: CASH_IN, CASH_OUT, DEBIT, PAYMENT, and TRANSFER.
Time Period: 741 hourly time steps representing approximately 30 days of continuous transaction activity.
Ethical Notes: This dataset contains no personally identifiable information. All account identifiers are synthetic. No consent was required.
Key EDA Findings: Fraud occurs exclusively in CASH_OUT and TRANSFER transactions. The dominant fraud signal is accounts drained to exactly zero (exact_drain, zero_orig). Transaction amount alone does not reliably distinguish fraud from legitimate activity.
5. Classification Model
5.1 Business Justification
A binary classifier is the operational core of real-time fraud screening. We compare Logistic Regression (interpretable, regulatory-friendly) against Random Forest (captures non-linear balance interactions). The winning model feeds into the transaction screening pipeline, scoring every CASH_OUT and TRANSFER before processing.
Random Forest is recommended for deployment. It achieves near-perfect AUC, correctly identifying virtually all fraud cases with minimal false positives. In a mobile money context, missing fraud causes direct financial loss — making high recall the primary objective. A decision threshold of 0.30 is recommended for production to maximise fraud recall while keeping manual review volumes manageable.
6. Model Explainability
6.1 Business Justification
In a regulated financial environment, fraud models cannot be black boxes. Compliance officers, auditors, and customers who dispute decisions require clear explanations. Variable importance identifies which transaction characteristics drive predictions, supporting model governance and regulatory compliance.
# Pick one true-positive fraud case from test settest_with_pred <- test_df %>%mutate(prob = rf_prob, pred = rf_pred)tp_case <- test_with_pred %>%filter(isFraud ==1, pred ==1) %>%slice(1) %>%select(all_of(FEATURES))# Baseline probabilitybaseline_prob <-mean(rf_prob[test_df$isFraud ==1])case_prob <-predict(rf_model, newdata = tp_case, type ="prob")[, "1"]# Approximate contribution per featuremean_vals <- test_df %>%select(all_of(FEATURES)) %>%summarise(across(everything(), mean))contribs <-sapply(FEATURES, function(feat) { row_mod <- mean_vals row_mod[[feat]] <- tp_case[[feat]] pred_prob <-predict(rf_model, newdata = row_mod, type ="prob")[, "1"] pred_prob -mean(predict(rf_model, newdata = mean_vals, type ="prob")[, "1"])})contrib_df <-data.frame(Feature =names(contribs),Contribution =as.numeric(contribs)) %>%arrange(Contribution)contrib_df %>%ggplot(aes(x =reorder(Feature, Contribution), y = Contribution,fill = Contribution >0)) +geom_col(show.legend =FALSE) +geom_hline(yintercept =0, color ="black", linewidth =0.8) +scale_fill_manual(values =c(GREEN, RED)) +coord_flip() +labs(title =sprintf("Waterfall Explanation — Fraud Prob: %.2f%% (Baseline: %.2f%%)", case_prob *100,mean(predict(rf_model, newdata = mean_vals,type ="prob")[, "1"]) *100),x ="", y ="Contribution to Fraud Probability") +theme_minimal(base_size =12)
Feature contribution waterfall for one representative fraud transaction
6.4 Plain-Language Interpretation
The five most important fraud signals:
exact_drain — Fraudsters drain origin accounts to precisely zero in one transaction — the strongest single fraud signal.
newbalanceOrig — A post-transaction balance of zero is a near-certain fraud indicator in CASH_OUT and TRANSFER transactions.
balance_diff_orig — A large negative origin balance change flags potential account takeover.
oldbalanceOrg — Fraudsters target accounts with substantial pre-transaction balances.
type_enc — Transaction type is structurally decisive: fraud is impossible in PAYMENT and CASH_IN transactions.
7. Transaction Segmentation (Clustering)
7.1 Business Justification
Not all transactions carry the same risk. K-Means clustering groups transactions by natural behavioural patterns — independent of the fraud label — enabling differentiated controls: high-risk clusters receive mandatory secondary authentication, low-risk clusters get straight-through processing.
Low-Risk Transactions: Predominantly PAYMENT and CASH_IN. Normal balance flows, no drain patterns. Recommended for straight-through processing.
High-Risk / Suspicious Transactions: Predominantly CASH_OUT and TRANSFER with high exact-drain and zero-balance rates. All transactions in this segment should trigger secondary authentication before processing.
8. Dimensionality Reduction (PCA)
8.1 Business Justification
With several correlated balance features, PCA reduces the space to uncorrelated principal components for visualisation and outlier detection. A biplot confirms cluster separation and communicates the transaction risk landscape to non-technical stakeholders.
Interpretation: PC1 separates high-value from low-value transactions (driven by balance and amount features). PC2 separates suspicious drain-pattern transactions from normal ones. The biplot confirms genuine cluster separation, validating the clustering result.
9. Time Series Analysis
9.1 Business Justification
Fraud volumes fluctuate hour by hour. Forecasting fraud frequency enables the operations team to staff review queues proactively, set dynamic transaction limits, and detect emerging fraud waves before they peak.
# Decomposition using decompose()ts_obj <-ts(y_ts, frequency =24)decomp <-decompose(ts_obj, type ="additive")autoplot(decomp) +labs(title ="Time Series Decomposition — Hourly Fraud Counts") +theme_minimal(base_size =12)
cat(ifelse(t_result$p.value >0.05,"-> Stationary in mean. d=0 (no differencing needed).\n","-> Differencing recommended (d=1).\n"))
-> Stationary in mean. d=0 (no differencing needed).
Code
# ACF and PACFpar(mfrow =c(1, 2))acf(y_ts, lag.max =36, main ="ACF — Hourly Fraud Count",col = BLUE, lwd =2)pacf(y_ts, lag.max =36, main ="PACF — Hourly Fraud Count",col = ORANGE, lwd =2)
ACF and PACF plots
Code
par(mfrow =c(1, 1))
9.4 ARIMA Forecast — 3 Steps Ahead
Code
# Fit ARIMA automaticallyarima_model <-auto.arima(ts(y_ts, frequency =24),stepwise =TRUE, approximation =TRUE)cat("=== ARIMA Model ===\n")
=== ARIMA Model ===
Code
print(summary(arima_model))
Series: ts(y_ts, frequency = 24)
ARIMA(1,0,0) with non-zero mean
Coefficients:
ar1 mean
0.1205 0.5381
s.e. 0.0434 0.0325
sigma^2 = 0.4309: log likelihood = -521.94
AIC=1049.89 AICc=1049.94 BIC=1062.67
Training set error measures:
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.00013115 0.6551608 0.576782 -Inf Inf 0.8900957 -0.002900605
3-step ahead ARIMA fraud forecast with prediction intervals
10. Integrated Findings
The five techniques collectively build a coherent and actionable fraud detection strategy. Classification demonstrated that fraud is almost perfectly predictable using balance-drain features, with Random Forest achieving near-perfect AUC. Explainability confirmed that exact_drain and newbalanceOrig dominate all other features — even a simple business rule flagging accounts drained to zero would catch the majority of fraud. Clustering confirmed that transactions naturally segment into distinct risk groups based on type and balance behaviour, validating a tiered authentication strategy. PCA confirmed genuine structural separation between segments in the reduced feature space. Time Series revealed irregular but autocorrelated fraud spikes, enabling short-term operational forecasting.
Single Integrated Recommendation: Deploy a two-stage real-time fraud screening system. Stage 1 — gate by transaction type: route PAYMENT, CASH_IN, and DEBIT to straight-through processing. Stage 2 — apply the Random Forest classifier to every CASH_OUT and TRANSFER: transactions with predicted fraud probability above 0.30 are held for review or auto-declined. Fraud operations staffing should be adjusted dynamically using the ARIMA hourly forecast.
11. Limitations & Further Work
Synthetic data: PaySim is calibrated against real transactions but is not actual transaction data. Production deployment requires retraining on live labelled logs.
Class imbalance: The 0.14% fraud rate was handled with class weights. SMOTE oversampling would further optimise the precision-recall trade-off.
Model complexity: XGBoost would likely match Random Forest performance with faster inference for real-time scoring at scale.
SHAP values: The contribution plot approximates Shapley values. Production systems should use the shapr R package for exact explanations.
Time series: The auto.arima model is a solid baseline. A seasonal ARIMA or Prophet model would better capture intraday fraud patterns.
Network features: Graph-based features (shared destination accounts, transaction velocity) would substantially improve recall on coordinated fraud rings.
References
Lopez-Rojas, E. A., Elmir, A., & Axelsson, S. (2016). PaySim: A financial mobile money simulator for fraud detection. EUROPAM 2016.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning (2nd ed.). Springer.
Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and Practice (3rd ed.). OTexts.
Kuhn, M. (2008). Building predictive models in R using the caret package. Journal of Statistical Software, 28(5).
Appendix: AI Usage Statement
Claude (Anthropic) assisted with structuring the Quarto document and providing R code templates. All analytical decisions — feature selection, model hyperparameters, cluster interpretation, time series specification, and business recommendations — were reviewed, validated, and adapted independently by the author. The author accepts full responsibility for the content and conclusions of this submission.