Ensemble-based Predictive Model for Financial Fraud Detection

V.O. Olaleye; O.A. Odeniyi; B.K. Alese

Call for Paper

May Edition

IJAIS solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 28 April 2025

Submit your paper

Know more

The week's pick

Enhancing Financial Time Series Predictions with a Hybrid BNN-LSTM Approach

Anika Tahsin Biva A.B.M. Shahadat Hossain Md. Shafiul Alom Khan Iqbal Habib

Random Articles

Computer Simulation of Chaotic Systems

Apr

2017

Automated Lip Reading Technique for Password Authentication

September

2012

Auto Conversion of Serial C Code into Cuda-C-Code for Faster Execution Utilizing GPU

September

2015

Deployment of Query Validation for Finite Range Query Scheme in Wireless Sensor Networks

August

2012

Reseach Article

Ensemble-based Predictive Model for Financial Fraud Detection

by V.O. Olaleye, O.A. Odeniyi, B.K. Alese

International Journal of Applied Information Systems

Foundation of Computer Science (FCS), NY, USA

Volume 12 - Number 42

Year of Publication: 2024

Authors: V.O. Olaleye, O.A. Odeniyi, B.K. Alese

10.5120/ijais2024451961

V.O. Olaleye, O.A. Odeniyi, B.K. Alese . Ensemble-based Predictive Model for Financial Fraud Detection. International Journal of Applied Information Systems. 12, 42 ( Jan 2024), 54-62. DOI=10.5120/ijais2024451961

@article{ 10.5120/ijais2024451961,

author = { V.O. Olaleye, O.A. Odeniyi, B.K. Alese },

title = { Ensemble-based Predictive Model for Financial Fraud Detection },

journal = { International Journal of Applied Information Systems },

issue_date = { Jan 2024 },

volume = { 12 },

number = { 42 },

month = { Jan },

year = { 2024 },

issn = { 2249-0868 },

pages = { 54-62 },

numpages = {9},

url = { https://www.ijais.org/archives/volume12/number42/ensemble-based-predictive-model-for-financial-fraud-detection/ },

doi = { 10.5120/ijais2024451961 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-01-27T22:32:21.391180+05:30

%A V.O. Olaleye

%A O.A. Odeniyi

%A B.K. Alese

%T Ensemble-based Predictive Model for Financial Fraud Detection

%J International Journal of Applied Information Systems

%@ 2249-0868

%V 12

%N 42

%P 54-62

%D 2024

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The financial industry remains a persistent target for fraudulent activities. Challenges to research in this area are due to data privacy concerns and the scarcity of publicly available datasets that contain instances of fraud. Researchers and practitioners have proposed various fraud detection techniques, applying diverse algorithms to uncover fraudulent patterns. To further address this, the study introduces a synthetic fraud-related dataset featuring five distinct fraud scenarios having about 2.5 million transactions. The primary objective is to analyze the intricacies of account transaction behaviour in a financial dataset. The authors propose an ensemble of three gradient boosting algorithms: CatBoost, Extreme Gradient Boosting (XGBoost), and LightGBM; The models developed demonstrate promising results, with several achieving an average Area Under the Curve (AUC) exceeding 0.9 and the ensemble having a predictive accuracy of 98.60%. Further evaluation through an application programming interface indicates a time complexity of less than 300 milliseconds and efficient memory usage, making this approach promising for practical usage in real-world scenarios.

References

D. Prusti and S. K. Rath, "Fraudulent Transaction Detection in Credit Card by Applying Ensemble Machine Learning techniques," 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kanpur, India, 2019, pp. 1-6, doi: 10.1109/ICCCNT45670.2019.8944867.
Sánchez-Aguayo, M., Urquiza-Aguiar, L., & Estrada-Jiménez, J. (2022). Predictive Fraud Analysis Applying the Fraud Triangle Theory through Data Mining Techniques. Applied Sciences, 12, 3382. https://doi.org/10.3390/app12073382
Paefgen, J., Staake, T., & Thiesse, F. (2013). Evaluation and aggregation of pay-as-you-drive insurance rate factors: A classification analysis approach. Decision Support Systems, 56, 192–201
Baecke, P., & Bocca, L. (2017). The value of vehicle telematics data in insurance risk selection processes. Decision Support Systems, 98, 69–79.
Bian, Y., Yang, C., Zhao, J. L., & Liang, L. (2018). Good drivers pay less: A study of usage-based vehicle insurance models. Transportation Research Part A: Policy and Practice, 107, 20–34.
Pesantez-Narvaez, J., Guillen, M., & Alcaniz, M. (2019). Predicting motor insurance claims using telematics data—xgboost versus logistic regression. Risks, 7(2), 70.
Prates, J. M., Oliveira, L. S., Costa, K. A., & Ludermir, T. B. (2011). Predictive modelling for fraud detection: A data-oriented approach. Decision Support Systems, 51(1), 201-210.
Geetha, G., Navin, J., Sanjeevi, P., & Sivaraj, S. (2023). Driver Driving Performance Analysis And Risk Detection Using Deep Learning. International Journal of Advanced Research in Computer and Communication Engineering, 12(5), 388–394. https://doi.org/10.17148/IJARCCE.2023.12563
A. Dal Pozzolo, G. Boracchi, O. Caelen, C. Alippi, and G. Bontempi,“Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy” in IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, pp. 1–14.
A. Dal Pozzolo, O. Caelen, and G. Bontempi, “When is undersampling effective in unbalanced classification tasks?” in Machine Learning and Knowledge Discovery in Databases. Cambridge, U.K.: Springer, 2015
A. Dal Pozzolo, O. Caelen, R. A. Johnson, and G. Bontempi, “Calibrating probability with undersampling for unbalanced classification,” in Proc. IEEE Symp. Ser. Computat. Intell., Dec. 2015, pp. 159–166
C. Alippi, G. Boracchi, and M. Roveri, “Just-in-time classifiers for recurrent concepts,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 4, pp. 620–634, Apr. 2013.
J. Gama, I. Žliobait˙ e, A. Bifet, M. Pechenizkiy, and A. Bouchachia, “A survey on concept drift adaptation,” ACM Comput. Surv., vol. 46, no. 4, p. 44, 2014.
G. Krempl and V. Hofer, “Classification in presence of drift and latency,” in Proc. 11th Data Mining Workshops, Dec. 2011, pp. 596–603.
J. Plasse and N. Adams, “Handling delayed labels in temporally evolving data streams,” in Proc. Int. Conf. Big Data, 2016, pp. 2416–2424.

Index Terms

Computer Science

Information Sciences

Data mining

Fraud Detection

Financial Industry

Keywords

Machine Learning Synthetic Data Financial Fraud Ensemble Learning Gradient Boosting