Addressing Class Imbalance in Credit Card Fraud Detection with Ensemble Learning and Domain-Specific Feature Engineering

DOI: https://doi.org/jobasr

Olaniran, S. F.

Lawal, M. A.

Abstract
Credit card fraud detection remains a critical challenge due to highly class imbalance, changing attack strategies, and the trade-off between recall and precision. This study evaluates the performance of supervised algorithms and ensemble methods (Random Forest, Gradient Boosting Machines (GBM), and Stacking) on a real-world transaction dataset enhanced with temporal, behavioral, and geographic features. A quantitative experimental design was employed, incorporating domain-specific feature engineering and the Synthetic Minority Oversampling Technique (SMOTE) to address imbalance. Models were assessed using precision, recall, F1-score, balanced accuracy, and ROC-AUC. Results show that ensemble models consistently outperformed single classifiers. GBM achieved the highest recall (89.37%), balanced accuracy (94.47%) and ROC-AUC (99.52%) on the imbalanced dataset with engineered features, making it highly effective for minimizing undetected fraud, while Stacking delivered superior precision (95.58%), accuracy (98.90%) and f1-score (92.07%), highlighting its value in reducing false positives. Feature engineering substantially improved recall and balanced accuracy in imbalanced scenarios, while SMOTE enhanced recall for simpler models but sometimes reduced precision. Overall, GBM with engineered features is best suited for real-time fraud screening where recall is critical, whereas Stacking is more appropriate for balanced contexts requiring equal emphasis on recall and precision. These findings underscore the operational value of combining ensemble learning, targeted feature engineering, and imbalance handling to strengthen fraud detection in highly skewed datasets, offering practical guidance for financial institutions seeking more reliable fraud prevention systems.
References
PDF