Data Pre-processing challenges with generic and domain specific solutions: a critical review

Usman Abdullahi Adam; Baffa Sani Mahmoud; Ismaila Ibrahim Adamu

doi:10.4314/jobasr.v4i2.42

Authors

Usman Abdullahi Adam Author
Baffa Sani Mahmoud Author
Ismaila Ibrahim Adamu Author

DOI:

https://doi.org/10.4314/jobasr.v4i2.42

Keywords:

Data preprocessing, Machine learning, Generic and Domain-specific Solutions

Abstract

Data pre-processing is a critical phase in the machine learning and data analysis lifecycle, significantly influencing model accuracy, efficiency, and reliability. While numerous standard techniques such as normalization, encoding, and missing value imputation are widely used, existing literature provides limited guidance on how to address complex, context-dependent challenges that require non-generic solutions. This gap creates uncertainty for practitioners when selecting appropriate preprocessing strategies across diverse data scenarios. This study aims to critically review and systematically categorize data pre-processing challenges by distinguishing between those effectively addressed using generic techniques and those requiring domain-specific or context-aware approaches. A systematic literature review methodology was adopted, synthesizing findings from academic research and industry practices across multiple data modalities, including tabular, textual, image, and time-series data. The findings reveal that generic techniques are effective for routine data issues but are insufficient for handling semantic inconsistencies, complex feature interactions, and context-driven anomalies. To address this, the study proposes a structured, decision-oriented framework that guides practitioners in evaluating data characteristics, identifying preprocessing challenges, and selecting appropriate strategies. This work contributes a practical and unified approach that enhances decision-making in data pre-processing, ultimately improving the robustness, interpretability, and performance of machine learning models.

Data Pre-processing challenges with generic and domain specific solutions: a critical review

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Make a Submission

Keywords

Information

Latest publications