Multiobjective Optimization Solution for the Selection of Quasi Equally Informative Subsets in Classification Models
Abubakar I. Safyan
Zaharaddeen Sani
Mukhtar Abubakar
Abstract
Feature selection is crucial in machine learning, particularly for high-dimensional data. This study presents two advanced multi-objective techniques—Improved Wrapper QEISS (IW-QEISS) and Improved Filter QEISS (IF-QEISS)—designed to identify multiple quasi-equally informative feature subsets. Unlike traditional methods, which focus solely on accuracy and subset size, our approach enhances robustness and interpretability. Using a four-objective NSGA-II framework with a population of 100 and 100 generations, we optimize for accuracy, redundancy, and feature importance (threshold = 0.05). Experiments show IW-QEISS identified seven subsets with a cardinality of four on the Heart dataset, achieving 0.836 accuracy—on par with W-MOSS. IF-QEISS offered similar accuracy with reduced computation. These results validate the efficiency and effectiveness of our proposed methods.
References
Afshari, H., Hare, W., & Tesfamariam, S. (2019). Constrained multi-objective optimization algorithms: Review and comparison with application in reinforced concrete structures. Applied Soft Computing, 83, 105631. https://doi.org/10.1016/j.asoc.2019.105631
Al-Ani, A. (2005). Feature subset selection using ant colony optimization algorithm. International journal of computational intelligence, 2(1), 53-58.
Asha, L. N., Dey, A., Yodo, N., & Aragon, L. G. (2022). Optimization approaches for multiple conflicting objectives in sustainable green supply chain management. Sustainability, 14(19), 12790. https://doi.org/10.3390/su141912790.
Audet, C., Bigeon, J., Cartier, D., Digabel, S. L., & Salomon, L. (2021). Performance indicators in multiobjective optimization. European Journal of Operational Research, 292(2), 397–422. https://doi.org/10.1016/j.ejor.2020.11.016
Azadifar, S., & Ahmadi, A. (2020). A Graph Theoretic Based Feature Selection Method Using Multi Objective PSO. 2020 28th Iranian Conference on Electrical Engineering (ICEE). Tabriz, Iran, 2020, pp. 1-5, https://doi.org/10.1109/icee50131.2020.9260948
Babor, M., Pedersen, L., Kidmose, U., Paquet-Durand, O., & Hitzmann, B. (2022). Application of Non-Dominated Sorting Genetic Algorithm (NSGA-II) to increase the efficiency of bakery production: a case study. Processes, 10(8), 1623. https://doi.org/10.3390/pr10081623
Beven, K. (2024). A brief history of information and disinformation in hydrological data and the impact on the evaluation of hydrological models. Hydrological Sciences Journal, 1–9. https://doi.org/10.1080/02626667.2024.2332616
Blank, J., & Deb, K. (2020). PyMoO: Multi-Objective Optimization in Python. IEEE Access, 8, 89497–89509. https://doi.org/10.1109/access.2020.2990567
Brauman, K. A., Bremer, L. L., Hamel, P., Ochoa‐Tocachi, B. F., Roman‐Dañobeytia, F., Bonnesoeur, V., Arapa, E., & Gammie, G. (2021). Producing valuable information from hydrologic models of nature‐based solutions for water. Integrated Environmental Assessment and Management, 18(1), 135–147. https://doi.org/10.1002/ieam.4511
Chaudhuri, A., & Sahu, T. P. (2022b). Multi-objective feature selection based on quasi-oppositional based Jaya algorithm for microarray data. Knowledge Based Systems, 236, 107804. https://doi.org/10.1016/j.knosys.2021.107804
Coello, C. a. C., Brambila, S. G., Gamboa, J. F., Tapia, M. G. C., & Gómez, R. H. (2019). Evolutionary multiobjective optimization: open research areas and some challenges lying ahead. Complex & Intelligent Systems, 6(2), 221–236. https://doi.org/10.1007/s40747-019-0113-4
Cui, Z., Zhang, J., Wang, Y., Cao, Y., Cai, X., Zhang, W., & Chen, J. (2019). A pigeon-inspired optimization algorithm for many-objective optimization problems. Science China Information Sciences, 62(7). https://doi.org/10.1007/s11432-018-9729-5
Dhiman, G., & Kumar, V. (2018). Multi-objective spotted hyena optimizer: A Multi-objective optimization algorithm for engineering problems. Knowledge Based Systems, 150, 175–197. https://doi.org/10.1016/j.knosys.2018.03.011
Emmerich, M., & Deutz, A. H. (2018). A tutorial on multiobjective optimization: fundamentals and evolutionary methods. Natural Computing, 17(3), 585–609. https://doi.org/10.1007/s11047-018-9685-y
Epstein, E., Nallapareddy, N., & Ray, S. (2023). On the Relationship between Feature Selection Metrics and Accuracy. Entropy, 25(12), 1646. https://doi.org/10.3390/e25121646
Gaspar-Cunha, A., Costa, P., Monaco, F., & Delbem, A. (2023). Many-Objectives Optimization: a machine learning approach for reducing the number of objectives. Mathematical and Computational Applications, 28(1), 17. https://doi.org/10.3390/mca28010017
Gomes, A. F. C., & Figueiredo, M. a. T. (2024). A measure of synergy based on union information. Entropy, 26(3), 271. https://doi.org/10.3390/e26030271
Got, A., Moussaoui, A., & Zouache, D. (2020b). A guided population archive whale optimization algorithm for solving multiobjective optimization problems. Expert Systems with Applications, 141, 112972. https://doi.org/10.1016/j.eswa.2019.112972
Gu, F., Liu, H., Cheung, Y., & Zheng, M. (2022). A Rough-to-Fine evolutionary multiobjective optimization algorithm. IEEE Transactions on Cybernetics, 52(12), 13472–13485. https://doi.org/10.1109/tcyb.2021.3081357
Gu, X., Guo, J., Xiao, L., Tao, M., & Li, C. (2019). A feature selection algorithm based on equal interval division and Minimal-Redundancy–Maximal-Relevance. Neural Processing Letters, 51(2), 1237–1263. https://doi.org/10.1007/s11063-019-10144-3
Gunantara, N. (2018). A review of multi-objective optimization: Methods and its applications. Cogent Engineering, 5(1), 1502242. https://doi.org/10.1080/23311916.2018.1502242
Gupta, A., Hantush, M. M., Govindaraju, R. S., & Beven, K. (2024). Evaluation of hydrological models at gauged and ungauge-d basins using machine learning-based limits-of-acceptability and hydrological signatures. Journal of Hydrology, 131774. https://doi.org/10.1016/j.jhydrol.2024.131774
Guyon, I, and Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research. https://doi.org/10.5555/944919.944968
Hanke, M., Dijkstra, L., Foraita, R., & Didelez, V. (2023). Variable selection in linear regression models: Choosing the best subset is not always the best choice. Biometrical Journal, 66(1). https://doi.org/10.1002/bimj.202200209
https://doi.org/10.1109/tcbb.2020.2974953
Jia, W., Sun, M., Lian, J., & Hou, S. (2022b). Feature dimensionality reduction: a review. Complex & Intelligent Systems, 8(3), 2663–2693. https://doi.org/10.1007/s40747-021-00637-x
Jiang, M., Wang, Z., Qiu, L., Guo, S., Gao, X., & Tan, K. C. (2021). A fast dynamic evolutionary multiobjective algorithm via manifold transfer learning. IEEE Transactions on Cybernetics, 51(7), 3417–3428. https://doi.org/10.1109/tcyb.2020.2989465
Jiménez, F., Martínez, C., Marzano, E., Palma, J., Sánchez, G., & Sciavicco, G. (2019). Multiobjective evolutionary feature selection for fuzzy classification. IEEE Transactions on Fuzzy Systems, 27(5), 1085–1099. https://doi.org/10.1109/tfuzz.2019.2892363
Kale, A., & Sonavane, S. (2019b). IoT based Smart Farming: Feature subset selection for optimized high-dimensional data using improved GA based approach for ELM. Computers and Electronics in Agriculture, 161, 225–232. https://doi.org/10.1016/j.compag.2018.04.027
Kesireddy, A., & Medrano, F. A. (2024). Elite Multi-Criteria Decision Making—Pareto Front Optimization in Multi-Objective Optimization. Algorithms, 17(5), 206. https://doi.org/10.3390/a17050206
Khan, Z., Ali, F., Ahmad, I., Hayat, M., & Pi, D. (2019c). iPredCNC: Computational prediction model for cancerlectins and non-cancerlectins using novel cascade features subset selection. Chemometrics and Intelligent Laboratory Systems, 195, 103876. https://doi.org/10.1016/j.chemolab.2019.103876
Kızılöz, H. E., Deniz, A., Dökeroğlu, T., & Çoşar, A. (2018). Novel multiobjective TLBO algorithms for the feature subset selection problem. Neurocomputing, 306, 94–107. https://doi.org/10.1016/j.neucom.2018.04.020
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1–2), 273–324. https://doi.org/10.1016/s0004-3702(97)00043-x
Labani, M., Moradi, P., & Jalili, M. (2020). A multi-objective genetic algorithm for text feature selection using the relative discriminative criterion. Expert Systems With Applications, 149, 113276. https://doi.org/10.1016/j.eswa.2020.113276
Li, D., Zhang, X., Xu, Z., Liu, X., Yang, J., & Zhang, D. (2020). Adversarial feature selection for trustworthy classification. IEEE transactions on knowledge and data engineering.
Li, K., Chen, R., Fu, G., & Yao, X. (2019). Two-Archive Evolutionary Algorithm for constrained multiobjective optimization. IEEE Transactions on Evolutionary Computation, 23(2), 303–315. https://doi.org/10.1109/tevc.2018.2855411
Li, X., Lu, C., Gao, L., Xiao, S., & Wen, L. (2018). An effective multiobjective algorithm for Energy-Efficient scheduling in a Real-Life welding shop. IEEE Transactions on Industrial Informatics, 14(12), 5400–5409. https://doi.org/10.1109/tii.2018.2843441
Liu, Y., Yen, G. G., & Gong, D. (2019). A multimodal multiobjective evolutionary algorithm using Two-Archive and recombination strategies. IEEE Transactions on Evolutionary Computation, 23(4), 660–674. https://doi.org/10.1109/tevc.2018.2879406
Ma, H., Fei, M., Jiang, Z., Li, L., Zhou, H., & Crookes, D. (2020b). A Multipopulation-Based multiobjective evolutionary algorithm. IEEE Transactions on Cybernetics, 50(2), 689–702. https://doi.org/10.1109/tcyb.2018.2871473
Mathotaarachchi, K. V., Hasan, R., & Mahmood, S. (2024). Advanced machine learning techniques for predictive modeling of property prices. Information, 15(6), 295. https://doi.org/10.3390/info15060295.
Mirjalili, S. Z., Mirjalili, S., Saremi, S., Faris, H., & Aljarah, I. (2017). Grasshopper optimization algorithm for multi-objective optimization problems. Applied Intelligence, 48(4), 805–820. https://doi.org/10.1007/s10489-017-1019-8
Nematollahi, A. F., Rahiminejad, A., & Vahidi, B. (2019). A novel multi-objective optimization algorithm based on Lightning Attachment Procedure Optimization algorithm. Applied Soft Computing, 75, 404–427. https://doi.org/10.1016/j.asoc.2018.11.032
Prina, M. G., Cozzini, M., Garegnani, G., Manzolini, G., Moser, D., Oberegger, U. F., Pernetti, R., Vaccaro, R., & Sparber, W. (2018). Multi-objective optimization algorithm coupled to EnergyPLAN software: The EPLANopt model. Energy, 149, 213–221. https://doi.org/10.1016/j.energy.2018.02.050
Qi, F., Wu, W., Yu, Z. L., Gu, Z., Wen, Z., Yu, T., & Li, Y. (2021). Spatiotemporal-Filtering-Based channel selection for Single-Trial EEG classification. IEEE Transactions on Cybernetics, 51(2), 558–567. https://doi.org/10.1109/tcyb.2019.2963709
Qi, N., Li, X., Wu, Z., Wan, Y., Wang, N., Duan, G., Wang, L., Xiang, J., Zhao, Y., & Zhan, H. (2024). Machine Learning-Based Research for predicting shale Gas well production. Symmetry, 16(5), 600. https://doi.org/10.3390/sym16050600
Saraiva, P. (2023). On Shannon entropy and its applications. Kuwait Journal of Science, 50(3), 194–199. https://doi.org/10.1016/j.kjs.2023.05.004
Schweidtmann, A. M., Clayton, A. D., Holmes, N. P., Bradford, E., & Bourne, R. A. (2018). Machine learning meets continuous flow chemistry: Automated optimization towards the Pareto front of multiple objectives. Chemical Engineering Journal, 352, 277–282. https://doi.org/10.1016/j.cej.2018.07.031
Sun, C., Wang, Y., & Sun, G. (2020). A multi-criteria fusion feature selection algorithm for fault diagnosis of helicopter planetary gear train. Chinese Journal of Aeronautics, 33(5), 1549–1561. https://doi.org/10.1016/j.cja.2019.07.014
Teegavarapu, R. S., Sharma, P. J., & Patel, P. L. (2022). Frequency-based performance measure for hydrologic model evaluation. Journal of Hydrology, 608, 127583. https://doi.org/10.1016/j.jhydrol.2022.127583
Wang, P., Xue, B., Liang, J., & Zhang, M. (2023b). Feature clustering-Assisted feature selection with differential evolution. Pattern Recognition, 140, 109523. https://doi.org/10.1016/j.patcog.2023.109523
Wang, Y., Liu, B., Ma, Z., Wong, K., & Li, X. (2019). Nature-Inspired Multiobjective Cancer subtype diagnosis. IEEE Journal of Translational Engineering in Health and Medicine, 7, 1–12. https://doi.org/10.1109/jtehm.2019.2891746
Xue, B., Zhang, M., Browne, W. N., & Yao, X. (2016). A survey on Evolutionary Computation Approaches to feature selection. IEEE Transactions on Evolutionary Computation, 20(4), 606–626. https://doi.org/10.1109/tevc.2015.2504420
Yu, J., Pan, J., & Lv, Y. (2020). Solving the ED problem with ACO algorithm modified by mRMR and local search method. 2020 Chinese Automation Congress (CAC), Shanghai, China, 2020, pp. 2131-2136. https://doi.org/10.1109/cac51589.2020.9326683
Yue, C., Liang, J., Qu, B., & Song, H. (2019). Multimodal Multiobjective Optimization in Feature Selection. IEEE Congress on Evolutionary Computation (CEC). https://doi.org/10.1109/cec.2019.8790329
Yue, C., Qu, B., Yu, K., & Li, X. (2019). A novel scalable test problem suite for multimodal multiobjective optimization. Swarm and Evolutionary Computation, 48, 62–71. https://doi.org/10.1016/j.swevo.2019.03.011.
Zhang, Y., Zhou, Z., Deng, Y., Pan, D., Van Griensven Thé, J., Yang, S. X., & bermGharabaghi, B. (2024). Daily streamflow forecasting using networks of Real-Time monitoring stations and hybrid machine learning methods. Water, 16(9), 1284. https://doi.org/10.3390/w16091284
Zhou, Y., Kang, J., Kwong, S., Wang, X., & Zhang, Q. (2021). An evolutionary multi-objective optimization framework of discretization-based feature selection for classification. Swarm and Evolutionary Computation, 60,100770. https://doi.org/10.1016/j.swevo.2020.100770
Zhu, Q., Zhang, Q., & Lin, Q. (2020). A constrained multiobjective evolutionary algorithm with Detect-and-Escape strategy. IEEE Transactions on Evolutionary Computation, 24(5), 938–947. https://doi.org/10.1109/tevc.2020.2981949.
PDF