Multiobjective Optimization Solution for the Selection of Quasi Equally Informative Subsets in Classification Models
DOI: https://doi.org/10.33003/jobasr
Abubakar I. Safyan
Zaharaddeen Sani
Mukhtar Abubakar
Abstract
Feature selection is crucial in machine learning, particularly for high-dimensional data. This study presents two advanced multi-objective techniques—Improved Wrapper QEISS (IW-QEISS) and Improved Filter QEISS (IF-QEISS)—designed to identify multiple quasi-equally informative feature subsets. Unlike traditional methods, which focus solely on accuracy and subset size, our approach enhances robustness and interpretability. Using a four-objective NSGA-II framework with a population of 100 and 100 generations, we optimize for accuracy, redundancy, and feature importance (threshold = 0.05). Experiments show IW-QEISS identified seven subsets with a cardinality of four on the Heart dataset, achieving 0.836 accuracy—on par with W-MOSS. IF-QEISS offered similar accuracy with reduced computation. These results validate the efficiency and effectiveness of our proposed methods.
References
Afshari, H., Hare, W., & Tesfamariam, S. (2019). Constrained multi-objective optimization algorithms: Review and comparison with application in reinforced concrete structures. Applied Soft Computing, 83, 105631. https://doi.org/10.1016/j.asoc.2019.105631
Al-Ani, A. (2005). Feature subset selection using ant colony optimization algorithm. International journal of computational intelligence, 2(1), 53-58.
Asha, L. N., Dey, A., Yodo, N., & Aragon, L. G. (2022). Optimization approaches for multiple conflicting objectives in sustainable green supply chain management. Sustainability, 14(19), 12790. https://doi.org/10.3390/su141912790.
Audet, C., Bigeon, J., Cartier, D., Digabel, S. L., & Salomon, L. (2021). Performance indicators in multiobjective optimization. European Journal of Operational Research, 292(2), 397–422. https://doi.org/10.1016/j.ejor.2020.11.016
Azadifar, S., & Ahmadi, A. (2020). A Graph Theoretic Based Feature Selection Method Using Multi Objective PSO. 2020 28th Iranian Conference on Electrical Engineering (ICEE). Tabriz, Iran, 2020, pp. 1-5, https://doi.org/10.1109/icee50131.2020.9260948
Babor, M., Pedersen, L., Kidmose, U., Paquet-Durand, O., & Hitzmann, B. (2022). Application of Non-Dominated Sorting Genetic Algorithm (NSGA-II) to increase the efficiency of bakery production: a case study. Processes, 10(8), 1623. https://doi.org/10.3390/pr10081623
Beven, K. (2024). A brief history of information and disinformation in hydrological data and the impact on the evaluation of hydrological models. Hydrological Sciences Journal, 1–9. https://doi.org/10.1080/02626667.2024.2332616
Blank, J., & Deb, K. (2020). PyMoO: Multi-Objective Optimization in Python. IEEE Access, 8, 89497–89509. https://doi.org/10.1109/access.2020.2990567
Brauman, K. A., Bremer, L. L., Hamel, P., Ochoa‐Tocachi, B. F., Roman‐Dañobeytia, F., Bonnesoeur, V., Arapa, E., & Gammie, G. (2021). Producing valuable information from hydrologic models of nature‐based solutions for water. Integrated Environmental Assessment and Management, 18(1), 135–147. https://doi.org/10.1002/ieam.4511
Chaudhuri, A., & Sahu, T. P. (2022b). Multi-objective feature selection based on quasi-oppositional based Jaya algorithm for microarray data. Knowledge Based Systems, 236, 107804. https://doi.org/10.1016/j.knosys.2021.107804
Coello, C. a. C., Brambila, S. G., Gamboa, J. F., Tapia, M. G. C., & Gómez, R. H. (2019). Evolutionary multiobjective optimization: open research areas and some challenges lying ahead. Complex & Intelligent Systems, 6(2), 221–236. https://doi.org/10.1007/s40747-019-0113-4
Cui, Z., Zhang, J., Wang, Y., Cao, Y., Cai, X., Zhang, W., & Chen, J. (2019). A pigeon-inspired optimization algorithm for many-objective optimization problems. Science China Information Sciences, 62(7). https://doi.org/10.1007/s11432-018-9729-5
Dhiman, G., & Kumar, V. (2018). Multi-objective spotted hyena optimizer: A Multi-objective optimization algorithm for engineering problems. Knowledge Based Systems, 150, 175–197. https://doi.org/10.1016/j.knosys.2018.03.011
Emmerich, M., & Deutz, A. H. (2018). A tutorial on multiobjective optimization: fundamentals and evolutionary methods. Natural Computing, 17(3), 585–609. https://doi.org/10.1007/s11047-018-9685-y
Epstein, E., Nallapareddy, N., & Ray, S. (2023). On the Relationship between Feature Selection Metrics and Accuracy. Entropy, 25(12), 1646. https://doi.org/10.3390/e25121646
Gaspar-Cunha, A., Costa, P., Monaco, F., & Delbem, A. (2023). Many-Objectives Optimization: a machine learning approach for reducing the number of objectives. Mathematical and Computational Applications, 28(1), 17. https://doi.org/10.3390/mca28010017
Gomes, A. F. C., & Figueiredo, M. a. T. (2024). A measure of synergy based on union information. Entropy, 26(3), 271. https://doi.org/10.3390/e26030271
Got, A., Moussaoui, A., & Zouache, D. (2020b). A guided population archive whale optimization algorithm for solving multiobjective optimization problems. Expert Systems with Applications, 141, 112972. https://doi.org/10.1016/j.eswa.2019.112972
Gu, F., Liu, H., Cheung, Y., & Zheng, M. (2022). A Rough-to-Fine evolutionary multiobjective optimization algorithm. IEEE Transactions on Cybernetics, 52(12), 13472–13485. https://doi.org/10.1109/tcyb.2021.3081357
Gu, X., Guo, J., Xiao, L., Tao, M., & Li, C. (2019). A feature selection algorithm based on equal interval division and Minimal-Redundancy–Maximal-Relevance. Neural Processing Letters, 51(2), 1237–1263. https://doi.org/10.1007/s11063-019-10144-3
Gunantara, N. (2018). A review of multi-objective optimization: Methods and its applications. Cogent Engineering, 5(1), 1502242. https://doi.org/10.1080/23311916.2018.1502242
Gupta, A., Hantush, M. M., Govindaraju, R. S., & Beven, K. (2024). Evaluation of hydrological models at gauged and ungauge-d basins using machine learning-based limits-of-acceptability and hydrological signatures. Journal of Hydrology, 131774. https://doi.org/10.1016/j.jhydrol.2024.131774
Guyon, I, and Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research. https://doi.org/10.5555/944919.944968
Hanke, M., Dijkstra, L., Foraita, R., & Didelez, V. (2023). Variable selection in linear regression models: Choosing the best subset is not always the best choice. Biometrical Journal, 66(1). https://doi.org/10.1002/bimj.202200209
https://doi.org/10.1109/tcbb.2020.2974953
Jia, W., Sun, M., Lian, J., & Hou, S. (2022b). Feature dimensionality reduction: a review. Complex & Intelligent Systems, 8(3), 2663–2693. https://doi.org/10.1007/s40747-021-00637-x
Jiang, M., Wang, Z., Qiu, L., Guo, S., Gao, X., & Tan, K. C. (2021). A fast dynamic evolutionary multiobjective algorithm via manifold transfer learning. IEEE Transactions on Cybernetics, 51(7), 3417–3428. https://doi.org/10.1109/tcyb.2020.2989465
Jiménez, F., Martínez, C., Marzano, E., Palma, J., Sánchez, G., & Sciavicco, G. (2019). Multiobjective evolutionary feature selection for fuzzy classification. IEEE Transactions on Fuzzy Systems, 27(5), 1085–1099. https://doi.org/10.1109/tfuzz.2019.2892363
Kale, A., & Sonavane, S. (2019b). IoT based Smart Farming: Feature subset selection for optimized high-dimensional data using improved GA based approach for ELM. Computers and Electronics in Agriculture, 161, 225–232. https://doi.org/10.1016/j.compag.2018.04.027
Kesireddy, A., & Medrano, F. A. (2024). Elite Multi-Criteria Decision Making—Pareto Front Optimization in Multi-Objective Optimization. Algorithms, 17(5), 206. https://doi.org/10.3390/a17050206
Khan, Z., Ali, F., Ahmad, I., Hayat, M., & Pi, D. (2019c). iPredCNC: Computational prediction model for cancerlectins and non-cancerlectins using novel cascade features subset selection. Chemometrics and Intelligent Laboratory Systems, 195, 103876. https://doi.org/10.1016/j.chemolab.2019.103876
Kızılöz, H. E., Deniz, A., Dökeroğlu, T., & Çoşar, A. (2018). Novel multiobjective TLBO algorithms for the feature subset selection problem. Neurocomputing, 306, 94–107. https://doi.org/10.1016/j.neucom.2018.04.020
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1–2), 273–324. https://doi.org/10.1016/s0004-3702(97)00043-x
Labani, M., Moradi, P., & Jalili, M. (2020). A multi-objective genetic algorithm for text feature selection using the relative discriminative criterion. Expert Systems With Applications, 149, 113276. https://doi.org/10.1016/j.eswa.2020.113276
Li, D., Zhang, X., Xu, Z., Liu, X., Yang, J., & Zhang, D. (2020). Adversarial feature selection for trustworthy classification. IEEE transactions on knowledge and data engineering.
Li, K., Chen, R., Fu, G., & Yao, X. (2019). Two-Archive Evolutionary Algorithm for constrained multiobjective optimization. IEEE Transactions on Evolutionary Computation, 23(2), 303–315. https://doi.org/10.1109/tevc.2018.2855411
Li, X., Lu, C., Gao, L., Xiao, S., & Wen, L. (2018). An effective multiobjective algorithm for Energy-Efficient scheduling in a Real-Life welding shop. IEEE Transactions on Industrial Informatics, 14(12), 5400–5409. https://doi.org/10.1109/tii.2018.2843441
Liu, Y., Yen, G. G., & Gong, D. (2019). A multimodal multiobjective evolutionary algorithm using Two-Archive and recombination strategies. IEEE Transactions on Evolutionary Computation, 23(4), 660–674. https://doi.org/10.1109/tevc.2018.2879406
Ma, H., Fei, M., Jiang, Z., Li, L., Zhou, H., & Crookes, D. (2020b). A Multipopulation-Based multiobjective evolutionary algorithm. IEEE Transactions on Cybernetics, 50(2), 689–702. https://doi.org/10.1109/tcyb.2018.2871473
Mathotaarachchi, K. V., Hasan, R., & Mahmood, S. (2024). Advanced machine learning techniques for predictive modeling of property prices. Information, 15(6), 295. https://doi.org/10.3390/info15060295.
Mirjalili, S. Z., Mirjalili, S., Saremi, S., Faris, H., & Aljarah, I. (2017). Grasshopper optimization algorithm for multi-objective optimization problems. Applied Intelligence, 48(4), 805–820. https://doi.org/10.1007/s10489-017-1019-8
Nematollahi, A. F., Rahiminejad, A., & Vahidi, B. (2019). A novel multi-objective optimization algorithm based on Lightning Attachment Procedure Optimization algorithm. Applied Soft Computing, 75, 404–427. https://doi.org/10.1016/j.asoc.2018.11.032
Prina, M. G., Cozzini, M., Garegnani, G., Manzolini, G., Moser, D., Oberegger, U. F., Pernetti, R., Vaccaro, R., & Sparber, W. (2018). Multi-objective optimization algorithm coupled to EnergyPLAN software: The EPLANopt model. Energy, 149, 213–221. https://doi.org/10.1016/j.energy.2018.02.050
Qi, F., Wu, W., Yu, Z. L., Gu, Z., Wen, Z., Yu, T., & Li, Y. (2021). Spatiotemporal-Filtering-Based channel selection for Single-Trial EEG classification. IEEE Transactions on Cybernetics, 51(2), 558–567. https://doi.org/10.1109/tcyb.2019.2963709
Qi, N., Li, X., Wu, Z., Wan, Y., Wang, N., Duan, G., Wang, L., Xiang, J., Zhao, Y., & Zhan, H. (2024). Machine Learning-Based Research for predicting shale Gas well production. Symmetry, 16(5), 600. https://doi.org/10.3390/sym16050600
Saraiva, P. (2023). On Shannon entropy and its applications. Kuwait Journal of Science, 50(3), 194–199. https://doi.org/10.1016/j.kjs.2023.05.004
Schweidtmann, A. M., Clayton, A. D., Holmes, N. P., Bradford, E., & Bourne, R. A. (2018). Machine learning meets continuous flow chemistry: Automated optimization towards the Pareto front of multiple objectives. Chemical Engineering Journal, 352, 277–282. https://doi.org/10.1016/j.cej.2018.07.031
Sun, C., Wang, Y., & Sun, G. (2020). A multi-criteria fusion feature selection algorithm for fault diagnosis of helicopter planetary gear train. Chinese Journal of Aeronautics, 33(5), 1549–1561. https://doi.org/10.1016/j.cja.2019.07.014
Teegavarapu, R. S., Sharma, P. J., & Patel, P. L. (2022). Frequency-based performance measure for hydrologic model evaluation. Journal of Hydrology, 608, 127583. https://doi.org/10.1016/j.jhydrol.2022.127583
Wang, P., Xue, B., Liang, J., & Zhang, M. (2023b). Feature clustering-Assisted feature selection with differential evolution. Pattern Recognition, 140, 109523. https://doi.org/10.1016/j.patcog.2023.109523
Wang, Y., Liu, B., Ma, Z., Wong, K., & Li, X. (2019). Nature-Inspired Multiobjective Cancer subtype diagnosis. IEEE Journal of Translational Engineering in Health and Medicine, 7, 1–12. https://doi.org/10.1109/jtehm.2019.2891746
Xue, B., Zhang, M., Browne, W. N., & Yao, X. (2016). A survey on Evolutionary Computation Approaches to feature selection. IEEE Transactions on Evolutionary Computation, 20(4), 606–626. https://doi.org/10.1109/tevc.2015.2504420
Yu, J., Pan, J., & Lv, Y. (2020). Solving the ED problem with ACO algorithm modified by mRMR and local search method. 2020 Chinese Automation Congress (CAC), Shanghai, China, 2020, pp. 2131-2136. https://doi.org/10.1109/cac51589.2020.9326683
Yue, C., Liang, J., Qu, B., & Song, H. (2019). Multimodal Multiobjective Optimization in Feature Selection. IEEE Congress on Evolutionary Computation (CEC). https://doi.org/10.1109/cec.2019.8790329
Yue, C., Qu, B., Yu, K., & Li, X. (2019). A novel scalable test problem suite for multimodal multiobjective optimization. Swarm and Evolutionary Computation, 48, 62–71. https://doi.org/10.1016/j.swevo.2019.03.011.
Zhang, Y., Zhou, Z., Deng, Y., Pan, D., Van Griensven Thé, J., Yang, S. X., & bermGharabaghi, B. (2024). Daily streamflow forecasting using networks of Real-Time monitoring stations and hybrid machine learning methods. Water, 16(9), 1284. https://doi.org/10.3390/w16091284
Zhou, Y., Kang, J., Kwong, S., Wang, X., & Zhang, Q. (2021). An evolutionary multi-objective optimization framework of discretization-based feature selection for classification. Swarm and Evolutionary Computation, 60,100770. https://doi.org/10.1016/j.swevo.2020.100770
Zhu, Q., Zhang, Q., & Lin, Q. (2020). A constrained multiobjective evolutionary algorithm with Detect-and-Escape strategy. IEEE Transactions on Evolutionary Computation, 24(5), 938–947. https://doi.org/10.1109/tevc.2020.2981949.
PDF