Crop Recommendation Predictive Analysis using Ensembling Techniques

DOI: https://doi.org/10.33003/jobasr-2023-v1i1-19

Muhammad Umar Abdullahi.

Morufu Olalere.

Gilbert I. O. Aimufua.

Kene Tochukwu Anyachebelu.

Bako Halilu Egga.

Abstract
Crop recommendation systems play a crucial role in modern agriculture by aiding farmers in making well-informed choices to optimize crop yield and resource utilization. Ensemble learning approaches can significantly improve the effectiveness of crop recommendation systems. To achieve this, multiple forecasts are combined from various models. In this paper, a complete Machine Learning Pipeline is used to evaluate the performance of ensemble learning models in crop recommendation tasks. A diverse dataset is used to select and train four ensemble learning methods, Bagging, Voting, Stacking, and One-Vs-Rest (OVR), as separate classifiers. The dataset includes various agricultural factors such as soil characteristics, meteorological conditions, and past crop productivity. Various metrics, including accuracy, precision, recall, F1-score, and support, are utilized for each model. Bagging is considered the most effective ensemble learning technique, demonstrating excellent levels of accuracy and overall performance. The bagging algorithm achieves a high level of accuracy, reaching 99.32%. It also achieves perfect precision, recall, and F1-score metrics, with values of 0.99, 1.00, and 1.00 respectively. The support value, which represents the number of instances used for evaluation, is 141. This study provides valuable perspectives on the choice of appropriate ensemble learning models for crop recommendation tasks. Consequently, it enables farmers and other individuals involved in agriculture to make well-informed choices using data, resulting in enhanced agricultural output and sustainability.
References
Agrawal, N., Govil, H., & Kumar, T. (2024). Agricultural land suitability classification and crop suggestion using machine learning and spatial multicriteria decision analysis in semi-arid ecosystem. Environment, Development and Sustainability, 1-38. https://doi.org/10.1007/s10668-023-04440-1 Akkem, Y., Biswas, S. K., & Varanasi, A. (2023). Smart farming using artificial intelligence: A review. Engineering Applications of Artificial Intelligence, 120, 105899. https://doi.org/10.1016/j.engappai.2023.105899 Attri, I., Awasthi, L. K., Sharma, T. P., & Rathee, P. (2023). A review of deep learning techniques used in agriculture. Ecological Informatics, 102217. https://doi.org/10.1016/j.ecoinf.2023.102217 Avramopoulos, I., & Vasiloglou, N. (2023). On algorithmically boosting fixed-point computations. arXiv preprint arXiv:2304.04665. https://doi.org/10.48550/arXiv.2304.04665 Bach, J. (2020). When artificial intelligence becomes general enough to understand itself. Commentary on Pei Wang’s paper “On defining artificial intelligence”. Journal of Artificial General Intelligence, 11(2), 15-18. https://DOI: 10.2478/jagi-2020-0003 Behera, S., Menon, D., Shenoy, G. V., & Suresh, J. M. (2023, June). Suggestion of Appropriate Crops Based on Rainfall and Underground Water Analysis. In 2023 3rd International Conference on Intelligent Technologies (CONIT) (pp. 1-6). IEEE. doi: 0.1109/CONIT59222.2023.10205821 Biswas, S., Wardat, M., & Rajan, H. (2022, May). The art and practice of data science pipelines: A comprehensive study of data science pipelines in theory, in the small, and the large. In Proceedings of the 44th International Conference on Software Engineering (pp. 2091-2103). https://doi.org/10.1145/3510003.3510057 Breiman, L. (1996). Bagging predictors. Machine learning, 24, 123-140. https://doi.org/10.1007/BF00058655 Chakauya, R., Materechera, S. A., Jiri, O., Chakauya, E., & Machete, M. (2023). Climate change impacts on agriculture, adaptation and resilience: insights from local farmers in South-East Zimbabwe. In Routledge Handbook of Climate Change Impacts on Indigenous Peoples and Local Communities (pp. 273-286). Routledge. https://doi.org/10.4324/9781003356837 Chen, J., Zeb, A., Nanehkaran, Y. A., & Zhang, D. (2023). Stacking ensemble model of deep learning for plant disease recognition. Journal of Ambient Intelligence and Humanized Computing, 14(9), 12359-12372. https://doi.org/10.1007/s12652-022-04334-6 Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree-boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794). https://doi.org/10.1145/2939672.2939785 Davis, E. (2015). Ethical guidelines for a superintelligence. Artificial Intelligence, 220, 121-124. https://doi.org/10.1016/j.artint.2014.12.003 Demir, S., & Sahin, E. K. (2023). Predicting occurrence of liquefaction-induced lateral spreading using gradient boosting algorithms integrated with particle swarm optimization: PSO-XGBoost, PSO-LightGBM, and PSOCatBoost. Acta Geotechnica, 18(6), 3403-3419. https://doi.org/10.1007/s11440-022-01777-1 De-Zarzà, I., de Curtò, J., Hernández-Orallo, E., & Calafate, C. T. (2023). Cascading and Ensemble Techniques in Deep Learning. Electronics, 12(15), 3354. https://doi.org/10.3390/electronics12153354 Emami, S., & Martínez-Muñoz, G. (2023). Sequential Training of Neural Networks with Gradient Boosting. IEEE Access. https://doi.org/10.1109/ACCESS.2023.3271515 Everitt, T., Goertzel, B., & Potapov, A. (2017). Artificial general intelligence. Lecture Notes in Artificial Intelligence. Heidelberg: Springer. https://doi.org/10.1007/978-3-319-63703-7 Falcon, W. P., Naylor, R. L., & Shankar, N. D. (2022). Rethinking global food demand for 2050. Population and Development Review, 48(4), 921-957. https://doi.org/10.1111/padr.12508 Flasiński, M. (2016). Introduction to artificial intelligence. Springer. https://doi: 10.1007/978-3-319- 40022-8 Ganaie, M. A., Hu, M., Malik, A. K., Tanveer, M., & Suganthan, P. N. (2022). Ensemble deep learning: A review. Engineering Applications of Artificial Intelligence, 115, 105151. https://doi.org/10.1016/j.engappai.2022.105151 González, S., García, S., Del Ser, J., Rokach, L., & Herrera, F. (2020). A practical tutorial on bagging and boosting based ensembles for machine learning: Algorithms, software tools, performance study, practical perspectives and opportunities. Information Fusion, 64, 205-237. https://doi.org/10.1016/j.inffus.2020.07.007 Gul, N., Mashwani, W. K., Aamir, M., Aldahmani, S., & Khan, Z. (2023). Optimal model selection for k-nearest neighbours ensemble via sub-bagging and sub-sampling with feature weighting. Alexandria Engineering Journal, 72, 157-168. https://doi.org/10.1016/j.aej.2023.03.075 Hajian-Tilaki K. Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation. Caspian J Intern Med. 2013 Spring;4(2):627- 35. PMID: 24009950; PMCID: PMC3755824. Han, R., Yoon, H., Kim, G., Lee, H., & Lee, Y. (2023). Revolutionizing Medicinal Chemistry: The Application of Artificial Intelligence (AI) in Early Drug Discovery. Pharmaceuticals (Basel, Switzerland), 16(9), 1259. https://doi.org/10.3390/ph16091259. Hutter, M. (2012). One decade of universal artificial intelligence. In Theoretical foundations of artificial general intelligence (pp. 67-88). Paris: Atlantis Press. https://doi.org/10.2991/978-94-91216-62-6_5 Jia, J., Liang, W., & Liang, Y. (2023). A Review of Hybrid and Ensemble in Deep Learning for Natural Language Processing. arXiv preprint arXiv:2312.05589. https://doi.org/10.48550/arXiv.2312.05589 Jitpakdeebodin, W., & Sinapiromsaran, K. (2023, March). Random forest algorithm using quartile-pattern bootstrapping for a class imbalanced problem. In Proceedings of the 2023 5th International Conference on Image, Video and Signal Processing (pp. 191-196). https://doi.org/10.1145/3591156.3591184 Kalimuthu, M., Vaishnavi, P., & Kishore, M. (2020, August). Crop prediction using machine learning. In 2020 third international conference on smart systems and inventive technology (ICSSIT) (pp. 926-932). IEEE. https://doi.org/10.1109/ICSSIT48917.2020.9214190 Khaki, S., & Wang, L. (2019). Crop yield prediction using deep neural networks. Frontiers in plant science, 10, 452963. https://doi.org/10.3389/fpls.2019.00621 Kom, Z., Nethengwe, N. S., Mpandeli, N. S., & Chikoore, H. (2022). Determinants of small-scale farmers’ choice and adaptive strategies in response to climatic shocks in Vhembe District, South Africa. GeoJournal, 87(2), 677- 700. https://doi.org/10.1007/s10708-020-10272-7 Liang, W., & Liu, Y. (2023). Rating Crop Insurance Contracts with Model Stacking of Gaussian Processes. Limpo, S. Y., Fahmid, I. M., Fattah, A., Rauf, A. W., Surmaini, E., Muslimin, ... & Andri, K. B. (2022). Integrating indigenous and scientific knowledge for decision making of rice farming in South Sulawesi, Indonesia. Sustainability, 14(5), 2952. https://doi.org/10.3390/su14052952 Maganathan, T., Senthilkumar, S., & Balakrishnan, V. (2020, November). Machine learning and data analytics for environmental science: a review, prospects and challenges. In IOP conference series: materials science and engineering (Vol. 955, No. 1, p. 012107). IOP Publishing. https://doi.org/10.1088/1757- 899X/955/1/012107 Malek, N. H. A., Yaacob, W. F. W., Wah, Y. B., Nasir, S. A. M., Shaadan, N., & Indratno, S. W. (2023). Comparison of ensemble hybrid sampling with bagging and boosting machine learning approach for imbalanced data. Indones. J. Elec. Eng. Comput. Sci, 29, 598-608. https://doi.org/10.11591/ijeecs.v29.i1.pp598-608 Miller, T., Lewita, K., Krzemińska, A., Kozlovska, P., Jawor, M., Cembrowska-Lech, D., & Kisiel, A. (2023). Boosting modern society: advancements and applications of the adaboost algorithm in diverse domains. Scientific Collection «InterConf», (152), 549–555. Retrieved from https://archive.interconf.center/index.php/conferenceproceeding/article/view/3214 Mohammed, A., & Kora, R. (2023). A comprehensive review on ensemble deep learning: Opportunities and challenges. Journal of King Saud University-Computer and Information Sciences, 35(2), 757-774. https://doi.org/10.1016/j.jksuci.2023.01.014 Nalluri, M., Pentela, M., & Eluri, N. R. (2020). A Scalable Tree Boosting System: XG Boost. Int. J. Res. Stud. Sci. Eng. Technol, 7, 36-51. https://doi.org/10.22259/2349-476X.0712005 Nosrati, V., & Rahmani, M. (2023). HMDE‐FS: A homogeneous distributed ensemble feature selection framework based on resampling with/without replacement. Concurrency and Computation: Practice and Experience, 35(7), e7613. https://doi.org/10.1002/cpe.7613 Nti, I. K., Zaman, A., Nyarko-Boateng, O., Adekoya, A. F., & Keyeremeh, F. (2023). A predictive analytics model for crop suitability and productivity with tree-based ensemble learning. Decision Analytics Journal, 8, 100311. https://doi.org/10.1016/j.dajour.2023.100311 Palanivel, K., & Surianarayanan, C. (2019). An Approach for Prediction of Crop Yield Using Machine Learning and Big Data Techniques (2019). International Journal of Computer Engineering and Technology 10(3), pp. 110- 118, 2019, Available at SSRN: https://ssrn.com/abstract=3555087 Petropoulos, F., & Siemsen, E. (2023). Forecast selection and representativeness. Management Science, 69(5), 2672-2690. https://doi.org/10.1287/mnsc.2022.4485 Pintor, M., Demetrio, L., Sotgiu, A., Melis, M., Demontis, A., & Biggio, B. (2019). secml: A Python Library for Secure and Explainable Machine Learning. arXiv preprint arXiv:1912.10013. https://doi.org/10.1016/j.softx.2022.101095 Rana, M., Chandorkar, P., Dsouza, A., & Kazi, N. (2015). Breast Cancer Diagnosis and Recurrence Prediction Using Machine Learning Techniques. International Journal of Research in Engineering and Technology, 4(4), 372–376. eISSN: 2319-1163 | pISSN: 2321-7308 Redhu, N. S., Thakur, Z., Yashveer, S., & Mor, P. (2022). Artificial intelligence: a way forward for agricultural sciences. In Bioinformatics in Agriculture (pp. 641-668). Academic Press. https://doi.org/10.1016/B978-0-323- 89778-5.00007-6 Ricciardi, V., Ramankutty, N., Mehrabi, Z., Jarvis, L., & Chookolingo, B. (2018). An open-access dataset of crop production by farm size from agricultural censuses and surveys. Data in brief, 19, 1970-1988. https://doi.org/10.1016/j.dib.2018.06.057 Sajitha, P., Andrushia, A. D., Anand, N., & Naser, M. Z. (2024). A Review on Machine Learning and Deep Learning Image-based Plant Disease Classification for Industrial Farming Systems. Journal of Industrial Information Integration, 100572. https://doi.org/10.1016/j.jii.2024.100572 Seireg, H. R., Omar, Y. M., & Elmahalawy, A. (2023, October). Multi-Agent System Based on Stacking Technique for Rice Yield Prediction. In 2023 3rd International Conference on Electronic Engineering (ICEEM) (pp. 1-6). IEEE. https://doi.org/10.1109/ICEEM58740.2023.10319559 Serraj, R., Krishnan, L., & Pingali, P. (2019). Agriculture and food systems to 2050: a synthesis. Agriculture & Food Systems to, 2050, 3-45. Retrieved from https://www.worldscientific.com/doi/pdf/10.1142/11212 #page=20 Shaikh, T. A., Rasool, T., & Lone, F. R. (2022). Towards leveraging the role of machine learning and artificial intelligence in precision agriculture and smart farming. Computers and Electronics in Agriculture, 198, 107119. https://doi.org/10.1016/j.compag.2022.107119 Shams, M. Y., Gamel, S. A., & Talaat, F. M. (2024). Enhancing crop recommendation systems with explainable artificial intelligence: a study on agricultural decision-making. Neural Computing and Applications, 1- 20. https://doi.org/10.1007/s00521-023-09391-2 Stančin, I., & Jović, A. (2019). An overview and comparison of free Python libraries for data mining and big data analysis. 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 977–982. https://doi.org/10.23919/MIPRO.2019.8757088 Subasi, A. (2020). Practical Machine Learning for Data Analysis. Academic press. http://dx.doi.org/10.1016/B978-0-12-821379-7.00001-1 Sun, F., Meng, X., Zhang, Y., Wang, Y., Jiang, H., & Liu, P. (2023). Agricultural Product Price Forecasting Methods: A Review. Agriculture, 13(9), 1671. https://doi.org/10.3390/agriculture13091671 Thotad, P. N., Bharamagoudar, G. R., & Kallur, S. S. (2023). Boosting-based machine learning approaches for diabetes prediction using Indian demographic and health survey-2021 data. https://doi.org/10.21203/rs.3.rs2784266/v1 Tripathi, A. D., Mishra, R., Maurya, K. K., Singh, R. B., & Wilson, D. W. (2019). Estimates for world population and global food availability for global health. In The role of functional food security in global health (pp. 3-24). Academic Press. https://doi.org/10.1016/B978-0-12- 813148-0.00001-3 Varoquaux, G., & Colliot, O. (2023). Evaluating machine learning models and their diagnostic value. Machine Learning for Brain Disorders, 601-630. https://doi.org/10.1007/978-1-0716-3195-9_20 Yang, L., Yu, X., Zhang, S., Zhang, H., Xu, S., Long, H., & Zhu, Y. (2023). Stacking-based and improved convolutional neural network: a new approach in rice leaf disease identification. Frontiers in Plant Science, 14, 1165940. https://doi.org/10.3389/fpls.2023.1165940 Zhang, J., Liu, J., Chen, Y., Feng, X., & Sun, Z. (2021). Knowledge mapping of machine learning approaches applied in agricultural management—a scientometric review with citespace. Sustainability, 13(14), 7662. https://doi.org/10.3390/su13147662 Zounemat-Kermani, M., Batelaan, O., Fadaee, M., & Hinkelmann, R. (2021). Ensemble machine learning paradigms in hydrology: A review. Journal of Hydrology, 598, 126266. https://doi.org/10.1016/j.jhydrol.2021.126266
PDF