Crop Recommendation Predictive Analysis using Ensembling Techniques
DOI: https://doi.org/10.33003/jobasr-2023-v1i1-19
Muhammad Umar Abdullahi.
Morufu Olalere.
Gilbert I. O. Aimufua.
Kene Tochukwu Anyachebelu.
Bako Halilu Egga.
Abstract
Crop recommendation systems play a crucial role in modern agriculture by aiding
farmers in making well-informed choices to optimize crop yield and resource
utilization. Ensemble learning approaches can significantly improve the
effectiveness of crop recommendation systems. To achieve this, multiple
forecasts are combined from various models. In this paper, a complete Machine
Learning Pipeline is used to evaluate the performance of ensemble learning
models in crop recommendation tasks. A diverse dataset is used to select and train
four ensemble learning methods, Bagging, Voting, Stacking, and One-Vs-Rest
(OVR), as separate classifiers. The dataset includes various agricultural factors
such as soil characteristics, meteorological conditions, and past crop productivity.
Various metrics, including accuracy, precision, recall, F1-score, and support, are
utilized for each model. Bagging is considered the most effective ensemble
learning technique, demonstrating excellent levels of accuracy and overall
performance. The bagging algorithm achieves a high level of accuracy, reaching
99.32%. It also achieves perfect precision, recall, and F1-score metrics, with
values of 0.99, 1.00, and 1.00 respectively. The support value, which represents
the number of instances used for evaluation, is 141. This study provides valuable
perspectives on the choice of appropriate ensemble learning models for crop
recommendation tasks. Consequently, it enables farmers and other individuals
involved in agriculture to make well-informed choices using data, resulting in
enhanced agricultural output and sustainability.
References
Agrawal, N., Govil, H., & Kumar, T. (2024). Agricultural
land suitability classification and crop suggestion using
machine learning and spatial multicriteria decision
analysis in semi-arid ecosystem. Environment, Development and Sustainability, 1-38.
https://doi.org/10.1007/s10668-023-04440-1
Akkem, Y., Biswas, S. K., & Varanasi, A. (2023). Smart
farming using artificial intelligence: A
review. Engineering Applications of Artificial
Intelligence, 120, 105899.
https://doi.org/10.1016/j.engappai.2023.105899
Attri, I., Awasthi, L. K., Sharma, T. P., & Rathee, P.
(2023). A review of deep learning techniques used in
agriculture. Ecological Informatics, 102217.
https://doi.org/10.1016/j.ecoinf.2023.102217
Avramopoulos, I., & Vasiloglou, N. (2023). On
algorithmically boosting fixed-point computations. arXiv
preprint arXiv:2304.04665.
https://doi.org/10.48550/arXiv.2304.04665
Bach, J. (2020). When artificial intelligence becomes
general enough to understand itself. Commentary on Pei
Wang’s paper “On defining artificial intelligence”.
Journal of Artificial General Intelligence, 11(2), 15-18.
https://DOI: 10.2478/jagi-2020-0003
Behera, S., Menon, D., Shenoy, G. V., & Suresh, J. M.
(2023, June). Suggestion of Appropriate Crops Based on
Rainfall and Underground Water Analysis. In 2023 3rd
International Conference on Intelligent Technologies
(CONIT) (pp. 1-6). IEEE. doi:
0.1109/CONIT59222.2023.10205821
Biswas, S., Wardat, M., & Rajan, H. (2022, May). The art
and practice of data science pipelines: A comprehensive
study of data science pipelines in theory, in the small, and
the large. In Proceedings of the 44th International
Conference on Software Engineering (pp. 2091-2103).
https://doi.org/10.1145/3510003.3510057
Breiman, L. (1996). Bagging predictors. Machine
learning, 24, 123-140.
https://doi.org/10.1007/BF00058655
Chakauya, R., Materechera, S. A., Jiri, O., Chakauya, E.,
& Machete, M. (2023). Climate change impacts on
agriculture, adaptation and resilience: insights from local
farmers in South-East Zimbabwe. In Routledge
Handbook of Climate Change Impacts on Indigenous
Peoples and Local Communities (pp. 273-286).
Routledge. https://doi.org/10.4324/9781003356837
Chen, J., Zeb, A., Nanehkaran, Y. A., & Zhang, D.
(2023). Stacking ensemble model of deep learning for
plant disease recognition. Journal of Ambient Intelligence
and Humanized Computing, 14(9), 12359-12372.
https://doi.org/10.1007/s12652-022-04334-6
Chen, T., & Guestrin, C. (2016, August). Xgboost: A
scalable tree-boosting system. In Proceedings of the 22nd
ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining (pp. 785-794).
https://doi.org/10.1145/2939672.2939785
Davis, E. (2015). Ethical guidelines for a
superintelligence. Artificial Intelligence, 220, 121-124.
https://doi.org/10.1016/j.artint.2014.12.003
Demir, S., & Sahin, E. K. (2023). Predicting occurrence
of liquefaction-induced lateral spreading using gradient
boosting algorithms integrated with particle swarm
optimization: PSO-XGBoost, PSO-LightGBM, and PSOCatBoost. Acta Geotechnica, 18(6), 3403-3419.
https://doi.org/10.1007/s11440-022-01777-1
De-Zarzà, I., de Curtò, J., Hernández-Orallo, E., &
Calafate, C. T. (2023). Cascading and Ensemble
Techniques in Deep Learning. Electronics, 12(15), 3354.
https://doi.org/10.3390/electronics12153354
Emami, S., & Martínez-Muñoz, G. (2023). Sequential
Training of Neural Networks with Gradient Boosting.
IEEE Access.
https://doi.org/10.1109/ACCESS.2023.3271515
Everitt, T., Goertzel, B., & Potapov, A. (2017). Artificial
general intelligence. Lecture Notes in Artificial
Intelligence. Heidelberg: Springer.
https://doi.org/10.1007/978-3-319-63703-7
Falcon, W. P., Naylor, R. L., & Shankar, N. D. (2022).
Rethinking global food demand for 2050. Population and
Development Review, 48(4), 921-957.
https://doi.org/10.1111/padr.12508
Flasiński, M. (2016). Introduction to artificial
intelligence. Springer. https://doi: 10.1007/978-3-319-
40022-8
Ganaie, M. A., Hu, M., Malik, A. K., Tanveer, M., &
Suganthan, P. N. (2022). Ensemble deep learning: A
review. Engineering Applications of Artificial
Intelligence, 115, 105151.
https://doi.org/10.1016/j.engappai.2022.105151
González, S., García, S., Del Ser, J., Rokach, L., &
Herrera, F. (2020). A practical tutorial on bagging and
boosting based ensembles for machine learning:
Algorithms, software tools, performance study, practical perspectives and opportunities. Information Fusion, 64,
205-237. https://doi.org/10.1016/j.inffus.2020.07.007
Gul, N., Mashwani, W. K., Aamir, M., Aldahmani, S., &
Khan, Z. (2023). Optimal model selection for k-nearest
neighbours ensemble via sub-bagging and sub-sampling
with feature weighting. Alexandria Engineering Journal,
72, 157-168. https://doi.org/10.1016/j.aej.2023.03.075
Hajian-Tilaki K. Receiver Operating Characteristic
(ROC) Curve Analysis for Medical Diagnostic Test
Evaluation. Caspian J Intern Med. 2013 Spring;4(2):627-
35. PMID: 24009950; PMCID: PMC3755824.
Han, R., Yoon, H., Kim, G., Lee, H., & Lee, Y. (2023).
Revolutionizing Medicinal Chemistry: The Application
of Artificial Intelligence (AI) in Early Drug Discovery.
Pharmaceuticals (Basel, Switzerland), 16(9), 1259.
https://doi.org/10.3390/ph16091259.
Hutter, M. (2012). One decade of universal artificial
intelligence. In Theoretical foundations of artificial
general intelligence (pp. 67-88). Paris: Atlantis Press.
https://doi.org/10.2991/978-94-91216-62-6_5
Jia, J., Liang, W., & Liang, Y. (2023). A Review of
Hybrid and Ensemble in Deep Learning for Natural
Language Processing. arXiv preprint arXiv:2312.05589.
https://doi.org/10.48550/arXiv.2312.05589
Jitpakdeebodin, W., & Sinapiromsaran, K. (2023,
March). Random forest algorithm using quartile-pattern
bootstrapping for a class imbalanced problem. In
Proceedings of the 2023 5th International Conference on
Image, Video and Signal Processing (pp. 191-196).
https://doi.org/10.1145/3591156.3591184
Kalimuthu, M., Vaishnavi, P., & Kishore, M. (2020,
August). Crop prediction using machine learning. In 2020
third international conference on smart systems and
inventive technology (ICSSIT) (pp. 926-932). IEEE.
https://doi.org/10.1109/ICSSIT48917.2020.9214190
Khaki, S., & Wang, L. (2019). Crop yield prediction
using deep neural networks. Frontiers in plant
science, 10, 452963.
https://doi.org/10.3389/fpls.2019.00621
Kom, Z., Nethengwe, N. S., Mpandeli, N. S., & Chikoore,
H. (2022). Determinants of small-scale farmers’ choice
and adaptive strategies in response to climatic shocks in
Vhembe District, South Africa. GeoJournal, 87(2), 677-
700. https://doi.org/10.1007/s10708-020-10272-7
Liang, W., & Liu, Y. (2023). Rating Crop Insurance
Contracts with Model Stacking of Gaussian Processes.
Limpo, S. Y., Fahmid, I. M., Fattah, A., Rauf, A. W.,
Surmaini, E., Muslimin, ... & Andri, K. B. (2022).
Integrating indigenous and scientific knowledge for
decision making of rice farming in South Sulawesi,
Indonesia. Sustainability, 14(5), 2952.
https://doi.org/10.3390/su14052952
Maganathan, T., Senthilkumar, S., & Balakrishnan, V.
(2020, November). Machine learning and data analytics
for environmental science: a review, prospects and
challenges. In IOP conference series: materials science
and engineering (Vol. 955, No. 1, p. 012107). IOP
Publishing. https://doi.org/10.1088/1757-
899X/955/1/012107
Malek, N. H. A., Yaacob, W. F. W., Wah, Y. B., Nasir,
S. A. M., Shaadan, N., & Indratno, S. W. (2023).
Comparison of ensemble hybrid sampling with bagging
and boosting machine learning approach for imbalanced
data. Indones. J. Elec. Eng. Comput. Sci, 29, 598-608.
https://doi.org/10.11591/ijeecs.v29.i1.pp598-608
Miller, T., Lewita, K., Krzemińska, A., Kozlovska, P.,
Jawor, M., Cembrowska-Lech, D., & Kisiel, A. (2023).
Boosting modern society: advancements and applications
of the adaboost algorithm in diverse domains. Scientific
Collection «InterConf», (152), 549–555. Retrieved from
https://archive.interconf.center/index.php/conferenceproceeding/article/view/3214
Mohammed, A., & Kora, R. (2023). A comprehensive
review on ensemble deep learning: Opportunities and
challenges. Journal of King Saud University-Computer
and Information Sciences, 35(2), 757-774.
https://doi.org/10.1016/j.jksuci.2023.01.014
Nalluri, M., Pentela, M., & Eluri, N. R. (2020). A
Scalable Tree Boosting System: XG Boost. Int. J. Res.
Stud. Sci. Eng. Technol, 7, 36-51.
https://doi.org/10.22259/2349-476X.0712005
Nosrati, V., & Rahmani, M. (2023). HMDE‐FS: A
homogeneous distributed ensemble feature selection
framework based on resampling with/without
replacement. Concurrency and Computation: Practice
and Experience, 35(7), e7613.
https://doi.org/10.1002/cpe.7613
Nti, I. K., Zaman, A., Nyarko-Boateng, O., Adekoya, A.
F., & Keyeremeh, F. (2023). A predictive analytics model
for crop suitability and productivity with tree-based ensemble learning. Decision Analytics Journal, 8,
100311. https://doi.org/10.1016/j.dajour.2023.100311
Palanivel, K., & Surianarayanan, C. (2019). An Approach
for Prediction of Crop Yield Using Machine Learning and
Big Data Techniques (2019). International Journal of
Computer Engineering and Technology 10(3), pp. 110-
118, 2019, Available at
SSRN: https://ssrn.com/abstract=3555087
Petropoulos, F., & Siemsen, E. (2023). Forecast selection
and representativeness. Management Science, 69(5),
2672-2690. https://doi.org/10.1287/mnsc.2022.4485
Pintor, M., Demetrio, L., Sotgiu, A., Melis, M.,
Demontis, A., & Biggio, B. (2019). secml: A Python
Library for Secure and Explainable Machine Learning.
arXiv preprint arXiv:1912.10013.
https://doi.org/10.1016/j.softx.2022.101095
Rana, M., Chandorkar, P., Dsouza, A., & Kazi, N. (2015).
Breast Cancer Diagnosis and Recurrence Prediction
Using Machine Learning Techniques. International
Journal of Research in Engineering and Technology,
4(4), 372–376. eISSN: 2319-1163 | pISSN: 2321-7308
Redhu, N. S., Thakur, Z., Yashveer, S., & Mor, P. (2022).
Artificial intelligence: a way forward for agricultural
sciences. In Bioinformatics in Agriculture (pp. 641-668).
Academic Press. https://doi.org/10.1016/B978-0-323-
89778-5.00007-6
Ricciardi, V., Ramankutty, N., Mehrabi, Z., Jarvis, L., &
Chookolingo, B. (2018). An open-access dataset of crop
production by farm size from agricultural censuses and
surveys. Data in brief, 19, 1970-1988.
https://doi.org/10.1016/j.dib.2018.06.057
Sajitha, P., Andrushia, A. D., Anand, N., & Naser, M. Z.
(2024). A Review on Machine Learning and Deep
Learning Image-based Plant Disease Classification for
Industrial Farming Systems. Journal of Industrial
Information Integration, 100572.
https://doi.org/10.1016/j.jii.2024.100572
Seireg, H. R., Omar, Y. M., & Elmahalawy, A. (2023,
October). Multi-Agent System Based on Stacking
Technique for Rice Yield Prediction. In 2023 3rd
International Conference on Electronic Engineering
(ICEEM) (pp. 1-6). IEEE.
https://doi.org/10.1109/ICEEM58740.2023.10319559
Serraj, R., Krishnan, L., & Pingali, P. (2019). Agriculture
and food systems to 2050: a synthesis. Agriculture &
Food Systems to, 2050, 3-45. Retrieved from
https://www.worldscientific.com/doi/pdf/10.1142/11212
#page=20
Shaikh, T. A., Rasool, T., & Lone, F. R. (2022). Towards
leveraging the role of machine learning and artificial
intelligence in precision agriculture and smart
farming. Computers and Electronics in Agriculture, 198,
107119. https://doi.org/10.1016/j.compag.2022.107119
Shams, M. Y., Gamel, S. A., & Talaat, F. M. (2024).
Enhancing crop recommendation systems with
explainable artificial intelligence: a study on agricultural
decision-making. Neural Computing and Applications, 1-
20. https://doi.org/10.1007/s00521-023-09391-2
Stančin, I., & Jović, A. (2019). An overview and
comparison of free Python libraries for data mining and
big data analysis. 42nd International Convention on
Information and Communication Technology,
Electronics and Microelectronics (MIPRO), 977–982.
https://doi.org/10.23919/MIPRO.2019.8757088
Subasi, A. (2020). Practical Machine Learning for Data
Analysis. Academic press.
http://dx.doi.org/10.1016/B978-0-12-821379-7.00001-1
Sun, F., Meng, X., Zhang, Y., Wang, Y., Jiang, H., & Liu,
P. (2023). Agricultural Product Price Forecasting
Methods: A Review. Agriculture, 13(9), 1671.
https://doi.org/10.3390/agriculture13091671
Thotad, P. N., Bharamagoudar, G. R., & Kallur, S. S.
(2023). Boosting-based machine learning approaches for
diabetes prediction using Indian demographic and health
survey-2021 data. https://doi.org/10.21203/rs.3.rs2784266/v1
Tripathi, A. D., Mishra, R., Maurya, K. K., Singh, R. B.,
& Wilson, D. W. (2019). Estimates for world population
and global food availability for global health. In The role
of functional food security in global health (pp. 3-24).
Academic Press. https://doi.org/10.1016/B978-0-12-
813148-0.00001-3
Varoquaux, G., & Colliot, O. (2023). Evaluating machine
learning models and their diagnostic value. Machine
Learning for Brain Disorders, 601-630.
https://doi.org/10.1007/978-1-0716-3195-9_20
Yang, L., Yu, X., Zhang, S., Zhang, H., Xu, S., Long, H.,
& Zhu, Y. (2023). Stacking-based and improved
convolutional neural network: a new approach in rice leaf
disease identification. Frontiers in Plant Science, 14,
1165940. https://doi.org/10.3389/fpls.2023.1165940 Zhang, J., Liu, J., Chen, Y., Feng, X., & Sun, Z. (2021).
Knowledge mapping of machine learning approaches
applied in agricultural management—a scientometric
review with citespace. Sustainability, 13(14), 7662.
https://doi.org/10.3390/su13147662
Zounemat-Kermani, M., Batelaan, O., Fadaee, M., &
Hinkelmann, R. (2021). Ensemble machine learning
paradigms in hydrology: A review. Journal of
Hydrology, 598, 126266.
https://doi.org/10.1016/j.jhydrol.2021.126266
PDF