Loading...

Predicting grain yield in a novel soft winter wheat (Triticum aestivum L.) population using spike traits and machine learning approaches


Citation :- Predicting grain yield in a novel soft winter wheat (Triticum aestivum L.) population using spike traits and machine learning approaches. Res. Crop. 26: 429-437
GHEBRIEL O DEKIN, VALERY A BURLUTSKY AND TUMUZGHI TESFAY ghebrielokba@gmail.com
Address : RUDN University, 6 Miklukho-Maklaya St 117198, Moscow, Russia Federation
Submitted Date : 28-06-2025
Accepted Date : 30-07-2025

Abstract

Accurate yield prediction in soft winter wheat breeding requires precise quantification of spike architecture traits, which are key determinants of grain production. The study aimed to predict wheat grain yield using machine learning, taking spike attributes as input variables. The study was conducted on a novel hybrid population of winter wheat comprising 6,999 spikes, cultivated at Kaluga, Moscow, Russia, during the 2021-22 study period. Spike morphology traits (length, weight, spikelet counts) and derived metrics such as density distributions, fertility %, attributive value, and grain yield coefficient were employed with machine learning models: partial least squares (PLS), random forest (RF), and gradient boosting (GB). Python 3.11.7 was used for modelling, and the dataset was randomly split into training (70%) and testing (30%) data, fitted, trained, and predictions were made.  Performance of models was evaluated using mean squared error (MSE), mean absolute error (MAE), root MSE (RMSE), R2, and ratio of performance to deviation (RPD). Grain weight was strongly correlated with density-based traits (spike weight (r = 0.594), grain number (r = 0.353), grain weight density (r = 0.617) and physiological traits such as fertility percentage (r = 0.523) and attributive value (r = 0.674). In predictive modelling, the PLS formed the best with RMSE = 0.1110, R² = 0.8957, followed by RF and GB models, respectively. RF prioritised structural traits such as spike length and spikelet number, whereas the GB model emphasised physiological traits like attributive value. The findings highlight the importance of integrating both structural and physiological trait optimisation in wheat breeding.

Keywords

Crop yield prediction machine learning partial least square winter wheat spike morphology

References

Ajilogba, C. F. and Walker, S. (2023). Using crop modeling to find solutions for wheat diseases: A review. Front. Environ. Sci10: doi:10.3389/fenvs.2022.987765.
Amgain, L. P., Poudel, M. R., Adhikari, S. and Dhakal, D. (2024). Trends of agro-climatic variability and multi-year prediction of rice and wheat yields under the changing climatic scenarios using DSSAT crop model in Nepalese western Terai. Res. Crop. 25: 369-78.
Chengzhi, C., Jidong, C. and Wenfang, C. (2024). Potential yields of two staple cereal crops worldwide under global warming. Farm. Manage. 9: 1-11.
Choudhury, A. K., Ishtiaque, S., Sen, R., Jahan, M. A. H. S., Akhter, S., Ahmed, F., Kalra, N. et al. (2018). Calibration and validation of DSSAT model for simulating wheat yield in Bangladesh.  Saudi J. Life Sci. 3: 356-64. doi:10.21276/haya.2018.3.4.3.
Erenstein, O., Jaleta, M., Mottaleb, K. A., Sonder, K., Donovan, J. and Braun, H. J. (2022). Global trends in wheat production, consumption and trade. In: Wheat improvement: food security in a changing climate (Eds.). Springer International Publishing, Cham. pp. 47–66.
Gavasso-Rita, Y. L., Papalexiou, S. M., Li, Y., Elshorbagy, A., Li, Z. and Schuster‐Wallace, C. (2024). Crop models and their use in assessing crop production and food security: A review. Food Energy Secur. 13: doi:10.1002/fes3.503.
Giraldo, P., Benavente, E., Manzano-Agugliaro, F. and Gimenez, E. (2019). Worldwide research trends on wheat and barley: A bibliometric comparative analysis. Agronomy 9: doi:10.3390/agronomy9070352.
Gómez, D., Salvador, P., Sanz, J. and Casanova, J. L. (2021). Modelling wheat yield with antecedent information, satellite and climate data using machine learning methods in Mexico. Agric. Forest Met300: doi:10.1016/j.agrformet.2020.108317.
Grinsztajn, L., Oyallon, E. and Varoquaux, G. (2022). Why do tree-based models still outperform deep learning on typical tabular data? Adv. Neural Inf. Process. Syst. 35: 507-20.
Hammers, M. (2022). Analysis of wheat spike characteristics using image analysis, machine learning, and genomics. Master's thesis, Colorado State University, U.S.A.
Hu, Y., Wei, X., Hao, M., Fu, W., Zhao, J. and Wang, Z. (2018). Partial least squares regression for determining factors controlling winter wheat yield. Agron. J. 110: 281–92.
Huber, F., Yushchenko, A., Stratmann, B. and Steinhage, V. (2022). Extreme gradient boosting for yield estimation compared with deep learning approaches. Comput. Electr. Agric202: doi:10.1016/j.compag.2022.107346.
Iman Ahmadi, Abdolmahdi Bakhshandeh and Mohmmad Hossein Gharineh (2023). Modelling the interaction of urea fertilizer and herbicide doses on wheat yield in competition with weed. Crop Res. 58: 1-11.
Iqbal, N., Shahzad, M. U., Sherif, E. S. M., Tariq, M. U., Rashid, J., Le, T. V. and Ghani, A. (2024). Analysis of wheat-yield prediction using machine learning models under climate change scenarios. Sustainability 16: doi:10.3390/su16166976.
Li, L., Hassan, M. A., Yang, S., Jing, F., Yang, M., Rasheed, A., Wang, J., Xia, X., He, Z. and Xiao, Y. (2022). Development of image-based wheat spike counter through a Faster R-CNN algorithm and application for genetic studies. Crop J. 10: 1303–11.
Liakos, K. G., Busato, P., Moshou, D., Pearson, S. and Bochtis, D. (2018). Machine learning in agriculture: A review. Sensors 18: doi:10.3390/s18082674.
Lou, Z., Lu, X. and Li, S. (2024). Yield prediction of winter wheat at different growth stages based on machine learning. Agronomy 14: doi:10.3390/agronomy14081834.
Nayak, H. S., Silva, J. V., Parihar, C. M., Krupnik, T. J., Sena, D. R., Kakraliya, S. K., Sapkota, T. B. et al. (2022). Interpretable machine learning methods to explain on-farm yield variability of high productivity wheat in Northwest India. Field Crops Res. 287: doi:10.1016/j.fcr.2022.108640.
Patil, Y., Fathima, R., Sundarajan, S., Ponmalar, P. S. and Ramachandran, H. (2024). Impact of feature selection on wheat yield prediction using machine learning. IJDNE 19: 1909–17.
Philipp, N., Weichert, H., Bohra, U., Weschke, W., Schulthess, A. W. and Weber, H. (2018). Grain number and grain yield distribution along the spike remain stable despite breeding for high yield in winter wheat. PloS One 13: doi:10.1371/journal.pone.0205452.
Sánchez, J. C. M., Mesa, H. G. A., Espinosa, A. T., Castilla, S. R. and Lamont, F. G. (2025). Improving wheat yield prediction through variable selection using support vector regression, random forest, and extreme gradient boosting. Smart Agric. Technol. 10: doi:10.2139/ ssrn.5021080.
Shen, Y., Mercatoris, B., Liu, Q., Yao, H., Li, Z., Chen, Z. and Wang, W. (2024). Use self-training random forest for predicting winter wheat yield. Remote Sensing 16: doi:10.3390/rs16244723.
Timlin, D., Paff, K. and Han, E. (2024). The role of crop simulation modeling in assessing potential climate change impacts. Agrosyst. Geosci. Environ. 7doi:10.1002/agg2.20453.
Van Klompenburg, T., Kassahun, A. and Catal, C. (2020). Crop yield prediction using machine learning: A systematic literature review. Comput. Electro. Agric. 177: doi:10.1016/j.compag. 2020.105709.
Viscarra Rossel, R. A., Taylor, H. J. and McBratney, A. B. (2007). Multivariate calibration of hyperspectral γ‐ray energy spectra for proximal soil sensing. Eur. J. Soil Sci58: 343-53.
Wajid, A., Hussain, K., Ilyas, A., Habib-ur-Rahman, M., Shakil, Q. and Hoogenboom, G. (2021). Crop models: Important tools in decision support system to manage wheat production under vulnerable environments. Agriculture 11: doi:10.3390/agriculture11111166.
Wang, D., Li, R., Liu, T., Sun, C. and Guo, W. (2023). Estimation of agronomic characters of wheat based on variable selection and machine learning algorithms. Agronomy 13: doi:10.3390/agronomy13112808.
Williams, P. C. (2001). Implementation of near infrared technology. In P. C. Williams, and K. H. Norris (Eds.), Near infrared technology in the agricultural and food industries. American Association of Cereal Chemists. pp. 145-70.
Zhang, P. P., Zhou, X. X., Wang, Z. X., Mao, W., Li, W. X., Yun, F., Guo, W. S. and Tan, C. W. (2020). Using HJ-CCD image and PLS algorithm to estimate the yield of field-grown winter wheat. Sci. Rep. 10: doi:10.1038/s41598-020-62125-5.

Global Footprints