Stacking Ensemble Machine Learning Modelling for Milk Yield Prediction Based on Biological Characteristics and Feeding Strategies

Ruiming Xing; Baihua Li; Shirin Dora; Michael Whittaker; Janette Mathie

Stacking Ensemble Machine Learning Modelling for Milk Yield Prediction Based on Biological Characteristics and Feeding Strategies

Ruiming Xing, Baihua Li, Shirin Dora, Michael Whittaker, Janette Mathie

DOI: http://dx.doi.org/10.15439/2024F9318

Citation: Proceedings of the 19th Conference on Computer Science and Intelligence Systems (FedCSIS), M. Bolanowski, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 39, pages 701–706 (2024)

Full text

Abstract. Knowing expected milk yield can help dairy farmers in better decision-making and management. The objective of this study was to build and compare predictive models to forecast daily milk yield over a long duration. A machine-learning pipeline was provided and five baseline models as well as a novel stacking model were developed for the prediction of milk yield on the CowNflow dataset using 414 Holstein cattle records collected from 1983 to 2019. Four different feature selection methods were performed to evaluate the essential biological characteristics and feeding-related features which affect milk yield. The results showed that the overall performance of predictive models improved after proper feature selection, with an $R^{2}$ value increased to 0.811, and a root mean squared error (RMSE) decreased to 3.627. The stacking model achieved the best performance with an $R^{2}$ value of 0.85, a mean absolute error (MAE) of 2.537 and an RMSE of 3.236. This research provides benchmark information for the prediction of milk yield on the CowNflow dataset and identifies useful factors such as dry matter (DM) intake and lactation month in long-term milk yield prediction.

References

OECD, Food, and A. O. of the United Nations, OECD-FAO Agricultural Outlook 2018-2027. OECD, 2018.
M. Cockburn, “Review: Application and prospective discussion of machine learning for the management of dairy farms,” Animals, vol. 10, no. 9, 2020. http://dx.doi.org/10.3390/ani10091690
M. Lopez-Suarez, E. Armengol, S. Calsamiglia, and L. Castillejos, “Using decision trees to extract patterns for dairy culling management,” in Artificial Intelligence Applications and Innovations. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-92007-8_20
S. Wolfert, L. Ge, C. Verdouw, and M.-J. Bogaardt, “Big data in smart farming–a review,” Agricultural systems, 2017. http://dx.doi.org/10.1016/j.agsy.2017.01.023
A. Saha and S. Bhattacharyya, “Artificial insemination for milk production in india: A statistical insight,” Indian Journal of Animal Sciences, vol. 90, no. 8, 2020. http://dx.doi.org/10.56093/ijans.v90i8.109314
F. Zhang, J. Upton, L. Shalloo, P. Shine, and M. D. Murphy, “Effect of introducing weather parameters on the accuracy of milk production forecast models,” Information Processing in Agriculture, vol. 7, no. 1, pp. 120–138, 2020. http://dx.doi.org/10.1016/j.inpa.2019.04.004
G. M. Dallago, D. M. de Figueiredo, P. C. de Resende Andrade, R. A. dos Santos, R. Lacroix, D. E. Santschi, and D. M. Lefebvre, “Predicting first test day milk yield of dairy heifers,” Computers and Electronics in Agriculture, vol. 166, p. 105032, 2019. http://dx.doi.org/10.1016/j.compag.2019.105032
M. Salamone, I. Adriaens, A. Vervaet, G. Opsomer, H. Atashi, V. Fievez, B. Aernouts, and M. Hostens, “Prediction of first test day milk yield using historical records in dairy cows,” animal, vol. 16, no. 11, p. 100658, 2022. http://dx.doi.org/10.1016/j.animal.2022.100658
Q. T. Nguyen, R. Fouchereau, E. Frenod, C. Gerard, and V. Sincholle, “Comparison of forecast models of production of dairy cows combining animal and diet parameters,” Computers and Electronics in Agriculture, vol. 170, p. 105258, 2020. http://dx.doi.org/10.1016/j.compag.2020.105258
H. Radwan, H. El Qaliouby, and E. A. Elfadl, “Classification and prediction of milk yield level for holstein friesian cattle using parametric and non-parametric statistical classification models,” Journal of Advanced Veterinary and Animal Research, vol. 7, no. 3, 2020. http://dx.doi.org/10.5455/javar.2020.g438
A. K. Sharma, R. Sharma, and H. Kasana, “Prediction of first lactation 305-day milk yield in karan fries dairy cattle using ann modeling,” Applied Soft Computing, vol. 7, no. 3, 2007. http://dx.doi.org/10.1016/j.asoc.2006.07.002
V. Dongre, R. Gandhi, A. Singh, and A. Ruhil, “Comparative efficiency of artificial neural networks and multiple linear regression analysis for prediction of first lactation 305-day milk yield in sahiwal cattle,” Live-stock Science, vol. 147, no. 1-3, 2012. http://dx.doi.org/10.1016/j.livsci.2012.04.002
D. Njubi, J. Wakhungu, and M. Badamana, “Use of test-day records to predict first lactation 305-day milk yield using artificial neural network in kenyan holstein–friesian dairy cows,” Tropical animal health and production, vol. 42, 2010. http://dx.doi.org/10.1007/s11250-009-9468-7
S. Sugiono, R. Soenoko, and D. P. Andriani, “Analysis the relationship of physiological, environmental, and cow milk productivity using ai,” in 2016 International Conference on Data and Software Engineering (ICoDSE). IEEE, 2016. http://dx.doi.org/10.1109/ICODSE.2016.7936165 pp. 1–6.
B. Ji, T. Banhazi, C. J. Phillips, C. Wang, and B. Li, “A machine learning framework to predict the next month’s daily milk yield, milk composition and milking frequency for cows in a robotic dairy farm,” biosystems engineering, vol. 216, pp. 186–197, 2022. http://dx.doi.org/10.1016/j.biosystemseng.2022.02.013
S. Džeroski and B. Ženko, “Is combining classifiers with stacking better than selecting the best one?” Machine learning, 2004. http://dx.doi.org/10.1023/B:MACH.0000015881.36452.6e
M. Ferreira, R. Delagarde, and N. Edouard, “Cownflow: A dataset on nitrogen flows and balances in dairy cows fed maize forage or herbage-based diets,” Data in Brief, 2021. http://dx.doi.org/10.1016/j.dib.2021.107393
J. Li, K. Cheng, S. Wang, F. Morstatter, R. P. Trevino, J. Tang, and H. Liu, “Feature selection: A data perspective,” ACM computing surveys (CSUR), vol. 50, no. 6, 2017. http://dx.doi.org/10.1145/3136625
L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, “Catboost: unbiased boosting with categorical features,” Advances in neural information processing systems, vol. 31, 2018. http://dx.doi.org/10.48550/arXiv.1706.09516
S. Raschka, “Model evaluation, model selection, and algorithm selection in machine learning,” 2018. http://dx.doi.org/10.48550/arXiv.1811.12808