## An Approach for Predicting the Costs of Forwarding Contracts using Gradient Boosting

### Haitao Xiao, Yuling Liu, Dan Du, Zhigang Lu

DOI: http://dx.doi.org/10.15439/2022F292

Citation: Proceedings of the 17th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 30, pages 451–454 (2022)

Abstract. Predicting the cost of forwarding contract is a severe challenge to road transport management system. The transportation cost of a forwarding contract often depends on many factors. It is hard for humans to evaluate the various factors in transportation and calculate the cost of forwarding contract. In this paper, we propose an approach to address such a problem by following the sequence of machine learning steps which consist of data analysis, feature engineering and model construction. First, we conduct a detailed analysis of the given data. Then, we generate effective features to characterize the cost of forwarding contract and eliminate redundant features. Finally, in the model construction phase, we propose a gradient boosting decision tree based method to train and predict the cost of forwarding contract. The proposed approach achieves RMSE scores of 0.1391 on the test set, which is the 2nd final score in the competition.

### References

- P. Anitha and M. M. Patil, “Forecasting of transportation cost for logistics data,” in 2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT). IEEE, 2021. http://dx.doi.org/10.1109/CONECCT52877.2021.9622576 pp. 01–06.
- R. Leszczyna, “Cost of cybersecurity management,” in Cybersecurity in the Electricity Sector. Springer, 2019, pp. 127–147.
- D. Chakraborty, H. Elhegazy, H. Elzarka, and L. Gutierrez, “A novel construction cost prediction model using hybrid natural and light gradient boosting,” Advanced Engineering Informatics, vol. 46, p. 101201, 2020. http://dx.doi.org/10.1016/j.aei.2020.101201
- M. A. Morid, O. R. L. Sheng, K. Kawamoto, T. Ault, J. Dorius, and S. Abdelrahman, “Healthcare cost prediction: Leveraging fine-grain temporal patterns,” Journal of biomedical informatics, vol. 91, p. 103113, 2019. http://dx.doi.org/10.1016/j.jbi.2019.103113
- (2022, Jun.) Fedcsis 2022 challenge: Predicting the costs of forwarding contracts. [Online]. Available: https://knowledgepit.ml/fedcsis-2022-challenge/
- A. Janusz, A. Jamiołkowski, and M. Okulewicz, “Predicting the costs of forwarding contracts: Analysis of data mining competition results,” in Proceedings of the 17th Conference on Computer Science and Intelligence Systems, FedCSIS 2022, Sofia, Bulgaria, September 4-7, 2022. IEEE, 2022.
- J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Annals of statistics, pp. 1189–1232, 2001. http://dx.doi.org/10.1214/aos/1013203451
- J. Friedman, T. Hastie, and R. Tibshirani, “Additive logistic regression: a statistical view of boosting,” The annals of statistics, vol. 28, no. 2, pp. 337–407, 2000. http://dx.doi.org/10.1214/aos/1016218223
- H. Xiao, Y. Liu, D. Du, and Z. Lu, “Wp-gbdt: An approach for winner prediction using gradient boosting decision tree,” in 2021 IEEE International Conference on Big Data (Big Data), 2021. http://dx.doi.org/10.1109/BigData52589.2021.9671688 pp. 5691–5698.
- S. Tyree, K. Q. Weinberger, K. Agrawal, and J. Paykin, “Parallel boosted regression trees for web search ranking,” in Proceedings of the 20th international conference on World wide web, 2011. http://dx.doi.org/10.1145/1963405.1963461 pp. 387–396.
- T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 2016. http://dx.doi.org/10.1145/2939672.2939785 pp. 785–794.
- G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “Lightgbm: A highly efficient gradient boosting decision tree,” Advances in neural information processing systems, vol. 30, pp. 3146–3154, 2017.
- L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, “Catboost: unbiased boosting with categorical features,” in Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, pp. 6639–6649.
- Z.-H. Zhou, Ensemble learning. Springer, 2021, pp. 181–210.
- P. Refaeilzadeh, L. Tang, and H. Liu, “Cross-validation,” Encyclopedia of database systems, vol. 5, pp. 532–538, 2009.