Application of Diversified Ensemble Learning in Real-life Business Problems: The Case of Predicting Costs of Forwarding Contracts

Milena Trajanoska; Pavel Gjorgovski; Eftim Zdravevski

Application of Diversified Ensemble Learning in Real-life Business Problems: The Case of Predicting Costs of Forwarding Contracts

Milena Trajanoska, Pavel Gjorgovski, Eftim Zdravevski

DOI: http://dx.doi.org/10.15439/2022F297

Citation: Proceedings of the 17th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 30, pages 437–446 (2022)

Full text

Abstract. Finding an optimal machine learning model thatcan be applied to a business problem is a complex challenge that needs to provide a balance between multiple requirements, including a high predictive performance of the model, continuous learning and deployment, and explainability of the predictions. The topic of the FedCSIS 2022 Challenge: ‘Predicting the Costs of Forwarding Contracts' is related to the challenges logistics and transportation companies are facing. To tackle this challenge, we utilized the provided datasets to establish an entire Machine Learning framework which includes domain-specific feature engineering and enrichment, generic feature transformation and extraction, model hyper-parameter tuning, and creating ensembles of traditional and deep learning models. Our contributions additionally include an analysis of the types of models which are suitable for the case of predicting a multimodal continuous target variable, as well as explainable analysis of the features which have the largest impact on predicting the value of these costs. We further show that ensembles created by combining multiple different models trained with different algorithms can improve the performance on unseen data. In this particular dataset, the experiments showed that such a combination improves the score by 3\% compared to the best performing individual model.

References

E. Zdravevski, P. Lameski, C. Apanowicz, D. Slezak, From big data to business analytics: The case study of churn prediction, Applied Soft Computing 90 (2020) 106164. https://doi.org/10.1016/j.asoc.2020.106164.
J. Bi, C. Zhang, An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme, Knowledge-Based Systems 158 (2018) 81–93. https://doi.org/10.1016/j.knosys.2018.05.037.
M. Grzegorowski, E. Zdravevski, A. Janusz, P. Lameski, C. Apanowicz, D. Slezak, Cost optimization for big data workloads based on dynamic scheduling and cluster-size tuning, Big Data Research 25 (2021) 100203. https://doi.org/10.1016/j.bdr.2021.100203.
E. Zdravevski, P. Lameski, A. Kulakov, S. Filiposka, D. Trajanov, B. Jakimovski, Parallel computation of information gain using hadoop and mapreduce, in: 2015 Federated Conference on Computer Science and Information Systems (FedCSIS), IEEE, 2015, pp. 181–192.
A. Janusz, A. Jamiołkowski, M. Okulewicz, Predicting the costs of forwarding contracts: Analysis of data mining competition results, in: Proceedings of the 17th Conference on Computer Science and Intelligence Systems, FedCSIS 2022, Sofia, Bulgaria, September 4-7, 2022, IEEE, 2022.
A. Janusz, G. Hao, D. Kaluza, T. Li, R. Wojciechowski, D. Slezak, Predicting escalations in customer support: Analysis of data mining challenge results, in: 2020 IEEE International Conference on Big Data (Big Data), IEEE, 2020, pp. 5519–5526.
A. Janusz, M. Przyborowski, P. Biczyk, D. Slezak, Network device workload prediction: A data mining challenge at knowledge pit, in: 2020 15th Conference on Computer Science and Information Systems (FedCSIS), IEEE, 2020, pp. 77–80.
A. Janusz, D. Kaluza, A. Chkadzynska-Krasowska, B. Konarski, J. Holland, D. Slezak, Ieee bigdata 2019 cup: suspicious network event recognition, in: 2019 IEEE International Conference on Big Data (Big Data), IEEE, 2019, pp. 5881–5887.
M. Matraszek, A. Janusz, M. Swiechowski, D. Slezak, Predicting victories in video games-ieee bigdata 2021 cup report, in: 2021 IEEE International Conference on Big Data (Big Data), IEEE, 2021, pp. 5664–5671.
P. Refaeilzadeh, L. Tang, H. Liu, Cross-validation., Encyclopedia of database systems 5 (2009) 532–538.
G. Behera, N. Nain, Grid search optimization (gso) based future sales prediction for big mart, in: 2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), IEEE, 2019, pp. 172–178.
S. Punia, K. Nikolopoulos, S. P. Singh, J. K. Madaan, K. Litsiou, Deep learning with long short-term memory networks and random forests for demand forecasting in multi-channel retail, International journal of production research 58 (16) (2020) 4964–4979.
A. Dutta, A. Dureja, S. Abrol, A. Dureja, et al., Prediction of ticket prices for public transport using linear regression and random forest regression methods: A practical approach using machine learning, in: International Conference on Recent Developments in Science, Engineering and Technology, Springer, 2019, pp. 140–150.
S.-J. Joo, H. Min, C. Smith, Benchmarking freight rates and procuring cost-attractive transportation services, The International Journal of Logistics Management (2017).
Z. Yang, E. E. Mehmed, Artificial neural networks in freight rate forecasting, Maritime Economics & Logistics 21 (3) (2019) 390–414.
A. Ubaid, F. Hussain, J. Charles, Modeling shipment spot pricing in the australian container shipping industry: case of asia-oceania trade lane, Knowledge-based systems 210 (2020) 106483.
S. Nataraj, C. Alvarez, L. Sada, A. Juan, J. Panadero, C. Bayliss, Applying statistical learning methods for forecasting prices and enhancing the probability of success in logistics tenders, Transportation Research Procedia 47 (2020) 529–536.
Z. Men, E. Yee, F.-S. Lien, D. Wen, Y. Chen, Short-term wind speed and power forecasting using an ensemble of mixture density neural networks, Renewable Energy 87 (2016) 203–211.
X. Ma, J. Sha, D. Wang, Y. Yu, Q. Yang, X. Niu, Study on a prediction of p2p network loan default based on the machine learning lightgbm and xgboost algorithms according to different high dimensional data cleaning, Electronic Commerce Research and Applications 31 (2018) 24–39.
C.-Y. Wang, M.-Y. Lin, Prediction of accrual expenses in balance sheet using decision trees and linear regression, in: 2016 Conference on Technologies and Applications of Artificial Intelligence (TAAI), IEEE, 2016, pp. 73–77.
T. Chen, T. He, M. Benesty, V. Khotilovich, Y. Tang, H. Cho, K. Chen, et al., Xgboost: extreme gradient boosting, R package version 0.4-2 1 (4) (2015) 1–4.
S. J. Rigatti, Random forest, Journal of Insurance Medicine 47 (1) (2017) 31–39.
E. Zdravevski, P. Lameski, R. Mingov, A. Kulakov, D. Gjorgjevikj, Robust histogram-based feature engineering of time series data, in: 2015 Federated Conference on Computer Science and Information Systems (FedCSIS), 2015, pp. 381–388. http://dx.doi.org/10.15439/2015F420.
E. Zdravevski, P. Lameski, A. Kulakov, S. Kalajdziski, Transformation of nominal features into numeric in supervised multi-class problems based on the weight of evidence parameter, in: 2015 Federated Conference on Computer Science and Information Systems (FedCSIS), 2015, pp. 169–179. http://dx.doi.org/10.15439/2015F90.
P.-E. Danielsson, Euclidean distance mapping, Computer Graphics and image processing 14 (3) (1980) 227–248.
N. R. Chopde, M. Nichat, Landmark based shortest path detection by using a* and haversine formula, International Journal of Innovative Research in Computer and Communication Engineering 1 (2) (2013) 298–302.
I. J. Myung, Tutorial on maximum likelihood estimation, Journal of mathematical Psychology 47 (1) (2003) 90–100.
J. Benesty, J. Chen, Y. Huang, I. Cohen, Pearson correlation coefficient, in: Noise reduction in speech processing, Springer, 2009, pp. 1–4.
G. Chandrashekar, F. Sahin, A survey on feature selection methods, Computers & Electrical Engineering 40 (1) (2014) 16–28.
M. B. Kursa, W. R. Rudnicki, Feature selection with the boruta package, Journal of statistical software 36 (2010) 1–13.
P. Crewson, Applied statistics handbook, AcaStat Software 1 (2006) 103–123.
E. Winter, The shapley value, Handbook of game theory with economic applications 3 (2002) 2025–2054.
T. Jayalakshmi, A. Santhakumaran, Statistical normalization and back propagation for classification, International Journal of Computer Theory and Engineering 3 (1) (2011) 1793–8201.
G. D. Hutcheson, Ordinary least-squares regression, L. Moutinho and GD Hutcheson, The SAGE dictionary of quantitative management research (2011) 224–228.
H. A. Chipman, E. I. George, R. E. McCulloch, Bayesian cart model search, Journal of the American Statistical Association 93 (443) (1998) 935–948.
A. F. Agarap, Deep learning using rectified linear units (relu), arXiv preprint https://arxiv.org/abs/1803.08375 (2018).
I. Nusrat, S.-B. Jang, A comparison of regularization techniques in deep neural networks, Symmetry 10 (11) (2018) 648.
H. Zheng, Z. Yang, W. Liu, J. Liang, Y. Li, Improving deep neural networks using softplus units, in: 2015 International joint conference on neural networks (IJCNN), IEEE, 2015, pp. 1–4.
T. G. Dietterich, Ensemble methods in machine learning, in: Interna- tional workshop on multiple classifier systems, Springer, 2000, pp. 1– 15.
I. R. White, P. Royston, A. M. Wood, Multiple imputation using chained equations: issues and guidance for practice, Statistics in medicine 30 (4) (2011) 377–399.
Z. Zhang, Missing data imputation: focusing on single imputation, Annals of translational medicine 4 (1) (2016).