Exploring Stability and Performance of hybrid Gradient Boosting Classification and Regression Models in Sectors Stock Trend Prediction: A Tale of Preliminary Success and Final Challenge
Ming Liu, Ling Cen, Dymitr Ruta, Quang Hieu Vu
DOI: http://dx.doi.org/10.15439/2024F2428
Citation: Proceedings of the 19th Conference on Computer Science and Intelligence Systems (FedCSIS), M. Bolanowski, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 39, pages 761–766 (2024)
Abstract. In the dynamic field of financial analytics, the ability to predict stock market trends is crucial for effective trading strategies which is the task for FedCSIS 2024 Data Science Challenge: Predicting Stock Trends. This paper presents a comprehensive study on the use of hybrid gradient boosting models, incorporating both classification and regression approaches, to forecast stock trends across different sectors of the S\&P 500. Utilizing a rich dataset comprising key financial indicators for 300 companies over a decade, our research aims to unravel the complexities of sector-specific trend predictions. The model leverages 58 financial indicators per company, along with their 1-year change metrics, to predict the future stock movements. In the preliminary phase of the competition, our hybrid model demonstrated promising results, achieving a score of 0.5941, ranking first among competitors. However, despite the initial success, the final phase of the model evaluation revealed a decline in performance, with a score of only 0.841500. This discrepancy highlights potential issues in model stability and generalized-ability when transitioning from a controlled to a more varied testing environment. This work not only underscores the complexities of predictive modeling in finance but also sets the stage for future research into creating more resilient AI-driven trading systems.
References
- S. Yadav and K. P. Sharma, "Statistical Analysis and Forecasting Models for Stock Market," 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC), Jalandhar, India, 2018, pp. 117-121, http://dx.doi.org/10.1109/ICSCCC.2018.8703324.
- J. Bagul, P. Warkhade, T. Gangwal and N. Mangaonkar, "ARIMA vs LSTM Algorithm – A Comparative Study Based on Stock Market Prediction," 2022 5th International Conference on Advances in Science and Technology (ICAST), Mumbai, India, 2022, pp. 49-53, http://dx.doi.org/10.1109/ICAST55766.2022.10039560.
- R. Karim, M. K. Alam and M. R. Hossain, "Stock Market Analysis Using Linear Regression and Decision Tree Regression," 2021 1st International Conference on Emerging Smart Technologies and Applications (eSmarTA), Sana’a, Yemen, 2021, pp. 1-6, http://dx.doi.org/10.1109/eSmarTA52612.2021.9515762.
- Z. Liu, Z. Dang and J. Yu, "Stock Price Prediction Model Based on RBF-SVM Algorithm," 2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC), Chongqing, China, 2020, pp. 124-127, http://dx.doi.org/10.1109/ICCEIC51584.2020.00032.
- Y. Wei and V. Chaudhary, "The Directionality Function Defect of Performance Evaluation Method in Regression Neural Network for Stock Price Prediction," 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), Sydney, NSW, Australia, 2020, pp. 769-770, http://dx.doi.org/10.1109/DSAA49011.2020.00108.
- M. Faraz, H. Khaloozadeh and M. Abbasi, "Stock Market Prediction-by-Prediction Based on Autoencoder Long Short-Term Memory Networks," 2020 28th Iranian Conference on Electrical Engineering (ICEE), Tabriz, Iran, 2020, pp. 1-5, http://dx.doi.org/10.1109/ICEE50131.2020.9261055.
- J. Creighton and F. H. Zulkernine, "Towards building a hybrid model for predicting stock indexes," 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 2017, pp. 4128-4133, http://dx.doi.org/10.1109/BigData.2017.8258433.
- A. Durgapal and V. Vimal, "Prediction of Stock Price Using Statistical and Ensemble learning Models: A Comparative Study," 2021 IEEE 8th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON), Dehradun, India, 2021, pp. 1-6, http://dx.doi.org/10.1109/UPCON52273.2021.9667644.
- P. K. Aithal, U. D. Acharya, M. Geetha, R. Sagar and R. Abraham, "A Comparative Study of Deep Neural Network and Statistical Models for Stock Price Prediction," 2022 3rd International Conference for Emerging Technology (INCET), Belgaum, India, 2022, pp. 1-5, http://dx.doi.org/10.1109/INCET54531.2022.9824487.
- X. Zheng, J. Cai and G. Zhang, "Stock Trend Prediction Based on ARIMA-LightGBM Hybrid Model," 2022 3rd Information Communication Technologies Conference (ICTC), Nanjing, China, 2022, pp. 227-231, http://dx.doi.org/10.1109/ICTC55111.2022.9778304.
- R. Jaiswal and B. Singh, "A Hybrid Convolutional Recurrent (CNN-GRU) Model for Stock Price Prediction," 2022 IEEE 11th International Conference on Communication Systems and Network Technologies (CSNT), Indore, India, 2022, pp. 299-304.
- Aleksandar M. Rakicevi, Pavle D. Milosevic, Ivana T. Dragovic, Ana M. Poledica, Milica M. Zukanovic, Andrzej Janusz, Dominik Slezak: Predicting Stock Trends Using Common Financial Indicators: FedCSIS 2024 Data Science Challenge on KnowledgePit.ai Platform, Proceedings of the 19th Conference on Computer Science and Intelligent Systems (FedCSIS), 2024.
- L. Mason, J. Baxter, P.L. Bartlett, and M. Frean. Boosting Algorithms as Gradient Descent In S.A. Solla and T.K. Leen and K. Müller. Advances in Neural Inf. Processing Sys. 12: 512–518, MIT Press, 1999.
- J.H. Friedman. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29(5): 1189-1232, 2001.
- M. Liu, L. Cen and D. Ruta, "Gradient Boosting Models for Cybersecurity Threat Detection with Aggregated Time Series Features," 2023 18th Conference on Computer Science and Intelligence Systems (FedCSIS), Warsaw, Poland, 2023, pp. 1311-1315, http://dx.doi.org/10.15439/2023F4457.
- D. Ruta, M. Liu and L. Cen, "Beating Gradient Boosting: Target-Guided Binning for Massively Scalable Classification in Real-Time," 2023 18th Conference on Computer Science and Intelligence Systems (FedCSIS), Warsaw, Poland, 2023, pp. 1301-1306, http://dx.doi.org/10.15439/2023F7166.
- D. Ruta, M. Liu, L. Cen. Feature Engineering for Predicting Frags in Tactical Games. Proc. Int. Conf. 2023 IEEE International Conference on Multimedia and Expo, 2023.
- D. Ruta, M. Liu, L. Cen and Q. Hieu Vu. Diversified gradient boosting ensembles for prediction of the cost of forwarding contracts. Proc. Int. 17th Conf. on Computer Science and Intelligence Systems, 2022.
- Q. Hieu Vu, L. Cen, D. Ruta and M. Liu. Key Factors to Consider when Predicting the Costs of Forwarding Contracts. Proc. Int. Conf. 2022 17th Conf. on Computer Science and Intelligence Systems, 2022.
- D. Ruta, L. Cen, M. Liu and Q. Hieu Vu. Automated feature engineering for prediction of victories in online computer games. Proc. Int. Conf on Big Data, 2021.
- Q. Hieu Vu, D. Ruta, L. Cen and M. Liu. A combination of general and specific models to predict victories in video games. Proc. Int. Conf. on Big Data, 2021.
- D. Ruta, L. Cen and Q. Hieu Vu. Deep Bi-Directional LSTM Networks for Device Workload Forecasting. Proc. 15th Int. Conf. Comp. Science and Inf. Sys., 2020.
- L. Cen, D. Ruta and Q. Hieu Vu. Efficient Support Vector Regression with Reduced Training Data. Proc. Fed. Conf. on Comp. Science and Inf. Sys., 2019.
- D. Ruta, L. Cen and Q. Hieu Vu. Greedy Incremental Support Vector Regression. Proc. Fed. Conf. on Computer Science and Inf. Sys., 2019.
- Q. Hieu Vu, D. Ruta and L. Cen. Gradient boosting decision trees for cyber security threats detection based on network events logs. Proc. IEEE Int. Conf. Big Data, 2019.
- L. Cen, A. Ruta, D. Ruta and Q. Hieu Vu. Regression networks for robust win-rates predictions of AI gaming bots. Int. Symp. Advances in AI and Apps (AAIA), 2018.
- Q. Hieu Vu, D. Ruta, A. Ruta and L. Cen. Predicting Win-rates of Hearthstone Decks: Models and Features that Won AAIA’2018 Data Mining Challenge. Int. Symp. Advances in Artificial Intelligence and Apps (AAIA), 2018.
- L. Cen, D. Ruta and A. Ruta. Using Recommendations for Trade Returns Prediction with Machine Learning. Int. Symp. on Methodologies for Intelligent Sys. (ISMIS), 2017.
- A. Ruta, D. Ruta and L. Cen. Algorithmic Daily Trading Based on Experts’ Recommendations. Int. Symp. on Methodologies for Intelligent Systems (ISMIS), 2017.
- Q. Hieu Vu, D. Ruta and L. Cen. An ensemble model with hierarchical decomposition and aggregation for highly scalable and robust classification. 12th Int. Symp. Advances in AI and Applications (AAIA), 2017.
- L. Cen and D. Ruta. A Map based Gender Prediction Model for Big E-Commerce Data. The 3rd IEEE Int. Conf. on Smart Data, 2017.
- D. Ruta and L. Cen. Self-Organized Predictor of Methane Concentration Warnings in Coal Mines. Proc. Int. Joint Conf. Rough Sets, LNCS, Springer, 2015.
- https://machinelearningmastery.com/hyperparameter-optimization-with-random-search-and-grid-search/.
- https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc/.