Searching Stable Solutions For Stock Predictions: A Stacking Approach
Ty Gross, Arthur Allebrandt Werlang, Apeksha Poudel, Julian Roß
DOI: http://dx.doi.org/10.15439/2024F6695
Citation: Proceedings of the 19th Conference on Computer Science and Intelligence Systems (FedCSIS), M. Bolanowski, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 39, pages 745–749 (2024)
Abstract. The goal of the competition is to predict stock positions for holding, selling or buying stocks of companies from the S\&P 500. Firstly the data is read in and the missing values are imputed with the median. Categorical data is one-hot encoded. A classification approach with mainly tree based methods is used. The models used are HistGradientBoosting, XGBoost, MLP and SVC whose parameters are chosen and modified through a grid search. For the stacking the models' prediction results are summed up and the result is mapped to the three positions. It is found that the result is a bit overfitted to the competition's test data which makes sense in regard to it being a competition. The stacking improves the score drastically. Concluding it can be said that machine learning models can hint in the right direction when it comes to handling stocks but fail at giving good financial advice.
References
- A. M. Rakicevic, P. D. Milosevic, I. T. Dragovic, A. M. Poledica, M. M. Zukanovic, A. Janusz, and D. Slezak, “Predicting stock trends using common financial indicators: A summary of fedcsis 2024 data science challenge held on knowledgepit.ai platform,” in Proceedings of FedCSIS 2024, 2024.
- T. pandas development team, “pandas-dev/pandas: Pandas (latest version),” 10.5281/zenodo.3509134, Apr. 2024.
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
- Y. S. Y. Hindy, “Missforest (version 2.5.5).” https://pypi.org/project/MissForest/, Mar. 2024.
- FedCSIS, “Data science challenge: Predicting stock trends,” 2024.
- G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “Lightgbm: A highly efficient gradient boosting decision tree,” in Advances in Neural Information Processing Systems (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds.), vol. 30, Curran Associates, Inc., 2017.
- T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, ACM, Aug. 2016.
- G. E. Hinton, “Connectionist learning procedures,” Artif. Intell., vol. 40, pp. 185–234, 1989.
- J. Platt, “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods,” Adv. Large Margin Classif., vol. 10, 06 2000.
- P. Liashchynskyi and P. Liashchynskyi, “Grid search, random search, genetic algorithm: a big comparison for nas,” arXiv preprint https://arxiv.org/abs/1912.06059, 2019.
- B. Pavlyshenko, “Using stacking approaches for machine learning models,” in 2018 IEEE Second International Conference on Data Stream Mining & Processing (DSMP), pp. 255–258, 2018.