Experimenting with manual and automated data mining pipelines on the FedCSIS 2024 Data Science Challenge
Max Lautenbach, Jusztina Judák, Luisa Buck, Marc Furier, Okan Mert Göktepe, Gregor Münker
DOI: http://dx.doi.org/10.15439/2024F6884
Citation: Proceedings of the 19th Conference on Computer Science and Intelligence Systems (FedCSIS), M. Bolanowski, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 39, pages 751–754 (2024)
Abstract. This paper reviews the 5th-best solution and results of the FedCSIS 2024 Data Science Challenge, which aimed to predict stock trends using financial indicators. It details the preprocessing, modelling, and tuning approaches and demonstrates, as well as the methods and techniques used to address the prediction problem effectively. Subsequently, the results of different experiments, including hyperparameter optimization on preprocessing steps and switching between different prediction targets, could be compared to manual experiments. Overall, a manually experienced model could be found to outperform hyperparameter-tuned pipelines.
References
- Christo El Morr, Manar Jammal, Hossam Ali-Hassan, and Walid EI-Hallak. Machine Learning for Practical Decision Making: A Multidisciplinary Perspective with Applications from Healthcare, Engineering and Business Analytics. Springer International Publishing, 2022. ISBN 9783031169908. http://dx.doi.org/10.1007/978-3-031-16990-8. URL http://dx.doi.org/10.1007/978-3-031-16990-8.
- Frederic Lardinois, Matthew Lynley, and John Mannes. Google is acquiring data science community kaggle. URL https://techcrunch.com/2017/03/07/google-is-acquiring-data-science-community-kaggle/?guccounter=1&guce_referrer=aHR0cHM6Ly9kZS53aWtpcGVkaWEub3JnLw&guce_referrer_sig=AQAAAEu9gSzQHtMGz1fxcvTfrr5VGV41GmfxVdjjnmodYOzlHNhlxLXWNY7by5UshvhMOqu7rfB4Qcx05Z5fi8vMGelVAxyorBLu--6UN1lxAG_nNgSdNy1MNv9L3m92Fxlz8kIr5YF1Kjv9z2ErFaqh3qeHzl_2_QiWylNrJMEJsK4L.
- Pau Rodríguez, Miguel A. Bautista, Jordi Gonzàlez, and Sergio Escalera. Beyond one-hot encoding: Lower dimensional target embedding. Image and Vision Computing, 75:21–31, July 2018. ISSN 0262-8856. http://dx.doi.org/10.1016/j.imavis.2018.04.004.
- Andre Ye and Zian Wang. Modern Deep Learning for Tabular Data: Novel Approaches to Common Modeling Problems. Apress, 2023. ISBN 9781484286920. http://dx.doi.org/10.1007/978-1-4842-8692-0.