Logo PTI Logo FedCSIS

Proceedings of the 19th Conference on Computer Science and Intelligence Systems (FedCSIS)

Annals of Computer Science and Information Systems, Volume 39

Experimenting with manual and automated data mining pipelines on the FedCSIS 2024 Data Science Challenge

, , , , ,

DOI: http://dx.doi.org/10.15439/2024F6884

Citation: Proceedings of the 19th Conference on Computer Science and Intelligence Systems (FedCSIS), M. Bolanowski, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 39, pages 751754 ()

Full text

Abstract. This paper reviews the 5th-best solution and results of the FedCSIS 2024 Data Science Challenge, which aimed to predict stock trends using financial indicators. It details the preprocessing, modelling, and tuning approaches and demonstrates, as well as the methods and techniques used to address the prediction problem effectively. Subsequently, the results of different experiments, including hyperparameter optimization on preprocessing steps and switching between different prediction targets, could be compared to manual experiments. Overall, a manually experienced model could be found to outperform hyperparameter-tuned pipelines.

References

  1. Christo El Morr, Manar Jammal, Hossam Ali-Hassan, and Walid EI-Hallak. Machine Learning for Practical Decision Making: A Multidisciplinary Perspective with Applications from Healthcare, Engineering and Business Analytics. Springer International Publishing, 2022. ISBN 9783031169908. http://dx.doi.org/10.1007/978-3-031-16990-8. URL http://dx.doi.org/10.1007/978-3-031-16990-8.
  2. Frederic Lardinois, Matthew Lynley, and John Mannes. Google is acquiring data science community kaggle. URL https://techcrunch.com/2017/03/07/google-is-acquiring-data-science-community-kaggle/?guccounter=1&guce_referrer=aHR0cHM6Ly9kZS53aWtpcGVkaWEub3JnLw&guce_referrer_sig=AQAAAEu9gSzQHtMGz1fxcvTfrr5VGV41GmfxVdjjnmodYOzlHNhlxLXWNY7by5UshvhMOqu7rfB4Qcx05Z5fi8vMGelVAxyorBLu--6UN1lxAG_nNgSdNy1MNv9L3m92Fxlz8kIr5YF1Kjv9z2ErFaqh3qeHzl_2_QiWylNrJMEJsK4L.
  3. Pau Rodríguez, Miguel A. Bautista, Jordi Gonzàlez, and Sergio Escalera. Beyond one-hot encoding: Lower dimensional target embedding. Image and Vision Computing, 75:21–31, July 2018. ISSN 0262-8856. http://dx.doi.org/10.1016/j.imavis.2018.04.004.
  4. Andre Ye and Zian Wang. Modern Deep Learning for Tabular Data: Novel Approaches to Common Modeling Problems. Apress, 2023. ISBN 9781484286920. http://dx.doi.org/10.1007/978-1-4842-8692-0.