Logo PTI Logo FedCSIS

Proceedings of the 17th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 30

KnowledgePit Meets BrightBox: A Step Toward Insightful Investigation of the Results of Data Science Competitions

,

DOI: http://dx.doi.org/10.15439/2022F309

Citation: Proceedings of the 17th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 30, pages 393398 ()

Full text

Abstract. We discuss the benefits of integrating the KnowledgePit data mining competition platform with the BrightBox technology aimed at diagnostics of machine learning models. We also show how to combine solutions submitted by the competition participants in order to obtain more accurate predictions.

References

  1. A. Janusz, D. Ślęzak, S. Stawicki, and M. Rosiak, “Knowledge Pit – A Data Challenge Platform,” in Proceedings of the 24th International Workshop on Concurrency, Specification and Programming, Rzeszów, Poland, September 28-30, 2015, ser. CEUR Workshop Proceedings, vol. 1492. CEUR-WS.org, 2015, pp. 191–195. [Online]. Available: http://ceur-ws.org/Vol-1492/Paper_18.pdf
  2. A. Janusz, A. Zalewska, and D. Ślęzak, “Introducing Approximation-based Model Diagnostics into KnowledgePit – A Platform for Organizing Data Mining Challenges,” in Book of Abstracts, the 19th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2022), Milan, Italy, July 11-15, 2022, 2022, pp. 19–20. [Online]. Available: https://ipmu2022.disco.unimib.it/wp-content/uploads/sites/86/2022/07/IPMU22-book-of-abstract.pdf
  3. J. L. Zimmermann, “Data Competitions: Crowdsourcing with Data Science Platforms,” in The Machine Age of Customer Insight. Emerald Publishing Limited, 2021, pp. 183–197. [Online]. Available: https://doi.org/10.1108/978-1-83909-694-520211017
  4. C. Tauchert, P. Buxmann, and J. Lambinus, “Crowdsourcing Data Science: A Qualitative Analysis of Organizations’ Usage of Kaggle Competitions,” in Proceedings of the 53rd Hawaii International Conference on System Sciences, HICSS 2020, Maui, Hawaii, USA, January 7-10, 2020, 2020, pp. 1–10. [Online]. Available: https://doi.org/10.24251/HICSS.2020.029
  5. A. Janusz, G. Hao, D. Kałuża, T. Li, R. Wojciechowski, and D. Ślęzak, “Predicting Escalations in Customer Support: Analysis of Data Mining Challenge Results,” in 2020 IEEE International Conference on Big Data (IEEE BigData 2020), Atlanta, GA, USA, December 10-13, 2020, 2020, pp. 5519–5526. [Online]. Available: https://doi.org/10.1109/BigData50022.2020.9378024
  6. A. Janusz, D. Kałuża, A. Chądzyńska-Krasowska, B. Konarski, J. Holland, and D. Ślęzak, “IEEE BigData 2019 Cup: Suspicious Network Event Recognition,” in 2019 IEEE International Conference on Big Data (IEEE BigData), Los Angeles, CA, USA, December 9-12, 2019, 2019, pp. 5881–5887. [Online]. Available: https://doi.org/10.1109/BigData47090.2019.9005668
  7. A. Janusz, A. Jamiołkowski, and M. Okulewicz, “Predicting the Costs of Forwarding Contracts: Analysis of Data Mining Competition Results,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, vol. 30, 2022.
  8. A. Janusz, A. Krasuski, S. Stawicki, M. Rosiak, D. Śl ̨ezak, and H. S. Nguyen, “Key Risk Factors for Polish State Fire Service: A Data Mining Competition at Knowledge Pit,” in Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, Warsaw, Poland, September 7-10, 2014, ser. Annals of Computer Science and Information Systems, vol. 2, 2014, pp. 345–354. [Online]. Available: https://doi.org/10.15439/2014F507
  9. M. Meina, A. Janusz, K. Rykaczewski, D. Śl ̨ezak, B. Celmer, and A. Krasuski, “Tagging Firefighter Activities at the Emergency Scene: Summary of AAIA’15 Data Mining Competition at Knowledge Pit,” in 2015 Federated Conference on Computer Science and Information Systems, FedCSIS 2015, Łódź, Poland, September 13-16, 2015, ser. Annals of Computer Science and Information Systems, vol. 5, 2015, pp. 367–373. [Online]. Available: https://doi.org/10.15439/2015F426
  10. A. Janusz, D. Ślęzak, M. Sikora, and Ł. Wróbel, “Predicting Dangerous Seismic Events: AAIA’16 Data Mining Challenge,” in Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, FedCSIS 2016, Gdańsk, Poland, September 11-14, 2016, ser. Annals of Computer Science and Information Systems, vol. 8, 2016, pp. 205–211. [Online]. Available: https://doi.org/10.15439/2016F560
  11. A. Janusz, T. Tajmajer, and M. Świechowski, “Helping AI to Play Hearthstone: AAIA’17 Data Mining Challenge,” in Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, FedCSIS 2017, Prague, Czech Republic, September 3-6, 2017, ser. Annals of Computer Science and Information Systems, vol. 11, 2017, pp. 121–125. [Online]. Available: https://doi.org/10.15439/2017F573
  12. A. Janusz, Ł. Grad, and M. Grzegorowski, “Clash Royale Challenge: How to Select Training Decks for Win-rate Prediction,” in Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, FedCSIS 2019, Leipzig, Germany, September 1-4, 2019, ser. Annals of Computer Science and Information Systems, vol. 18, 2019, pp. 3–6. [Online]. Available: https://doi.org/10.15439/2019F365
  13. A. Janusz, T. Tajmajer, M. Świechowski, Ł. Grad, J. Puczniewski, and D. Ślęzak, “Toward an Intelligent HS Deck Advisor: Lessons Learned from AAIA’18 Data Mining Competition,” in Proceedings of the 2018 Federated Conference on Computer Science and Information Systems, FedCSIS 2018, Poznań, Poland, September 9-12, 2018, ser. Annals of Computer Science and Information Systems, vol. 15, 2018, pp. 189–192. [Online]. Available: https://doi.org/10.15439/2018F386
  14. A. Janusz, M. Przyborowski, P. Biczyk, and D. Śl ̨ezak, “Network Device Workload Prediction: A Data Mining Challenge at Knowledge Pit,” in Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, FedCSIS 2020, Sofia, Bulgaria, September 6-9, 2020, ser. Annals of Computer Science and Information Systems, vol. 21, 2020, pp. 77–80. [Online]. Available: https://doi.org/10.15439/2020F159
  15. M. Züfle and S. Kounev, “A Framework for Time Series Preprocessing and History-based Forecasting Method Recommendation,” in Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, FedCSIS 2020, Sofia, Bulgaria, September 6-9, 2020, ser. Annals of Computer Science and Information Systems, vol. 21, 2020, pp. 141–144. [Online]. Available: https://doi.org/10.15439/2020F101
  16. P. Przybyszewski, S. Dziewiątkowski, S. Jaszczur, M. Śmiech, and M. S. Szczuka, “Use of Domain Knowledge and Feature Engineering in Helping AI to Play Hearthstone,” in Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, FedCSIS 2017, Prague, Czech Republic, September 3-6, 2017, ser. Annals of Computer Science and Information Systems, vol. 11, 2017, pp. 143–148. [Online]. Available: https://doi.org/10.15439/2017F567
  17. M. Grzegorowski, “Massively Parallel Feature Extraction Framework Application in Predicting Dangerous Seismic Events,” in Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, FedCSIS 2016, Gdańsk, Poland, September 11-14, 2016, ser. Annals of Computer Science and Information Systems, vol. 8, 2016, pp. 225–229. [Online]. Available: https://doi.org/10.15439/2016F90
  18. A. Gosiewska and P. Biecek, “Auditor: An R Package for Model-Agnostic Visual Validation and Diagnostics,” The R Journal, vol. 11, no. 2, p. 85, 2019. [Online]. Available: https://doi.org/10.32614/rj-2019-036
  19. A. Skowron and D. Śl ̨ezak, “Rough Sets Turn 40: From Information Systems to Intelligent Systems,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, vol. 30, 2022.
  20. S. Stawicki, D. Ślęzak, A. Janusz, and S. Widz, “Decision Bireducts and Decision Reducts – A Comparison,” International Journal of Approximate Reasoning, vol. 84, pp. 75–109, 2017. [Online]. Available: https://doi.org/10.1016/j.ijar.2017.02.007
  21. J. W. Grzymała-Busse, “Rule Induction,” in Data Mining and Knowledge Discovery Handbook, 2nd ed. Springer, 2010, pp. 249–265. [Online]. Available: https://doi.org/10.1007/978-0-387-09823-4_13
  22. M. Matraszek, A. Janusz, M. Świechowski, and D. Śl ̨ezak, “Predicting Victories in Video Games – IEEE BigData 2021 Cup Report,” in 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, December 15-18, 2021, 2021, pp. 5664–5671. [Online]. Available: https://doi.org/10.1109/BigData52589.2021.9671650
  23. D. Ruta, M. Liu, L. Cen, and Q. H. Vu, “Diversified Gradient Boosting Ensembles for Prediction of the Cost of Forwarding Contracts,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, vol. 30, 2022.
  24. H. Xiao, Y. Liu, D. Du, and Z. Lu, “An Approach for Predicting the Costs of Forwarding Contracts using Gradient Boosting,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, vol. 30, 2022.
  25. Q. H. Vu, L. Cen, D. Ruta, and M. Liu, “Key Factors to Consider when Predicting the Costs of Forwarding Contracts,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, vol. 30, 2022.
  26. S. Pioroński and T. Górecki, “Using Gradient Boosting Trees to Predict the Costs of Forwarding Contracts,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, M. Ganzha, M. Paprzycki, and D. Ślęzak, Eds., vol. 30, 2022
  27. M. Aché, A. Janusz, K. Żbikowski, D. Śl ̨ezak, M. Kryszkiewicz, H. Rybiński, and P. Gawrysiak, “ISMIS 2017 Data Mining Competition: Trading Based on Recommendations,” in Foundations of Intelligent Systems - 23rd International Symposium, ISMIS 2017, Warsaw, Poland, June 26-29, 2017, Proceedings, ser. Lecture Notes in Computer Science, vol. 10352. Springer, 2017, pp. 697–707. [Online]. Available: https://doi.org/10.1007/978-3-319-60438-1_68
  28. T. Hastie, R. Tibshirani, J. H. Friedman, and J. H. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd Edition). Springer, 2009.