KnowledgePit Meets BrightBox: A Step Toward Insightful Investigation of the Results of Data Science Competitions
Andrzej Janusz, Dominik Ślęzak
DOI: http://dx.doi.org/10.15439/2022F309
Citation: Proceedings of the 17th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 30, pages 393–398 (2022)
Abstract. We discuss the benefits of integrating the KnowledgePit data mining competition platform with the BrightBox technology aimed at diagnostics of machine learning models. We also show how to combine solutions submitted by the competition participants in order to obtain more accurate predictions.
References
- A. Janusz, D. Ślęzak, S. Stawicki, and M. Rosiak, “Knowledge Pit – A Data Challenge Platform,” in Proceedings of the 24th International Workshop on Concurrency, Specification and Programming, Rzeszów, Poland, September 28-30, 2015, ser. CEUR Workshop Proceedings, vol. 1492. CEUR-WS.org, 2015, pp. 191–195. [Online]. Available: http://ceur-ws.org/Vol-1492/Paper_18.pdf
- A. Janusz, A. Zalewska, and D. Ślęzak, “Introducing Approximation-based Model Diagnostics into KnowledgePit – A Platform for Organizing Data Mining Challenges,” in Book of Abstracts, the 19th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2022), Milan, Italy, July 11-15, 2022, 2022, pp. 19–20. [Online]. Available: https://ipmu2022.disco.unimib.it/wp-content/uploads/sites/86/2022/07/IPMU22-book-of-abstract.pdf
- J. L. Zimmermann, “Data Competitions: Crowdsourcing with Data Science Platforms,” in The Machine Age of Customer Insight. Emerald Publishing Limited, 2021, pp. 183–197. [Online]. Available: https://doi.org/10.1108/978-1-83909-694-520211017
- C. Tauchert, P. Buxmann, and J. Lambinus, “Crowdsourcing Data Science: A Qualitative Analysis of Organizations’ Usage of Kaggle Competitions,” in Proceedings of the 53rd Hawaii International Conference on System Sciences, HICSS 2020, Maui, Hawaii, USA, January 7-10, 2020, 2020, pp. 1–10. [Online]. Available: https://doi.org/10.24251/HICSS.2020.029
- A. Janusz, G. Hao, D. Kałuża, T. Li, R. Wojciechowski, and D. Ślęzak, “Predicting Escalations in Customer Support: Analysis of Data Mining Challenge Results,” in 2020 IEEE International Conference on Big Data (IEEE BigData 2020), Atlanta, GA, USA, December 10-13, 2020, 2020, pp. 5519–5526. [Online]. Available: https://doi.org/10.1109/BigData50022.2020.9378024
- A. Janusz, D. Kałuża, A. Chądzyńska-Krasowska, B. Konarski, J. Holland, and D. Ślęzak, “IEEE BigData 2019 Cup: Suspicious Network Event Recognition,” in 2019 IEEE International Conference on Big Data (IEEE BigData), Los Angeles, CA, USA, December 9-12, 2019, 2019, pp. 5881–5887. [Online]. Available: https://doi.org/10.1109/BigData47090.2019.9005668
- A. Janusz, A. Jamiołkowski, and M. Okulewicz, “Predicting the Costs of Forwarding Contracts: Analysis of Data Mining Competition Results,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, vol. 30, 2022.
- A. Janusz, A. Krasuski, S. Stawicki, M. Rosiak, D. Śl ̨ezak, and H. S. Nguyen, “Key Risk Factors for Polish State Fire Service: A Data Mining Competition at Knowledge Pit,” in Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, Warsaw, Poland, September 7-10, 2014, ser. Annals of Computer Science and Information Systems, vol. 2, 2014, pp. 345–354. [Online]. Available: https://doi.org/10.15439/2014F507
- M. Meina, A. Janusz, K. Rykaczewski, D. Śl ̨ezak, B. Celmer, and A. Krasuski, “Tagging Firefighter Activities at the Emergency Scene: Summary of AAIA’15 Data Mining Competition at Knowledge Pit,” in 2015 Federated Conference on Computer Science and Information Systems, FedCSIS 2015, Łódź, Poland, September 13-16, 2015, ser. Annals of Computer Science and Information Systems, vol. 5, 2015, pp. 367–373. [Online]. Available: https://doi.org/10.15439/2015F426
- A. Janusz, D. Ślęzak, M. Sikora, and Ł. Wróbel, “Predicting Dangerous Seismic Events: AAIA’16 Data Mining Challenge,” in Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, FedCSIS 2016, Gdańsk, Poland, September 11-14, 2016, ser. Annals of Computer Science and Information Systems, vol. 8, 2016, pp. 205–211. [Online]. Available: https://doi.org/10.15439/2016F560
- A. Janusz, T. Tajmajer, and M. Świechowski, “Helping AI to Play Hearthstone: AAIA’17 Data Mining Challenge,” in Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, FedCSIS 2017, Prague, Czech Republic, September 3-6, 2017, ser. Annals of Computer Science and Information Systems, vol. 11, 2017, pp. 121–125. [Online]. Available: https://doi.org/10.15439/2017F573
- A. Janusz, Ł. Grad, and M. Grzegorowski, “Clash Royale Challenge: How to Select Training Decks for Win-rate Prediction,” in Proceedings of the 2019 Federated Conference on Computer Science and Information Systems, FedCSIS 2019, Leipzig, Germany, September 1-4, 2019, ser. Annals of Computer Science and Information Systems, vol. 18, 2019, pp. 3–6. [Online]. Available: https://doi.org/10.15439/2019F365
- A. Janusz, T. Tajmajer, M. Świechowski, Ł. Grad, J. Puczniewski, and D. Ślęzak, “Toward an Intelligent HS Deck Advisor: Lessons Learned from AAIA’18 Data Mining Competition,” in Proceedings of the 2018 Federated Conference on Computer Science and Information Systems, FedCSIS 2018, Poznań, Poland, September 9-12, 2018, ser. Annals of Computer Science and Information Systems, vol. 15, 2018, pp. 189–192. [Online]. Available: https://doi.org/10.15439/2018F386
- A. Janusz, M. Przyborowski, P. Biczyk, and D. Śl ̨ezak, “Network Device Workload Prediction: A Data Mining Challenge at Knowledge Pit,” in Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, FedCSIS 2020, Sofia, Bulgaria, September 6-9, 2020, ser. Annals of Computer Science and Information Systems, vol. 21, 2020, pp. 77–80. [Online]. Available: https://doi.org/10.15439/2020F159
- M. Züfle and S. Kounev, “A Framework for Time Series Preprocessing and History-based Forecasting Method Recommendation,” in Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, FedCSIS 2020, Sofia, Bulgaria, September 6-9, 2020, ser. Annals of Computer Science and Information Systems, vol. 21, 2020, pp. 141–144. [Online]. Available: https://doi.org/10.15439/2020F101
- P. Przybyszewski, S. Dziewiątkowski, S. Jaszczur, M. Śmiech, and M. S. Szczuka, “Use of Domain Knowledge and Feature Engineering in Helping AI to Play Hearthstone,” in Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, FedCSIS 2017, Prague, Czech Republic, September 3-6, 2017, ser. Annals of Computer Science and Information Systems, vol. 11, 2017, pp. 143–148. [Online]. Available: https://doi.org/10.15439/2017F567
- M. Grzegorowski, “Massively Parallel Feature Extraction Framework Application in Predicting Dangerous Seismic Events,” in Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, FedCSIS 2016, Gdańsk, Poland, September 11-14, 2016, ser. Annals of Computer Science and Information Systems, vol. 8, 2016, pp. 225–229. [Online]. Available: https://doi.org/10.15439/2016F90
- A. Gosiewska and P. Biecek, “Auditor: An R Package for Model-Agnostic Visual Validation and Diagnostics,” The R Journal, vol. 11, no. 2, p. 85, 2019. [Online]. Available: https://doi.org/10.32614/rj-2019-036
- A. Skowron and D. Śl ̨ezak, “Rough Sets Turn 40: From Information Systems to Intelligent Systems,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, vol. 30, 2022.
- S. Stawicki, D. Ślęzak, A. Janusz, and S. Widz, “Decision Bireducts and Decision Reducts – A Comparison,” International Journal of Approximate Reasoning, vol. 84, pp. 75–109, 2017. [Online]. Available: https://doi.org/10.1016/j.ijar.2017.02.007
- J. W. Grzymała-Busse, “Rule Induction,” in Data Mining and Knowledge Discovery Handbook, 2nd ed. Springer, 2010, pp. 249–265. [Online]. Available: https://doi.org/10.1007/978-0-387-09823-4_13
- M. Matraszek, A. Janusz, M. Świechowski, and D. Śl ̨ezak, “Predicting Victories in Video Games – IEEE BigData 2021 Cup Report,” in 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, December 15-18, 2021, 2021, pp. 5664–5671. [Online]. Available: https://doi.org/10.1109/BigData52589.2021.9671650
- D. Ruta, M. Liu, L. Cen, and Q. H. Vu, “Diversified Gradient Boosting Ensembles for Prediction of the Cost of Forwarding Contracts,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, vol. 30, 2022.
- H. Xiao, Y. Liu, D. Du, and Z. Lu, “An Approach for Predicting the Costs of Forwarding Contracts using Gradient Boosting,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, vol. 30, 2022.
- Q. H. Vu, L. Cen, D. Ruta, and M. Liu, “Key Factors to Consider when Predicting the Costs of Forwarding Contracts,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, vol. 30, 2022.
- S. Pioroński and T. Górecki, “Using Gradient Boosting Trees to Predict the Costs of Forwarding Contracts,” in Proceedings of the 2022 Federated Conference on Computer Science and Intelligence Systems, Sofia, Bulgaria, September 4-7, 2022, ser. Annals of Computer Science and Information Systems, M. Ganzha, M. Paprzycki, and D. Ślęzak, Eds., vol. 30, 2022
- M. Aché, A. Janusz, K. Żbikowski, D. Śl ̨ezak, M. Kryszkiewicz, H. Rybiński, and P. Gawrysiak, “ISMIS 2017 Data Mining Competition: Trading Based on Recommendations,” in Foundations of Intelligent Systems - 23rd International Symposium, ISMIS 2017, Warsaw, Poland, June 26-29, 2017, Proceedings, ser. Lecture Notes in Computer Science, vol. 10352. Springer, 2017, pp. 697–707. [Online]. Available: https://doi.org/10.1007/978-3-319-60438-1_68
- T. Hastie, R. Tibshirani, J. H. Friedman, and J. H. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd Edition). Springer, 2009.