Logo PTI Logo FedCSIS

Federated Conference on Computer Science and Intelligence Systems

September 17–20, 2023. Warsaw, Poland

Preproceedings of the 2023 Federated Conference on Computer Science and Intelligence Systems

Complete FedCSIS Preproceedings (PDF, 75.102 M)

Preface

Dear Reader, it is our pleasure to present to you

FedCSIS 2023 consisted of the following events:

Furthermore,

Each event constituting FedCSIS had its own Organizing and Program Committee. We would like to express our warmest gratitude to members of all of them for their hard work attracting and later refereeing 379 submissions.

Co-Chairs of the FedCSIS Conference Series

Maria Ganzha, Warsaw University of Technology, Poland and Systems Research Institute Polish Academy of Sciences, Warsaw, Poland.

Leszek Maciaszek, Wrocł‚aw University of Economics and Business, Wrocł‚aw, Poland and Macquarie University, Sydney, Australia.

Marcin Paprzycki, Systems Research Institute Polish Academy of Sciences, Warsaw Poland and Management Academy, Warsaw, Poland.

Dominik Ślęzak, Institute of Informatics, University of Warsaw, Poland and QED Software, Poland and DeepSeas, USA.

Hide Preface

Main Track

Main Track Invited Contributions

Main Track Regular Papers

Main Track Short Papers

Technical Tracks

Preface to Technical Tracks

Dear Reader, it is our pleasure to present to you

FedCSIS 2023 consisted of the following events:

Furthermore,

Each event constituting FedCSIS had its own Organizing and Program Committee. We would like to express our warmest gratitude to members of all of them for their hard work attracting and later refereeing 379 submissions.

Co-Chairs of the FedCSIS Conference Series

Maria Ganzha, Warsaw University of Technology, Poland and Systems Research Institute Polish Academy of Sciences, Warsaw, Poland.

Leszek Maciaszek, Wrocł‚aw University of Economics and Business, Wrocł‚aw, Poland and Macquarie University, Sydney, Australia.

Marcin Paprzycki, Systems Research Institute Polish Academy of Sciences, Warsaw Poland and Management Academy, Warsaw, Poland.

Dominik Ślęzak, Institute of Informatics, University of Warsaw, Poland and QED Software, Poland and DeepSeas, USA.

Hide Preface to Technical Tracks

Technical Tracks Regular Papers

Technical Tracks Short Papers

Competitions

PolEval

Preface to PolEval

PolEval is an annual NLP challenge organized since 2017. The choice of the name of the challenge was deliberate: as most research concentrates on the most popular languages (especially English), the aim of PolEval was to promote work on processing Polish. By focusing on Polish, it actively promotes the creation of new resources in this language, facilitates further research and contributes to creating new and improved methods and models for Polish.

The goal of PolEval is thus to:

  • develop established procedures for evaluating systems solving a wide range of tasks in NLP,
  • create annotated datasets that can be used for training and evaluation of systems,
  • objectively compare systems performing various tasks in the field of natural language processing,
  • bring researchers from the scientific and business communities closer together and exchanging knowledge between them,
  • facilitate popularization of NLP issues in the context of the Polish language.

To achieve these goals, PolEval proposes a well-formulated task framework, in which the scope, input data, expected output data, evaluation methods, training and test data are prepared by the organizers. This way the challenge aims to be a platform for objective comparison of methods, models and systems for processing Polish.

Hide Preface to PolEval

Cybersecurity Threat Detection in the Behavior of IoT Devices

Preface to Cybersecurity Threat Detection in the Behavior of IoT Devices

Cybersecurity Threat Detection in the Behavior of IoT Devices was the 9th competition organized in association with the FedCSIS conference series at KnowledgePit.ai. The goal was to detect attacks on IoT devices on the basis of data provided by Efigo company, describing the changing behavior of devices, and known moments of cyberattack attempts. The data set was generated as a part of a SPINET project aiming at improving the cybersecurity of IoT device networks.

The increasing significance of many IoT device applications motivates scientists to develop techniques for cyber safety improvement in many ways. In our case, based on the changing device behavior, it was expected that the profile of processes running on the device should change during the attack attempts.

The competition data was collected in a simulated environment - IoT devices were emulated in a separated network, where attacking servers were also plugged in. The scenario of attacks was known, so it was possible to tag the behavioral data as “normal” and “unusual”. Based on that information participants tried to develop classification models to predict whether the device is being attacked or not.

The top four competitor groups were invited to submit a paper describing their solutions to our special event at the FedCSIS 2023 conference. These papers are included in this chapter of the conference proceedings and are preceded by a paper describing in detail the competition, authored by the organizers. The most of presented approaches were based on gradient-boosting algorithms. That is not surprising - such models play an essential role in different fields of application. However, they are not so easy to interpret which may cause difficulties in better understanding the nature of IoT devices' behavior change during cyberattacks.

Andrzej Janusz

Marcin Michalak

Hide Preface to Cybersecurity Threat Detection in the Behavior of IoT Devices

Center for Artificial Intelligence Challenge on Conversational AI Correctness

Preface to Center for Artificial Intelligence Challenge on Conversational AI Correctness

Center Center for Artificial Intelligence Challenge on Conversational AI Correctness was organized as part of the 1st Symposium on Challenges for Natural Language Processing. The goal of this competition was to develop Natural Language Understanding models that are robust against speech recognition errors.

Regardless of near-human accuracy of Automatic Speech Recognition in general-purpose transcription tasks, speech recognition errors can significantly deteriorate the performance of a Natural Language Understanding model that follows the speech-to-text module in a virtual assistant. The problem is even more apparent when an ASR system from an external vendor is used as an integral part of a conversational system without any further adaptation. The contestants were expected to develop Natural Language Understanding models that maintain satisfactory performance despite the presence of ASR errors in the input.

The data for the competition consist of natural language utterances along with semantic frames that represent the commands targeted at a virtual assistant. The approach used to prepare the data for the challenge was meant to promote models robust to various types of errors in the input, making it impossible to solve the task by simply learning a shallow mapping from incorrectly recognized words to the correct ones. It reflects real-world scenarios where the NLU system is presented with inputs that exhibit various disturbances due to changes in the ASR model, acoustic conditions, speaker variation, and other causes.

This chapter includes the paper discussing the objectives, evaluation rules and results of the competition, authored by the organizers followed by the detailed description of the leading solution contributed by the winners of the challenge.

Hide Preface to Center for Artificial Intelligence Challenge on Conversational AI Correctness

Temporal Image Caption Retrieval Competition

Preface to Temporal Image Caption Retrieval Competition

Temporal Image Caption Retrieval Competition was organized as part of the 1st Symposium on Challenges for Natural Language Processing. The goal of the competition was, given a picture from a newspaper and the newspaper's publication daily date, to retrieve a picture caption from a given caption set.

Multimodal models, especially combining vision and text, are gaining great recognition. One such multimodal challenge is Text-Image retrieval, which is to retrieve an image for a text query or retrieve a text for a given image. In this challenge, we introduce a task in the Text-Image retrieval setup, additionally extending the modalities with temporal data.

Language models rarely utilize any input information except for text. E.g additional data could be a text domain, document timestamp, website URL, or other metadata information. However, models trained solely on text data may be limited in usage. Additional temporal information is useful when factual knowledge is required, but the facts change over time.

The presented task is based on the Chronicling America [Lee, B. C. G., Mears, J., Jakeway, E., Ferriter, M., Adams, C., Yarasavage, N., … & Weld, D. S. (2020). The Newspaper Navigator Dataset: Extracting Headlines and Visual Content from 16 Million Historic Newspaper Pages in Chronicling America. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM '20). Association for Computing Machinery, New York, NY, USA, 3055–3062.] and Challenging America [Pokrywka, J., Gralinski, F., Jassem, K., Kaczmarek, K., Jurkiewicz, K., & Wierzchoń, P. (2022, July). Challenging America: Modeling language in longer time scales. In Findings of the Association for Computational Linguistics: NAACL 2022 (pp. 737-749).] projects. Chronicling America is an open database of over 16 million pages of digitized historic American newspapers covering 274 years. Challenging America is a set of temporal challenges built from the Chronicling America dataset.

This chapter includes the paper discussing the objectives, evaluation rules and results of the competition, authored by the organizers followed by the detailed description of the leading solution contributed by the winners of the challenge.

Hide Preface to Cybersecurity Threat Detection in the Behavior of IoT Devices
TeXnical Editor: Aleksander Denisiuk
E-mail:
Phone/fax: +48-89-5246089