Logo PTI Logo FedCSIS

Proceedings of the 18th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 35

Boosting conversational AI correctness by accounting for ASR errors using a sequence to sequence model

,

DOI: http://dx.doi.org/10.15439/2023F9627

Citation: Proceedings of the 18th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 35, pages 13251328 ()

Full text

Abstract. This paper describes the winning submission to the challenge CAICCAIC: Center for Artificial Intelligence Challenge on Conversational AI Correctness. The aim of the challenge was to design a mechanism of natural language understanding capable of interpreting user prompts. The prompts were the output of an automatic speech recognition system and therefore contained errors. In this scenario, it was necessary to apply techniques of accounting for these errors. As per the results of the challenge, the most effective technique proved to be an original use of a sequence to sequence model.

References

  1. could also be corrected. R EFERENCES M. Kubis, P. Skórzewski, M. Sowański, T. Ziętkiewicz, “CAICCAIC: Centre for Artificial Intelligence Challenge on Conversational AI Correctness”, Proceedings of FedCSIS 2023, 2023
  2. F. Graliński, R. Jaworski, Ł. Borchmann and P. Wierzchoń, “Gonito.net - Open Platform for Research Competition, Cooperation and Reproducibility” Proceedings of the 4REAL Workshop: Workshop on Research Results Reproducibility and Resources Citation in Science and Technology of Language / Branco António, Calzolari Nicoletta Paris, France, European Language Resources, 2016, pp. 13–20
  3. F. Graliński, R. Jaworski, Ł. Borchmann and P. Wierzchoń, “A semi-automatic method for thematic classification of documents in a large text corpus” Mambrini Francesco, Passarotti Marco, Sporleder Caroline : Proceedings of the Workshop on Corpus-Based Research in the Humanities (CRH), Warszawa, Institute of Computer Science, Polish Academy of Sciences, 2015, pp. 13–21
  4. F. Graliński, Ł. Borchmann, R. Jaworski and P. Wierzchoń, “The RetroC challenge: How to guess the publication year of a text?” Anatonacopoulos Apostolos (red.): DATeCH 2017: Proceedings of the 2nd International Conference on Digital Access to Textual Cultural Heritage,New York, Association for Computing Machinery, 2017
  5. M. Faruqui and D. Hakkani-Tür, “Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems”. Computational Linguistics vol. 48 (1), 2022, pp. 221–232
  6. J. Lichtarge, C. Alberti, S. Kumar, N. Shazeer, N. Parmar and S. Tong. “Corpora generation for grammatical error correction.” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Volume 1, 2019, pp. 3291–3301. https://doi.org/10.18653/v1/N19-1333
  7. L. Velikovich, I. Williams, J. Scheiner, P. Aleksic, P. Moreno, and M. Riley. 2018. “Semantic lattice processing in contextual automatic speech recognition for Google Assistant.” Proceedings of Interspeech, 2018, pp. 2222–2226.
  8. H. Chung, L. Hou, S. Longpre, B. Zoph, Y. Tay, W Fedus et al. “Scaling Instruction-Finetuned Language Models”, https://arxiv.org/pdf/2210.11416.pdf, 2022
  9. Y. Tang, C. Tran, X. Li, P. Chen, N. Goyal, V. Chaudhary et al., “Multilingual Translation with Extensible Multilingual Pretraining and Finetuning”, https://arxiv.org/abs/2008.00401, 2022
  10. Z. Xie, A. Avati, N. Arivazhagan, D. Jurafsky, A. Ng, “Neural Language Correction with Character-Based Attention. “, https://arxiv.org/abs/1603.09727, 2016