Developing keyword spotting method for the Polish language

Łukasz Laszko

Developing keyword spotting method for the Polish language

Łukasz Laszko

DOI: http://dx.doi.org/10.15439/2018F178

Citation: Communication Papers of the 2018 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 17, pages 123–127 (2018)

Full text

Abstract. The paper presents the application of unsupervised method to word detection in recorded speech for the spoken Polish language. The method utilizes similarity measure between analyzed speech and a pattern synthesized from pure text. Dynamic time warping algorithm is applied for time alignment and the resulting alignment path defines an input to the classifier. The classification process involves calculation of cost function and extraction of the projected sequence of Human-Factor Cepstral Coefficients, both of which are compared with the threshold values. The results obtained after application of the method to the CLARIN-PL Mobile Corpus are encouraging to develop this method for the Polish language.

References

J. Sas, A. Żołnierek, “Pipelined language model construction for Polish speech recognition” in International Journal of Applied Mathematics and Computer Science, vol. 23, no. 3, 2013, pp. 649-668, http://dx.doi.org/10.2478/amcs-2013-0049
D. Koržinek, Ł. Brocki, “Grammar Based Automatic Speech Recognition System for the Polish Language” in R. Jabłoński, M. Turkowski, R. Szewczyk (eds), Recent Advances in Mechatronics, Springer, Berlin, Heidelberg, 2007, ISBN 978-3-540-73956-2, pp. 87-91, http://dx.doi.org/10.1007/978-3-540-73956-2_18
M. Ziółko, J. Gałka, B. Ziółko, T. Jadczyk, et al., “Automatic Speech Recognition System Dedicated for Polish” in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2011, pp. 3315-3316.
B. Ziółko, T. Jadczyk, D. Skurzok, P. Zelasko, et al, SARMATA 2.0 Automatic Polish Language Speech Recognition System, Conference: Interspeech 2015, Dresden, Germany, 2015.
M. Pol, T. Walkowiak, M. Piasecki, “Towards CLARIN-PL LTC Digital Research Platform for: Depositing, Processing, Analyzing and Visualizing Language Data” in I. Kabashkin, I. Yatskiv, O. Prentkovskis (eds), Reliability and Statistics in Transportation and Communication, RelStat 2017, Lecture Notes in Networks and Systems, vol 36, Springer, Cham, 2018, pp. 485-494, http://dx.doi.org/10.1007/978-3-319-74454-4_47.
Ł. Laszko, “Word detection in recorded speech using textual queries”, Proceedings of the 2015 Federated Conference on Computer Science and Information Systems, 2015, pp. 849-853, DOI: 10.15439/2015F341.
Ł.Laszko, “Using formant frequencies to word detection in recorded speech”, Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, 2016, pp. 797-801, http://dx.doi.org/10.15439/2016F518.
D. Koržinek, K. Marasek, Ł. Brocki, K. Wołk, “Polish Read Speech Corpus for Speech Tools and Services”, Selected papers from the CLARIN Annual Conference 2016, Aix-en-Provence, 26–28 October 2016, CLARIN Common Language Resources and Technology Infrastructure, number 136, Linköping University Electronic Press, Linköpings universitet, 2017, pp. 54–62.