Comparative Study of Multi-stage Classification Scheme for Recognition of Lithuanian Speech Emotions

Tatjana Liogiene, Gintautas Tamulevičius

DOI: http://dx.doi.org/10.15439/2016F316

Citation: Proceedings of the 2016 Federated Conference on Computer Science and Information Systems, M. Ganzha, L. Maciaszek, M. Paprzycki (eds). ACSIS, Vol. 8, pages 483–486 (2016)

Full text

Abstract. This paper presents the experimental study of multi-stage classification based recognition of Lithuanian speech emotions. Three different feature selection criterions were compared for this purpose: maximal efficiency, minimal cross-correlation feature criterions, and the sequential feature selection. A large database of spoken emotional Lithuanian language was used in this experiment -- each of 5 emotions was represented by 1000 utterances. Results of speaker-independent emotion recognition experiment show the superiority of multi stage classification using SFS technique for feature selection by 0.7-8 \%. This classification scheme gave the highest recognition accuracy and the smallest feature set. Nevertheless, increase of analyzed emotions and emotional utterances expands the size of required feature set.

References

J. Liu, C. Chen, J. Bu, M. You, and J. Tao, “Speech Emotion Recognition using an Enhanced Co-Training Algorithm,” 2007 IEEE International Conference on Multimedia and Expo, pp. 999–1002, July 2007, http://dx.doi.org/10.1109/ICME.2007.4284821.
M. Lugger, M.-E. Janoir, and B. Yang, “Combining classifiers with diverse feature sets for robust speaker independent emotion recognition,” 17th European Signal Processing Conference, pp. 1225–1229, 2009, http://dx.doi.org/10.5281/zenodo.41415.
Z. Xiao, E. Centrale, L. Chen, and W. Dou, “Recognition of emotions in speech by a hierarchical approach,” 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, pp. 1–8, September 2009, http://dx.doi.org/10.1109/ACII.2009. 5349587.
E. M. Albornoz, D. H. Milone, and H. L. Rufiner, “Spoken emotion recognition using hierarchical classifiers,” Computer Speech & Lan- guage, pp. 556–570, 2011, http://dx.doi.org/10.1016/j.csl.2010.10. 001.
C.-C. Lee, E. Mower, C. Busso, S. Lee, and S. Narayanan, “Emotion Recognition Using a Hierarchical Binary Decision Tree Approach,” Speech Communication, pp. 1162–1171, 2011, http://dx.doi.org/10.1016/j.specom.2011.06.004.
L. Chen, X. Mao, Y. Xue, and L. L. Cheng, “Speech emotion recognition: Features and classification models,” Digital Signal Processing, pp. 1154–1160, 2012, http://dx.doi.org/10.1016/j.dsp.2012.05.007.
M. Kotti and F. Paterno, “Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema,” Iternational Journal of Speech Technology, pp. 131–150, 2012, http://dx.doi.org/10.1007/s10772-012-9127-7.
W.-J. Yoon and K.-S. Park, “Building robust emotion recognition system on heterogeneous speech databases,” 2011 IEEE International Conference on Consumer Electronics, pp. 825–826, 2011, http://dx.doi.org/10.1109/TCE.2011.5955217.
A. Milton and S. Tamil Selvi, “Class-specific multiple classifiers scheme to recognize emotions from speech signals,” Computer Speech and Language, pp. 727–742, 2014, http://dx.doi.org/10.1016/j.csl.2013.08.004.
G. Tamulevicius and T. Liogiene, “Low-order multi-level features for speech emotion recognition,” Baltic Journal of Modern Computing, pp. 234–247, 2015.
J. Matuzas, T. Tišina, G. Drabavičius, and L. Markevičiūtė, “Lithuanian Spoken Language Emotions Database,” Baltic Institute of Advanced Language, 2015. [Online]. Available: http://datasets.bpti.lt/lithuanian-spoken-language-emotions-database/.
F. Eyben, M. Wollmer, and B. Schuller, “OpenEAR - Introducing the Munich open-source emotion and affect recognition toolkit,” 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, pp. 1–6, September 2009, http://dx.doi.org/10.1109/ACII.2009.5349350.