The Use of Deep Learning in Speech Enhancement

Rashmirekha Ram; Mihir Narayan Mohanty

The Use of Deep Learning in Speech Enhancement

Rashmirekha Ram, Mihir Narayan Mohanty

DOI: http://dx.doi.org/10.15439/2017KM40

Citation: Proceedings of the 2017 International Conference on Information Technology and Knowledge Management, Ajay Jaiswal, Vijender Kumar Solanki, Zhongyu (Joan) Lu, Nikhil Rajput (eds). ACSIS, Vol. 14, pages 107–111 (2017)

Full text

Abstract. Deep learning is an emerging area in current scenario. Mostly, Convolutional Neural Network (CNN) and Deep Belief Network (DBN) are used as the model in deep learning. It is termed as Deep Neural Network (DNN). The use of DNN is widely spread in many applications, exclusively for detection and classification purpose. In this paper, authors have used the same network for signal enhancement purpose. Speech is considered for the input signal with noise. The model of DNN is used with two layers. It has been compared with the ADALINE model to prove its efficacy.

References

P. Loizou, Speech Enhancement: Theory and Practice. CRC Press, 2007.
S.Haykin, Adaptive Filter Theory, Prentice Hall, Upper Saddle River, 3rd Edition, 1996.
R.Ram, M.N.Mohanty, Performance Analysis of Adaptive Algorithms for Speech Enhancement Applications, Indian Journal of Science and Technology 9(44), 2016.
S.Vihari, A.S.Murthy, P.Soni, D.C.Naik, Comparison of Speech Enhancement Algorithms, Procedia Computer Science 89, pp. 666 – 676, 2016.
L.B.Fah, A.Hussain & S.A.Samad, Speech Enhancement by Noise Cancellation Using Neural Network, IEEE Conf., 2000.
R.Ram, M.N.Mohanty, Fractional DCT ADALINE Method for Speech Enhancement, Int. Conf. on Machine Learning & Computational Intelligence, 2017. (Accepted)
A.Prieto, B.Prieto, E.M.Ortigosa, E.Ros, F. Pelayo, J.Ortega, I. Rojas, Neural Networks: An Overview of Early Research, Current Frameworks and New Challenges, Neurocomputing 214, pp.242–268, 2016.
T. Kounovsky and J. Malek, Single Channel Speech Enhancement Using Convolutional Neural Network, IEEE International Workshop of Electronics, Control, Measurement, Signals and their Application to Mechatronics (ECMSM),2017, pp.1-5.
M.Kolbaek, Z.H.Tan, J.Jensen, Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems, IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(1), 2017.
Y.Xu, J.Du, L.R.Dai, and C.H.Lee, An Experimental Study on Speech Enhancement Based on Deep Neural Networks, IEEE Signal Processing Letters 21(1), 2014.
Y.Xu, J. Du, L.R.Dai, C.H.Lee, A Regression Approach to Speech Enhancement Based on Deep Neural Networks, IEEE/ACM Transactions on Audio, Speech, and Language Processing 23(1), 2015.
R.Li, Y.Liu, Y.Shi, L.Dong, W.Cui, ILMSAF based Speech Enhancement with DNN and Noise Classification, Speech Communication 85, pp.53–70, 2016.
Y.Li, S.Kang, Deep Neural Network-Based Linear Predictive Parameter Estimations for Speech Enhancement, IET Signal Process.11 (4), pp.469-476, 2017.
T.Goehring, F.Bolner, J.Monaghan, B.Dijk, A.Zarowski, S.Bleeck, Speech Enhancement Based on Neural Networks Improves Speech Intelligibility in Noise for Cochlear Implant Users, Hearing Research 344, pp.183-194, 2017.
Y.Koizumi, K.Niwa, Y.Hioka, K.Kobayashi and Y.Haneda, DNN-Based Source Enhancement Self-Optimized By Reinforcement Learning Using Sound Quality Measurements, IEEE Conf., 2017.
R.Ram, M.N.Mohanty, Deep Neural Network based Speech Enhancement. Int. Conf. On Cognitive Informatics & Soft Computing, 2017. (Accepted)
J.C.Hou, S.S.Wang, Y.H.Lai, J.C.Lin, Y Tsao, H.W.Chang, H.M.Wang, Audio-Visual Speech Enhancement using Deep Neural Networks, Signal and Information Processing Association Annual Summit and Conference (APSIPA), Asia-Pacific, IEEE, 2016.