Speech enhancement using multiple deep neural networks

Karjol, P.; Kumar, M.A.; Ghosh, P.K.

Please use this identifier to cite or link to this item: https://idr.l3.nitk.ac.in/jspui/handle/123456789/6616

Title:	Speech enhancement using multiple deep neural networks
Authors:	Karjol, P. Kumar, M.A. Ghosh, P.K.
Issue Date:	2018
Citation:	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2018, Vol.2018-April, , pp.5049-5052
Abstract:	In this work, we present a variant of multiple deep neural network (DNN) based speech enhancement method. We directly estimate clean speech spectrum as a weighted average of outputs from multiple DNNs. The weights are provided by a gating network. The multiple DNNs and the gating network are trained jointly. The objective function is set as the mean square logarithmic error between the target clean spectrum and the estimated spectrum. We conduct experiments using two and four DNNs using the TIMIT corpus with nine noise types (four seen noises and five unseen noises) taken from the AURORA database at four different signal-to-noise ratios (SNRs). We also compare the proposed method with a single DNN based speech enhancement scheme and existing multiple DNN schemes using segmental SNR, perceptual evaluation of speech quality (PESQ) and short-term objective intelligibility (STOI) as the evaluation metrics. These comparisons show the superiority of proposed method over baseline schemes in both seen and unseen noises. Specifically, we observe an absolute improvement of 0.07 and 0.04 in PESQ measure compared to single DNN when averaged over all noises and SNRs for seen and unseen noise cases respectively. � 2018 IEEE.
URI:	http://idr.nitk.ac.in/jspui/handle/123456789/6616
Appears in Collections:	2. Conference Papers

Files in This Item:

There are no files associated with this item.

Show full item record