Using Separate Losses for Speech and Noise in Mask-Based Speech Enhancement

机译：在基于蒙版的语音增强中使用单独的语音和噪声损失

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Estimating time-frequency domain masks for speech enhancement using deep learning approaches has recently become a popular field in research. In this paper, we propose a novel components loss (CL) for the training of neural networks for speech enhancement. During the training process, the proposed CL offers separate control over suppression of the noise component and preservation of the speech component. We illustrate the potential of the proposed CL by example of a convolutional neural network (CNN) for mask-based speech enhancement. We show improvement in almost all employed instrumental quality metrics over the baseline losses, which comprises the conventional mean squared error (MSE) loss and also perceptual evaluation of speech quality (PESQ) loss. On average, more than 0.3 dB higher SNR improvement and an at least 0.1 points higher PESQ score on the speech component are obtained. In addition to that, a more naturally sounding residual noise and a consistently best PESQ on the enhanced speech is obtained. All improvements are more distinct at low SNR conditions.

机译：使用深度学习方法估计用于语音增强的时频域掩码最近已成为研究中的热门领域。在本文中，我们提出了一种新颖的成分损失（CL），用于训练用于语音增强的神经网络。在训练过程中，提出的CL提供了对噪声分量的抑制和语音分量的保留的单独控制。我们通过卷积神经网络（CNN）的示例说明了基于面罩的语音增强所提出的CL的潜力。我们显示，与基线损失相比，几乎所有使用的仪器质量指标都有所改善，其中包括常规的均方误差（MSE）损失以及语音质量（PESQ）损失的感知评估。平均而言，在语音分量上可获得SNR改进高出0.3 dB以上和PESQ得分高出至少0.1点。除此之外，在增强的语音上获得了听起来更自然的残留噪声和始终如一的最佳PESQ。在低SNR条件下，所有改进都更加明显。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2020年|7519-7523|共5页
会议地点
作者
Ziyi Xu; Samy Elshamy; Tim Fingscheidt;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Mask-based speech enhancement; noise reduction; components loss function; CNN;

机译：基于模板的语音增强;降噪;成分损失函数; CNN;

相似文献

外文文献
中文文献
专利

1. Mechanisms of Localization and Speech Perception with Colocated and Spatially Separated Noise and Speech Maskers Under Single-Sided Deafness with a Cochlear Implant [J] . Dirks Coral, Nelson Peggy B., Sladen Douglas P., Ear and hearing. . 2019,第6期

机译：用耳蜗植入物在单面耳聋下分配和空间分离的噪声和语音掩蔽的定位和语音感知机制
2. Minimum mean square error estimator for speech enhancement in additive noise assuming Weibull speech priors and speech presence uncertainty [J] . Mojtaba Bahrami, Neda Faraji International journal of speech technology . 2021,第1期

机译：威布尔语音Priors和语音存在不确定性的添加剂噪声语音增强的最小均方误差估计
3. Combination of GMM-Based Speech Estimation Method and Temporal Domain SVD-Based Speech Enhancement for Noise Robust Speech Recognition [J] . Masakiyo Fujimoto, Yasuo Ariki Systems and Computers in Japan . 2007,第3期

机译：基于GMM的语音估计方法与基于时域SVD的语音增强相结合的噪声鲁棒语音识别
4. Using Separate Losses for Speech and Noise in Mask-Based Speech Enhancement [C] . Ziyi Xu, Samy Elshamy, Tim Fingscheidt IEEE International Conference on Acoustics, Speech and Signal Processing . 2020

机译：基于掩模语音增强中的语音和噪声使用单独的损失
5. THE EXTRACTION OF FEATURES FROM A SPEECH SIGNAL CORRUPTED BY ADDITIVE NOISE AND THEIR USE FOR SPEECH ENHANCEMENT. [D] . WALICKI, JACEK STANISLAW. 1981

机译：从具有附加噪声的语音信号中提取特征，并将其用于语音增强。
6. Congruent Visual Speech Enhances Cortical Entrainment to Continuous Auditory Speech in Noise-Free Conditions [O] . Michael J. Crosse, John S. Butler, Edmund C. Lalor 2015

机译：一致的视觉语音增强了在无噪声情况下连续听觉语音的皮质夹带
7. Analysis of very low quality speech for mask-based enhancement [O] . Gonzalez Sira 2013

机译：分析基于掩模的增强的非常低质量的语音

Using Separate Losses for Speech and Noise in Mask-Based Speech Enhancement

摘要

著录项

相似文献

相关主题

期刊订阅