首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Using Separate Losses for Speech and Noise in Mask-Based Speech Enhancement
【24h】

Using Separate Losses for Speech and Noise in Mask-Based Speech Enhancement

机译:在基于蒙版的语音增强中使用单独的语音和噪声损失

获取原文

摘要

Estimating time-frequency domain masks for speech enhancement using deep learning approaches has recently become a popular field in research. In this paper, we propose a novel components loss (CL) for the training of neural networks for speech enhancement. During the training process, the proposed CL offers separate control over suppression of the noise component and preservation of the speech component. We illustrate the potential of the proposed CL by example of a convolutional neural network (CNN) for mask-based speech enhancement. We show improvement in almost all employed instrumental quality metrics over the baseline losses, which comprises the conventional mean squared error (MSE) loss and also perceptual evaluation of speech quality (PESQ) loss. On average, more than 0.3 dB higher SNR improvement and an at least 0.1 points higher PESQ score on the speech component are obtained. In addition to that, a more naturally sounding residual noise and a consistently best PESQ on the enhanced speech is obtained. All improvements are more distinct at low SNR conditions.
机译:使用深度学习方法估计用于语音增强的时频域掩码最近已成为研究中的热门领域。在本文中,我们提出了一种新颖的成分损失(CL),用于训练用于语音增强的神经网络。在训练过程中,提出的CL提供了对噪声分量的抑制和语音分量的保留的单独控制。我们通过卷积神经网络(CNN)的示例说明了基于面罩的语音增强所提出的CL的潜力。我们显示,与基线损失相比,几乎所有使用的仪器质量指标都有所改善,其中包括常规的均方误差(MSE)损失以及语音质量(PESQ)损失的感知评估。平均而言,在语音分量上可获得SNR改进高出0.3 dB以上和PESQ得分高出至少0.1点。除此之外,在增强的语音上获得了听起来更自然的残留噪声和始终如一的最佳PESQ。在低SNR条件下,所有改进都更加明显。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号