...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >A Regression Approach to Speech Enhancement Based on Deep Neural Networks
【24h】

A Regression Approach to Speech Enhancement Based on Deep Neural Networks

机译:基于深度神经网络的语音增强回归方法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In contrast to the conventional minimum mean square error (MMSE)-based noise reduction techniques, we propose a supervised method to enhance speech by means of finding a mapping function between noisy and clean speech signals based on deep neural networks (DNNs). In order to be able to handle a wide range of additive noises in real-world situations, a large training set that encompasses many possible combinations of speech and noise types, is first designed. A DNN architecture is then employed as a nonlinear regression function to ensure a powerful modeling capability. Several techniques have also been proposed to improve the DNN-based speech enhancement system, including global variance equalization to alleviate the over-smoothing problem of the regression model, and the dropout and noise-aware training strategies to further improve the generalization capability of DNNs to unseen noise conditions. Experimental results demonstrate that the proposed framework can achieve significant improvements in both objective and subjective measures over the conventional MMSE based technique. It is also interesting to observe that the proposed DNN approach can well suppress highly nonstationary noise, which is tough to handle in general. Furthermore, the resulting DNN model, trained with artificial synthesized data, is also effective in dealing with noisy speech data recorded in real-world scenarios without the generation of the annoying musical artifact commonly observed in conventional enhancement methods.
机译:与传统的基于最小均方误差(MMSE)的降噪技术相比,我们提出了一种监督方法,该方法通过在基于深度神经网络(DNN)的噪声与纯净语音信号之间找到映射函数来增强语音。为了能够在现实情况下处理各种加性噪声,首先设计了一个大型训练集,其中包含语音和噪声类型的许多可能组合。然后,将DNN架构用作非线性回归函数,以确保强大的建模能力。还提出了多种技术来改进基于DNN的语音增强系统,包括全局方差均衡以缓解回归模型的过度平滑问题,以及辍学和噪声感知训练策略来进一步提高DNN的泛化能力,以改善DNN的语音增强系统。看不见的噪音情况。实验结果表明,与传统的基于MMSE的技术相比,所提出的框架可以在客观和主观测量方面实现显着改进。有趣的是,提出的DNN方法可以很好地抑制高度不稳定的噪声,这通常很难处理。此外,用人工合成数据训练的结果DNN模型也可以有效地处理在真实场景中记录的嘈杂语音数据,而不会产生常规增强方法中通常观察到的烦人的音乐伪像。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号