...
首页> 外文期刊>International journal of speech technology >Mask-based blind source separation and MVDR beamforming in ASR
【24h】

Mask-based blind source separation and MVDR beamforming in ASR

机译:ASR中基于掩模的盲源分离和MVDR波束形成

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents a front-end enhancement system for automatic speech recognition to address the cocktail party problem. Cocktail party problem is focus on recognizing the target speech when multiple speakers talk in the noisy real-environments. Many conventional techniques have been proposed. In this work, we propose a new framework to integrate the conventional blind source separation and minimum variance distortionless response beamformer for the speech enhancement and source separation of the recent CHiME-5 challenge. In our experiments, we found that the time-frequency (T-F) mask estimation strategy based on the BSS algorithm should be different for speech enhancement and source separation. The main difference is that whether we need to account for background noise as an additional class during T-F mask estimation. Experimental results showed that the proposed framework was very beneficial to improve the speech recognition performance on the Single-array-track of CHiME-5. We obtained relative 13.5% WER reduction than the official baseline system by only improving the front-end speech enhancement framework.
机译:本文提出了一种用于自动语音识别的前端增强系统,以解决鸡尾酒会问题。鸡尾酒会的问题主要是在嘈杂的真实环境中,当多个说话者讲话时识别目标语音。已经提出了许多常规技术。在这项工作中,我们提出了一个新的框架,以集成传统的盲源分离和最小方差无失真响应波束形成器,以用于最近的CHiME-5挑战的语音增强和源分离。在我们的实验中,我们发现基于BSS算法的时频(T-F)掩码估计策略在语音增强和源分离方面应该有所不同。主要区别在于在T-F掩码估计期间是否需要将背景噪声作为额外的类别考虑。实验结果表明,提出的框架对于提高CHiME-5单阵列轨道上的语音识别性能非常有益。通过仅改进前端语音增强框架,我们获得了比官方基准系统相对降低13.5%的WER。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号