Mask-based blind source separation and MVDR beamforming in ASR

Renke He; Yanhua Long; Yijie Li; Jiaen Liang

首页> 外文期刊>International journal of speech technology >Mask-based blind source separation and MVDR beamforming in ASR

【24h】

Mask-based blind source separation and MVDR beamforming in ASR

机译：ASR中基于掩模的盲源分离和MVDR波束形成

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a front-end enhancement system for automatic speech recognition to address the cocktail party problem. Cocktail party problem is focus on recognizing the target speech when multiple speakers talk in the noisy real-environments. Many conventional techniques have been proposed. In this work, we propose a new framework to integrate the conventional blind source separation and minimum variance distortionless response beamformer for the speech enhancement and source separation of the recent CHiME-5 challenge. In our experiments, we found that the time-frequency (T-F) mask estimation strategy based on the BSS algorithm should be different for speech enhancement and source separation. The main difference is that whether we need to account for background noise as an additional class during T-F mask estimation. Experimental results showed that the proposed framework was very beneficial to improve the speech recognition performance on the Single-array-track of CHiME-5. We obtained relative 13.5% WER reduction than the official baseline system by only improving the front-end speech enhancement framework.

机译：本文提出了一种用于自动语音识别的前端增强系统，以解决鸡尾酒会问题。鸡尾酒会的问题主要是在嘈杂的真实环境中，当多个说话者讲话时识别目标语音。已经提出了许多常规技术。在这项工作中，我们提出了一个新的框架，以集成传统的盲源分离和最小方差无失真响应波束形成器，以用于最近的CHiME-5挑战的语音增强和源分离。在我们的实验中，我们发现基于BSS算法的时频（T-F）掩码估计策略在语音增强和源分离方面应该有所不同。主要区别在于在T-F掩码估计期间是否需要将背景噪声作为额外的类别考虑。实验结果表明，提出的框架对于提高CHiME-5单阵列轨道上的语音识别性能非常有益。通过仅改进前端语音增强框架，我们获得了比官方基准系统相对降低13.5％的WER。

著录项

来源
《International journal of speech technology》 |2020年第1期|133-140|共8页
作者
Renke He; Yanhua Long; Yijie Li; Jiaen Liang;
展开▼
作者单位

Department of Electronical and Information Engineering Shanghai Normal University Shanghai 200234 China;

Unisound AI Technology Co. Ltd.. Beijing 100089 China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Cocktail party problem; MVDR; BSS; T-F masking; Speech enhancement;

机译：鸡尾酒会问题;MVDR;BSS;T-F遮罩;语音增强;

相似文献

外文文献
中文文献
专利

1. Fast-Convergence Algorithm Combining ICA and Beamforming for Blind Separation of More Than Two Sources [J] . Tsuyoki NISHIKAWA, Hiroshi SARUWATARI, Kiyohiro SHIKANO, 電子情報通信学会技術研究報告. 応用音響. Engineering Acoustics . 2004,第454期

机译：ICA和波束赋形相结合的快速收敛算法用于两个以上源的盲分离
2. Wideband Blind Source Separation Algorithm Based on Beamforming [J] . Fu Weihong, Zhang Yichen Wireless personal communications: An Internaional Journal . 2019,第1期

机译：基于波束形成的宽带盲源分离算法
3. Blind source separation for robot audition using fixed HRTF beamforming [J] . Mounira Maazaoui, Karim Abed-Meraim, Yves Grenier EURASIP journal on advances in signal processing . 2012,第1期

机译：使用固定HRTF波束成形的盲源分离，用于机器人试听
4. A New Kind of Method for DOA Estimation Based on Blind Source Separation and MVDR Beamforming [C] . Kang Chunyu, Zhang Xinhua, Han Dong International Conference on Natural Computation;ICNC '09 . 2009

机译：基于盲源分离和MVDR波束形成的DOA估计新方法
5. Array signal processing for beamforming and blind source separation. [D] . Moazzen, Iman. 2013

机译：用于波束成形和盲源分离的阵列信号处理。
6. Towards Robust Multiple Blind Source Localization Using Source Separation and Beamforming [O] . Henglin Pu, Chao Cai, Menglan Hu, 2021

机译：通过源分离和波束成形来实现强大的多个盲源本地化
7. BLIND SEPARATION OF MORE THAN TWO SOURCES BASED ON HIGH-CONVERGENCE ALGORITHM COMBINING ICA AND BEAMFORMING [O] . Nishikawa Tsuyoki, SARUWATARI Hiroshi, Shikano Kiyohiro 2005

机译：基于ICA和波束形成的高收敛算法的两种以上盲源分离。

Mask-based blind source separation and MVDR beamforming in ASR

摘要

著录项

相似文献

相关主题

期刊订阅