首页> 外文期刊>IEICE Transactions on Information and Systems >Improved HMM Separation for Distant-Talking Speech Recognition
【24h】

Improved HMM Separation for Distant-Talking Speech Recognition

机译:改进的HMM分离,用于远距离语音识别

获取原文
获取原文并翻译 | 示例
           

摘要

In distant-talking speech recognition, the recognition accuracy is seriously degraded by reverberation and environmental noise. A robust speech recognition technique in such environments, HMM separation and composition, has been described in [1]. HMM separation estimates the model parameters of the acoustic transfer function using adaptation data uttered from an unknown position in noisy and reverberant environments, and HMM composition builds an HMM of noisy and reverberant speech, using the acoustic transfer function estimated by HMM separation. Previously, HMM separation has been applied to the acoustic transfer function based on a single Gaussian distribution. However the improvement was smaller than expected for the impulse response with long reverberations. This is because the variance of the acoustic transfer function in each frame increases, since the length of the impulse response of the room reverberation is longer than that of the spectral analysis window. In this paper, HMM separation is extended to estimate the acoustic transfer function based on the Gaussian mixture components in order to compensate for the greater variability of the acoustic transfer function, and the re-estimation formulae are derived. In addition, this paper introduces a technique to adapt the noise weight for each mel-spaced frequency in order to improve the performance of the HMM separation in the linear-spectral domain, since the use of the HMM separation in the linear-spectral domain sometimes causes a negative mean output due to the subtraction operation. The extended HMM separation is evaluated on distant-talking speech recognition tasks. The results of the experiments clarify the effectiveness of the proposed method.
机译:在远距离语音识别中,混响和环境噪声会严重降低识别精度。在[1]中已经描述了在这样的环境中的鲁棒语音识别技术,即HMM分离和合成。 HMM分离使用从嘈杂和混响环境中的未知位置发出的适应数据来估计声学传递函数的模型参数,并且HMM合成使用通过HMM分离估算的声学传递函数来构建嘈杂和混响语音的HMM。以前,HMM分离已基于单个高斯分布应用于声学传递函数。但是,对于长混响的脉冲响应而言,改进程度小于预期。这是因为,由于房间混响的脉冲响应的长度比频谱分析窗口的脉冲响应的长度长,所以每帧中的声音传递函数的方差增加。在本文中,HMM分离被扩展以基于高斯混合分量来估计声学传递函数,以补偿声学传递函数的更大可变性,并推导了重新估计公式。此外,本文介绍了一种技术,可针对每个mel-spaced频率调整噪声权重,以提高线性光谱域中HMM分离的性能,因为有时会在线性光谱域中使用HMM分离由于减法运算,导致平均输出为负。在远距离语音识别任务上评估扩展的HMM分离。实验结果证明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号