Improved HMM Separation for Distant-Talking Speech Recognition

Tetsuya TAKIGUCHI; Masafumi NISHIMURA

首页> 外文期刊>IEICE Transactions on Information and Systems >Improved HMM Separation for Distant-Talking Speech Recognition

【24h】

Improved HMM Separation for Distant-Talking Speech Recognition

机译：改进的HMM分离，用于远距离语音识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In distant-talking speech recognition, the recognition accuracy is seriously degraded by reverberation and environmental noise. A robust speech recognition technique in such environments, HMM separation and composition, has been described in [1]. HMM separation estimates the model parameters of the acoustic transfer function using adaptation data uttered from an unknown position in noisy and reverberant environments, and HMM composition builds an HMM of noisy and reverberant speech, using the acoustic transfer function estimated by HMM separation. Previously, HMM separation has been applied to the acoustic transfer function based on a single Gaussian distribution. However the improvement was smaller than expected for the impulse response with long reverberations. This is because the variance of the acoustic transfer function in each frame increases, since the length of the impulse response of the room reverberation is longer than that of the spectral analysis window. In this paper, HMM separation is extended to estimate the acoustic transfer function based on the Gaussian mixture components in order to compensate for the greater variability of the acoustic transfer function, and the re-estimation formulae are derived. In addition, this paper introduces a technique to adapt the noise weight for each mel-spaced frequency in order to improve the performance of the HMM separation in the linear-spectral domain, since the use of the HMM separation in the linear-spectral domain sometimes causes a negative mean output due to the subtraction operation. The extended HMM separation is evaluated on distant-talking speech recognition tasks. The results of the experiments clarify the effectiveness of the proposed method.

机译：在远距离语音识别中，混响和环境噪声会严重降低识别精度。在[1]中已经描述了在这样的环境中的鲁棒语音识别技术，即HMM分离和合成。 HMM分离使用从嘈杂和混响环境中的未知位置发出的适应数据来估计声学传递函数的模型参数，并且HMM合成使用通过HMM分离估算的声学传递函数来构建嘈杂和混响语音的HMM。以前，HMM分离已基于单个高斯分布应用于声学传递函数。但是，对于长混响的脉冲响应而言，改进程度小于预期。这是因为，由于房间混响的脉冲响应的长度比频谱分析窗口的脉冲响应的长度长，所以每帧中的声音传递函数的方差增加。在本文中，HMM分离被扩展以基于高斯混合分量来估计声学传递函数，以补偿声学传递函数的更大可变性，并推导了重新估计公式。此外，本文介绍了一种技术，可针对每个mel-spaced频率调整噪声权重，以提高线性光谱域中HMM分离的性能，因为有时会在线性光谱域中使用HMM分离由于减法运算，导致平均输出为负。在远距离语音识别任务上评估扩展的HMM分离。实验结果证明了该方法的有效性。

著录项

来源
《IEICE Transactions on Information and Systems》 |2004年第5期|p.1127-1137|共11页
作者
Tetsuya TAKIGUCHI; Masafumi NISHIMURA;
展开▼
作者单位

IBM Tokyo Research Laboratory, Yamato-shi, 242-8502 Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词
distant-talking speech recognition; HMM separation; reverberation; noise;

机译：远距离语音识别;HMM分离;混响;噪声;

相似文献

外文文献
中文文献
专利

1. Improved Noise Robustness of Word HMMs Based on Weighted variance Expansion for Noisy Speech Recognition [J] . Sukeyasu Kanno, Tetsuo Funada Systems and Computers in Japan . 2005,第13期

机译：基于加权方差扩展的改进型词HMM噪声鲁棒性
2. HMM-based stressed speech modeling with application to improved synthesis and recognition of isolated speech under stress [J] . Bou-Ghazale S.E., Hansen J.H.L. IEEE Transactions on Speech and Audio Proceeding . 1998,第3期

机译：基于HMM的压力语音建模及其在压力下孤立语音的合成和识别中的应用
3. An improvement of HMM separation and composition for noisy reverberant speech recognition [J] . Tetsuya Takiguchi, Masafumi Nishimura 電子情報通信学会技術研究報告. 応用音響. Engineering Acoustics . 2003,第24期

机译：改进的HMM分离和合成技术用于嘈杂的混响语音识别
4. IVN-Based Joint Training Of GMM And HMMs Using An Improved VTS-Based Feature Compensation For Noisy Speech Recognition [C] . Jun Du, Qiang Huo Annual conference of the International Speech Communication Association . 2012

机译：基于IVN的GMM和HMM联合训练，使用改进的基于VTS的特征补偿进行嘈杂的语音识别
5. HMM-based non-intrusive speech quality and implementation of Viterbi score distribution and hiddenness based measures to improve the performance of speech recognition [D] . Talwar, Gaurav 2006

机译：基于HMM的非侵入式语音质量以及基于Viterbi分数分布和隐蔽性的措施的实施，以提高语音识别的性能
6. Using blind source separation techniques to improve speech recognition in bilateral cochlear implant patients [O] . Kostas Kokkinakis, Philipos C. Loizou -1

机译：使用盲源分离技术改善双侧人工耳蜗患者的语音识别
7. Improved Bimodal Speech Recognition using Tied-mixture HMMs and 5000 Word audio-visual Syncronous Database [O] . Satoshi Nakamura, Ron Nagai, Kiyohiro Shikano 1997

机译：使用分层混合HMM和5000 Word视听同步数据库改进了双峰语音识别
8. Improved HMM Models for High Performance Speech Recognition. [R] . Austin, S., Barry, C., Chow, Y., 1989

机译：改进的Hmm模型用于高性能语音识别。

Improved HMM Separation for Distant-Talking Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅