首页> 外文期刊>Cognitive Neurodynamics >Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting
【24h】

Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting

机译:多流LSTM-HMM解码和直方图均衡以增强噪声健壮关键字

获取原文
获取原文并翻译 | 示例
           

摘要

Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today’s automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and models. Histogram Equalization is an efficient method to reduce the mismatch between clean and noisy conditions by normalizing all moments of the probability distribution of the feature vector components. In this article, we propose to combine histogram equalization and multi-condition training for robust keyword detection in noisy speech. To better cope with conversational speaking styles, we show how contextual information can be effectively exploited in a multi-stream ASR framework that dynamically models context-sensitive phoneme estimates generated by a long short-term memory neural network. The proposed techniques are evaluated on the SEMAINE database—a corpus containing emotionally colored conversations with a cognitive system for “Sensitive Artificial Listening”.
机译:对于当今的自动语音识别(ASR)系统来说,高度自发的,对话性的以及潜在的情绪化和嘈杂的语音是众所周知的挑战,这凸显了对改进语音功能和模型的高级算法的需求。直方图均衡化是一种有效的方法,可通过归一化特征向量分量的概率分布的所有矩来减少干净条件与嘈杂条件之间的不匹配。在本文中,我们建议将直方图均衡和多条件训练相结合,以在嘈杂的语音中进行健壮的关键字检测。为了更好地应对谈话的口语风格,我们展示了如何在多流ASR框架中有效地利用上下文信息,该框架动态地模拟由长短期记忆神经网络生成的上下文相关音素估计。在SEMAINE数据库上对提出的技术进行了评估,该数据库是一个语料库,其中包含带有“敏感的人工听力”的认知系统的彩色对话。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号