首页> 外文会议>IEEE Global Conference on Signal and Information Processing >An investigation into instantaneous frequency estimation methods for improved speech recognition features
【24h】

An investigation into instantaneous frequency estimation methods for improved speech recognition features

机译:改进语音识别功能的瞬时频率估计方法研究

获取原文

摘要

There have been several studies, in the recent past, pointing to the importance of analytic phase of the speech signal in human perception, especially in noisy conditions. However, phase information is still not used in state-of-the-art speech recognition systems. In this paper, we illustrate the importance of analytic phase of the speech signal for automatic speech recognition. As the computation of analytic phase suffers from inevitable phase wrapping problem, we extract features from its time derivative, referred to as instantaneous frequency (IF). In this work, we highlight the issues involved in IF extraction from speech-like signals, and propose suitable modifications for IF extraction from speech signals. We used the deep neural network (DNN) framework to build a speech recognition system using features extracted from the IF of speech signals. The speech recognition system based on IF features delivered a phoneme error rate of 21.8% on TIMIT database, while the baseline system based on mel-frequency cepstral coefficients (MFCCs) delivered a phoneme error rate of 18.4%. The combination of IF and MFCC features based systems, using minimum Bayes risk (MBR) decoding, provided a relative improvement of 8.7% over the baseline system, illustrating the significance of analytic phase for speech recognition.
机译:最近有几项研究指出了语音信号分析阶段在人类感知中的重要性,特别是在嘈杂的条件下。但是,在最新的语音识别系统中仍未使用相位信息。在本文中,我们说明了语音信号分析阶段对于自动语音识别的重要性。由于解析相位的计算不可避免地会出现相位缠绕问题,因此我们从其时间导数中提取特征,称为瞬时频率(IF)。在这项工作中,我们重点介绍了从类语音信号中频提取中涉及的问题,并提出了对从语音信号中频提取中进行适当修改的建议。我们使用深度神经网络(DNN)框架,使用从语音信号中频中提取的特征来构建语音识别系统。基于IF功能的语音识别系统在TIMIT数据库上的音素错误率为21.8%,而基于梅尔频率倒谱系数(MFCC)的基线系统的音素错误率为18.4%。使用最小贝叶斯风险(MBR)解码的IF和MFCC基于特征的系统的组合,相对于基线系统,相对提高了8.7%,说明了语音识别的分析阶段的重要性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号