首页> 中文期刊>数据采集与处理 >基于瞬时频率估计和特征映射的汉语耳语音话者识别

基于瞬时频率估计和特征映射的汉语耳语音话者识别

     

摘要

耳语音是有别于正常音的一种微弱语音信号,在正常音训练的说话人识别系统中,用耳语音进行识别时会造成系统性能的急速下降.本文在基于语音产生的调幅-调频(AM-FM)模型基础上,采用多带解调分析(Multiband demodulation analysis,MDA)和能量分离算法(Energy separation algorithm,ESA)计算语音信号的瞬时频率,作为语音的一种特征.随后在基于耳语音和正常音来自不同信道的假设下,对语音的参数做特征映射后再进行训练和识别,以减少信道对系统的影响.实验表明,和传统的MFCC参数相比,加入特征映射后系统的识别率得到提高,且IFE的识别率和稳健性都优于MFCC.%Whisper is a special speech production mode different from neutral speech mode. The performance of speaker identification system (SIS), trained mainly with neutral voices, sharply declines when tested with the whispered speech. Based on the AM-FM model representation of speech signal, the multiband demodulation analysis (MDA) and the energy separation algorithm (ESA) are used to compute the instantaneous frequency estimation (IFE) as a character of speech signal. Then, under the condition that whispered speech and neutral speech come from different channels, feature mapping is conducted to reduce the channel effects before SIS training and test. The experimental results show that compared with MFCCs, feature mapping improves the accuracy of the system, and IFE parameter provides better robustness and accuracy results than MFCCs.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号