首页> 外文会议>Pacific Rim International Conference on Artificial Intelligence; 20040809-20040813; Auckland; NZ >Speaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech Speech Understanding / Interaction
【24h】

Speaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech Speech Understanding / Interaction

机译:基于对数面积比和高斯混合模型的窄带语音语音理解/交互中的说话人识别

获取原文
获取原文并翻译 | 示例

摘要

Log area ratio coefficients (LAR) derived from linear prediction coefficients (LPC) is a well known feature extraction technique used in speech applications. This paper presents a novel way to use the LAR feature in a speaker identification system. Here, instead of using the mel frequency cepstral coefficients (MFCC), the LAR feature is used in a Gaussian mixture model (GMM) based speaker identification system. An F-ratio feature analysis was conducted on both the LAR and MFCC feature vectors which showed the lower order LAR coefficients are superior to MFCC counterpart. The text-independent, closed-set speaker identification rate, as tested on the down-sampled version of TIMIT database, was improved from 96.73%, using the MFCC feature, to 98.81%, using the LAR features.
机译:从线性预测系数(LPC)得出的对数面积比系数(LAR)是语音应用中使用的众所周知的特征提取技术。本文提出了一种在说话人识别系统中使用LAR功能的新颖方法。在这里,代替使用梅尔频率倒谱系数(MFCC),在基于高斯混合模型(GMM)的说话者识别系统中使用了LAR功能。对LAR和MFCC特征向量都进行了F比特征分析,结果表明,低阶LAR系数优于MFCC对应特征。在TIMIT数据库的降采样版本上测试的与文本无关的封闭式说话人识别率从使用MFCC功能的96.73%提高到使用LAR功能的98.81%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号