Speaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech Speech Understanding / Interaction

机译：基于对数面积比和高斯混合模型的窄带语音语音理解/交互中的说话人识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Log area ratio coefficients (LAR) derived from linear prediction coefficients (LPC) is a well known feature extraction technique used in speech applications. This paper presents a novel way to use the LAR feature in a speaker identification system. Here, instead of using the mel frequency cepstral coefficients (MFCC), the LAR feature is used in a Gaussian mixture model (GMM) based speaker identification system. An F-ratio feature analysis was conducted on both the LAR and MFCC feature vectors which showed the lower order LAR coefficients are superior to MFCC counterpart. The text-independent, closed-set speaker identification rate, as tested on the down-sampled version of TIMIT database, was improved from 96.73%, using the MFCC feature, to 98.81%, using the LAR features.

机译：从线性预测系数（LPC）得出的对数面积比系数（LAR）是语音应用中使用的众所周知的特征提取技术。本文提出了一种在说话人识别系统中使用LAR功能的新颖方法。在这里，代替使用梅尔频率倒谱系数（MFCC），在基于高斯混合模型（GMM）的说话者识别系统中使用了LAR功能。对LAR和MFCC特征向量都进行了F比特征分析，结果表明，低阶LAR系数优于MFCC对应特征。在TIMIT数据库的降采样版本上测试的与文本无关的封闭式说话人识别率从使用MFCC功能的96.73％提高到使用LAR功能的98.81％。

著录项

来源
《Pacific Rim International Conference on Artificial Intelligence; 20040809-20040813; Auckland; NZ》|2004年|P.901-908|共8页
会议地点 Auckland(NZ);Auckland(NZ)
作者
David Chow; Waleed H. Abdulla;
展开▼
作者单位

Electrical and Electronic Engineering Department, The University of Auckland Auckland, New Zealand;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements [J] . Tran Huy Dat, Hiroshi Fujimura, Kazuya Takeda, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2004,第539期

机译：基于对数功率域中高斯混合建模的SNR和子带SNR估计及其在语音增强中的应用
2. SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements [J] . Tran HUY DAT, Hiroshi FUJIMURA, Kazuya TAKEDA, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2004,第539期

机译：基于对数功率域中高斯混合建模的SNR和子带SNR估计及其在语音增强中的应用
3. SNR and sub-band SNR estimation based on Gaussian mixture modeling in the log power domain with application for speech enhancements [J] . Tran HUY DAT, Hiroshi FUJIMURA, Kazuya TAKEDA, 電子情報通信学会技術研究報告. 音声. Speech . 2004,第542期

机译：基于对数功率域中高斯混合建模的SNR和子带SNR估计及其在语音增强中的应用
4. Speaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech Speech Understanding / Interaction [C] . David Chow, Waleed H. Abdulla, Lecture Notes in Artificial Intelligence 3157 Pacific Rim International Conference on Artificial Intelligence . 2004

机译：基于日志区域比和高斯混合模型在窄带语音语音理解/交互中的扬声器识别
5. A software based speaker identification system using Gaussian mixture model classification. [D] . Reynolds, Ryan M. 2005

机译：使用高斯混合模型分类的基于软件的说话人识别系统。
6. Detecting Manic State of Bipolar Disorder Based on Support Vector Machine and Gaussian Mixture Model Using Spontaneous Speech [O] . Zhongde Pan, Chao Gui, Jing Zhang, 2018

机译：基于支持向量机和高斯混合模型的自发性语音躁狂状态检测
7. Speaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech [O] . David Chow, Waleed H. Abdulla 2008

机译：基于对数面积比和窄带语音高斯混合模型的说话人识别
8. Automatic Detection of Depression in Speech Using Gaussian Mixture Modeling with Factor Analysis. [R] . Sturim, D., Torres-Carrasquillo, P., Quatieri, T. F., 2015

机译：用因子分析的高斯混合模型自动检测语音中的抑郁。

Speaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech Speech Understanding / Interaction

摘要

著录项

相似文献

相关主题

期刊订阅