MLP-based isolated phoneme classification using likelihood features extracted from reconstructed phase space

Yasser Shekofteh; Farshad Almasganj; Ayoub Daliri

首页> 外文期刊>Engineering Applications of Artificial Intelligence >MLP-based isolated phoneme classification using likelihood features extracted from reconstructed phase space

【24h】

MLP-based isolated phoneme classification using likelihood features extracted from reconstructed phase space

机译：使用从重构相空间提取的似然特征的基于MLP的孤立音素分类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Nonlinear properties of a complex signal can be represented in reconstructed phase space (RPS). Previously, researchers have developed RPS-based feature extraction approaches to capture nonlinear properties. Typically, these approaches are more computationally demanding - higher run-time - and less accurate than traditional techniques such as Mel-frequency cepstral coefficients (MFCCs) that fail to capture nonlinear properties of signals. To overcome these issues, we propose a new RPS-based feature extraction approach that is based on a previously reported approach. The proposed approach calculates the similarities between the embedded speech signals and a set of predefined speech attractor models in the RPS, and uses the similarities as a set of proper input features for a final phonetic classifier. A set of Gaussian mixture models (GMMs) is trained to represent the variety of all phoneme attractors in the RPS. Using the developed GMMs, for each embedded out-sample speech signal, a feature vector is calculated that consists of the Log-likelihoods. Then, an MLP-based classifier is used to estimate posterior probabilities for the phoneme classes. To test the performance of the proposed approach, we apply the approach to a Persian speech corpus (i.e., FARSDAT). Results show 1.89% absolute classification accuracy improvement in comparison to performance of a baseline system that exploits MFCC features. Combining different classifiers that use the proposed RPS-based features and MFCC features, the classifier gain the highest accuracy of 68.85% phoneme classification rate, with absolute accuracy improvements of 4.78% against a baseline system.

机译：复信号的非线性特性可以在重构相空间（RPS）中表示。以前，研究人员已经开发了基于RPS的特征提取方法来捕获非线性特性。通常，与无法捕获信号非线性特性的传统技术（例如梅尔频率倒谱系数（MFCC））相比，这些方法对计算的要求更高-运行时间更长，并且准确性更低。为了克服这些问题，我们提出了一种新的基于RPS的特征提取方法，该方法基于以前报告的方法。所提出的方法计算RPS中嵌入的语音信号和一组预定义的语音吸引器模型之间的相似度，并将这些相似度用作最终语音分类器的一组适当的输入特征。训练了一组高斯混合模型（GMM），以表示RPS中所有音素吸引子的种类。使用开发的GMM，对于每个嵌入的样本外语音信号，将计算一个包含对数似然的特征向量。然后，基于MLP的分类器用于估计音素类别的后验概率。为了测试所提出方法的性能，我们将该方法应用于波斯语语料库（即FARSDAT）。结果显示，与利用MFCC功能的基准系统的性能相比，绝对分类准确性提高了1.89％。结合使用建议的基于RPS的功能和MFCC功能的不同分类器，该分类器可获得68.85％的音素分类率的最高准确性，相对于基准系统，绝对准确性提高了4.78％。

著录项

来源
《Engineering Applications of Artificial Intelligence》 |2015年第9期|1-9|共9页
作者
Yasser Shekofteh; Farshad Almasganj; Ayoub Daliri;
展开▼
作者单位

Biomedical Engineering Department, Amirkabir University of Technology, Hafez Avenue, PO Box 15875-4413, Tehran, Iran,Research Center for Development of Advanced Technologies (RCDAT), Tehran, Iran;

Biomedical Engineering Department, Amirkabir University of Technology, Tehran, Iran;

Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA,Department of Speech, Language, and Hearing Sciences, Boston University, Boston, MA, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Isolated phoneme classification; Nonlinear speech processing; Phoneme attractor; Reconstructed phase space; Gaussian mixture models;

机译：孤立的音素分类;非线性语音处理;音素吸引子;重构相空间;高斯混合模型;

相似文献

外文文献
中文文献
专利

1. Time-domain isolated phoneme classification using reconstructed phase spaces [J] . Johnson M.T., Povinelli R.J., Lindgren A.C., IEEE Transactions on Speech and Audio Proceessing . 2005,第4期

机译：使用重构相空间的时域隔离音素分类
2. Phoneme classification in reconstructed phase space with convolutional neural networks [J] . Wesley R. John, Khan A. Nayeemulla, Shahina A. Pattern recognition letters . 2020,第Jula期

机译：卷积神经网络重建阶段空间中的音素分类
3. Sleep apnoea detection from ECG using features extracted from reconstructed phase space and frequency domain [J] . Ayyoob Jafari Biomedical signal processing and control . 2013,第6期

机译：使用从重构相空间和频域提取的特征从ECG进行睡眠呼吸暂停检测
4. A comparison of reconstructed phase spaces and cepstral coefficients for multi-band phoneme classification [C] . Indrebo, K.M., Povinelli, . 2004

机译：多频带音素分类的重构相空间和倒频谱系数的比较
5. A novel approach in the detection of obstructive sleep apnea from electrocardiogram signals using neural network classification of textural features extracted from time-frequency plots. [D] . Al-Abed, Mohammad Ahmad. 2006

机译：一种使用从时频图提取的纹理特征的神经网络分类从心电图信号中检测阻塞性睡眠呼吸暂停的新方法。
6. EEMD Independent Extraction for Mixing Features of Rotating Machinery Reconstructed in Phase Space [O] . Zaichao Ma, Guangrui Wen, Cheng Jiang 2015

机译：EEMD独立提取用于相空间重构的旋转机械混合特性
7. A COMPARISON OF RECONSTRUCTED PHASE SPACES AND CEPSTRAL COEFFICIENTS FOR MULTI-BAND PHONEME CLASSIFICATION [O] . Kevin M. Indrebo, Richard J. Povinelli, Michael T. Johnson 2008

机译：用于多波段频率分类的重构相空间和次幂系数的比较
8. Iterated Class-Specific Subspaces for Speaker-Dependent Phoneme Classification [R] . Baggenstoss, P. M. 2008

机译：用于说话者相关音素分类的迭代类特定子空间

MLP-based isolated phoneme classification using likelihood features extracted from reconstructed phase space

摘要

著录项

相似文献

相关主题

期刊订阅