首页> 外文会议>International conference on asian language processing >Classification of phonemes using modulation spectrogram based features for Gujarati language
【24h】

Classification of phonemes using modulation spectrogram based features for Gujarati language

机译:使用古吉拉特语基于调制频谱图的特征对音素进行分类

获取原文

摘要

In this paper, features extracted from modulation spectrogram are used to classify the phonemes in Gujarati language. Modulation spectrogram which is a 2-dimensional (i.e., 2-D) feature vector, is then reduced to a smaller feature dimension by using the proposed feature extraction method. Gujarati database was manually segmented in 31 phoneme classes. These phonemes are then classified using support vector machine (SVM) classifier. Classification accuracy of phoneme classification is 94.5 % as opposed to classification with the state-of-the-art feature set Mel frequency cepstral coefficients (MFCC), which yields 92.74 % classification accuracy. Classification accuracy for broad phoneme classes, viz., vowel, stops, nasals, semivowels, affricates and fricatives is also determined. Phoneme classification in their respective classes is 95.03 % correct with the proposed feature set. Fusion of MFCC with the proposed feature set is performing even better, giving phoneme classification accuracy of 95.7%. With the fusion of features phoneme classification in sonorant and obstruent classes is found to be 97.01 % accurate.
机译:在本文中,使用从调制频谱图提取的特征对古吉拉特语中的音素进行分类。然后,通过使用所提出的特征提取方法,将作为二维(即2-D)特征向量的调制频谱图减小为较小的特征维。古吉拉特语数据库被手动划分为31个音素类。然后使用支持向量机(SVM)分类器对这些音素进行分类。音素分类的分类准确度为94.5%,与使用最新功能集梅尔频率倒谱系数(MFCC)进行分类相比,它的分类准确度为92.74%。还确定了宽音素类别的分类准确度,即元音,停止音,鼻音,半元音,附加音和摩擦音。所建议的功能集在其各自类别中的音素分类正确率为95.03%。 MFCC与提出的功能集的融合效果更好,音素分类准确率达95.7%。通过将特征音素分类合并在声音和淫秽类中,可以达到97.01%的准确率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号