...
首页> 外文期刊>International journal of speech technology >Improving the performance of the speaker emotion recognition based on low dimension prosody features vector
【24h】

Improving the performance of the speaker emotion recognition based on low dimension prosody features vector

机译:基于低维韵律特征向量的说话人情绪识别性能

获取原文
获取原文并翻译 | 示例
           

摘要

Speaker emotion recognition is an important research issue as it finds lots of applications in human-robot interaction, computer-human interaction, etc. This work deals with the recognition of emotion of the speaker from speech utterance. For that features like pitch, log energy, zero crossing rate, and first three formant frequencies are used. Feature vectors are constructed using the 11 statistical parameters of each feature. The Artificial Neural Network (ANN) is chosen as a classifier owing to its universal function approximation capabilities. In ANN based classifier, the time required for training the network as well as for classification depends upon the dimension of feature vector. This work focused on development of a speaker emotion recognition system using prosody features as well as reduction of dimensionality of feature vectors. Here, principle component analysis (PCA) is used for feature vector dimensionality reduction. Emotional prosody speech and transcription from Linguistic Data Consortium (LDC) and Berlin emotional databases are considered for evaluating the performance of proposed approach for seven types of emotion recognition. The performance of the proposed method is compared with existing approaches and better performance is obtained with proposed method. From experimental results it is observed that 75.32% and 84.5% recognition rate is obtained for Berlin emotional database and LDC emotional speech database respectively.
机译:说话人情感识别是一个重要的研究问题,因为它在人机交互,计算机人机交互等方面都有很多应用。这项工作涉及从语音中识别说话人的情感。为此,使用了音高,对数能量,过零率和前三个共振峰频率等特征。使用每个特征的11个统计参数构建特征向量。由于其通用函数逼近功能,因此选择了人工神经网络(ANN)作为分类器。在基于ANN的分类器中,训练网络以及进行分类所需的时间取决于特征向量的维数。这项工作的重点是开发使用韵律特征的说话人情感识别系统以及降低特征向量的维数。这里,主成分分析(PCA)用于减少特征向量维数。语言数据协会(LDC)和柏林情感数据库的情感韵律语音和转录被认为可用于评估所提出的七种情感识别方法的性能。将所提出的方法的性能与现有方法进行比较,并获得更好的性能。从实验结果可以看出,柏林情感数据库和LDC情感语音数据库的识别率分别为75.32%和84.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号