首页> 外文会议>INTERSPEECH 2012 >Emotion Recognition using Acoustic.and Lexical Features
【24h】

Emotion Recognition using Acoustic.and Lexical Features

机译:使用声学的情感识别。词汇特征

获取原文

摘要

In this paper we present an innovative approach for utterance-level emotion recognition by fusing acoustic features with lexical features extracted from automatic speech recognition (ASR) output. The acoustic features are generated by combining: (1) a novel set of features that are derived from segmental Mel Frequency Cepstral Coefficients (MFCC) scored against emotion-dependent Gaussian mixture models, and (2) statistical functionals of low-level feature descriptors such as intensity, fundamental frequency, jitter, shimmer, etc. These acoustic features are fused with two types of lexical features extracted from the ASR output: (1) presence/absence of word stems, and (2) bag-of-words sentiment categories. The combined feature set is used to train support vector machines (SVM) for emotion classification. We demonstrate the efficacy of our approach by performing four-way emotion recognition on the University of Southern California's Interactive Emotional Motion Capture (USC-IEMOCAP) corpus. Our experiments show that the fusion of acoustic and lexical features delivers an emotion recognition accuracy of 65.7%, outperforming the previously reported best results on this challenging dataset.
机译:在本文中,我们通过将声学特征与来自自动语音识别(ASR)输出提取的词汇特征呈现为发声级情绪识别一种创新的方法。声学特征是通过组合产生的:(1)一组新的特征,是根据节段性梅尔频率倒谱系数(MFCC)衍生的刻划对抗情感依赖性高斯混合模型,和(2)低级特征的统计泛函描述符等如强度,基频,抖动,微光等等。这些声学特征融合与两种类型的从ASR输出提取的词汇特征:(1)存在/不存在字的茎,和(2)袋的词情绪类别。将合并的特征集来训练支持向量机(SVM),用于情感类别。我们通过在南加州的互动情感的动作捕捉(USC-IEMOCAP)语料库的大学进行四向情感识别证明了该方法的有效性。我们的实验表明,声学和词汇特征融合提供了65.7%的情感识别准确率,跑赢上这个具有挑战性的数据集先前报告的最好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号