首页> 外文会议>IEEE Workshop on Spoken Language Technology >Effective data-driven feature learning for detecting name errors in automatic speech recognition
【24h】

Effective data-driven feature learning for detecting name errors in automatic speech recognition

机译:有效的数据驱动特征学习,可在自动语音识别中检测名称错误

获取原文

摘要

This paper addresses the problem of detecting name errors in automatic speech recognition (ASR) output. The highly skewed label distributions (i.e. name errors are infrequent), sparse training data, and large number of potential lexical features pose significant challenges for training name error classification systems. Data-driven feature learning is needed for handling multiple languages but is sensitive to over fitting. We address the problem by designing aggregate features using a related (sentence-level name detection) task, and reduce dimensionality of the lexical features using word classes. Experiments on conversational domain data in both English and Iraqi Arabic show that best results are obtained using all feature mapping methods plus feature selection using L1 regularization.
机译:本文解决了在自动语音识别(ASR)输出中检测名称错误的问题。高度倾斜的标签分布(即名称错误很少见),稀疏的训练数据以及大量潜在的词汇特征对训练名称错误分类系统提出了重大挑战。需要数据驱动的特征学习来处理多种语言,但对过度拟合很敏感。我们通过使用相关的(句子级名称检测)任务设计聚合特征来解决该问题,并使用单词类降低词汇特征的维数。使用英语和伊拉克阿拉伯语进行的会话域数据实验表明,使用所有特征映射方法以及使用L1正则化进行特征选择都可以获得最佳结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号