首页> 外文期刊>Multimedia Tools and Applications >Spoken keyword search system using improved ASR engine and novel template-based keyword scoring
【24h】

Spoken keyword search system using improved ASR engine and novel template-based keyword scoring

机译:使用改进的ASR引擎和新颖的基于模板的关键字评分的口语关键字搜索系统

获取原文
获取原文并翻译 | 示例
           

摘要

Keyword search for spoken documents has become more and more important nowadays due to the increasing amount of spoken data. The typical system makes use of an Automatic Speech Recognition system (ASR) and information retrieval methods. While a number of studies have been done to get the optimal system performance, KeyWord Search (KWS) systems still suffer from two main drawbacks. First, the system performance depends strongly on the ASR transcripts which are inherently inexact. Due to the speech signal variabilities, ASR systems are far from being powerful. Second, KWS systems make detection decisions based on the lattice-based posterior probability which is incomparable across keywords. In addition, posterior probabilities of true detection usually fall into different ranges which decrease the spotting performance. This paper considers the problems of ASR transcriptions and keyword detection decision based on posterior probabilities. More specifically, we propose to enhance the ASR transcripts accuracy by introducing a new ASR architecture in which we integrate data augmentation and ensemble learning techniques into a single framework. In addition, we proposed a novel keyword rescoring method that provides scores from a new perspective. Precisely, inspired by template-based KWS approach, scores of similarity between the detected keywords are computed by computing the distance between the acoustic features and are used as new scores for decision. Experiments on French and English datasets show that the proposed KWS system potentially leads to more accurate keyword results than the conventional systems.
机译:如今,由于语音数据量的增加,对语音文档进行关键字搜索变得越来越重要。典型的系统利用自动语音识别系统(ASR)和信息检索方法。尽管已经进行了许多研究来获得最佳系统性能,但是关键字搜索(KWS)系统仍然存在两个主要缺点。首先,系统性能在很大程度上取决于本质上不精确的ASR成绩单。由于语音信号的可变性,ASR系统远非功能强大。其次,KWS系统基于基于格的后验概率做出检测决策,这在关键字之间是无法比拟的。另外,真实检测的后验概率通常落在不同范围内,这降低了点样性能。本文基于后验概率考虑了ASR转录和关键词检测决策的问题。更具体地说,我们建议通过引入新的ASR体系结构来提高ASR成绩单的准确性,在该体系结构中,我们将数据增强和集成学习技术集成到单个框架中。此外,我们提出了一种新颖的关键字评分方法,可从新的角度提供得分。精确地,受基于模板的KWS方法的启发,通过计算声学特征之间的距离来计算检测到的关键字之间的相似度分数,并将其用作决策的新分数。在法语和英语数据集上进行的实验表明,与传统系统相比,拟议的KWS系统可能会导致更准确的关键字结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号