Combining Word and Phonetic-Code Representations for Spoken Document Retrieval

机译：结合口语文档检索的单词和语音代码表示

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The traditional approach for spoken document retrieval (SDR) uses an automatic speech recognizer (ASR) in combination with a word-based information retrieval method. This approach has only showed limited accuracy, partially because ASR systems tend to produce transcriptions of spontaneous speech with significant word error rate. In order to overcome such limitation we propose a method which uses word and phonetic-code representations in collaboration. The idea of this combination is to reduce the impact of transcription errors in the processing of some (presumably complex) queries by representing words with similar pronunciations through the same phonetic code. Experimental results on the CLEF-CLSR-2007 corpus are encouraging; the proposed hybrid method improved the mean average precision and the number of retrieved relevant documents from the traditional word-based approach by 3% and 7% respectively.

机译：传统的口头文档检索（SDR）方法使用自动语音识别器（ASR）与基于Word的信息检索方法结合使用。这种方法仅显示了有限的准确性，部分原因是ASR系统倾向于产生具有显着字错误率的自发性言语的转录。为了克服此类限制，我们提出了一种使用单词和语音代码表示的方法。这种组合的思想是通过代表通过相同的语音代码的单词来减少转录错误在处理某些（可能的复杂）查询时的影响。 CLEF-CLSR-2007语料库的实验结果令人鼓舞;所提出的混合方法改善了平均平均精度和从传统的基于词的方法的检索相关文件的数量分别为3％和7％。

著录项

来源
《Annual International Conference on Computational Linguistics and Intelligent Text Processing》|2011年||共9页
会议地点
作者
Alejandro Reyes-Barragan; Manuel Montes-y-Gomez; Luis Villasenor-Pineda;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词

相似文献

外文文献
中文文献
专利

1. Efficient query-by-example spoken document retrieval combining phone multigram representation and dynamic time warping [J] . Paula Lopez-Otero, Javier Parapar, Alvaro Barreiro Information Processing & Management . 2019,第1期

机译：结合电话多媒体报表示和动态时间规整的高效示例查询口头文件检索
2. SpeechFind: Advances in Spoken Document Retrieval for a National Gallery of the Spoken Word [J] . Hansen J.H.L., Huang R., Zhou B., IEEE Transactions on Speech and Audio Proceessing . 2005,第5期

机译：SpeechFind：国家语言单词库的语音文档检索进展
3. Word Topic Models for Spoken Document Retrieval and Transcription [J] . BERLIN CHEN ACM transactions on Asian language information processing . 2009,第1期

机译：语音文档检索和转录的Word主题模型
4. Combining Word and Phonetic-Code Representations for Spoken Document Retrieval [C] . Alejandro Reyes-Barragan, Manuel Montes-y-GOmez, Luis Villasenor-Pineda Annual conference on intelligent text processing and computational linguistics;CICLing 2011 . 2011

机译：结合单词和语音代码表示进行语音文档检索
5. Examining the unspoken in spoken word recognition: Semantic representation and integration. [D] . Pirog, Kathleen Ann. 2007

机译：检查口语单词识别中的潜语：语义表示和集成。
6. EEG decoding of spoken words in bilingual listeners: from words to language invariant semantic-conceptual representations [O] . João M. Correia, Bernadette Jansma, Lars Hausfeld, -1

机译：双语听众中口语的EEG解码：从单词到语言不变的语义概念表示
7. Combining Word and Phonetic-Code Representations for Spoken Document Retrieval [O] . Ro Reyes-barragán, Manuel Montes-y-gómez, Luis Villaseñor-pineda 2013

机译：结合语音文本检索的Word和语音代码表示

Combining Word and Phonetic-Code Representations for Spoken Document Retrieval

摘要

著录项

相似文献

相关主题

期刊订阅