Discriminative Approach to Build Hybrid Vocabulary for Conversational Telephone Speech Recognition of Agglutinative Languages

Xin LI; Jielin PAN; Qingwei ZHAO; Yonghong YAN

首页> 外文期刊>IEICE transactions on information and systems >Discriminative Approach to Build Hybrid Vocabulary for Conversational Telephone Speech Recognition of Agglutinative Languages

【24h】

Discriminative Approach to Build Hybrid Vocabulary for Conversational Telephone Speech Recognition of Agglutinative Languages

机译：会话语言语音识别的混合词汇建立判别方法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Morphemes, which are obtained from morphological parsing, and statistical sub-words, which are derived from data-driven splitting, are commonly used as the recognition units for speech recognition of agglutinative languages. In this letter, we propose a discriminative approach to select the splitting result, which is more likely to improve the recognizer's performance, for each distinct word type. An objective function which involves the unigram language model (LM) probability and the count of misrecognized phones on the acoustic training data is defined and minimized. After determining the splitting result for each word in the text corpus, we select the frequent units to build a hybrid vocabulary including morphemes and statistical sub-words. Compared to a statistical sub-word based system, the hybrid system achieves 0.8% letter error rates (LERs) reduction on the test set.

机译：从形态学分析中获得的词素和从数据驱动的拆分中获得的统计子词通常用作凝集语言语音识别的识别单元。在这封信中，我们提出了一种判别方法来选择分割结果，对于每种不同的词类型，分割结果更有可能提高识别器的性能。定义并最小化了一个目标函数，该函数涉及unigram语言模型（LM）的概率和在声学训练数据上误认电话的数量。确定文本语料库中每个单词的拆分结果后，我们选择常用单元来构建包含语素和统计子单词的混合词汇。与基于统计子词的系统相比，混合系统在测试集上实现了0.8％的字母错误率（LER）降低。

著录项

来源
《IEICE transactions on information and systems》 |2013年第11期|共5页
作者
Xin LI; Jielin PAN; Qingwei ZHAO; Yonghong YAN;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Discriminative Approach to Build Hybrid Vocabulary for Conversational Telephone Speech Recognition of Agglutinative Languages [J] . Xin LI, Jielin PAN, Nonmembers, IEICE Transactions on Information and Systems . 2013,第11期

机译：识别性语言的会话电话语音识别混合词汇建立方法
2. An improved two-stage mixed language model approach for handling out-of-vocabulary words in large vocabulary continuous speech recognition [J] . Bert Reveil, Kris Demuynck, Jean-Pierre Martens Computer speech and language . 2014,第1期

机译：一种改进的两阶段混合语言模型方法，用于处理大词汇量连续语音识别中的词汇外单词
3. Modeling word-level rate-of-speech variation in large vocabulary conversational speech recognition [J] . Jing Zheng, Horacio Franco, Andreas Stolcke Speech Communication . 2003,第2a3期

机译：大型词汇会话语音识别中的词级语音变化率建模
4. HYBRID LANGUAGE MODELS FOR OUT OF VOCABULARY WORD DETECTION IN LARGE VOCABULARY CONVERSATIONAL SPEECH RECOGNITION [C] . Alt Yazgan, Murat Saraclar IEEE International Conference on Acoustics, Speech, and Signal Processing . 2004

机译：大词汇对话语音识别中的词汇词检测混合语言模型
5. Learning discriminant narrow-band temporal patterns for automatic recognition of conversational telephone speech. [D] . Chen, Barry Yue. 2005

机译：学习可分辨的窄带时间模式，以自动识别会话电话语音。
6. Words from spontaneous conversational speech can be recognized with human-like accuracy by an error-driven learning algorithm that discriminates between meanings straight from smart acoustic features bypassing the phoneme as recognition unit [O] . Denis Arnold, Fabian Tomaschek, Konstantin Sering, -1

机译：通过错误驱动的学习算法可以区分自发会话语音中的单词其准确性与人类类似可以从智能声学特征中区分出含义而绕过音素作为识别单元
7. Hybrid language models for out of vocabulary word detection in large vocabulary conversational speech recognition [O] . Ali Yazgan, Murat Saraclar 2004

机译：用于大词汇量会话语音识别中词汇外单词检测的混合语言模型
8. Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004 [R] . Martin, A., Miller, D., Przybocki, M., 2004

机译：2004年NIsT演讲者认可评估的会话电话语音语料库集

Discriminative Approach to Build Hybrid Vocabulary for Conversational Telephone Speech Recognition of Agglutinative Languages

摘要

著录项

相似文献

相关主题

期刊订阅