One-Pass Semi-Dynamic Network Decoding Using a Subnetwork Caching Model for Large Vocabulary Continuous Speech Recongnition

Dong-Hoon AHN; Minhwa CHUNG

首页> 外文期刊>IEICE Transactions on Information and Systems >One-Pass Semi-Dynamic Network Decoding Using a Subnetwork Caching Model for Large Vocabulary Continuous Speech Recongnition

【24h】

One-Pass Semi-Dynamic Network Decoding Using a Subnetwork Caching Model for Large Vocabulary Continuous Speech Recongnition

机译：大词汇量连续语音识别使用子网络缓存模型的一次通过半动态网络解码

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a new decoding framework for large vocabulary continuous speech recognition that can handle a static search network dynamically. Generally, a static network decoder can use a search space that is globally optimized in advance, and therefore it can run at high speed during decoding. However, its large memory requirement due to the large network size or the spatial complexity of the optimization algorithm often makes it impractical. Our new one-pass semi-dynamic network decoding scheme aims at incorporating such an optimized search network with memory efficiency, but without losing speed. In this framework, a complete search network is organized on the basis of self-structuring subnetworks and is nearly minimized using a modified tail-sharing algorithm. While the decoder runs, it caches subnetworks needed for decoding in memory, whereas static network decoders keep the complete network in memory. The subnetwork caching model is controlled by two levels of caches: local cache obtained by subnetwork caching operations and global cache obtained by subnetwork preloading operations. The model can also be controlled adaptively by using subnetwork profiling operations. Furthermore, it is made simple and fast with compactly designed self-structuring subnetworks. Experimental results on a 25 k-word Korean broadcast news transcription task show that the semi-dynamic decoder can run almost as fast as an equivalent static network decoder under various memory configurations by using the subnetwork caching model.

机译：本文提出了一种新的用于大词汇量连续语音识别的解码框架，该框架可以动态处理静态搜索网络。通常，静态网络解码器可以使用预先全局优化的搜索空间，因此，它可以在解码期间高速运行。然而，由于网络规模大或优化算法的空间复杂性，其大的存储需求常常使其不切实际。我们新的单程半动态网络解码方案旨在将这种优化的搜索网络并入存储效率，但又不会降低速度。在这个框架中，一个完整的搜索网络是在自构造子网络的基础上组织的，并且使用改进的尾部共享算法几乎将其最小化。解码器运行时，它将缓存所需的子网缓存在内存中，而静态网络解码器则将完整的网络保留在内存中。子网缓存模型由两个级别的缓存控制：通过子网缓存操作获取的本地缓存和通过子网预加载操作获取的全局缓存。还可以通过使用子网概要分析操作来自适应地控制模型。此外，它通过紧凑设计的自构造子网变得简单快捷。 25 k字韩语广播新闻转录任务的实验结果表明，通过使用子网缓存模型，在各种内存配置下，半动态解码器的运行速度几乎可以与等效静态网络解码器一样快。

著录项

来源
《IEICE Transactions on Information and Systems》 |2004年第5期|p.1164-1174|共11页
作者
Dong-Hoon AHN; Minhwa CHUNG;
展开▼
作者单位

Dept. of Computer Science, Sogang University, Sinsu-dong 1, Mapo-gu, Seoul, 121-742, Korea;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词
speech recognition; semi-dynamic network decoding; subnetwork caching; tail-sharing algorithm;

机译：语音识别;半动态网络解码;子网缓存;共享尾算法;

相似文献

外文文献
中文文献
专利

1. Efficient WFST-Based One-Pass Decoding With On-The-Fly Hypothesis Rescoring in Extremely Large Vocabulary Continuous Speech Recognition [J] . Hori T., Hori C., Minami Y., IEEE transactions on audio, speech and language processing . 2007,第4期

机译：高效的基于WFST的单遍解码，具有即时假设，可极大地记录词汇量，并能连续语音识别
2. Minimum Bayes Risk Estimation and Decoding in Large Vocabulary Continuous Speech Recognition [J] . William BYRNE IEICE Transactions on Information and Systems . 2006,第3期

机译：大词汇量连续语音识别中的最小贝叶斯风险估计和解码
3. The use of virtual hypothesis copies in decoding of large-vocabulary continuous speech [J] . Seide F. IEEE Transactions on Speech and Audio Proceessing . 2005,第4期

机译：虚拟假设副本在大词汇量连续语音解码中的使用
4. Acoustic modeling by phoneme templates and modified one-pass DP decoding for continuous speech recognition [C] . Ramasubramanian V., Kulkarni K., Kaemmerer B. IEEE International Conference on Acoustics, Speech and Signal Processing . 2008

机译：音素模板的声学建模和用于连续语音识别的修改一次通过DP解码
5. Performance evaluation of the cache and forward routing protocol in multihop wireless subnetworks. [D] . Gopinath, Snehapreethi. 2010

机译：多跳无线子网中的缓存和转发路由协议的性能评估。
6. Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition [O] . Jibin Wu, Emre Yılmaz, Malu Zhang, 2020

机译：大型词汇自动语音识别深尖峰神经网络
7. REAL-TIME ONE-PASS DECODING WITH RECURRENT NEURAL NETWORK LANGUAGE MODEL FOR SPEECH RECOGNITION [O] . Takaaki Hori, Yotaro Kubo, Atsushi Nakamura 2014

机译：语音识别的递归神经网络语言模型实时单通解码

One-Pass Semi-Dynamic Network Decoding Using a Subnetwork Caching Model for Large Vocabulary Continuous Speech Recongnition

摘要

著录项

相似文献

相关主题

期刊订阅