首页> 外文OA文献 >DESIGN IMPLEMENTATION OF A REAL-TIME, SPEAKER-INDEPENDENT, CONTINUOUS SPEECH RECOGNITION SYSTEM WITH VLIW DIGITAL SIGNAL PROCESSOR ARCHITECTURE
【2h】

DESIGN IMPLEMENTATION OF A REAL-TIME, SPEAKER-INDEPENDENT, CONTINUOUS SPEECH RECOGNITION SYSTEM WITH VLIW DIGITAL SIGNAL PROCESSOR ARCHITECTURE

机译:具有VLIW数字信号处理器体系结构的实时,独立于说话者的连续语音识别系统的设计与实现

摘要

This thesis explores the feasibility of mapping a real-time, continuous speech recognition system onto a multi-core Digital Signal Processor architecture. While a pure hardware solution is capable of implementing the entire recognition process in real-time, the design process can be lengthy and inflexible to changes. However, a low-end embedded processor such as ARM7 is insufficient to execute in real-time. As a result, a more flexible and powerful DSP solution with Texas Instruments¡¦ C6713 multi-core DSP is used to exploit the instruction level parallelism within the speech recognition process. By exploiting the parallelism using 7 optimization techniques, the performance of the recognition process can be real-time on a 300 MHz DSP for a 1000 word vocabulary. At its core, continuous speech recognition is essentially a matching problem. The recognition process can be divided into four major phases: Feature Extraction, Acoustic Modeling, Phone Modeling and Word Modeling. Each phase is analyzed in detail to identify performance issues. In short, the major issues are its massive computations and large memory bandwidth. After applying various optimizations, the overall computational performance has improved from about 15 times slower than real-time to 1.6 times faster than real-time with the hardware. Through utilization of Direct Memory Access and larger cache memory, the memory bandwidth problem can be solved. The conclusion is that a multi-core DSP running at 300 MHz would be sufficient to implement a 1000 word Command & Control type application using the optimization techniques described in this thesis.
机译:本文探讨了将实时连续语音识别系统映射到多核数字信号处理器体系结构的可行性。虽然纯硬件解决方案能够实时实现整个识别过程,但设计过程可能很长且不易更改。但是,诸如ARM7之类的低端嵌入式处理器不足以实时执行。因此,采用了德州仪器(TI)的C6713多核DSP的更加灵活和强大的DSP解决方案,可在语音识别过程中利用指令级并行性。通过使用7种优化技术利用并行性,识别过程的性能可以在300 MHz DSP上针对1000个单词的词汇进行实时处理。从本质上讲,连续语音识别本质上是一个匹配问题。识别过程可分为四个主要阶段:特征提取,声学建模,电话建模和单词建模。每个阶段都会进行详细分析,以识别性能问题。简而言之,主要问题是其庞大的计算量和较大的内存带宽。在应用各种优化之后,整体计算性能已从硬件的大约15倍的实时速度提高到实时的1.6倍。通过利用直接内存访问和更大的缓存内存,可以解决内存带宽问题。结论是,使用本文描述的优化技术,以300 MHz运行的多核DSP足以实现1000字命令与控制类型的应用。

著录项

  • 作者

    Ng Wai-Ting;

  • 作者单位
  • 年度 2007
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号