首页> 外文学位 >A speech recognition IC with an efficient MFCC extraction algorithm and multi-mixture models.
【24h】

A speech recognition IC with an efficient MFCC extraction algorithm and multi-mixture models.

机译:具有高效MFCC提取算法和多混合模型的语音识别IC。

获取原文
获取原文并翻译 | 示例

摘要

Automatic speech recognition (ASR) by machine has received a great deal of attention in past decades. Speech recognition algorithms based on the Mel frequency cepstrum coefficient (MFCC) and the hidden Markov model (HMM) have a better recognition performance compared with other speech recognition algorithms and are widely used in many applications. In this thesis a speech recognition system with an efficient MFCC extraction algorithm and multi-mixture models is presented. It is composed of two parts: a MFCC feature extractor and a HMM-based speech decoder.; In the conventional MFCC feature extraction algorithm, speech is separated into some short overlapped frames. The existing extraction algorithm requires a lot of computations and is not suitable for hardware implementation. We have developed a hardware efficient MFCC feature extraction algorithm in our work. The new algorithm reduces the computational power by 54% compared to the conventional algorithm with only 1.7% reduction in recognition accuracy.; For the HMM-based decoder of the speech recognition system, it is advantageous to use models with multi mixtures, but with more mixtures the calculation becomes more complicated. Using a table look-up method proposed in this thesis the new design can handle up to 16 states and 8 mixtures. This new design can be easily extended to handle models which have more states and mixtures. We have implemented the new algorithm with an Altera FPGA chip using fix-point calculation and tested the FPGA chip with the speech data from the AURORA 2 database, which is a well known database designed to evaluate the performance of speech recognition algorithms in noisy conditions [27]. The recognition accuracy of the new system is 91.01%. A conventional software recognition system running on PC using 32-bit floating point calculation has a recognition accuracy of 94.65%.
机译:在过去的几十年中,机器自动语音识别(ASR)受到了广泛的关注。与其他语音识别算法相比,基于梅尔频率倒谱系数(MFCC)和隐马尔可夫模型(HMM)的语音识别算法具有更好的识别性能,并被广泛应用于许多应用中。本文提出了一种具有高效MFCC提取算法和多种混合模型的语音识别系统。它由两部分组成:MFCC特征提取器和基于HMM的语音解码器。在常规的MFCC特征提取算法中,语音被分成一些短的重叠帧。现有的提取算法需要大量的计算,并且不适合硬件实现。我们在工作中开发了一种硬件高效的MFCC特征提取算法。与传统算法相比,新算法将计算能力降低了54%,识别精度仅降低了1.7%。对于语音识别系统的基于HMM的解码器,使用具有多种混合的模型是有利的,但是混合越多,计算就越复杂。使用本文提出的查表方法,新设计可以处理多达16种状态和8种混合物。可以轻松扩展此新设计,以处理状态和混合更多的模型。我们已经使用Altera FPGA芯片通过定点计算实现了新算法,并使用AURORA 2数据库中的语音数据对FPGA芯片进行了测试,该数据库是一个著名的数据库,旨在评估嘈杂条件下的语音识别算法的性能[ 27]。新系统的识别精度为91.01%。使用32位浮点计算在PC上运行的常规软件识别系统的识别精度为94.65%。

著录项

  • 作者

    Han, Wei.;

  • 作者单位

    The Chinese University of Hong Kong (Hong Kong).;

  • 授予单位 The Chinese University of Hong Kong (Hong Kong).;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 255 p.
  • 总页数 255
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号