首页> 外文学位 >Front-end of Wake-Up-Word Speech Recognition System Design on FPGA.
【24h】

Front-end of Wake-Up-Word Speech Recognition System Design on FPGA.

机译:FPGA唤醒词语音识别系统设计的前端。

获取原文
获取原文并翻译 | 示例

摘要

A typical speech recognition system is push button operated (Push-to-talk), which requires hand movement and hence mixed multi-modal interface. However, for disabled patients and those who use hands-busy applications (e.g., where the user has objects to manipulate or device to control while asking for assistance from another device) movement may be restricted or impossible. The only alternative is to use Speech Only Interface. The method that is being proposed is called Wake-Up-Word Speech Recognition (WUW-SR). A WUW-SR system would allow the user to operate (activate) many systems (Cell phone, Computer, Elevator, etc.) with speech commands instead of hand movements.;This work defines a new front-end paradigm of the Wake-Up-Word Speech Recognition on Field Programmable gate Arrays (FPGA). The-State-of-The-Art Front-end of WUW-SR system is based on three different subsystems that can produce three sets of features simultaneously: Mel-frequency Cepstral Coefficients (MFCC), Linear Predictive Coding Coefficients (LPC), and Enhanced Mel-frequency Cepstral Coefficients (ENH_MFCC). These extracted features are then compressed and transmitted to the server via a dedicated channel, where subsequently they are decoded. These features are decoded with corresponding Hidden Markov Models (HMMs) in the back-end stage of the WUW-SR.;In the WUW-SR system, the front-end processor is located at the terminal (e.g. hand-held device) which is typically connected over a data network to remote back-end recognition (e.g., server). WUW's front-end can be added to any hand-held electronic device compatible with WUW-SR and command (activate) it by using our voice only (no push to talk as is presently done). WUW's front-end is designed, and implemented in Altera DSP development kit with Cyclone III FPGA as a portable system acting as a processor that is capable of computing three different sets of features at a much faster rate than software. It is cost effective, consumes very little power, and it is not limited by having to operate on a general-purpose computer so it can be used on any portable device.
机译:典型的语音识别系统是按按钮操作(即按即说),需要手移动,因此需要混合的多模式界面。但是,对于残疾患者和使用忙碌应用程序的患者(例如,当用户有对象要操纵或设备要控制,同时要求其他设备协助的情况),移动可能会受到限制或不可能进行。唯一的选择是使用“仅语音”接口。提出的方法称为唤醒单词语音识别(WUW-SR)。 WUW-SR系统将允许用户使用语音命令而不是用手操作来操作(激活)许多系统(手机,计算机,电梯等);这项工作定义了唤醒的新前端范例。现场可编程门阵列(FPGA)上的Word语音识别。 WUW-SR系统的最新前端基于三个不同的子系统,它们可以同时产生三组特征:梅尔频率倒谱系数(MFCC),线性预测编码系数(LPC)和增强的梅尔频率倒谱系数(ENH_MFCC)。然后,将这些提取的特征压缩并通过专用通道传输到服务器,然后在此解码。这些功能在WUW-SR的后端阶段使用相应的隐马尔可夫模型(HMM)进行解码。在WUW-SR系统中,前端处理器位于终端(例如,手持设备)上,通常,它通过数据网络连接到远程后端识别(例如服务器)。可以将WUW的前端添加到与WUW-SR兼容的任何手持式电子设备中,并且仅使用我们的语音即可命令(激活)它(目前不像通话一样进行通话)。 WUW的前端是在带有Cyclone III FPGA的Altera DSP开发套件中设计和实现的,它是一种便携式系统,用作处理器,能够以比软件快得多的速率计算三套不同的功能。它具有成本效益,消耗的功率很小,并且不受必须在通用计算机上运行的限制,因此可以在任何便携式设备上使用。

著录项

  • 作者

    Eljhani, Mohamed Muftah.;

  • 作者单位

    Florida Institute of Technology.;

  • 授予单位 Florida Institute of Technology.;
  • 学科 Computer engineering.;Computer science.;Electrical engineering.
  • 学位 Ph.D.
  • 年度 2015
  • 页码 178 p.
  • 总页数 178
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 农学(农艺学);
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号