首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP >Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition
【24h】

Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition

机译:利用深度神经网络提高属性和电话估计的准确性,从而实现基于检测的语音识别

获取原文

摘要

Generation of high-precision sub-phonetic attribute (also known as phonological features) and phone lattices is a key frontend component for detection-based bottom-up speech recognition. In this paper we employ deep neural networks (DNNs) to improve detection accuracy over conventional shallow MLPs (multi-layer perceptrons) with one hidden layer. A range of DNN architectures with five to seven hidden layers and up to 2048 hidden units per layer have been explored. Training on the SI84 and testing on the Nov92 WSJ data, the proposed DNNs achieve significant improvements over the shallow MLPs, producing greater than 90% frame-level attribute estimation accuracies for all 21 attributes tested for the full system. On the phone detection task, we also obtain excellent frame-level accuracy of 86.6%. With this level of high-precision detection of basic speech units we have opened the door to a new family of flexible speech recognition system design for both top-down and bottom-up, lattice-based search strategies and knowledge integration.
机译:高精度子语音属性(也称为语音特征)和电话格的生成是基于检测的自底向上语音识别的关键前端组件。在本文中,我们使用深度神经网络(DNN)来提高具有一个隐藏层的常规浅层MLP(多层感知器)的检测精度。已探索了一系列DNN架构,这些架构具有五到七个隐藏层,每层最多2048个隐藏单元。拟议的DNN在SI84上进行了训练并在Nov92 WSJ数据上进行了测试,与浅层MLP相比,取得了显着改善,对整个系统测试的所有21个属性产生了超过90%的帧级属性估计准确性。在电话检测任务上,我们还获得了86.6%的出色帧级精度。通过对基本语音单元的高精度检测,我们为基于自上而下和自下而上,基于格的搜索策略以及知识集成的灵活语音识别系统设计新家族打开了大门。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号