首页> 外国专利> VOICE DATA ENHANCING METHOD AND DEVICE IN VOICE RECOGNITION BASED ON RECURRENT NEURAL NETWORK

VOICE DATA ENHANCING METHOD AND DEVICE IN VOICE RECOGNITION BASED ON RECURRENT NEURAL NETWORK

机译:基于递归神经网络的语音识别语音数据增强方法及装置

摘要

A voice data enhancing method based on a recurrent neural network in the field of voice recognition processing aims at solving the problem of excessive modeling word dependence caused by irregular grammar phenomena of voice recognition simulation in voice recognition in a recurrent neural network. The method comprises: extracting acoustic features of various frequency energy values identifying voice from input voice data to generate acoustic feature vectors (201); obtaining a statement label sequence of the voice data according to a preset labeling file and the acoustic feature vectors (202); obtaining an alignment file after a decision cluster operation by means of the labeling file preset by a decision cluster, and the statement label sequence (203); generating a first random number γ between [0, 1], and comparing the first random number with a preset adjusting proportion α (204); and if the first random number γ is greater than the adjusting proportion α, performing enhancement processing on the voice data in a position indicated by a boundary file (205). The method enables irregular spoken language phenomena in training data to be increased quickly and conveniently.
机译:一种语音识别处理领域中基于递归神经网络的语音数据增强方法,旨在解决递归神经网络中语音识别中语音识别模拟的不规则语法现象所引起的过度建模单词依赖性问题。该方法包括:从输入语音数据中提取识别语音的各种频率能量值的声学特征以生成声学特征矢量(201);以及根据预设的标签文件和声学特征向量,获取语音数据的声明标签序列(202);通过决策簇预置的标签文件,在决策簇操作后获取对齐文件,以及语句标签序列; 203;产生介于[0,1]之间的第一随机数γ,并将其与预设的调整比例α进行比较(204);如果第一随机数γ大于调整比例α,则对边界文件所指示的位置的语音数据进行增强处理(205)。该方法使得训练数据中不规则的口语现象能够迅速而方便地增加。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号