首页> 外文会议>Conference on Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments;Society of Photo-Optical Instrumentation Engineers >Improvement of the learning process of the automated speaker recognition system for critical use with HMM-DNN component
【24h】

Improvement of the learning process of the automated speaker recognition system for critical use with HMM-DNN component

机译:改进了与HMM-DNN组件一起使用的自动说话人识别系统的学习过程

获取原文

摘要

The article presents the results of the adaptation of the hybrid HMM-DNN speech synthesis model for use in automatedspeaker recognition system for critical use (ASRSCU). In particular, the process of learning the HMM-DNN speechsynthesis model with the estimation of the difference between the posterior probability distributions of all HMM statesand the actual a posteriori probability distribution, calculated by DNN, and the use of semantic information in thespeaker recognition process, has been improved. The features that are observed in the sequence of frames to which theinput phonogram is divided describe this information. The obtained results allowed improving the efficiency of the textdependentspeaker recognition when using ASRSCU in a noisy acoustic environment. The article formulated measuresfor the structural integration of the HMM-DNN component in ASRSCU and describes the practical aspects of thisprocess. In particular, the choice of the type and the method of normalization of the vectors of basic informative featuresat the frame level was substantiated, the number of HMM states and GMM parameters were determined depending onthe parameters of the chosen formation model, and the procedure for interpreting the recognition results was described.The paper formulates measures to optimize the learning process of the ASRSCU with the HMM-DNN component,which will be exploited in noisy environments.
机译:本文介绍了用于自动化的混合HMM-DNN语音合成模型的适应结果 关键用途说话人识别系统(ASRSCU)。特别是学习HMM-DNN语音的过程 估计所有HMM状态的后验概率分布之间的差异的综合模型 以及DNN计算的实际后验概率分布,以及语义信息在 说话人识别过程,得到了改善。在帧序列中观察到的特征 输入的留声机被划分描述此信息。获得的结果允许提高文本相关的效率 在嘈杂的声学环境中使用ASRSCU时的说话人识别。文章制定措施 用于ASRSCU中的HMM-DNN组件的结构集成,并描述了这一方面的实际方面 过程。特别是基本信息特征向量的类型选择和归一化方法 在帧级别得到证实,HMM状态和GMM参数的数量取决于 描述了所选地层模型的参数,以及解释识别结果的步骤。 本文提出了使用HMM-DNN组件优化ASRSCU学习过程的措施, 在嘈杂的环境中会被利用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号