首页> 外国专利> LPC-10e DNN Voiced/Unvoiced Decision Method Using Deep Neural Network for Linear Predictive Coding-10e Vocoder

LPC-10e DNN Voiced/Unvoiced Decision Method Using Deep Neural Network for Linear Predictive Coding-10e Vocoder

机译:基于深度神经网络的LPC-10e DNN有声/无声决策方法用于线性预测编码10e声码器

摘要

The present invention relates to a sound improvement technology of a vocoder. More specifically, the present invention relates to a voiced/unvoiced sound discrimination method which attenuates distortion of a synthetic sound source through more accurate voiced/unvoiced sound discrimination using machine learning of voiced and unvoiced sound labeled for each frame of a sound source. According to the present invention, in the case of synthesized sound through accurately discriminated voiced/unvoiced sound, synthetic noise is significantly reduced and higher sound quality can be achieved in the synthesized sound. The voiced/unvoiced sound discrimination method comprises: a step (a) of generating and inputting voiced/unvoiced sound label information and a discrimination reference value for the voiced/unvoiced sound label information by using a TIMIT database (DB); a step (b) of extracting a first sound feature value, inputting the first sound feature value and the discrimination reference value to a deep neural network (DNN) model, and training a weight and a bias; a step (c) of extracting a second sound feature value, inputting the second sound feature value to the DNN model by a DNN model part, and calculating two output values; and a step (d) of comparing the two output values with the discrimination reference value, and discriminating a random input sound source as the voiced sound or the unvoiced sound.
机译:本发明涉及一种声码器的声音改善技术。更具体地,本发明涉及一种浊音/清浊声音鉴别方法,其通过使用针对声源的每一帧标记的浊音和清浊声音的机器学习,通过更准确的浊音/清浊声音鉴别来衰减合成声源的失真。根据本发明,在通过准确地区分的有声/无声声音的合成声音的情况下,合成噪声显着降低,并且可以在合成声音中获得更高的音质。语音/非语音声音识别方法包括:步骤(a),通过使用TIMIT数据库(DB)生成并输入语音/非语音声音标签信息和语音/非语音声音标签信息的辨别参考值。步骤(b),提取第一声音特征值,将第一声音特征值和判别参考值输入到深度神经网络(DNN)模型,并训练权重和偏差。步骤(c),提取第二声音特征值,通过DNN模型部分将第二声音特征值输入DNN模型,并计算两个输出值;步骤(d),将两个输出值与判别基准值进行比较,将随机输入声源判别为有声声音或无声声音。

著录项

  • 公开/公告号KR101862982B1

    专利类型

  • 公开/公告日2018-05-30

    原文格式PDF

  • 申请/专利权人 AGENCY FOR DEFENSE DEVELOPMENT;

    申请/专利号KR20170021568

  • 发明设计人 KIM HONG KOOK;LEE JUNG HYUK;

    申请日2017-02-17

  • 分类号G10L25/93;G10L15/02;G10L19;G10L25/30;

  • 国家 KR

  • 入库时间 2022-08-21 12:37:48

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号