首页> 外国专利> SPEECH SYNTHESIS DEVICE, SPEECH SYNTHESIS METHOD, SPEECH SYNTHESIS PROGRAM, SPEECH SYNTHESIS MODEL LEARNING DEVICE, SPEECH SYNTHESIS MODEL LEARNING METHOD, AND SPEECH SYNTHESIS MODEL LEARNING PROGRAM

SPEECH SYNTHESIS DEVICE, SPEECH SYNTHESIS METHOD, SPEECH SYNTHESIS PROGRAM, SPEECH SYNTHESIS MODEL LEARNING DEVICE, SPEECH SYNTHESIS MODEL LEARNING METHOD, AND SPEECH SYNTHESIS MODEL LEARNING PROGRAM

机译:言语合成装置,言语合成方法,言语合成程序,言语合成模型学习装置,言语合成模型学习方法,言语合成模型学习程序

摘要

The purpose of the invention is to prevent degradation in speech and unnatural phoneme duration. A speech synthesis device according to an embodiment comprises a storage unit, a creation unit, a determination unit, a generation unit, and a waveform generation unit. The storage unit stores, as statistical model information, an output distribution of acoustic characteristic parameters including pitch characteristic parameters, and a duration distribution by time parameters in each state of a statistical model having a plurality of states. The creation unit creates a statistical model series from the statistical model information and context information that corresponds to an input text. The determination unit determines the number of pitch waveforms for each state by employing the duration based on the duration distribution in each state of each statistical model in the statistical model series, and pitch information based on the output distribution of the pitch characteristic parameters. The generation unit generates an output distribution sequence of the acoustic characteristic parameters on the basis of the number of pitch waveforms, and generates acoustic characteristic parameters on the basis of the output distribution sequence. The waveform generation unit generates a speech waveform from the generated acoustic characteristic parameters.
机译:本发明的目的是防止语音质量下降和不自然的音素持续时间。根据实施例的语音合成装置包括存储单元,创建单元,确定单元,生成单元和波形生成单元。存储单元在包括多个状态的统计模型的每个状态中存储包括音高特征参数的声学特征参数的输出分布以及根据时间参数的持续时间的分布作为统计模型信息。创建单元从统计模型信息和与输入文本相对应的上下文信息中创建统计模型系列。确定单元通过使用基于统计模型系列中每个统计模型的每个统计模型的每个状态下的持续时间分布的持续时间以及基于音调特征参数的输出分布的音调信息,来确定每个状态的音调波形的数量。生成单元基于音高波形的数量生成声学特征参数的输出分布序列,并且基于输出分布序列来生成声学特征参数。波形产生单元从产生的声学特性参数产生语音波形。

著录项

  • 公开/公告号WO2017046887A1

    专利类型

  • 公开/公告日2017-03-23

    原文格式PDF

  • 申请/专利权人 KABUSHIKI KAISHA TOSHIBA;

    申请/专利号WO2015JP76269

  • 发明设计人 TAMURA MASATSUNE;MORITA MASAHIRO;

    申请日2015-09-16

  • 分类号G10L13/06;

  • 国家 WO

  • 入库时间 2022-08-21 13:31:41

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号