首页> 外文期刊>IEICE Transactions on Information and Systems >Fundamental Frequency Modeling for Speech Synthesis Based on a Statistical Learning Technique
【24h】

Fundamental Frequency Modeling for Speech Synthesis Based on a Statistical Learning Technique

机译:基于统计学习技术的语音合成基本频率建模

获取原文
获取原文并翻译 | 示例
           

摘要

This paper proposes a novel multi-layer approach to fundamental frequency modeling for concatenative speech synthesis based on a statistical learning technique called additive models. We define an additive F_0 contour model consisting of long-term, intonational phrase-level, component and short-term, accentual phrase-level, component, along with a least-squares error criterion that includes a regularization term. A back-fitting algorithm, that is derived from this error criterion, estimates both components simultaneously by iteratively applying cubic spline smoothers. When this method is applied to a 7,000 utterance Japanese speech corpus, it achieves F_0 RMS errors of 28.9 and 29.8 Hz on the training and test data, respectively, with corresponding correlation coefficients of 0.806 and 0.777. The automatically determined intonational and accentual phrase components turn out to behave smoothly, systematically, and intuitively under a variety of prosodic conditions.
机译:本文提出了一种新的多层方法,用于基于称为加性模型的统计学习技术的级联语音合成的基本频率建模。我们定义了一个加法F_0等高线模型,该模型由长期,国际化的短语级别成分和短期,重音短语级别成分以及包含正则项的最小二乘误差准则组成。从该误差准则中得出的反拟合算法通过迭代应用三次样条平滑器同时估计两个分量。当此方法应用于7,000个话语日语语料库时,它在训练和测试数据上分别达到28.9和29.8 Hz的F_0 RMS误差,相应的相关系数为0.806和0.777。自动确定的民族和重音短语成分在各种韵律条件下表现出平稳,系统和直观的行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号