首页> 外文会议>IEEE International Conference on Signal Processing >A new method for FO tracking errors fix and generation in HMM-based Mandarin speech synthesis using generation process model
【24h】

A new method for FO tracking errors fix and generation in HMM-based Mandarin speech synthesis using generation process model

机译:使用生成过程模型将基于HMM的普通话语音合成中的跟踪误差修复和生成的一种新方法

获取原文

摘要

The HMM-based Text-to-Speech System can produce high quality synthetic speech with flexible modeling of spectral and prosodie parameters. However the quality of synthetic speech degrades when feature vectors used in training are noisy. Among all noisy features, pitch tracking errors and corresponding flawed voiced/unvoiced (VU) decisions are the two key factors in voice quality problems. Also these errors will enlarge the RMSE of phoneme duration. In HMM-based TTS durations are typically modeled statistically using state duration probability distributions and duration prediction for unseen contexts. Use of rich context features enables synthesis without high-level linguistic knowledge. In this paper, an F0 generation process model is used to re-estimate F0 values in the regions of pitch tracking errors, as well as in unvoiced regions. A prior knowledge of VU is imposed in each Mandarin phoneme and they are used for VU decision. Also we design two sets of syntax features to improve Mandarin phone and pause duration prediction respectively.
机译:基于HMM的文本到语音系统可以产生高质量的合成语音,具有灵活的光谱和伪影参数建模。然而,当训练中使用的特征向量嘈杂时,合成语音的质量劣化。在所有嘈杂的特征中,俯仰跟踪错误和相应的缺陷浊音/清醒(VU)决策是语音质量问题的两个关键因素。这些错误也会扩大音素持续时间的RMSE。在基于HMM的TTS持续时间内通常使用状态持续时间概率分布和持续时间预测来模拟统计上的未操作上下文。使用丰富的上下文功能,可以在没有高级语言知识的情况下综合。在本文中,F 0 生成过程模型用于重新估计节距跟踪误差区域中的F 0 值,以及在清音区域中。在每个普通话音素中施加了对VU的先验知识,并且它们用于VU决定。我们还设计了两组语法功能,以便分别改善普通话手机和暂停持续时间预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号