首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Structural maximum a posteriori speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS
【24h】

Structural maximum a posteriori speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS

机译:汉语普通话TTS结构最大后验说话人说话率依赖分层韵律模型的适应

获取原文

摘要

In this paper, a structural maximum a posterior speaker adaptation method to adjust the existing speaking rate (SR) dependent hierarchical prosodic model (SR-HPM) to a new speaker's data for realizing a new voice of any given SR is discussed. The adaptive SR-HPM is formulated based on MAP estimation with a reference SR-HPM serving as an informative prior. The prior information provided by the reference SR-HPM is hierarchically organized by decision trees. The results of objective and subjective evaluations showed that the proposed method not only performed slightly better than the maximum likelihood-based model in the observed SR range of the target speaker's data, but also was much better in the unseen SR range.
机译:在本文中,讨论了一种结构上最大的后说话者自适应方法,用于将现有的与说话率(SR)相关的分层韵律模型(SR-HPM)调整为新说话者的数据,以实现任何给定SR的新声音。自适应SR-HPM是基于MAP估计而制定的,其中参考SR-HPM作为先验信息。参考SR-HPM提供的先验信息由决策树分层组织。主观和主观评估的结果表明,该方法不仅在目标说话人数据的观察到的SR范围内比基于最大似然模型的性能稍好,而且在看不见的SR范围内也要好得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号