...
首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron
【24h】

Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron

机译:达到最终韵律转移,用于塔歇尔斯竞争语言合成

获取原文
           

摘要

We present an extension to the Tacotron speech synthesis architecture that learns a latent embedding space of prosody, derived from a reference acoustic representation containing the desired prosody. We show that conditioning Tacotron on this learned embedding space results in synthesized audio that matches the prosody of the reference signal with fine time detail even when the reference and synthesis speakers are different. Additionally, we show that a reference prosody embedding can be used to synthesize text that is different from that of the reference utterance. We define several quantitative and subjective metrics for evaluating prosody transfer, and report results with accompanying audio samples from single-speaker and 44-speaker Tacotron models on a prosody transfer task.
机译:我们向Tacodron语音合成架构展示了学习韵律的潜在嵌入空间的延伸,从包含所需韵律的参考声学表示。我们表明,在该学习的嵌入空间上的调节塔克罗伦嵌入空间导致合成音频与参考信号的韵律相匹配,即使当参考和合成扬声器也不同,即使在参考和合成扬声器也不同。此外,我们表明,参考韵律嵌入可以用来合成与参考话语不同的文本。我们定义了几种定量和主观度量,用于评估韵律转移,并在韵律转移任务上伴随单人扬声器和44扬声器塔基诺模型的音频样本报告结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号