首页> 外文会议>Asia-Pacific Signal and Information Processing Association Annual Summit and Conference >Prosody-aware subword embedding considering Japanese intonation systems and its application to DNN-based multi-dialect speech synthesis
【24h】

Prosody-aware subword embedding considering Japanese intonation systems and its application to DNN-based multi-dialect speech synthesis

机译:考虑日语语调系统的韵律意识子词嵌入及其在基于DNN的多方言语音合成中的应用

获取原文

摘要

This paper presents prosody-aware subword embedding considering Japanese intonation systems and its application to DNN (deep neural network)-based multi-dialect speech synthesis. In accordance with recent improvements of speech synthesis in rich-resourced languages, the research trend is shifting to more challenging languages such as Japanese dialects that still have undefined prosodic contexts. Conventional prosody-aware word embedding can unsupervisedly extract the contexts in a data-driven manner using words and F0 sequences. However, accurate contexts for unknown words are difficult to generate. To solve this problem, we propose prosody-aware subword embedding considering Japanese intonation systems. The unsupervised subword model, which is trained considering language and acoustic characteristics, can tokenize an unknown word into known subwords suitable for prosody-aware embedding. We also propose a modulation filtering method considering intra-subword moras to improve the embedding accuracies. We apply the methods to not only Japanese but also Japanese multi-dialect speech synthesis. In the multi-dialect case, we propose subword models shared among dialects and embedding models conditioned by dialect information. The experimental evaluation demonstrates that the proposed multi-dialect methods can improve speech quality in some Japanese dialects.
机译:本文介绍了考虑日语语调系统的韵律感知子词嵌入及其在基于DNN(深度神经网络)的多方言语音合成中的应用。根据资源丰富的语言中语音合成的最新改进,研究趋势正在转向更具挑战性的语言,例如日语方言,这些语言仍具有不确定的韵律语境。常规的感知韵律的单词嵌入可以使用单词和F0序列以数据驱动的方式无监督地提取上下文。但是,难以生成未知单词的准确上下文。为了解决这个问题,我们提出了考虑日语语调系统的韵律感知子词嵌入。经过训练的无监督子词模型考虑了语言和声学特征,可以将未知词标记为适合于韵律感知嵌入的已知子词。我们还提出了一种考虑子内字词修饰的调制滤波方法,以提高嵌入精度。我们不仅将方法应用于日语,而且还将其应用于日语多方言语音合成。在多方言的情况下,我们提出了在方言之间共享的子词模型和以方言信息为条件的嵌入模型。实验评估表明,所提出的多方言方法可以提高某些日本方言的语音质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号