Structural maximum a posteriori speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS

机译：汉语普通话TTS结构最大后验说话人说话率依赖分层韵律模型的适应

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, a structural maximum a posterior speaker adaptation method to adjust the existing speaking rate (SR) dependent hierarchical prosodic model (SR-HPM) to a new speaker's data for realizing a new voice of any given SR is discussed. The adaptive SR-HPM is formulated based on MAP estimation with a reference SR-HPM serving as an informative prior. The prior information provided by the reference SR-HPM is hierarchically organized by decision trees. The results of objective and subjective evaluations showed that the proposed method not only performed slightly better than the maximum likelihood-based model in the observed SR range of the target speaker's data, but also was much better in the unseen SR range.

机译：在本文中，讨论了一种结构上最大的后说话者自适应方法，用于将现有的与说话率（SR）相关的分层韵律模型（SR-HPM）调整为新说话者的数据，以实现任何给定SR的新声音。自适应SR-HPM是基于MAP估计而制定的，其中参考SR-HPM作为先验信息。参考SR-HPM提供的先验信息由决策树分层组织。主观和主观评估的结果表明，该方法不仅在目标说话人数据的观察到的SR范围内比基于最大似然模型的性能稍好，而且在看不见的SR范围内也要好得多。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2016年|5625-5629|共5页
会议地点
作者
I-Bin Liao; Chen-Yu Chiang; Sin-Horng Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Mandarin TTS; hierarchical prosodic model; prosodic-acoustic features; speaker adaptation;

机译：汉语普通话;层次韵律模型;韵律特征;说话者适应;

相似文献

外文文献
中文文献
专利

1. Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation [J] . Huang Zhen, Siniscalchi Sabato Marco, Lee Chin-Hui Pattern recognition letters . 2017,第octa15期

机译：基于深度神经网络的语音识别和说话人自适应的插件最大后验解码器的分层贝叶斯组合
2. Speaker adaptation in the maximum a posteriori framework based on the probabilistic 2-mode analysis of training models [J] . Yongwon Jeong EURASIP journal on audio, speech, and music processing . 2013,第1期

机译：基于训练模型的概率2模式分析，在最大后验框架中进行说话人适应
3. Maximum a Posteriori Adaptation of the Centroid Model for Speaker Verification [J] . Hautamki V., Kinnunen T., Krkkinen I., IEEE signal processing letters . 2008,第1期

机译：用于说话人验证的质心模型的最大后验适应
4. Structural maximum a posteriori speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS [C] . I-Bin Liao, Chen-Yu Chiang, Sin-Horng Chen IEEE International Conference on Acoustics, Speech and Signal Processing . 2016

机译：构造最大普通话依赖于普通话依赖分层韵律模型的后验扬声器适应
5. Leveraging Metalinguistic Awareness and L1 Prosody in the Learning of L2 Prosody: The Case of Mandarin Speakers Learning English Sentence Stress [D] . Liu, Di. 2018

机译：在L2韵律学习中利用金属语言意识和L1韵律：普通话扬声器学习英语句子压力的情况
6. The Perception and Representation of Segmental and Prosodic Mandarin Contrasts in Native Speakers of Cantonese [O] . Xujin Zhang, Arthur G. Samuel, Siyun Liu -1

机译：粤语母语人士对比的观念与陈述
7. STRUCTURAL SPEAKER ADAPTATION USING MAXIMUM A POSTERIORI APPROACH AND A GAUSSIAN DISTRIBUTIONS MERGING TECHNIQUE [O] . Olivier Bellot, Driss Matrouf, Pascal Nocera, 2013

机译：使用最大后验方法和高斯分布合并技术进行结构音箱自适应
8. Prosodic Speaker Verification using Subspace Multinomial Models with Intersession Compensation. [R] . Kockmann, M., Burget, L., Glembek, O., 2013

机译：使用带间隙补偿的子空间多项式模型进行韵律说话人验证。

Structural maximum a posteriori speaker adaptation of speaking rate-dependent hierarchical prosodic model for Mandarin TTS

摘要

著录项

相似文献

相关主题

期刊订阅