LSTM-Based Speech Segmentation for TTS Synthesis

机译：基于LSTM的TTS综合语音分割

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper describes experiments on speech segmentation for the purposes of text-to-speech synthesis. We used a bidirectional LSTM neural network for framewise phone classification and another bidirectional LSTM network for predicting the duration of particular phones. The proposed segmentation procedure combines both outputs and finds the optimal speech-phoneme alignment by using the dynamic programming approach. We introduced two modifications to increase the robustness of phoneme classification. Experiments were performed on 2 professional voices and 2 amateur voices. A comparison with a reference HMM-based segmentation with additional manual corrections was performed. Preference listening tests showed that the reference and experimental segmentation are equivalent when used in a unit selection TTS system.

机译：本文介绍了用于语音合成的语音分割的实验。我们使用了用于框架电话分类的双向LSTM神经网络和另一个双向LSTM网络，用于预测特定手机的持续时间。建议的分割过程结合了两个输出并通过使用动态编程方法找到最佳语音 - 音素对齐。我们介绍了两个修改，以增加音素分类的稳健性。实验是在2个专业的声音和2个业余声音上进行的。执行与附加手动校正的基于HMM的分割的比较。偏好聆听测试表明，当在单位选择TTS系统中使用时，参考和实验分割是等效的。

著录项

来源
《International conference on text, speech, and dialogue》|2019年|xix 414 p.|共12页
会议地点
作者
Zdenek Hanzlicek; Jakub Vit; Daniel Tihelka;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
Speech segmentation; Speech synthesis; LSTM neural networks;

机译：语音分割;语音合成;LSTM神经网络;

相似文献

外文文献
中文文献
专利

1. LSTM-Based Robust Voicing Decision Applied to DNN-Based Speech Synthesis [J] . R. Pradeep, M. Kiran Reddy, K. Sreenivasa Rao Automatic Control and Computer Sciences . 2019,第4期

机译：基于LSTM的强大的声音决策应用于基于DNN的语音合成
2. On combining acoustic and modulation spectrograms in an attention LSTM-based system for speech intelligibility level classification [J] . Gallardo-Antolin Ascension, Montero Juan M. Neurocomputing . 2021,第Octa7期

机译：在基于LSTM的语音清晰度分类中的注意力和调制谱图中结合声学和调制谱图
3. Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration [J] . Strake Maximilian, Defraene Bruno, Fluyt Kristoff, EURASIP journal on advances in signal processing . 2020,第a期

机译：基于LSTM的噪声抑制后跟基于CNN的语音恢复的语音增强
4. LSTM-Based Speech Segmentation for TTS Synthesis [C] . Zdenek Hanzlicek, Jakub Vit, Daniel Tihelka International conference on text, speech, and dialogue . 2019

机译：基于LSTM的语音分割，用于TTS合成
5. Speech analysis and synthesis based on ARMA lattice model [D] . Wang, Min 2003

机译：基于ARMA晶格模型的语音分析与合成
6. Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference [O] . Byeongwook Lee, Kwang-Hyun Cho -1

机译：以语音包络作为时间参考的自动语音识别的大脑启发式语音分割
7. A fusion approach for automatic speech segmentation of large corpora with application to speech synthesis [O] . Jarifi, Safaa, Pastor, Dominique, Rosec, Olivier 2007

机译：大语料库自动语音分割的融合方法及其在语音合成中的应用

LSTM-Based Speech Segmentation for TTS Synthesis

摘要

著录项

相似文献

相关主题

期刊订阅