首页> 外文会议>International Conference on Language Resources and Evaluation >Burmese Speech Corpus, Finite-State Text Normalization and Pronunciation Grammars with an Application to Text-to-Speech
【24h】

Burmese Speech Corpus, Finite-State Text Normalization and Pronunciation Grammars with an Application to Text-to-Speech

机译:缅甸语音语料库,有限状态文本规范化和发音语法与文本到语音的应用程序

获取原文

摘要

This paper introduces an open-source crowd-sourced multi-speaker speech corpus along with the comprehensive set of finite-state transducer (FST) grammars for performing text normalization for the Burmese (Myanmar) language. We also introduce the open-source finite-state grammars for performing grapheme-to-phoneme (G2P) conversion for Burmese. These three components are necessary (but not sufficient) for building a high-quality text-to-speech (TTS) system for Burmese, a tonal Southeast Asian language from the Sino-Tibetan family which presents several linguistic challenges. We describe the corpus acquisition process and provide the details of our finite state-based approach to Burmese text normalization and G2P. Our experiments involve building a multi-speaker TTS system based on long short term memory (LSTM) recurrent neural network (RNN) models, which were previously shown to perform well for other languages in a low-resource setting. Our results indicate that the data and grammars that we are announcing are sufficient to build reasonably high-quality models comparable to other systems We hope these resources will facilitate speech and language research on the Burmese language, which is considered by many to be low-resource due to the limited availability of free linguistic data.
机译:本文介绍了一个开源人群源多扬声器语音语料库,以及全面的有限状态传感器(FST)语法,用于为缅甸(缅甸)语言进行文本标准化。我们还介绍了用于对缅甸人进行图形到音素(G2P)转换的开源有限状态语法。这三个组件是必要的(但不足以来)为缅甸的高质量文本 - 语音(TTS)系统,来自中藏族家族的色调东南亚语言,呈现出几种语言挑战。我们描述了语料库采集过程,并提供了我们的有限状态的缅甸文本规范化和G2P方法的详细信息。我们的实验涉及构建基于长短短期内存(LSTM)复制神经网络(RNN)模型的多扬声器TTS系统,前面显示在低资源设置中对其他语言表现良好。我们的结果表明,我们宣布的数据和语法足以建立与其他系统相当的相当高质量的型号,我们希望这些资源能够促进缅甸语言的言语和语言研究,这些资源被许多人视为低资源由于自由语言数据的可用性有限。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号