首页> 外文会议>International conference on text, speech and dialogue >Unified Simplified Grapheme Acoustic Modeling for Medieval Latin LVCSR
【24h】

Unified Simplified Grapheme Acoustic Modeling for Medieval Latin LVCSR

机译:中世纪拉丁语LVCSR的统一简化音素声学建模

获取原文

摘要

A large vocabulary continuous speech recognition (LVCSR) system designed for dictation of medieval Latin language documents is introduced. Such language technology tool can be of great help for preserving Latin language charters from this era, as optical character recognition systems are often challenged by these historical materials. As corresponding historical research focuses on the Visegrad region, our primary aim is to make medieval Latin dictation available for texts and speakers of this region, concentrating on Czech, Hungarian and Polish. The baseline acoustic models we start with are monolingual grapheme-based ones. On one hand, the application of medieval Latin knowledge-based grapheme-to-phoneme (G2P) mapping from the source language to the target language resulted in significant improvement, reducing the Word Error Rate (WER) by 13.3%. On the other hand, applying a Unified Simplified Grapheme (USG) inventory set for the three-language acoustic data set complemented with Romanian speech data, resulted in a further 0.7% WER reduction - without using any target or source language G2P rules.
机译:介绍了用于听写中世纪拉丁语言文档的大型词汇连续语音识别(LVCSR)系统。由于光学字符识别系统经常受到这些历史资料的挑战,因此这种语言技术工具对于保存该时代的拉丁语宪章可能会大有帮助。由于相应的历史研究集中于维谢格拉德(Visegrad)地区,因此我们的主要目标是使该地区的文字和说话者可以使用中世纪的拉丁语听写,重点是捷克语,匈牙利语和波兰语。我们从基线开始的声学模型是基于单语言字素的模型。一方面,从源语言到目标语言的中世纪拉丁语基于知识的音素到音素(G2P)映射的应用带来了显着的改进,字错误率(WER)降低了13.3%。另一方面,对三语言声学数据集应用统一简化字素(USG)清单集,并补充罗马尼亚语音数据,可进一步降低WER 0.7%,而无需使用任何目标语言或源语言G2P规则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号