Unified Simplified Grapheme Acoustic Modeling for Medieval Latin LVCSR

机译：中世纪拉丁语LVCSR的统一简化音素声学建模

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A large vocabulary continuous speech recognition (LVCSR) system designed for dictation of medieval Latin language documents is introduced. Such language technology tool can be of great help for preserving Latin language charters from this era, as optical character recognition systems are often challenged by these historical materials. As corresponding historical research focuses on the Visegrad region, our primary aim is to make medieval Latin dictation available for texts and speakers of this region, concentrating on Czech, Hungarian and Polish. The baseline acoustic models we start with are monolingual grapheme-based ones. On one hand, the application of medieval Latin knowledge-based grapheme-to-phoneme (G2P) mapping from the source language to the target language resulted in significant improvement, reducing the Word Error Rate (WER) by 13.3%. On the other hand, applying a Unified Simplified Grapheme (USG) inventory set for the three-language acoustic data set complemented with Romanian speech data, resulted in a further 0.7% WER reduction - without using any target or source language G2P rules.

机译：介绍了用于听写中世纪拉丁语言文档的大型词汇连续语音识别（LVCSR）系统。由于光学字符识别系统经常受到这些历史资料的挑战，因此这种语言技术工具对于保存该时代的拉丁语宪章可能会大有帮助。由于相应的历史研究集中于维谢格拉德（Visegrad）地区，因此我们的主要目标是使该地区的文字和说话者可以使用中世纪的拉丁语听写，重点是捷克语，匈牙利语和波兰语。我们从基线开始的声学模型是基于单语言字素的模型。一方面，从源语言到目标语言的中世纪拉丁语基于知识的音素到音素（G2P）映射的应用带来了显着的改进，字错误率（WER）降低了13.3％。另一方面，对三语言声学数据集应用统一简化字素（USG）清单集，并补充罗马尼亚语音数据，可进一步降低WER 0.7％，而无需使用任何目标语言或源语言G2P规则。

著录项

来源
《International conference on text, speech and dialogue》|2017年|420-428|共9页
会议地点
作者
Lili Szabó; Péter Mihajlik; András Balog; Tibor Fegyó;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
G2P; Medieval Latin; Under-resourced speech recognition; Unified simplified grapheme modeling;

机译：G2P;中世纪拉丁语;资源不足的语音识别;统一的简化字素建模;

相似文献

外文文献
中文文献
专利

1. LVCSR Based on Context-Dependent Syllable Acoustic Models [J] . Jian ZHANG, Longbiao WANG, Seiichi NAKAGAWA 電子情報通信学会技術研究報告 . 2008,第551期

机译：基于上下文相关音节声学模型的LVCSR
2. LVCSR Based on Context-Dependent Syllable Acoustic Models [J] . Jian ZHANG, Longbiao WANG, Seiichi NAKAGAWA 電子情報通信学会技術研究報告. 音声. Speech . 2007,第551期

机译：基于上下文相关音节声学模型的LVCSR
3. Training Deep Bidirectional LSTM Acoustic Model for LVCSR by a Context-Sensitive-Chunk BPTT Approach [J] . Kai Chen, Qiang Huo Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第7期

机译：通过上下文敏感块BPTT方法训练LVCSR的深度双向LSTM声学模型
4. Unified Simplified Grapheme Acoustic Modeling for Medieval Latin LVCSR [C] . Lili Szabo, Peter Mihajlik, Andras Balog, International Conference on Text, Speech and Dialogue . 2017

机译：中世纪拉丁语LVCSR的统一简化的石墨对声学建模
5. Search and decoding strategies for complex lexical modeling in LVCSR [D] . Deoras, Anoop 2011

机译：LVCSR中复杂词法建模的搜索和解码策略
6. Grapheme learning and grapheme-color synesthesia: toward a comprehensive model of grapheme-color association [O] . Michiko Asano, Kazuhiko Yokosawa 2013

机译：字素学习与字素色联觉：走向字素色关联的综合模型
7. Comparison of Grapheme-to-Phoneme Methods on Large Pronunciation Dictionaries and LVCSR Tasks [O] . Hahn Stefan, Vozila Paul, Bisani Maximilian 2012

机译：大型语音词典和LVCSR任务上的音素到音素方法的比较

Unified Simplified Grapheme Acoustic Modeling for Medieval Latin LVCSR

摘要

著录项

相似文献

相关主题

期刊订阅