首页> 外文期刊>ACM transactions on Asian and low-resource language information processing >Punjabi to ISO 15919 and Roman Transliteration with Phonetic Rectification
【24h】

Punjabi to ISO 15919 and Roman Transliteration with Phonetic Rectification

机译:Punjabi到ISO 15919和罗马音译具有语音整流

获取原文
获取原文并翻译 | 示例
           

摘要

Transliteration removes the script barriers. Unfortunately, Punjabi is written in four different scripts, i.e., Gurmukhi, Shahmukhi, Devnagri, and Latin. The Latin script is understandable for nearly all factions of the Punjabi community. The objective of our work is to transliterate the Punjabi Gurmukhi script into Latin script. There has been considerable progress in Punjabi to Latin transliteration, but the accuracy of present-day systems is less than 50% (Google Translator has approximately 45% accuracy). We do not have the facility of a rich parallel corpus for Punjabi, so we cannot use the corpus-based techniques of machine learning that are in vogue these days. The existing systems of transliteration follow grapheme-based approach. The grapheme-based transliteration is unable to handle many scenarios such as tones, inherent schwa, glottal stops, nasalization, and gemination. In this article, the grapheme-based transliteration has been augmented with phonetic rectification where the Punjabi script is rectified phonetically before applying character-to-character mapping. Handling the inherent short vowel schwa was the major challenge in phonetic rectification. Instead of following the fixed syllabic pattern, we devised a generic finite state transducer to insert schwa. The accuracy of our transliteration system is approximately 96.82%.
机译:音译消除了脚本障碍。不幸的是,Punjabi是用四种不同的脚本,即Gurmukhi,Shahmukhi,Devnagri和拉丁语编写。拉丁文脚本对于旁遮普社区的几乎所有派别都是可以理解的。我们工作的目标是将Punjabi Gurmukhi脚本翻译成拉丁文脚本。旁遮普岛对拉丁文音译有相当大的进展,但现今系统的准确性小于50%(谷歌翻译有大约45%的精度)。我们没有针对旁遮普地区的丰富并行语料库的设施,因此我们不能在这些天时使用基于语料库的机器学习技术。现有的音译系统遵循基于Graineme的方法。基于格子的音译不能处理许多情景,如色调,固有的SCHWA,最小的停止,鼻化和Gemination。在本文中,基于Graineme的音译已经增加了语音整流,其中旁遮普脚本在应用字符到字符的映射之前被语音纠正。处理固有的短元音Schwa是语音整改的主要挑战。我们设计了一个通用的有限状态传感器来插入Schwa。我们的音译系统的准确性约为96.82%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号