首页> 外文会议>International Conference on speech and computer >Algorithms for Automatic Accentuation and Transcription of Russian Texts in Speech Recognition Systems
【24h】

Algorithms for Automatic Accentuation and Transcription of Russian Texts in Speech Recognition Systems

机译:语音识别系统中俄语文本自动重音和转录的算法

获取原文

摘要

This paper presents an overview of rule-based system for automatic accentuation and phonemic transcription of Russian texts for speech connected tasks, such as Automatic Speech Recognition (ASR). Two parts of the developed system, accentuation and transcription, use different approaches to achieve correct phonemic representations of input phrases. Accentuation is based on "Grammatical dictionary of the Russian language" of A.A. Zaliznyak and wiktionary corpus. To distinguish homographs, the accentuation system also utilises morphological information of the sentences based on Recurrent Neural Networks (RNN). Transcription algorithms apply the rules presented in the monograph of B.M. Lobanov and L.I. Tsirulnik "Computer Synthesis and Voice Cloning". The rules described in the present paper are implemented in an open-source module, which can be of use to any scientific study connected to ASR or Speech To Text (STT) tasks. Automatically marked up text annotations of the Russian Voxforge database were used as training data for an acoustic model in CMU Sphinx. The resulting acoustic model was evaluated on cross-validation, mean Word Accuracy being 71.2%. The developed toolkit is written in the Python language and is accessible on GitHub for any researcher interested.
机译:本文概述了用于语音连接任务(例如自动语音识别(ASR))的俄语文本的自动重音和音素转录的基于规则的系统。所开发系统的两个部分,重音和转录,使用不同的方法来实现输入短语的正确音素表示。重读基于A.A.的“俄语语法词典”。 Zaliznyak和维基百科的语料库。为了区分同形异义词,重音系统还利用基于递归神经网络(RNN)的句子形态信息。转录算法应用B.M.专着中提出的规则洛巴诺夫和L.I.齐鲁尼克(Tsirulnik)“计算机综合与语音克隆”。本文中描述的规则是在开源模块中实现的,该模块可用于与ASR或语音转文本(STT)任务相关的任何科学研究。俄罗斯Voxforge数据库的自动标记文本注释被用作CMU Sphinx中声学模型的训练数据。通过交叉验证对生成的声学模型进行了评估,平均单词准确度为71.2%。开发的工具包是用Python语言编写的,任何感兴趣的研究人员都可以在GitHub上访问。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号