首页> 中文期刊> 《计算机工程与设计》 >维吾尔语语音识别语料库中的OOV研究

维吾尔语语音识别语料库中的OOV研究

         

摘要

A serious problem of OOV (out of vocabulary) is produced by abundant morphology of Uyghur which has created a large number of words. To quantify the effect on speech recognition brought by OOV, based on Python programming language, an algorithm that can control OOV rate of test sets in Uyghur speech corpus and an algorithm that can select optimal text are proposed. Using these algorithms, telephone speech database of Uyghur is conducted. The experimental results demonstrate that controlling OOV rate of test sets can increase rate of Uyghur speech recognition.%鉴于维吾尔语丰富的形态变化产生大量单词引起的集外词(out of vocabulary,OOV)问题,为了定量研究OOV对维吾尔语语音识别的影响,采用控制语料库测试集OOV的算法及最佳文本挑选算法对不同OOV的测试集进行实验,算法通过Python语言实现.应用该算法进行电话语音库的文本转写,构建了维吾尔语的电话语音库.实验结果表明,该控制测试集OOV的方法能够有效地提高维吾尔语语音识别率.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号