首页> 外文会议>International Conference on Contemporary Computing and Informatics >Web based and voice enabled IVRS for large scale Malayalam speech data collection
【24h】

Web based and voice enabled IVRS for large scale Malayalam speech data collection

机译:基于Web并启用语音的IVRS,用于大规模马拉雅拉姆语语音数据收集

获取原文

摘要

Speech corpora are vital resource in development and evaluation of automatic speech recognition systems, as well as for acoustic phonetic studies. Collecting a huge corpus is not an easy task. The lack of such resources is one of the reasons for the absence of good quality speech recognition systems in Indian languages. Here we have automated such process by developing web based tool for collecting broad band speech data and an IVR system with speech recognition for collecting narrow band speech data. The main features includes the full support for the typical recording, annotation and project administration workflow, easy editing of the speech content, with an advantage of a fully localizable user interface. This paper describes in detail the development of web based speech collection tool and an IVR system which will enable end-to-end building of speech corpus with minimum manual effort.
机译:语音语料库是自动语音识别系统的开发和评估以及声学研究中的重要资源。收集庞大的语料库并非易事。缺乏这种资源是缺乏印度语言优质语音识别系统的原因之一。在这里,我们通过开发用于收集宽带语音数据的基于Web的工具和具有语音识别功能的IVR系统来收集窄带语音数据,从而实现了自动化。主要功能包括完全支持典型的录音,注释和项目管理工作流程,易于编辑语音内容,并具有完全可本地化的用户界面。本文详细介绍了基于Web的语音收集工具和IVR系统的开发,该系统将以最少的人工就可实现端到端的语音语料库构建。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号