首页> 外文会议>European Conference on Ambient Intelligence >Spoken Language Identification Using ConvNets
【24h】

Spoken Language Identification Using ConvNets

机译:使用convnets口语语言识别

获取原文

摘要

Language Identification (LI) is an important first step in several speech processing systems. With a growing number of voice-based assistants, speech LI has emerged as a widely researched field. To approach the problem of identifying languages, we can either adopt an implicit approach where only the speech for a language is present or an explicit one where text is available with its corresponding transcript. This paper focuses on an implicit approach due to the absence of transcriptive data. This paper benchmarks existing models and proposes a new attention based model for language identification which uses log-Mel spectrogram images as input. We also present the effectiveness of raw waveforms as features to neural network models for LI tasks. For training and evaluation of models, we classified six languages (English, French, German, Spanish, Russian and Italian) with an accuracy of 95.4% and four languages (English, French, German, Spanish) with an accuracy of 96.3% obtained from the VoxForge dataset. This approach can further be scaled to incorporate more languages.
机译:语言识别(LI)是若干语音处理系统中的重要第一步。凭借越来越多的语音助理,言语李作为一个广泛研究的领域出现。为了接近识别语言的问题,我们可以采用隐含的方法,其中只有语言的语音存在,或者是一个明确的文本具有相应的成绩单的语言。由于没有转录数据,本文侧重于隐含的方法。本文基准测试现有模型,并提出了一种基于新的语言识别模型,它使用Log-Mel谱图图像作为输入。我们还向LI任务的神经网络模型提出了原始波形的有效性。对于模型的培训和评估,我们分类了六种语言(英语,法语,德语,西班牙语,俄语和意大利语),准确性为95.4%和四种语言(英语,法语,德语,西班牙语),精度为96.3% voxtorge数据集。可以进一步扩展这种方法以包含更多语言。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号