An unsupervised deep domain adaptation approach for robust speech recognition

Sun Sining; Zhang Binbin; Xie Lei; Zhang Yanning

首页> 外文期刊>Neurocomputing >An unsupervised deep domain adaptation approach for robust speech recognition

【24h】

An unsupervised deep domain adaptation approach for robust speech recognition

机译：一种无监督的深度域自适应方法，可实现可靠的语音识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper addresses the robust speech recognition problem as a domain adaptation task. Specifically, we inttoduce an unsupervised deep domain adaptation (DDA) approach to acoustic modeling in order to eliminate the training-testing mismatch that is common in real-world use of speech recognition. Under a multi-task learning framework, the approach jointly learns two discriminative classifiers using one deep neural network (DNN). As the main task, a label predictor predicts phoneme labels and is used during training and at test time. As the second task, a domain classifier discriminates between the source and the target domains during training. The network is optimized by minimizing the loss of the label classifier and to maximize the loss of the domain classifier at the same time. The proposed approach is easy to implement by modifying a common feed-forward network. Moreover, this unsupervised approach only needs labeled training data from the source domain and some unlabeled raw data of the new domain. Speech recognition experiments on noise/channel distortion and domain shift confirm the effectiveness of the proposed approach. For instance, on the Aurora-4 corpus, compared with the acoustic model trained only using clean data, the DDA approach achieves relative 37.8% word error rate (WER) reduction. (C) 2017 Elsevier B.V. All rights reserved.

机译：本文将鲁棒的语音识别问题作为领域自适应任务来解决。具体来说，我们为声学建模引入了一种无监督的深域自适应（DDA）方法，以消除在现实世界中语音识别中常见的训练测试失配问题。在多任务学习框架下，该方法使用一个深度神经网络（DNN）联合学习两个判别分类器。标签预测器作为主要任务，可以预测音素标签，并在训练期间和测试时间使用。作为第二项任务，领域分类器在训练过程中区分源域和目标域。通过最小化标签分类器的损失并同时最大化域分类器的损失来优化网络。通过修改公共前馈网络，可以轻松实现所提出的方法。而且，这种无监督的方法仅需要来自源域的带标签的训练数据和新域的一些无标签的原始数据。针对噪声/通道失真和域移位的语音识别实验证实了该方法的有效性。例如，在Aurora-4语料库上，与仅使用干净数据训练的声学模型相比，DDA方法可实现37.8％的字错误率（WER）降低。（C）2017 Elsevier B.V.保留所有权利。

著录项

来源
《Neurocomputing》 |2017年第27期|79-87|共9页
作者
Sun Sining; Zhang Binbin; Xie Lei; Zhang Yanning;
展开▼
作者单位

Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian, Peoples R China;

Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian, Peoples R China;

Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian, Peoples R China;

Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian, Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Domain adaptation; Robust speech recognition; Deep neural network; Deep learning;

机译：领域自适应;鲁棒语音识别;深度神经网络;深度学习;

相似文献

外文文献
中文文献
专利

1. Robust Speech Recognition Based on Structured Modeling, Irrelevant Variability Normalization and Unsupervised Online Adaptation [J] . Qiang Huo 電子情報通信学会技術研究報告 . 2008,第551期

机译：基于结构化建模，不相关变量归一化和无监督在线自适应的鲁棒语音识别
2. Unsupervised Speaker Adaptation for Robust Speech Recognition in Real Environments [J] . Shingo Yamade, Akira Baba, Shinichi Yoshikawa, Electronics and Communications in Japan. Part 2, Electronics . 2005,第8期

机译：无监督说话人适应，可在真实环境中实现可靠的语音识别
3. Noise robust speech recognition applied to unsupervised speaker adaptation [J] . Shingo Yamade, Akinobu Lee, Hiroshi Saruwatari, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2002,第527期

机译：适用于无监督说话者适应的抗噪语音识别
4. Coupled Unsupervised Deep Convolutional Domain Adaptation for Speech Emotion Recognition [C] . Ocquaye Elias Nii Noi, Mao Qirong, Guopeng Xu, International Conference on Multimedia Big Data . 2018

机译：耦合无监督深度卷积域自适应的语音情感识别
5. Joint-space adaptation technique for robust continuous speech recognition. [D] . Wang, Chien-Jen. 1997

机译：联合空间自适应技术，用于鲁棒的连续语音识别。
6. Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition [O] . Sankaranarayanan Ananthakrishnan, Shrikanth Narayanan -1

机译：类别韵律模型的无监督适应用于韵律标记和语音识别
7. Unsupervised Adaptation with Domain Separation Networks for Robust Speech Recognition [O] . Meng, Zhong, Chen, Zhuo, Mazalov, Vadim, 2017

机译：具有域分离网络的无监督自适应鲁棒性语音识别

An unsupervised deep domain adaptation approach for robust speech recognition

摘要

著录项

相似文献

相关主题

期刊订阅