首页> 外文会议>Annual Conference of the International Speech Communication Association >A New Pre-training Method for Training Deep Learning Models with Application to Spoken Language Understanding
【24h】

A New Pre-training Method for Training Deep Learning Models with Application to Spoken Language Understanding

机译:一种新的培训深入学习模型的新预训练方法,以应用于口语语言理解

获取原文

摘要

We propose a simple and efficient approach for pre-training deep learning models with application to slot filling tasks in spoken language understanding. The proposed approach leverages unlabeled data to train the models and is generic enough to work with any deep learning model. In this study, we consider the CNN2CRF architecture that contains Convolutional Neural Network (CNN) with Conditional Random Fields (CRF) as top layer, since it has shown great potential for learning useful representations for supervised sequence learning tasks. The proposed pre-training approach with this architecture learns the feature representations from both labeled and unlabeled data at the CNN layer, covering features that would not be observed in limited labeled data. At the CRF layer, the unlabeled data uses predicted classes of words as latent sequence labels together with labeled sequences. Latent labeled sequences, in principle, has the regularization effect on the labeled sequences, yielding a better generalized model. This allows the network to learn representations that are useful for not only slot tagging using labeled data but also learning dependencies both within and between latent clusters of unseen words. The proposed pre-training method with the CRF2CNN architecture achieves significant gains with respect to the strongest semi-supervised baseline.
机译:我们提出了岗前培训深度学习模型与应用在口语理解槽分配任务的简单而有效的方法。所提出的方法利用未标记数据来训练模型,是通用的,足以工作,任何深刻的学习模式。在这项研究中,我们认为包含卷积神经网络(CNN)与条件随机域(CRF)作为顶层CNN2CRF架构,因为它为学习监督序列学习任务用的表现显示出巨大的潜力。这种体系结构所提出的预培训方法学习到在CNN层标记的和未标记的数据的特征表示,覆盖,不会在限定标记的数据可观察到的特征。在CRF层,所述未标记的数据使用与标记序列来预测单词作为潜序列标签的类在一起。潜标有序列原则,对标有序列正规化作用,产生更好的广义模型。这允许网络得知是不仅插槽使用标记的数据标记,而且内部和看不见的话潜伏集群之间的学习依赖性用的表现。与CRF2CNN架构建议前期训练方法实现相对于最强的半监督基线显著的收益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号