Cross-lingual projection for class-based language models

机译：基于类的语言模型的跨语言投影

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a cross-lingual projection technique for training class-based language models. We borrow from previous success in projecting POS tags and NER mentions to that of a trained class-based language model. We use a CRF to train a model to predict when a sequence of words is a member of a given class and use this to label our language model training data. We show that we can successfully project the contextual cues for these classes across pairs of languages and retain a high quality class model in languages with no supervised class data. We present empirical results that show the quality of the projected models as well as their effect on the down-stream speech recognition objective. We are able to achieve over 70% of the WER reduction when using the projected class models as compared to models trained on human annotations.

机译：本文提出了一种用于训练基于班级的语言模型的跨语言投影技术。我们借鉴了以前在投射POS标签方面的成功经验，而NER提到了经过培训的基于班级的语言模型。我们使用CRF来训练模型，以预测单词序列何时是给定类的成员，并使用它来标记我们的语言模型训练数据。我们证明了我们可以跨语言对成功地为这些类投影上下文提示，并在没有监督类数据的情况下在语言中保留高质量的类模型。我们提供的经验结果表明了投影模型的质量以及它们对下游语音识别目标的影响。与使用人工注释训练的模型相比，使用投影的类模型，我们能够实现WER减少70％以上。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2016年|83-88|共6页
会议地点
作者
Beat Gfeller; Vlad Schogol; Keith Hall;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Language model adaptation in Tamil language using cross-lingual latent semantic analysis with document aligned corpora [J] . Selvam M., Natarajan A. M. Current Science: A Fortnightly Journal of Research . 2010,第7期

机译：使用跨语言潜在语义分析和文档对齐语料库对泰米尔语语言模型进行适应
2. Language model adaptation in Tamil language using cross-lingual latent semantic analysis with document aligned corpora [J] . Natarajan A. M., Selvam M. Current science . 2010,第07期

机译：使用跨语言潜在语义分析和文档对齐语料库对泰米尔语语言模型进行适应
3. Language model adaptation in Tamil language using cross-lingual latent semantic analysis with document aligned corpora [J] . Natarajan A. M., Selvam M. Current science . 2010,第07期

机译：使用跨语言潜在语义分析和文档对齐语料库对泰米尔语语言模型进行适应
4. Cross-lingual projection for class-based language models [C] . Beat Gfeller, Vlad Schogol, Keith Hall Annual meeting of the Association for Computational Linguistics . 2016

机译：基于类的语言模型的交叉语言投影
5. Multilingual model using cross-lingual word embeddings based on subword alignment and cross-task projection利用統計を見る [D] . Sakuma Jin 2019

机译：使用基于子词对齐和跨任务投影的跨语言词嵌入的多语言模型
6. Cross-lingual Unified Medical Language System entity linking in online health communities [O] . Yonatan Bitton, Raphael Cohen, Tamar Schifter, 2020

机译：在线健康社区中链接的跨语言统一医疗语言系统实体
7. Cross-lingual projection for class-based language models [O] . Beat Gfeller, Vlad Schogol, Keith Hall 2016

机译：基于类的语言模型的交叉语言投影

Cross-lingual projection for class-based language models

摘要

著录项

相似文献

相关主题

期刊订阅