首页> 外国专利> Filled translation for bootstrapping language understanding of low-resourced languages

Filled translation for bootstrapping language understanding of low-resourced languages

机译:填充翻译,用于引导对资源较少的语言的语言理解

摘要

Annotated training data (e.g., sentences) in a first language are used to generate annotated training data for a second language. For example, annotated sentences in English are manually collected first, and then is used to generate annotated sentences in Chinese. The annotated training data includes slot labels, slot values and carrier phrases. The carrier phrases are the portions of the training data that is outside of a slot. The carrier phrases are translated from the first language to one or more translations in the second language. The translations may include machine translations as well as human translations. Entities for the slot values are determined for the translated sentences using content sources that include locale-dependent entities. The determined entities are used to fill the slots in the translations of the second language. All or a portion of the resulting sentences may be used for training models in the second language.
机译:第一语言的带注释的训练数据(例如,句子)用于生成第二语言的带注释的训练数据。例如,首先手动收集英语注释的句子,然后将其用于生成中文注释的句子。带注释的训练数据包括时隙标签,时隙值和载波短语。载波短语是训练数据在插槽之外的部分。将载体短语从第一语言翻译成第二语言的一个或多个翻译。翻译可以包括机器翻译以及人工翻译。使用包括依赖于语言环境的实体的内容源,为翻译后的句子确定广告位值的实体。所确定的实体用于填充第二语言的翻译中的空位。所产生的句子的全部或一部分可以用于第二语言的训练模型。

著录项

  • 公开/公告号US9613027B2

    专利类型

  • 公开/公告日2017-04-04

    原文格式PDF

  • 申请/专利权人 MICROSOFT TECHNOLOGY LICENSING LLC;

    申请/专利号US201314074358

  • 发明设计人 MEI-YUH HWANG;YONG NI;

    申请日2013-11-07

  • 分类号G06F17/28;G06F17;G06F17/20;G06F17/21;G06F9/44;G06Q10;G06Q50;G10L15;G10L13;

  • 国家 US

  • 入库时间 2022-08-21 13:41:39

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号