首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces
【24h】

Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces

机译:非等距嵌入空间中半监控的双语词典诱导

获取原文

摘要

Recent work on bilingual lexicon induction (BLI) has frequently depended either on aligned bilingual lexicons or on distribution matching, often with an assumption about the isometry of the two spaces. We propose a technique to quantitatively estimate this assumption of the isometry between two embedding spaces and empirically show that this assumption weakens as the languages in question become increasingly etymologically distant. We then propose Bilingual Lexicon Induction with Semi-Supervision (BLISS) — a semi-supervised approach that relaxes the isometric assumption while leveraging both limited aligned bilingual lexicons and a larger set of unaligned word embeddings, as well as a novel hubness filtering technique. Our proposed method obtains state of the art results on 15 of 18 language pairs on the MUSE dataset, and does particularly well when the embedding spaces don't appear to be isometric. In addition, we also show that adding supervision stabilizes the learning procedure, and is effective even with minimal supervision.*
机译:最近关于双语词典感应(BLI)的工作经常依赖于对齐的双语词典或分布匹配,通常是关于两个空间的等距的假设。我们提出了一种技术来定量地估计两个嵌入空间之间的等距的这种假设,并经验表明这种假设削弱,因为所涉及的语言变得越来越远的遥远。然后,我们提出了半监督(Bliss)的双语词典诱导 - 一种半监督方法,可以在利用有限的对齐双语词典和更大一套未对准的单词嵌入的同时放宽等距假设,以及一种新的毂效应滤波技术。我们所提出的方法在Muse数据集上的15个语言对中获得了最新的结果,并且当嵌入空间似乎没有等距时,特别好。此外,我们还表明添加监督稳定学习程​​序,即使在最小的监督下也是有效的。*

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号