首页> 外文会议>China national conference on computational linguistics;International symposium on natural language processing based on naturally annotated big data >Combining Lexical Context with Pseudo-alignment for Bilingual Lexicon Extraction from Comparable Corpora
【24h】

Combining Lexical Context with Pseudo-alignment for Bilingual Lexicon Extraction from Comparable Corpora

机译:从可比语料库中将词汇上下文与伪对齐方式相结合来提取双语词汇

获取原文

摘要

Only a few studies have made use of alignment information in bilingual lexicon extraction from comparable corpora, in which comparable corpora are necessarily divided into 1-1 aligned document pairs. They have not been able to show extracted lexicons benefit from the embedding of alignment information. Moreover, strict 1-1 alignments do not exist broadly in comparable corpora. We develop in this paper a language-independent approach to lexicon extraction by combining the classic lexical context with pseudo-alignment information. Experiments on the English-French comparable corpus demonstrate that pseudo-alignment in comparable corpora is an essential feature leading to a significant improvement of standard method of lexicon extraction, a perspective that have never been investigated in a similar way by previous studies.
机译:只有很少的研究在从可比语料库的双语词典中提取比对信息,其中可比语料库必须分为1-1个对齐的文档对。他们无法显示提取的词典受益于对齐信息的嵌入。而且,可比语料库中不存在严格的1-1对齐方式。我们在本文中通过将经典词汇上下文与伪对齐信息相结合,开发了一种独立于语言的词汇提取方法。在英语-法语可比语料库上进行的实验表明,可比语料库中的伪对齐是导致词典提取标准方法显着改进的重要特征,以前的研究从未以类似的方式对此进行过研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号