首页> 外文期刊>ACM transactions on Asian language information processing >Statistical Query Translation Models for Cross-Language Information Retrieval
【24h】

Statistical Query Translation Models for Cross-Language Information Retrieval

机译:跨语言信息检索的统计查询翻译模型

获取原文
获取原文并翻译 | 示例
           

摘要

Query translation is an important task in cross-language information retrieval (CLIR), which aims to determine the best translation words and weights for a query. This article presents three statistical query translation models that focus on the resolution of query translation ambiguities. All the models assume that the selection of the translation of a query term depends on the translations of other terms in the query. They differ in the way linguistic structures are detected and exploited. The co-occurrence model treats a query as a bag of words and uses all the other terms in the query as the context for translation disambiguation. The other two models exploit linguistic dependencies among terms. The noun phrase (NP) translation model detects NPs in a query, and translates each NP as a unit by assuming that the translation of a term only depends on other terms within the same NP. Similarly, the dependency translation model detects and translates dependency triples, such as verb-object, as units. The evaluations show that linguistic structures always lead to more precise translations. The experiments of CLIR on TREC Chinese collections show that all three models have a positive impact on query translation and lead to significant improvements of CLIR performance over the simple dictionary-based translation method. The best results are obtained by combining the three models.
机译:查询翻译是跨语言信息检索(CLIR)的一项重要任务,该语言旨在确定查询的最佳翻译词和权重。本文介绍了三种统计查询翻译模型,它们专注于查询翻译歧义的解决。所有模型均假设查询词翻译的选择取决于查询中其他词的翻译。它们在检测和利用语言结构的方式上有所不同。同现模型将查询视为一袋单词,并将查询中的所有其他术语用作翻译歧义消除的上下文。其他两个模型利用术语之间的语言依赖性。名词短语(NP)转换模型检测查询中的NP,并通过假定术语的翻译仅取决于同一NP中的其他术语来将每个NP转换为一个单元。类似地,依存关系转换模型将依存三元组(例如动词-宾语)检测并转换为单位。评估表明,语言结构总是导致更精确的翻译。在TREC中文馆藏上进行CLIR的实验表明,这三种模型都对查询翻译产生积极影响,并比基于字典的简单翻译方法显着提高了CLIR性能。通过组合这三个模型可以获得最佳结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号