首页> 外文期刊>ACM SIGIR FORUM >Beyond Bags of Words: Effectively Modeling Dependence and Features in Information Retrieval
【24h】

Beyond Bags of Words: Effectively Modeling Dependence and Features in Information Retrieval

机译:胜于千言万语:有效建模信息检索中的依存关系和特征

获取原文
获取原文并翻译 | 示例
           

摘要

Current state of the art information retrieval models treat documents and queries as bags of words. There have been many attempts to go beyond this simple representation. Unfortunately, few have shown consistent improvements in retrieval effectiveness across a wide range of tasks and data sets. Here, we propose a new statistical model for information retrieval based on Markov random fields. The proposed model goes beyond the bag of words assumption by allowing dependencies between terms to be incorporated into the model. This allows for a variety of textual and non-textual features to be easily combined under the umbrella of a single model. Within this framework, we explore the theoretical issues involved, parameter estimation, feature selection, and query expansion. We give experimental results from a number of information retrieval tasks, such as ad hoc retrieval and web search.
机译:当前最先进的信息检索模型将文档和查询视为单词袋。已经进行了超出这种简单表示的许多尝试。不幸的是,很少有人在跨各种任务和数据集的检索效率方面显示出持续改进。在此,我们提出了一种基于马尔可夫随机场的信息检索统计模型。所提出的模型通过允许将术语之间的依赖性合并到模型中,从而超出了单词假设的范围。这允许在单个模型的保护下轻松组合各种文本和非文本功能。在此框架内,我们探讨了涉及的理论问题,参数估计,特征选择和查询扩展。我们从许多信息检索任务(例如临时检索和Web搜索)中给出实验结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号