首页> 外文期刊>ACM transactions on Asian language information processing >Flexible Pseudo-Relevance Feedback via Selective Sampling
【24h】

Flexible Pseudo-Relevance Feedback via Selective Sampling

机译:通过选择性采样提供的灵活的伪相关反馈

获取原文
获取原文并翻译 | 示例
       

摘要

Although Pseudo-Relevance Feedback (PRF) is a widely used technique for enhancing average retrieval performance, it may actually hurt performance for around one-third of a given set of topics. To enhance the reliability of PRF, Flexible PRF has been proposed, which adjusts the number of pseudo-relevant documents and/or the number of expansion terms for each topic. This paper explores a new, inexpensive Flexible PRF method, called Selective Sampling, which is unique in that it can skip documents in the initial ranked output to look for more "novel" pseudo-relevant documents. While Selective Sampling is only comparable to Traditional PRF in terms of average performance and reliability, per-topic analyses show that Selective Sampling outperforms Traditional PRF almost as often as Traditional PRF outperforms Selective Sampling. Thus, treating the top P documents as relevant is often not the best strategy. However, predicting when Selective Sampling outperforms Traditional PRF appears to be as difficult as predicting when a PRF method fails. For example, our per-topic analyses show that even the proportion of truly relevant documents in the pseudo-relevant set is not necessarily a good performance predictor.
机译:尽管伪相关反馈(PRF)是一种用于增强平均检索性能的广泛使用的技术,但实际上可能损害给定主题集的约三分之一的性能。为了提高PRF的可靠性,已经提出了灵活的PRF,它可以针对每个主题调整伪相关文档的数量和/或扩展项的数量。本文探讨了一种新的,便宜的Flexible PRF方法,称为选择性采样(Selective Sampling),它的独特之处在于它可以跳过初始排序输出中的文档,以寻找更多“新颖”的伪相关文档。尽管就平均性能和可靠性而言,选择性抽样仅可与传统PRF相提并论,但按主题分析显示,选择性抽样优于传统PRF的频率几乎与传统PRF优于选择性PRF的频率相同。因此,将前P个文档视为相关文档通常不是最佳策略。但是,预测何时选择性采样优于传统PRF似乎和预测PRF方法何时失败一样困难。例如,我们的按主题分析显示,即使伪相关集中真正相关文档的比例也不一定是良好的性能预测指标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号