首页> 外文期刊>Journal of Information Science >Generating query suggestions by exploiting latent semantics in query logs
【24h】

Generating query suggestions by exploiting latent semantics in query logs

机译:通过利用查询日志中的潜在语义来生成查询建议

获取原文
获取原文并翻译 | 示例
           

摘要

Search engines assist users in expressing their information needs more accurately by reformulating the issued queries automatically and suggesting the generated formulations to the users. Many approaches to query suggestion draw on the information stored in query logs, recommending recorded queries that are textually similar to the current user's query or that frequently co-occurred with it in the past. In this paper, we propose an approach that concentrates on deducing the actual information need from the user's query. The challenge therein lies not only in processing keyword queries, which are often short and possibly ambiguous, but especially in handling the complexity of natural language that allows users to express the same or similar information needs in various differing ways. We expect a higher-level semantic representation of a user's query to more accurately reflect the information need than the explicit query terms alone can. To this aim, we employ latent Dirichlet allocation as a probabilistic topic model to reveal latent semantics in the query log. Our evaluations show that, whereas purely topic-based query suggestion performs the worst, the interpolation of our proposed topic-based model with the baseline word-based model that generates suggestions based on matching query terms achieves significant improvements in suggestion quality over the already well performing purely word-based approach.
机译:搜索引擎通过自动重新格式化发出的查询并向用户建议生成的公式,来帮助用户更准确地表达其信息需求。查询建议的许多方法都利用存储在查询日志中的信息,建议在文本上类似于当前用户的查询或在过去经常发生的记录的查询。在本文中,我们提出一种方法,该方法集中于从用户的查询中推断出实际的信息需求。其中的挑战不仅在于处理通常简短且可能含糊的关键字查询,而且尤其在于处理自然语言的复杂性,该复杂性允许用户以各种不同的方式表达相同或相似的信息需求。我们期望用户查询的更高层次的语义表示比单独的显式查询术语能够更准确地反映信息需求。为此,我们采用潜在的Dirichlet分配作为概率主题模型来揭示查询日志中的潜在语义。我们的评估表明,尽管单纯基于主题的查询建议的效果最差,但将我们建议的基于主题的模型与基于基线词的模型(基于匹配查询词生成建议)的插值方法相比,已经可以很好地改善建议质量执行纯粹基于单词的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号