...
首页> 外文期刊>電子情報通信学会技術研究報告 >Exploration on Efficient Similar Sentences Extraction
【24h】

Exploration on Efficient Similar Sentences Extraction

机译:高效相似句抽取的探索

获取原文
获取原文并翻译 | 示例
           

摘要

Semantic similarity measure between sentences is an essential issue for many applications, such as natural language processing, Web page retrieval, question-answer model, and so forth. Although there are a few studies exploring on this issue, most of them focus on how to improve the effectiveness of the problem. In this paper, we address the efficiency issue, i.e., for a given sentence collection, how to efficiently discover the top-k semantic similar sentences to a query. The issue is very important for real applications because the data becomes huge and the existing state-of-the-art strategies cannot satisfy the users' performance requirement. We propose efficient strategies to tackle such problem based on a general framework. Extensive experimental evaluations conducted on two real datasets demonstrate that the efficiency of our proposal outperforms the state-of-the-art approach.
机译:句子之间的语义相似性度量对于许多应用程序来说都是必不可少的问题,例如自然语言处理,Web页面检索,问答模型等等。尽管有一些有关此问题的研究,但大多数研究都集中在如何提高问题的有效性上。在本文中,我们解决了效率问题,即对于给定的句子集合,如何有效地发现查询的前k个语义相似的句子。对于实际应用而言,此问题非常重要,因为数据变得庞大,并且现有的最新策略无法满足用户的性能要求。我们提出基于通用框架解决此类问题的有效策略。在两个真实数据集上进行的广泛实验评估表明,我们的建议的效率优于最新方法。

著录项

  • 来源
    《電子情報通信学会技術研究報告》 |2012年第172期|17-22|共6页
  • 作者单位

    Institute of Industrial Science, the University of Tokyo 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505 Japan;

    Institute of Industrial Science, the University of Tokyo 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505 Japan;

    Institute of Industrial Science, the University of Tokyo 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505 Japan;

    Institute of Industrial Science, the University of Tokyo 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505 Japan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    semantic similarity; query aggregation; top-κ;

    机译:语义相似度;查询聚合;顶κ;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号