...
首页> 外文期刊>Knowledge and Information Systems >SumCR: A new subtopic-based extractive approach for text summarization
【24h】

SumCR: A new subtopic-based extractive approach for text summarization

机译:SumCR:一种新的基于子主题的提取方法,用于文本摘要

获取原文
获取原文并翻译 | 示例
           

摘要

In text summarization, relevance and coverage are two main criteria that decide the quality of a summary. In this paper, we propose a new multi-document summarization approach SumCR via sentence extraction. A novel feature called Exemplar is introduced to help to simultaneously deal with these two concerns during sentence ranking. Unlike conventional ways where the relevance value of each sentence is calculated based on the whole collection of sentences, the Exemplar value of each sentence in SumCR is obtained within a subset of similar sentences. A fuzzy medoid-based clustering approach is used to produce sentence clusters or subsets where each of them corresponds to a subtopic of the related topic. Such kind of subtopic-based feature captures the relevance of each sentence within different subtopics and thus enhances the chance of SumCR to produce a summary with a wider coverage and less redundancy. Another feature we incorporate in SumCR is Position, i.e., the position of each sentence appeared in the corresponding document. The final score of each sentence is a combination of the subtopic-level feature Exemplar and the document-level feature Position. Experimental studies on DUC benchmark data show the good performance of SumCR and its potential in summarization tasks.
机译:在文本摘要中,相关性和覆盖率是决定摘要质量的两个主要标准。在本文中,我们提出了一种新的通过句子提取的多文档摘要方法SumCR。引入了一种名为“示例”的新颖功能,以帮助在句子排名期间同时处理这两个问题。与基于句子的整个集合来计算每个句子的相关性值的常规方式不同,SumCR中每个句子的样本值是在相似句子的子集中获得的。基于模糊类固醇的聚类方法用于生成句子聚类或子集,其中每个子聚类或子集对应于相关主题的子主题。这种基于子主题的功能可捕获每个子主题在不同子主题中的相关性,从而增加SumCR生成具有更广泛覆盖范围和更少冗余的摘要的机会。我们在SumCR中包含的另一个功能是位置,即每个句子在相应文档中出现的位置。每个句子的最终分数是子主题级别的功能示例和文档级别的功能位置的组合。 DUC基准数据的实验研究表明SumCR的良好性能及其在汇总任务中的潜力。

著录项

  • 来源
    《Knowledge and Information Systems》 |2012年第3期|p.527-545|共19页
  • 作者

    Jian-Ping Mei; Lihui Chen;

  • 作者单位

    Division of Information Engineering, School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, Republic of Singapore;

    Division of Information Engineering, School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, Republic of Singapore;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Text summarization; Clustering; Subtopic; Sentence extractive; Sentence position;

    机译:文本摘要;聚类;副题;句子提取;句子位置;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号