首页> 外文期刊>Information Processing & Management >SRL-ESA-TextSum: A text summarization approach based on semantic role labeling and explicit semantic analysis
【24h】

SRL-ESA-TextSum: A text summarization approach based on semantic role labeling and explicit semantic analysis

机译:SRL-ESA-TextSum:一种基于语义角色标记和显式语义分析的文本汇总方法

获取原文
获取原文并翻译 | 示例
           

摘要

Automatic text summarization attempts to provide an effective solution to today's unprecedented growth of textual data. This paper proposes an innovative graph-based text summarization framework for generic single and multi document summarization. The summarizer benefits from two well-established text semantic representation techniques; Semantic Role Labelling (SRL) and Explicit Semantic Analysis (ESA) as well as the constantly evolving collective human knowledge in Wikipedia. The SRL is used to achieve sentence semantic parsing whose word tokens are represented as a vector of weighted Wikipedia concepts using ESA method. The essence of the developed framework is to construct a unique concept graph representation underpinned by semantic role-based multi-node (under sentence level) vertices for summarization. We have empirically evaluated the summarization system using the standard publicly available dataset from Document Understanding Conference 2002 (DUC 2002). Experimental results indicate that the proposed summarizer outperforms all state-of-the-art related comparators in the single document summarization based on the ROUGE-1 and ROUGE-2 measures, while also ranking second in the ROUGE-1 and ROUGE-SU4 scores for the multi-document summarization. On the other hand, the testing also demonstrates the scalability of the system, i.e., varying the evaluation data size is shown to have little impact on the summarizer performance, particularly for the single document summarization task. In a nutshell, the findings demonstrate the power of the role-based and vectorial semantic representation when combined with the crowd-sourced knowledge base in Wikipedia.
机译:自动文本摘要试图为当今文本数据的空前增长提供有效的解决方案。本文提出了一种创新的基于图的文本摘要框架,用于通用的单文档和多文档摘要。摘要器受益于两种完善的文本语义表示技术;语义角色标记(SRL)和显式语义分析(ESA)以及维基百科中不断发展的集体人类知识。 SRL用于使用ESA方法实现句子语义解析,其单词标记表示为加权Wikipedia概念的向量。开发框架的本质是构造一个独特的概念图表示,以基于语义角色的多节点(句子级别)顶点为基础进行概括。我们使用2002年文档理解大会(DUC 2002)的标准公共可用数据集对经验总结系统进行了评估。实验结果表明,在基于ROUGE-1和ROUGE-2度量的单个文档汇总中,拟议的汇总器的性能优于所有最新的比较器,同时在ROUGE-1和ROUGE-SU4评分中排名第二。多文档摘要。另一方面,测试还证明了系统的可伸缩性,即,显示评估数据大小的变化对汇总器的性能影响很小,特别是对于单个文档汇总任务而言。简而言之,这些发现证明了当与Wikipedia中的众包知识库结合时,基于角色和矢量语义表示的功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号