首页> 外文会议>International Conference on Bangla Speech and Language Processing >Opinion Summarization of Bangla Texts using Cosine Simillarity Based Graph Ranking and Relevance Based Approach
【24h】

Opinion Summarization of Bangla Texts using Cosine Simillarity Based Graph Ranking and Relevance Based Approach

机译:基于余弦相似度的图排序和基于相关性的方法对孟加拉语文本的意见汇总

获取原文

摘要

The main idea of the automatic extractive text or opinion summarization is to find most important representative small subset of the original document without any loss of important information. There are many existing methods available for text summarization of English, Turkish, Arabic and other languages. But very few attempts has been done for Bangla language because of its having rich morphology and multifaceted structure. In this paper, we propose a joint cosine simillarity based graph ranking and Relevance based scoring and ranking approach for the summarization of bangla text. We developed a stemming algorithm based on Parts of Speech(POS) tagging consisting of around two lakhs POS tags for Bangla texts. A redundancy removal algorithm is also proposed to remove redundancy so that each sentences in the summary represents exactly the most important information in the document. The performance of the proposed approach is evaluated by measuring the recall, precision and f-score based on Rouge metric and it is also showed that proposed approach outperforms to other existing summarization methods for Bangla texts.
机译:自动提取文本或意见摘要的主要思想是找到原始文档中最重要的代表性小子集,而不会丢失任何重要信息。现有许多方法可用于英语,土耳其语,阿拉伯语和其他语言的文本摘要。但是,由于孟加拉语具有丰富的形态和多方面的结构,因此很少进行尝试。在本文中,我们提出了一种基于余弦相似度的图排序和基于相关性的评分和排序方法,用于孟加拉文本的汇总。我们基于词性(POS)标记开发了一种词干提取算法,该词性标记由孟加拉语文本的大约两个十万个POS标记组成。还提出了一种冗余消除算法,以消除冗余,以便摘要中的每个句子恰好表示文档中最重要的信息。通过基于Rouge度量测量召回率,精度和f得分来评估该方法的性能,并且还表明该方法优于其他现有的Bangla文本摘要方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号