Graph-based Density Peaks Ranking Approach for Extracting KeyPhrases (GDREK)

机译：基于图的密度峰值排序方法以提取关键短语（GDREK）

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Surprisingly, there are more than 1,500,000 articles found by google scholar search engine on keyphrase extraction (KE) have been published recently, 21,000 of them only in current year. This large number implies that researchers need to find more accurate and better performing models for KE from text as a subtask of text mining and summarization. This paper presents a novel design of KE. The model is composed of Graph-based Representation, sentence clustering and ranking based on Density peaks for KE in single or multi-documents (GDREK) which can be used further in text extractive summarization. The principle of GDREK is using graph model to represent text and then group and rank the sentences in a mutuality manner. In this model, sentence grouping and ranking proceeds by discovering the main topics of text and finding central sentences of each topic incrementally. In this incremental step, as the sentences are grouped based on the Graph-based Growing Self-Organizing Map (G-GSOM), they are ranked using Density Peaks (DP) concept according to a measure of similarity between sentences. Our similarity measure is based on shared phrases and Cosine function. Sentences are scored under the assumption that when a sentence has more similar sentences, it is considered more important (higher density) and more representative. Finally, the most frequent words or phrases in the sentences are selected as key phrases of the text. Experimental results show that our innovative technique extracts the most key phrases and words of two datasets and yields over 75% accuracy and from most sub-topics of text.

机译：令人惊讶的是，最近有1,500,000篇由Google学术搜索引擎发现的关于关键词提取（KE）的文章已经发表，仅当年有21,000篇。大量数据表明研究人员需要从文本中找到更准确，性能更好的KE模型，这是文本挖掘和汇总的子任务。本文提出了一种新颖的KE设计。该模型由基于图的表示，句子聚类和基于单个或多个文档中的KE的密度峰值的排序（GDREK）组成，可以进一步用于文本提取摘要中。 GDREK的原理是使用图形模型表示文本，然后以相互关联的方式对句子进行分组和排序。在此模型中，句子的分组和排名通过发现文本的主要主题并逐步找到每个主题的中心句子来进行。在此增量步骤中，由于根据基于图的增长自组织映射（G-GSOM）对句子进行分组，因此根据句子之间的相似性度量，使用密度峰值（DP）概念对句子进行排序。我们的相似性度量基于共享短语和余弦函数。在句子具有相似句子的情况下对句子评分，认为句子更重要（密度更高）且更具代表性。最后，选择句子中最常用的词或短语作为文本的关键短语。实验结果表明，我们的创新技术从两个文本子主题中提取了两个数据集的最关键短语和单词，并产生了75％以上的准确性。

著录项

来源
《2019 IEEE 7th Palestinian International Conference on Electrical and Computer Engineering》|2019年|1-6|共6页
会议地点 Gaza(PS)
作者
Mahmoud Alfarra; Abdalfattah M. Alfarra; Ahmed Salahedden;
展开▼
作者单位

University College of Science and Technology, Khan Younis, Palestine;

University College of Science and Technology, Khan Younis, Palestine;

Open University Sudan, Khartoum, Sudan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. SemGraph: Extracting Keyphrases Following a Novel Semantic Graph-Based Approach [J] . Juan Martinez-Romo, Lourdes Araujo, Andres Duque Fernandez Journal of the American Society for Information Science . 2016,第1期

机译：SemGraph：根据一种新颖的基于语义图的方法提取关键短语
2. A Graph-based Approach of Automatic Keyphrase Extraction [J] . Yan Ying, Tan Qingping, Xie Qinzheng, Procedia Computer Science . 2017,第1期

机译：一种基于图的自动关键词提取方法
3. Ranking Sentences for Keyphrase Extraction: A Relational Data Mining Approach [J] . Michelangelo Ceci, Corrado Loglisci, Lucrezia Macchia Procedia Computer Science . 2014,第1期

机译：关键短语提取的排序句子：一种关系数据挖掘方法
4. Graph-based Density Peaks Ranking Approach for Extracting KeyPhrases (GDREK) [C] . Mahmoud Alfarra, Abdalfattah M. Alfarra, Ahmed Salahedden Palestinian International Conference on Electrical and Computer Engineering . 2019

机译：基于图的密度峰值峰值排名方法，用于提取关键词（GDREK）
5. Evaluation techniques and graph-based algorithms for automatic summarization and keyphrase extraction. [D] . Hamid, Fahmida. 2016

机译：自动汇总和关键短语提取的评估技术和基于图的算法。
6. Graph Peak Caller: Calling ChIP-seq peaks on graph-based reference genomes [O] . Ivar Grytten, Knut D. Rand, Alexander J. Nederbragt, 2019

机译：图峰调用者：在基于图的参考基因组上调用ChIP-seq峰
7. TopicRank: Graph-Based Topic Ranking for Keyphrase Extraction [O] . Bougouin Adrien, Boudin Florian, Daille Béatrice 2013

机译：TopicRank：用于关键词提取的基于图形的主题排名

Graph-based Density Peaks Ranking Approach for Extracting KeyPhrases (GDREK)

摘要

著录项

相似文献

相关主题

期刊订阅