Is Community Detection Fully Unsupervised? The Case of Weighted Graphs

机译：社区检测完全不受监督吗？加权图的情况

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the field of NLP, word embeddings have recently attracted a lot of attention. A textual corpus is represented as a sparse words co-occurrences matrix. Then, the matrix can be factorized, for example using SVD, which allows to obtain a shorter matrix with dense and continuous vectors. To help SVD, PMI measure is applied on the initial co-occurrence matrix, assigning a relevant weight to the co-occurrences by normalizing them using both the considered words frequencies. In this paper, we follow this idea to study if weighted networks can benefit from pre-processing that can help community detection. We first design a benchmark using LFR networks. Then, we consider PMI and another NLP inspired measure as a preprocessing of the links weights, and show that PMI worsens the results while the other one improves them. By distinguishing links inside communities and links between communities into two classes, we show that this is due to the weights distributions of these links. Links between communities are in average bigger, leading to bigger values of PMI. From this analysis, we design another set of experiments that show that it is possible to classify efficiently links into these two classes, using a small set of features. Finally, we introduce the Supervised Label Propagation (SLP) algorithm that takes into account the classification results during the propagation. This algorithm clearly improves the results, leading us to a major questioning: is community detection on weighted networks a fully unsupervised task? We conclude with our thoughts on this topic.

机译：在NLP领域，Word Embeddings最近引起了很多关注。文本语料库表示为稀疏单词共同发生矩阵。然后，矩阵可以是分解的，例如使用SVD，其允许具有致密和连续向量的较短矩阵。为了帮助SVD，PMI测量应用于初始共发生矩阵，通过使用所考虑的单词频率对其进行归一化其来分配相关权重。在本文中，我们遵循这个想法要学习，如果加权网络可以从可以帮助社区检测的预处理中受益。我们首先使用LFR网络设计基准。然后，我们考虑PMI和另一个NLP激发措施作为链路权重的预处理，并显示PMI在另一个改善它们的同时使结果恶化。通过将社区内部的链接区分成两个类别，我们表明这是由于这些链接的权重分布。社区之间的链接平均大，导致PMI的更大值。从这个分析中，我们设计了另一组实验，表明可以使用一小组特征来分类为这两个类别的有效链接。最后，我们介绍了在传播期间考虑了分类结果的监督标签传播（SLP）算法。该算法清楚地提高了结果，导致我们对重大质询：是对加权网络的社区检测完全无监督的任务吗？我们凭借我们对这一主题的看法结束。

著录项

来源
《International workshop on complex networks and their applications》|2018年|256-266|共11页
会议地点
作者
Victor Connes; Nicolas Dugue; Adrien Guille;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Unsupervised learning for community detection in attributed networks based on graph convolutional network [J] . Wang Xiaofeng, Li Jianhua, Yang Li, Neurocomputing . 2021,第Octa7期

机译：基于图形卷积网络的归属网络中社区检测的无监督学习
2. Predicting protein complexes from weighted protein-protein interaction graphs with a novel unsupervised methodology: Evolutionary enhanced Markov clustering [J] . Theofilatos Konstantinos, Pavlopoulou Niki, Papasavvas Christoforos, Artificial intelligence in medicine . 2015,第3期

机译：使用新型无监督方法从加权蛋白质-蛋白质相互作用图预测蛋白质复合物：进化增强型马尔可夫聚类
3. Weighted Graph Clustering for Community Detection of Large Social Networks [J] . Ruifang Liu, Shan Feng, Ruisheng Shi, Procedia Computer Science . 2014,第1期

机译：大型社交网络社区检测的加权图聚类
4. Three-Way Decisions Community Detection Model Based on Weighted Graph Representation [C] . Jie Chen, Yang Li, Shu Zhao, International Joint Conference on Rough Sets . 2020

机译：基于加权图表示的三向决策社区检测模型
5. Unsupervised Classification and Network Analysis of the Reddit Communities with Spiking Neural Network and Exponential-Family Random Graph Model [D] . He, Jie. 2021

机译：尖刺神经网络和指数家庭随机图模型的红线社区无监督分类和网络分析
6. A Mixed Application of Geographically Weighted Regression and Unsupervised Classification for Analyzing Latex Yield Variability in Yunnan China [O] . Oh Seok Kim, Jeffrey B. Nugent, Zhuang-Fang Yi, -1

机译：地理加权回归与无监督分类在云南乳胶产量变异性分析中的混合应用
7. Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities [O] . Lancichinetti, Andrea, Fortunato, Santo 2009

机译：测试社区检测算法的定向和基准测试基准具有重叠社区的加权图

Is Community Detection Fully Unsupervised? The Case of Weighted Graphs

摘要

著录项

相似文献

相关主题

期刊订阅