Mining frequent subgraphs over uncertain graph databases under probabilistic semantics

Jianzhong Li; Zhaonian Zou; Hong Gao

首页> 外文期刊>The VLDB journal >Mining frequent subgraphs over uncertain graph databases under probabilistic semantics

【24h】

Mining frequent subgraphs over uncertain graph databases under probabilistic semantics

机译：概率语义在不确定图数据库上挖掘频繁子图

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Frequent subgraph mining has been extensively studied on certain graph data. However, uncertainty is intrinsic in graph data in practice, but there is very few work on mining uncertain graph data. This paper focuses on mining frequent subgraphs over uncertain graph data under the probabilistic semantics. Specifically, a measure called φ-frequent probability is introduced to evaluate the degree of recurrence of subgraphs. Given a set of uncertain graphs and two real numbers 0 < φ, τ < 1, the goal is to quickly find all subgraphs with φ-frequent probability at least τ. Due to the NP-hardness of the problem and to the #P-hardness of computing the φ-frequent probability of a subgraph, an approximate mining algorithm is proposed to produce an (ε, δ)-approximate set ∏ of "frequent subgraphs", where 0 < ε < τ is error tolerance, and 0 < δ < 1 is a confidence bound. The algorithm guarantees that (1) any frequent subgraph S is contained in ∏ with probability at least ((1 - δ)/2)~s, where s is the number of edges in S; (2) any infrequent subgraph with φ-frequent probability less than τ-ε is contained in ∏ with probability at most δ/2. The theoretical analysis shows that to obtain any frequent subgraph with probability at least 1 - Δ, the input parameter S of the algorithm must be set to at most 1 - 2(1 - Δ)~(l/e)_(max), where 0 < Δ < 1, and e_(max) is the maximum number of edges in frequent subgraphs. Extensive experiments on real uncertain graph data verify that the proposed algorithm is practically efficient and has very high approximation quality. Moreover, the difference between the probabilistic semantics and the expected semantics on mining frequent subgraphs over uncertain graph data has been discussed in this paper for the first time.

机译：频繁的子图挖掘已在某些图数据上进行了广泛的研究。但是，在实践中，不确定性是图形数据的内在因素，但是挖掘不确定性图形数据的工作很少。本文重点研究概率语义下不确定图数据上的频繁子图挖掘。具体而言，引入了一种称为φ概率的度量，以评估子图的重复程度。给定一组不确定图和两个实数0 <φ，τ<1，目标是快速找到所有φ概率至少为τ的子图。由于问题的NP难度和计算子图的φ概率的#P难度，提出了一种近似挖掘算法来生成（ε，δ）近似的“频繁子图”集合∏ ，其中0 <ε<τ是容错能力，0 <δ<1是置信界。该算法保证（1）在frequent中包含任何频繁的子图S，且概率至少为（（1-δ）/ 2）〜s，其中s是S中的边数；（2）在∏中包含φ概率小于τ-ε的任何不频繁子图，概率最大为δ/ 2。理论分析表明，要获得概率至少为1-Δ的频繁子图，算法的输入参数S必须设置为至多1-2（1-Δ）〜（l / e）_（max），其中0 <Δ<1，并且e_（max）是频繁子图中的最大边数。对实际不确定图数据的大量实验证明，该算法实用有效，并且具有很高的近似质量。此外，本文首次讨论了在不确定的图数据上挖掘频繁子图的概率语义和预期语义之间的差异。

著录项

来源
《The VLDB journal》 |2012年第6期|753-777|共25页
作者
Jianzhong Li; Zhaonian Zou; Hong Gao;
展开▼
作者单位

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China;

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China;

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类
关键词
uncertain graph; frequent subgraph mining; probabilistic semantics; φ-frequent probability; #p;

机译：不确定图频繁的子图挖掘;概率语义;φ-概率;#p;

相似文献

外文文献
中文文献
专利

1. Mining Frequent Subgraph Patterns from Uncertain Graph Data [J] . Zou Zhaonian, Li Jianzhong, Gao Hong, Knowledge and Data Engineering, IEEE Transactions on . 2010,第9期

机译：从不确定的图数据中挖掘频繁的子图模式
2. Efficient weighted probabilistic frequent itemset mining in uncertain databases [J] . Li Zhiyang, Chen Fengjuan, Wu Junfeng, Expert Systems . 2021,第5期

机译：在不确定数据库中有效的加权概率频繁漏洞挖掘
3. Probabilistic maximal frequent itemset mining methods over uncertain databases [J] . Li Haifeng, Hai Mo, Zhang Ning, Intelligent data analysis . 2019,第6期

机译：概率最大频繁的项目集挖掘方法在不确定数据库中
4. Discovering Frequent Subgraphs over Uncertain Graph Databases under Probabilistic Semantics [C] . Zhaonian Zou, Hong Gao, Jianzhong Li ACM SIGKDD international conference on knowledge discovery and data mining;KDD 10 . 2011

机译：在概率语义下在不确定图数据库上发现频繁子图
5. A relational database approach for frequent subgraph mining. [D] . Pradhan, Subhesh Kumar. 2006

机译：一种用于频繁子图挖掘的关系数据库方法。
6. FREQUENT SUBGRAPH MINING OF PERSONALIZED SIGNALING PATHWAY NETWORKS GROUPS PATIENTS WITH FREQUENTLY DYSREGULATED DISEASE PATHWAYS AND PREDICTS PROGNOSIS [O] . Arda Durmaz, Tim A. D. Henderson, Douglas Brubaker, -1

机译：频繁失调的疾病通路和预测的个性化信号通路网络组的频率子图挖掘
7. Mining Frequent Subgraph Patterns from Uncertain Graphs [O] . Zou Zhao-nian, Li Jian-zhong, Gao Hong, 2015

机译：从不确定图中挖掘频繁子图模式
8. Discovering Frequent Geometric Subgraphs. [R] . Kuramochi, M., Karypis, G. 2004

机译：发现频繁的几何子图。

Mining frequent subgraphs over uncertain graph databases under probabilistic semantics

摘要

著录项

相似文献

相关主题

期刊订阅