首页> 外文学位 >Information theoretic, probabilistic and maximum partial substructure algorithms for discovering graph-based anomalies.

【24h】

Information theoretic, probabilistic and maximum partial substructure algorithms for discovering graph-based anomalies.

机译：信息理论，概率和最大局部子结构算法，用于发现基于图的异常。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The ability to mine data represented as a graph has become important in several domains for detecting various structural patterns. One important area of data mining is anomaly detection, particularly for fraud. However, less work has been done in terms of detecting anomalies in graph-based data. While there has been some previous work that has used statistical metrics and conditional entropy measurements, the results have been limited to certain types of anomalies and specific domains.; In this work we present graph-based approaches to uncovering anomalies in domains where the anomalies consist of unexpected entity/relationship alterations that closely resemble non-anomalous behavior. We have developed three algorithms for the purpose of detecting anomalies in all three types of possible graph changes: label modifications, vertex/edge insertions and vertex/edge deletions. Each of our algorithms focuses on one of these anomalous types, using the minimum description length principle to first discover the normative substructure. Once the common pattern is known, each algorithm then uses a different approach to discover particular anomalous types. The first algorithm uses the minimum description length to find substructures that closely compress the graph. The second algorithm uses a probabilistic approach to examine substructure extensions and their likelihood of existence. The third algorithm analyzes substructures that come close to matching the normative pattern, but are unable to make some of the final extensions.; Using synthetic and real-world data, we evaluate the effectiveness of each of these algorithms in terms of each of the types of anomalies. Each of these algorithms demonstrates the usefulness of examining a graph-based representation of data for the purposes of detecting fraud, where some individual or entity is cloaking their illegal activities through an attempt at closely resembling legitimate transactions.

机译：挖掘以图形表示的数据的能力在用于检测各种结构模式的几个领域中已经变得很重要。数据挖掘的一个重要领域是异常检测，特别是对于欺诈。但是，在检测基于图形的数据中的异常方面所做的工作较少。尽管以前有一些使用统计指标和条件熵测量的工作，但结果仅限于某些类型的异常和特定域。在这项工作中，我们提出了基于图的方法来发现异常中由异常实体/关系更改组成的域，这些异常与非异常行为非常相似。我们已经开发了三种算法来检测所有三种可能的图形变化中的异常：标签修改，顶点/边缘插入和顶点/边缘删除。我们的每种算法都专注于这些异常类型之一，使用最小描述长度原则首先发现规范子结构。一旦知道了通用模式，每种算法便会使用不同的方法来发现特定的异常类型。第一种算法使用最小描述长度来找到紧密压缩图形的子结构。第二种算法使用概率方法来检查子结构扩展及其存在的可能性。第三种算法分析的子结构接近匹配规范模式，但无法进行某些最终扩展。使用合成和真实数据，我们根据每种异常类型评估每种算法的有效性。这些算法中的每一种都演示了检查基于图形的数据表示形式以检测欺诈的有用性，其中某些个人或实体通过试图类似于合法交易来掩盖其非法活动。

著录项

作者
Eberle, William Fred.;
展开▼
作者单位

The University of Texas at Arlington.;

展开▼
授予单位 The University of Texas at Arlington.;
学科 Computer Science.
学位 Ph.D.
年度 2007
页码 165 p.
总页数 165
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Effectiveness of Partition and Graph Theoretic Clustering Algorithms for Multiple Source Partial Discharge Pattern Classification Using Probabilistic Neural Network and Its Adaptive Version: A Critique Based on Experimental Studies [J] . S. Venkatesh, S. Gopal, K. Kannan Journal of electrical and computer engineering . 2012,第pta3期

机译：基于概率神经网络的分区和图论聚类算法在多源局部放电模式分类中的有效性及其自适应版本：一种基于实验研究的评论
2. Effectiveness of Partition and Graph Theoretic Clustering Algorithms for Multiple Source Partial Discharge Pattern Classification Using Probabilistic Neural Network and Its Adaptive Version: A Critique Based on Experimental Studies [J] . S.Venkatesh, S.Gopal, K.Kannan Journal of Electrical and Computer Engineering . 2012,第4期

机译：基于概率神经网络的分区和图论聚类算法在多源局部放电模式分类中的有效性及其自适应版本：一种基于实验研究的批判
3. Efficient graph-based algorithms for discovering and maintaining association rules in large databases [J] . Guanling Lee, K. L. Lee, Arbee L. P. Chen Knowledge and information systems . 2001,第3期

机译：高效的基于图的算法，用于发现和维护大型数据库中的关联规则
4. Efficient Algorithms for Discovering Frequent and Maximal Substructures from Large Semistructured Data [C] . Hiroki Arimura Computer and information science . 2010

机译：从大型半结构化数据中发现频繁和最大子结构的高效算法
5. Analysis of Maximum Power Point Tracking Algorithms for Single Stage Photovoltaic Power Plants under Realistic Partial Shading Conditions Caused by Moving Clouds. [D] . Azuaje Berbeci, Bernardo J. 2017

机译：在移动云引起的实际局部遮蔽条件下，单级光伏电站的最大功率点跟踪算法分析。
6. A maximum common substructure-based algorithm for searching and predicting drug-like compounds [O] . Yiqun Cao, Tao Jiang, Thomas Girke -1

机译：基于最大通用子结构的算法用于搜索和预测类药物化合物
7. Effectiveness of Partition and Graph Theoretic Clustering Algorithms for Multiple Source Partial Discharge Pattern Classification Using Probabilistic Neural Network and Its Adaptive Version: A Critique Based on Experimental Studies [O] . S. Venkatesh, S. Gopal, K. Kannan 2012

机译：不同概率神经网络对多源部分放电模式分类的分区与图形理论聚类算法的有效性及其自适应版本：基于实验研究的批判
8. Substructuring Algorithms and Solution Techniques for the Numerical Approximation of Partial Differential Equations. [R] . Gunzburger, M. D., Nicolaides, R. A. 1986

机译：偏微分方程数值逼近的子结构算法及求解技术。

Information theoretic, probabilistic and maximum partial substructure algorithms for discovering graph-based anomalies.

摘要

著录项

相似文献

相关主题

期刊订阅