首页> 外文学位 >Information theoretic, probabilistic and maximum partial substructure algorithms for discovering graph-based anomalies.
【24h】

Information theoretic, probabilistic and maximum partial substructure algorithms for discovering graph-based anomalies.

机译:信息理论,概率和最大局部子结构算法,用于发现基于图的异常。

获取原文
获取原文并翻译 | 示例

摘要

The ability to mine data represented as a graph has become important in several domains for detecting various structural patterns. One important area of data mining is anomaly detection, particularly for fraud. However, less work has been done in terms of detecting anomalies in graph-based data. While there has been some previous work that has used statistical metrics and conditional entropy measurements, the results have been limited to certain types of anomalies and specific domains.; In this work we present graph-based approaches to uncovering anomalies in domains where the anomalies consist of unexpected entity/relationship alterations that closely resemble non-anomalous behavior. We have developed three algorithms for the purpose of detecting anomalies in all three types of possible graph changes: label modifications, vertex/edge insertions and vertex/edge deletions. Each of our algorithms focuses on one of these anomalous types, using the minimum description length principle to first discover the normative substructure. Once the common pattern is known, each algorithm then uses a different approach to discover particular anomalous types. The first algorithm uses the minimum description length to find substructures that closely compress the graph. The second algorithm uses a probabilistic approach to examine substructure extensions and their likelihood of existence. The third algorithm analyzes substructures that come close to matching the normative pattern, but are unable to make some of the final extensions.; Using synthetic and real-world data, we evaluate the effectiveness of each of these algorithms in terms of each of the types of anomalies. Each of these algorithms demonstrates the usefulness of examining a graph-based representation of data for the purposes of detecting fraud, where some individual or entity is cloaking their illegal activities through an attempt at closely resembling legitimate transactions.
机译:挖掘以图形表示的数据的能力在用于检测各种结构模式的几个领域中已经变得很重要。数据挖掘的一个重要领域是异常检测,特别是对于欺诈。但是,在检测基于图形的数据中的异常方面所做的工作较少。尽管以前有一些使用统计指标和条件熵测量的工作,但结果仅限于某些类型的异常和特定域。在这项工作中,我们提出了基于图的方法来发现异常中由异常实体/关系更改组成的域,这些异常与非异常行为非常相似。我们已经开发了三种算法来检测所有三种可能的图形变化中的异常:标签修改,顶点/边缘插入和顶点/边缘删除。我们的每种算法都专注于这些异常类型之一,使用最小描述长度原则首先发现规范子结构。一旦知道了通用模式,每种算法便会使用不同的方法来发现特定的异常类型。第一种算法使用最小描述长度来找到紧密压缩图形的子结构。第二种算法使用概率方法来检查子结构扩展及其存在的可能性。第三种算法分析的子结构接近匹配规范模式,但无法进行某些最终扩展。使用合成和真实数据,我们根据每种异常类型评估每种算法的有效性。这些算法中的每一种都演示了检查基于图形的数据表示形式以检测欺诈的有用性,其中某些个人或实体通过试图类似于合法交易来掩盖其非法活动。

著录项

  • 作者

    Eberle, William Fred.;

  • 作者单位

    The University of Texas at Arlington.;

  • 授予单位 The University of Texas at Arlington.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2007
  • 页码 165 p.
  • 总页数 165
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号