【24h】

Efficient Density Clustering Method for Spatial Data

机译:空间数据的高效密度聚类方法

获取原文
获取原文并翻译 | 示例

摘要

Data mining for spatial data has become increasingly important as more and more organizations are exposed to spatial data from sources such as remote sensing, geographical information systems, astronomy, computer cartography, environmental assessment and planning, etc. Recently, density based clustering methods, such as DENCLUE, DBSCAN, OPTICS, have been published and recognized as powerful clustering methods for data mining. These approaches have run time complexity of O(n log n) when using spatial index techniques, R~+ tree and grid cell. However, these methods are known to lack scalability with respect to dimensionality. In this paper, a unique approach to efficient neighborhood search and a new efficient density based clustering algorithm using EIN-rings are developed. Our approach exploits compressed vertical data structures, Peano Trees (P-trees), and fast P-tree logical operations to accelerate the calculation of the density function within EIN-rings. This approach stands in contrast to the ubiquitous approach of vertically scanning horizontal data structures (records). The average run time complexity of our algorithm for spatial data in d-dimension is O(dn n~(1/2)). Our proposed method has comparable cardinality scalability with other density methods for small and medium size of data, but superior speed and dimensional scalability.
机译:随着越来越多的组织暴露于来自遥感,地理信息系统,天文学,计算机制图,环境评估和规划等来源的空间数据,用于空间数据的数据挖掘变得越来越重要。近来,基于密度的聚类方法DENCLUE,DBSCAN,OPTICS等已被出版,并被公认为是用于数据挖掘的强大聚类方法。当使用空间索引技术,R〜+树和网格单元时,这些方法的运行时复杂度为O(n log n)。但是,已知这些方法在尺寸方面缺乏可扩展性。在本文中,开发了一种独特的有效邻域搜索方法以及一种使用EIN环的新型基于密度的高效聚类算法。我们的方法利用压缩的垂直数据结构,Peano树(P树)和快速的P树逻辑运算来加速EIN环内密度函数的计算。这种方法与垂直扫描水平数据结构(记录)的普遍方法形成对比。我们的d维空间数据算法的平均运行时间复杂度为O(dn n〜(1/2))。对于小型和中型数据,我们提出的方法具有与其他密度方法相当的基数可伸缩性,但是速度和维度可伸缩性都很好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号