首页> 中文期刊> 《中国科学技术大学学报》 >一种基于轨迹数据密度分区的分布式并行聚类方法

一种基于轨迹数据密度分区的分布式并行聚类方法

         

摘要

全球定位技术与基于位置服务的发展促进了轨迹大数据的发展.轨迹聚类作为最重要的轨迹分析任务之一,得到了广泛的研究.目前,大多数聚类方法是在单处理机模式下运行,对于大规模的轨迹数据其处理时间较长,难以满足时效性强的轨迹分析任务,为此提出一种基于轨迹数据密度分区的分布式并行聚类方法.首先将整个轨迹数据集抽象在一个矩形区域内,通过该矩形最长维度的变换将数据合理地划分为若干任务量相当的分区,构建可供分布式并行聚类的局部数据集,然后各工作服务器对局部分区分别执行DBSCAN聚类算法,管理服务器对局部聚类结果进行合并与整合.实验结果验证了本方法的有效性,在一定程度上提高了聚类分析的运算效率.%The development of global positioning technology and location-based service have contributed to the development of trajectory big data.Trajectory clustering is one of the most important trajectory analysis tasks and has been extensively studied.Currently,most of the clustering methods operate in a single-processor mode,and large-scale trajectory data processing is a lengthy process,making it difficult to meet the strong timeliness of the trajectory analysis task.To solve the problem,a distributed parallel clustering method based on trajectory density partition is proposed.Firstly,the whole dataset is abstracted in a rectangular region,and the dataset is divided into several partitions with tasks that have almost the same amount by the transformation of the longest dimension of the rectangle,thus constructing the local datasets for distributed parallel clustering.Then the worker servers implement the DBSCAN clustering algorithm for the local partitions respectively,and the manager server merges and integrates the local clustering results.The experimental results show that the algorithm is effective and improves the computational rate of clustering analysis to a certain degree.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号