首页> 外文会议>2012 7th International Conference on Computing and Convergence Technology >A partitioning technique for improving the performance of PageRank on Hadoop
【24h】

A partitioning technique for improving the performance of PageRank on Hadoop

机译:一种用于提高Hadoop上PageRank性能的分区技术

获取原文
获取原文并翻译 | 示例

摘要

There are a lot of research results in large scale graph analysis on Hadoop. The performance of the graph analysis based on Hadoop is impacted by data partitioning. The effectiveness of data partitioning depends on how the data partitioning maintains data locality in each node of cluster, and this would be different from the problems faced with. One way of data partitioning known to be effective is partitioning data by domains. For instance, this technique could be very useful in partitioning data by areas analyzing web graphs. But this kind of improvement from the data partitioning is limited to specific problems. In this paper, we propose a data partitioning technique based on semi-clustering for analyzing web graphs with PageRank algorithm on Hadoop. With experiment, PageRank computation with our partitioning technique improves the performance, as the number of iterations increases. This method can be very effective in the case of large scale graph processing.
机译:在Hadoop上进行大规模图形分析的研究成果很多。基于Hadoop的图分析的性能受到数据分区的影响。数据分区的有效性取决于数据分区如何在群集的每个节点中维护数据局部性,这将与所面临的问题有所不同。已知有效的数据分区方式之一是按域分区数据。例如,该技术在通过分析Web图形的区域对数据进行分区中可能非常有用。但是,数据分区的这种改进仅限于特定问题。在本文中,我们提出了一种基于半聚类的数据分区技术,用于在Hadoop上使用PageRank算法分析Web图形。通过实验,随着迭代次数的增加,使用我们的分区技术的PageRank计算提高了性能。在大规模图形处理的情况下,此方法可能非常有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号