首页> 中文期刊> 《中文信息学报》 >基于幂律分布的网络用户快速排序算法

基于幂律分布的网络用户快速排序算法

         

摘要

随着网络论坛、博客、微博的发展,引出社会网络中的用户排序问题.将在线网络论坛中用户映射为节点,用户评论过程中形成的回复关系映射为有向关联图,其节点度符合幂律分布.且论坛中用户的主题发布行为和回复关系符合Pagerank算法的互增强和随机游走特性,因此选用Pagerank算法排序用户影响力.该文提出的研究问题:如何提高用户排序应用中数据的存储和运行效率.天涯网络论坛中80%以上用户入度为0,据此,根据入度是否为0划分为两个集合,对入度为0集合按出度构造链接表,设计了基于集合划分的高效排序算法SD-Rank.SD-Rank时空复杂性为O(V'),V'为入度非0节点集.对天涯网络论坛真实用户数据的实验结果表明:SD-Rank算法时空复杂性优于Pagerank算法.%With the wide application of BBS, blog and micro blog etc, how to rank the user becomes a well-recognized research issue,especially in the social network area. The paper analyzes the users relational graph in network BBS, representing the correlation graph by users and replies between users, and reveals its power law distribution in in/out degree. Owing the fact that the user's behavior of posting and replying is in accordance with Pagerank's characteristics of random walking and mutual-enforcement, the user's influence is ranked with Pagerank algorithm. This paper further addresses the issue of time-space ratio in computation resulted by the exponent growth of user's number. Based on the observation that over 80 percent user's indegree is 0. Using list structure designs the efficient set-division-ranking arithmetic (SD-Rank) is designed by via list structure after dividing users into two sets: 0-indegree in setO and non-0-indegree in setl. Through set partitions according to degree distribution, the time-space complexity of SD-Rank is decreased from CXV+E) to O(V'), in which V' is the size of setl. Experiment on TIANYA BBS dataset shows that SD-Rank is more efficient than Pagerank.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号