【24h】

A Teapot Graph and Its Hierarchical Structure of the Chinese Web

机译:中文网的茶壶图及其层级结构

获取原文

摘要

The shape of the Web in terms of its graphical structure has been a widely interested topic. Two graphs, Bow Tie and Daisy, have stood out from previous research. In this work, we take a different approach, by viewing the Web as a hierarchy of three levels, namely page level, host level, and domain level. Such structures are analyzed and compared with a snapshot of Chinese Web in early 2006, involving 830 million pages, 17 million hosts, and 0.8 million domains. Some interesting results have emerged. For example, the Chinese Web appears more like a teapot (with a large size of SCC, a medium size of IN and a small size of OUT) at page level than the classic bow tie or daisy shape. Some challenging phenomena are also observed. For example, the Ins become much smaller than OUTs at host and domain levels. Future work will tackle these puzzles.
机译:Web的图形结构形式一直是引起人们广泛关注的话题。先前的研究中有两个图,领结和雏菊。在这项工作中,我们采用一种不同的方法,即将Web视为三个级别的层次结构,即页面级别,主机级别和域级别。对此类结构进行了分析,并与2006年初的中文Web快照进行了比较,该快照涉及8.3亿个页面,1,700万个主机和80万个域。出现了一些有趣的结果。例如,中文网页在页面级别上看起来更像是一个茶壶(SCC尺寸大,IN尺寸中等,OUT尺寸很小),而不是经典的领结或雏菊形状。还观察到一些具有挑战性的现象。例如,在主机和域级别,In变得比OUT小得多。未来的工作将解决这些难题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号