首页> 外文学位 >The multi-tree cubing algorithm for computing iceberg cubes.
【24h】

The multi-tree cubing algorithm for computing iceberg cubes.

机译:用于计算冰山多维数据集的多树多维数据集算法。

获取原文
获取原文并翻译 | 示例

摘要

The computation of data cubes is one of the major operations in on-line analytical processing (OLAP), but it is also one of the most expensive. To improve efficiency, an iceberg cube represents only the cells whose aggregate value is above a given threshold (minimum support). Two existing approaches, top-down and bottom-up, are used to compute the iceberg cube for a data set, but both have performance limitations.; In this thesis, a new algorithm, called Multi-Tree Cubing (MTC), is proposed and implemented for computing an iceberg cube. Overall, the Multi-Tree Cubing algorithm is a top-down approach, so it features shared computation. By processing the orderings in the opposite order from a top-down approach called Top-Down Computation, the MTC algorithm is able to prune attributes at the partition level, which is more accurate than the pruning in a bottom-up approach called Bottom-Up Computation (BUC). The MTC algorithm is based on a specialized type of prefix tree data structure, called a AP-tree, which facilitates fast sorting and Apriori-like pruning. Once a tree is constructed, MTC traverses the tree in a depth-first manner and checks each partition against the minimum support to determine if it should output a result or prune the current branch. The results of five series of experiments comparing MTC versus BUC on real and synthetic databases are presented. The experimental results showed that MTC consistently performed the same as or better than BUC for all five series of experiments.
机译:数据多维数据集的计算是联机分析处理(OLAP)中的主要操作之一,但它也是最昂贵的操作之一。为了提高效率,一个冰山立方仅代表其合计值高于给定阈值(最小支持量)的像元。现有的两种方法,自上而下和自下而上,用于计算数据集的冰山一角,但两者都有性能限制。本文提出了一种用于计算冰山立方的新算法,称为多树立方(MTC)。总体而言,Multi-Tree Cubing算法是一种自顶向下的方法,因此具有共享计算功能。通过从称为“自上而下的计算”的自上而下的方法以相反的顺序处理顺序,MTC算法能够在分区级别修剪属性,这比“自下而上”的自下而上方法的修剪更为准确。计算(BUC)。 MTC算法基于一种称为AP树的特殊类型的前缀树数据结构,它有助于快速排序和类似Apriori的修剪。构造树后,MTC会以深度优先的方式遍历该树,并根据最小支持量检查每个分区,以确定是否应输出结果或修剪当前分支。给出了在真实数据库和合成数据库上比较MTC与BUC的五个系列实验的结果。实验结果表明,在所有五个系列实验中,MTC的性能始终与BUC相同或更好。

著录项

  • 作者

    Li, Xing.;

  • 作者单位

    The University of Regina (Canada).;

  • 授予单位 The University of Regina (Canada).;
  • 学科 Computer Science.
  • 学位 M.Sc.
  • 年度 2005
  • 页码 72 p.
  • 总页数 72
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号