首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Fast algorithms for frequent itemset mining using FP-trees
【24h】

Fast algorithms for frequent itemset mining using FP-trees

机译:使用FP-tree进行频繁项集挖掘的快速算法

获取原文
获取原文并翻译 | 示例
           

摘要

Efficient algorithms for mining frequent itemsets are crucial for mining association rules as well as for many other data mining tasks. Methods for mining frequent itemsets have been implemented using a prefix-tree structure, known as an FP-tree, for storing compressed information about frequent itemsets. Numerous experimental results have demonstrated that these algorithms perform extremely well. In this paper, we present a novel FP-array technique that greatly reduces the need to traverse FP-trees, thus obtaining significantly improved performance for FP-tree-based algorithms. Our technique works especially well for sparse data sets. Furthermore, we present new algorithms for mining all, maximal, and closed frequent itemsets. Our algorithms use the FP-tree data structure in combination with the FP-array technique efficiently and incorporate various optimization techniques. We also present experimental results comparing our methods with existing algorithms. The results show that our methods are the fastest for many cases. Even though the algorithms consume much memory when the data sets are sparse, they are still the fastest ones when the minimum support is low. Moreover, they are always among the fastest algorithms and consume less memory than other methods when the data sets are dense.
机译:挖掘频繁项集的高效算法对于挖掘关联规则以及许多其他数据挖掘任务至关重要。已经使用称为FP树的前缀树结构实现了挖掘频繁项集的方法,该前缀树结构用于存储有关频繁项集的压缩信息。许多实验结果表明,这些算法的性能非常好。在本文中,我们提出了一种新颖的FP阵列技术,该技术大大减少了遍历FP树的需求,从而为基于FP树的算法获得了显着改善的性能。对于稀疏数据集,我们的技术特别有效。此外,我们提出了用于挖掘所有,最大和封闭频繁项目集的新算法。我们的算法有效地结合使用FP-tree数据结构和FP-array技术,并结合了各种优化技术。我们还提供了将我们的方法与现有算法进行比较的实验结果。结果表明,在许多情况下,我们的方法都是最快的。即使算法在数据集稀疏时会消耗大量内存,但在最低支持率较低时,它们仍然是最快的算法。此外,当数据集密集时,它们始终是最快的算法之一,并且比其他方法消耗更少的内存。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号