Efficient weighted probabilistic frequent itemset mining in uncertain databases

Li Zhiyang; Chen Fengjuan; Wu Junfeng; Liu Zhaobin; Liu Weijiang

首页> 外文期刊>Expert Systems >Efficient weighted probabilistic frequent itemset mining in uncertain databases

【24h】

Efficient weighted probabilistic frequent itemset mining in uncertain databases

机译：在不确定数据库中有效的加权概率频繁漏洞挖掘

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Uncertain data mining has attracted so much interest in many emerging applications over the past decade. An issue of particular interest is to discover the frequent itemsets in uncertain databases. As an item would not appear in a transaction of such database for certain, several probability models are presented to measure the frequency of an itemset, and the frequent itemset over probabilistic data generally has two different definitions: the expected support-based frequent itemset and probabilistic frequent itemset. Meanwhile, it is noted that the frequency itself cannot identify useful or meaningful patterns in some scenarios. Other measures such as the importance of items should be also taken into account. To this end, some studies recently have been done on weighted (importance) frequent itemset mining in uncertain databases. However, they are only designed for the expected support-based frequent itemset, and suffer from low efficiency due to generating too many frequent itemset candidates. To address this issue, we propose a novel weighted probabilistic frequent itemsets (w-PFIs) algorithm. Moreover, we derive a probability model for the support of a w-PFI candidate in our method and present three pruning techniques to narrow the search space and remove the unpromising candidates immediately. Extensive experiments have been conducted on both real and synthetic datasets, to evaluate the performance of our w-PFI algorithm in terms of runtime, accuracy and scalability. Results show that our algorithm yields the best performance among the existing algorithms.

机译：在过去十年中，不确定的数据挖掘吸引了许多新兴应用的兴趣。特别感兴趣的问题是在不确定数据库中发现频繁的项目集。由于某些项目不会出现在此类数据库的事务中，提出了几种概率模型来测量项目集的频率，并且概率数据的频繁项目集通常具有两个不同的定义：基于预期的支持的频繁项目集和概率频繁的项目集。同时，有人指出，频率本身不能在某些情况下识别有用或有意义的模式。还应考虑其他措施，例如物品的重要性。为此，最近在不确定数据库中对加权（重要性）频繁的项目集挖掘进行了一些研究。但是，它们仅设计用于基于预期的基于支持的频繁项目集，并且由于产生太多频繁的项目集候选而导致的低效率。为了解决这个问题，我们提出了一种新的加权概率频繁项目集（W-PFIS）算法。此外，我们从我们的方法中推导了支持W-PFI候选的概率模型，并提出了三种修剪技术来缩小搜索空间并立即移除未妥协的候选。在实际和合成数据集中进行了广泛的实验，以评估我们的W-PFI算法在运行时，准确性和可扩展性方面的性能。结果表明，我们的算法在现有算法之间产生了最佳性能。

著录项

来源
《Expert Systems》 |2021年第5期|e12551.1-e12551.17|共17页
作者
Li Zhiyang; Chen Fengjuan; Wu Junfeng; Liu Zhaobin; Liu Weijiang;
展开▼
作者单位

Dalian Maritime Univ Informat Sci & Technol Coll Dalian Peoples R China;

Dalian Maritime Univ Informat Sci & Technol Coll Dalian Peoples R China;

Dalian Ocean Univ Sch Informat Engn Dalian Peoples R China;

Dalian Maritime Univ Informat Sci & Technol Coll Dalian Peoples R China;

Dalian Maritime Univ Informat Sci & Technol Coll Dalian Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
probability model; pruning; uncertain database; weighted probabilistic frequent itemset;

机译：概率模型;修剪;不确定数据库;加权概率频繁项目集;

相似文献

外文文献
中文文献
专利

1. An Efficient Method for Mining Frequent Weighted Closed Itemsets from Weighted Item Transaction Databases [J] . Bay Vo Journal of Information Recording . 2017,第1期

机译：一种从加权项目交易数据库中挖掘频繁的加权封闭项目集的有效方法
2. Probabilistic maximal frequent itemset mining methods over uncertain databases [J] . Li Haifeng, Hai Mo, Zhang Ning, Intelligent data analysis . 2019,第6期

机译：概率最大频繁的项目集挖掘方法在不确定数据库中
3. Mining Weighted Frequent Itemsets without Candidate Generation in Uncertain Databases [J] . Lin Jerry Chun-Wei, Gan Wensheng, Fournier-Viger Philippe, International Journal of Information Technology & Decision Making . 2017,第6期

机译：未在不确定数据库中挖掘加权频繁项目集
4. Efficient Mining of Weighted Frequent Itemsets in Uncertain Databases [C] . Jerry Chun-Wei Lin, Wensheng Gan, Philippe Fournier-Viger, International conference on machine learning and data mining . 2016

机译：不确定数据库中加权频繁项集的有效挖掘
5. New algorithms for frequent sequential pattern and itemset data mining in certain and uncertain databases. [D] . Peterson, Erich Allen. 2012

机译：在某些不确定数据库中频繁进行顺序模式和项集数据挖掘的新算法。
6. An efficient pattern growth approach for mining fault tolerant frequent itemsets [O] . Shariq Bashir -1

机译：挖掘容错频繁项集的有效模式增长方法
7. Probabilistic Frequent Pattern Growth for Itemset Mining in Uncertain Databases [O] . Thomas Bernecker, Hans-Peter Kriegel, Matthias Renz, 2012

机译：在不确定数据库中替代项目集挖掘的概率频繁模式增长

Efficient weighted probabilistic frequent itemset mining in uncertain databases

摘要

著录项

相似文献

相关主题

期刊订阅