首页> 外文会议> >Techniques for Missing Value Recovering in Imbalanced Databases: Application in a Marketing Database with Massive Missing Data
【24h】

Techniques for Missing Value Recovering in Imbalanced Databases: Application in a Marketing Database with Massive Missing Data

机译:失衡数据库中的缺失值恢复技术:在具有大量缺失数据的市场营销数据库中的应用

获取原文

摘要

Missing data in databases are considered to be one of the biggest problems faced on Data Mining application. This problem can be aggravated when there is massive missing data in the presence of imbalanced databases. Several techniques as imputation, classifiers and approximation of patterns have been proposed and compared, but these comparisons do not consider adverse conditions found in real databases. In this work, it is presented a comparison of techniques used to classify records from a real imbalanced database with massive missing data, where the main objective is the database pre-processing to recover and select records completely filled for a further application of the techniques. It was compared algorithms such as clustering, decision tree, artificial neural networks and Bayesian classifier. Through the results, it can be verified that the problem characterization and database understanding are essential steps for a correct techniques comparison in a real problem. It was observed that artificial neural networks are an interesting alternative for this kind of problem since it is capable to obtain satisfactory results even when dealing with real-world problems.
机译:数据库中的数据丢失被认为是数据挖掘应用程序面临的最大问题之一。当存在不平衡的数据库时,如果有大量丢失的数据,则此问题可能会加剧。已经提出并比较了几种技术,例如归因,分类器和模式逼近,但是这些比较没有考虑实际数据库中发现的不利条件。在这项工作中,将对用于对具有大量丢失数据的真实不平衡数据库中的记录进行分类的技术进行比较,其主要目标是对数据库进行预处理以恢复并选择完全填充的记录,以进一步应用这些技术。比较了诸如聚类,决策树,人工神经网络和贝叶斯分类器之类的算法。通过结果,可以验证问题的特征描述和数据库理解是在实际问题中进行正确技术比较的必要步骤。据观察,人工神经网络是解决此类问题的一种有趣的替代方法,因为即使在处理实际问题时,它也能够获得令人满意的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号