首页> 外文学位 >Data mining and knowledge discovery: A guided approach based on monotone Boolean functions.
【24h】

Data mining and knowledge discovery: A guided approach based on monotone Boolean functions.

机译:数据挖掘和知识发现:一种基于单调布尔函数的指导方法。

获取原文
获取原文并翻译 | 示例

摘要

This dissertation deals with an important problem in Data Mining and Knowledge Discovery (DM & KD), and Information Technology (IT) in general. It addresses the problem of efficiently learning monotone Boolean functions via membership queries to oracles. The monotone Boolean function can be thought of as a phenomenon, such as breast cancer or a computer crash, together with a set of predictor variables. The oracle can be thought of as an entity that knows the underlying monotone Boolean function, and provides a Boolean response to each query. In practice, it may take the shape of a human expert, or it may be the outcome of performing tasks such as running experiments or searching large databases.; Monotone Boolean functions have a general knowledge representation power and are inherently frequent in applications. A key goal of this dissertation is to demonstrate the wide spectrum of important real-life applications that can be analyzed by using the new proposed computational approaches. The applications of breast cancer diagnosis, computer crashing, college acceptance policies, and record linkage in databases are here used to demonstrate this point and illustrate the algorithmic details. Monotone Boolean functions have the added benefit of being intuitive. This property is perhaps the most important in learning environments, especially when human interaction is involved, since people tend to make better use of knowledge they can easily interpret, understand, validate, and remember.; The main goal of this dissertation is to design new algorithms that can minimize the average number of queries used to completely reconstruct monotone Boolean functions defined on a finite set of vectors V = {lcub}0,1{rcub}n. The optimal query selections are found via a recursive algorithm in exponential time (in the size of V). The optimality conditions are then summarized in the simple form of evaluative criteria, which are near optimal and only take polynomial time to compute. Extensive unbiased empirical results show that the evaluative criterion approach is far superior to any of the existing methods. In fact, the reduction in average number of queries increases exponentially with the number of variables n, and faster than exponentially with the oracle's error rate.
机译:本文主要涉及数据挖掘和知识发现(DM&KD)以及信息技术(IT)中的一个重要问题。它通过对oracle的成员资格查询解决了有效学习单调布尔函数的问题。单调布尔函数可以与一组预测变量一起被认为是一种现象,例如乳腺癌或计算机崩溃。可以将oracle视为知道基础单调布尔函数并为每个查询提供布尔响应的实体。在实践中,它可能会像人类专家一样,或者可能是执行诸如运行实验或搜索大型数据库之类的任务的结果。单调布尔函数具有一般知识的表示能力,并且在应用程序中固有地很频繁。本文的主要目标是证明可以通过使用新提出的计算方法进行分析的重要的现实生活应用的广泛范围。乳腺癌诊断,计算机崩溃,大学录取政策以及数据库中的记录链接的应用在这里用于证明这一点并说明算法细节。单调布尔函数具有直观的附加优点。该属性可能是学习环境中最重要的属性,尤其是在涉及人机交互时,因为人们倾向于更好地利用知识,因此他们可以轻松地解释,理解,验证和记忆。本文的主要目的是设计一种新算法,该算法可以最大程度地减少用于完全重构有限矢量集 V = {lcub} 0,1 {rcub上定义的单调布尔函数的平均查询数量。 } n 。最佳查询选择是通过递归算法以指数时间(以 V 的大小)找到的。然后,以评估标准的简单形式总结最佳条件,这些条件接近最佳且仅需多项式时间即可计算。大量的无偏经验结果表明,评估标准方法远远优于任何现有方法。实际上,平均查询数量的减少随着变量 n 的数量呈指数增长,并且比使用oracle的错误率呈指数增长更快。

著录项

  • 作者

    Torvik, Vetle Ingvald.;

  • 作者单位

    Louisiana State University and Agricultural & Mechanical College.;

  • 授予单位 Louisiana State University and Agricultural & Mechanical College.;
  • 学科 Operations Research.; Statistics.; Computer Science.
  • 学位 Ph.D.
  • 年度 2002
  • 页码 102 p.
  • 总页数 102
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 运筹学;统计学;自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号