首页> 外文学位 >Efficient case-based reasoning through feature weighting, and its application in protein crystallography.

【24h】

Efficient case-based reasoning through feature weighting, and its application in protein crystallography.

机译：通过特征加权的基于案例的有效推理及其在蛋白质晶体学中的应用。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data preprocessing is critical for machine learning, data mining, and pattern recognition. In particular, selecting relevant and non-redundant features in high-dimensional data is important to efficiently construct models that accurately describe the data. In this work, I present SLIDER, an algorithm that weights features to reflect relevance in determining similarity between instances. Accurate weighting of features improves the similarity measure, which is useful in learning algorithms like nearest neighbor and case-based reasoning. SLIDER performs a greedy search for optimum weights in an exponentially large space of weight vectors. Exhaustive search being intractable, the algorithm reduces the search space by focusing on pivotal weights at which representative instances are equidistant to truly similar and different instances in Euclidean space. SLIDER then evaluates those weights heuristically, based on effectiveness in properly ranking pre-determined matches of a set of cases, relative to mismatches.; I analytically show that by choosing feature weights that minimize the mean rank of matches relative to mismatches, the separation between the distributions of Euclidean distances for matches and mismatches is increased. This leads to a better distance metric, and consequently increases the probability of retrieving true matches from a database. I also discuss how SLIDER is used to improve the efficiency and effectiveness of case retrieval in a case-based reasoning system that automatically interprets electron density maps to determine the three-dimensional structures of proteins. Electron density patterns for regions in a protein are represented by numerical features, which are used in a distance metric to efficiently retrieve matching patterns by searching a large database. These pre-selected cases are then evaluated by more expensive methods to identify truly good matches---this strategy speeds up the retrieval of matching density regions, thereby enabling fast and accurate protein model-building. This two-phase case retrieval approach is potentially useful in many case-based reasoning systems, especially those with computationally expensive case matching and large case libraries.

机译：数据预处理对于机器学习，数据挖掘和模式识别至关重要。特别是，在高维数据中选择相关的和非冗余的特征对于有效构建可准确描述数据的模型很重要。在这项工作中，我介绍了SLIDER，这是一种对特征进行加权以反映确定实例之间相似性的相关性的算法。准确的特征权重可改善相似性度量，这在学习算法（如最近邻居和基于案例的推理）中很有用。 SLIDER在权重向量的指数空间中对最优权重进行贪婪搜索。穷举搜索难以解决，该算法通过关注代表实例与欧几里得空间中真正相似和不同实例等距的枢轴权重来减少搜索空间。然后，SLIDER会根据正确排序一组案例中相对于错配的预定匹配的有效性，试探性地评估这些权重。我的分析表明，通过选择使权重相对于失配的平均排名最小的特征权重，可以增加匹配和失配的欧氏距离分布之间的距离。这导致更好的距离度量，并因此增加了从数据库检索真实匹配项的可能性。我还将讨论在基于案例的推理系统中如何使用SLIDER来提高案例检索的效率和有效性，该系统可自动解释电子密度图以确定蛋白质的三维结构。蛋白质区域中的电子密度模式由数字特征表示，这些特征在距离度量中用于通过搜索大型数据库来有效地检索匹配模式。然后，通过更昂贵的方法对这些预选的病例进行评估，以鉴定出真正良好的匹配-这种策略可加快匹配密度区域的检索速度，从而实现快速，准确的蛋白质模型构建。这种两阶段的案例检索方法在许多基于案例的推理系统中可能很有用，尤其是那些具有计算上昂贵的案例匹配和大型案例库的系统。

著录项

作者
Gopal, Kreshna.;
展开▼
作者单位

Texas A&M University.;

展开▼
授予单位 Texas A&M University.;
学科 Biology Bioinformatics.; Artificial Intelligence.; Computer Science.
学位 Ph.D.
年度 2007
页码 137 p.
总页数 137
原文格式 PDF
正文语种 eng
中图分类人工智能理论;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A learning algorithm for feature weighting in case-based reasoning [J] . Maryam Ghadami, Seyed-Mohammad Seyedhosseini, Ahmad Makui International Journal of Knowledge Engineering and Data Mining . 2012,第2a3期

机译：基于案例的推理中特征加权的学习算法
2. A personalized counseling system using case-based reasoning with neural symbolic feature weighting (CANSY) [J] . Kwang Hyuk Im, Sung Ho Ha Applied Intelligence . 2008,第3期

机译：使用基于案例的推理和神经符号特征权重（CANDY）的个性化咨询系统
3. A personalized counseling system using case-based reasoning with neural symbolic feature weighting (CANSY) [J] . Sungho Ha Applied Intelligence . 2008,第3期

机译：使用基于案例的推理和神经符号特征权重（CANDY）的个性化咨询系统
4. Twin-Systems to Explain Artificial Neural Networks using Case-Based Reasoning: Comparative Tests of Feature-Weighting Methods in ANN-CBR Twins for XAI [C] . Eoin M. Kenny, Mark T. Keane International Joint Conference on Artificial Intelligence . 2020

机译：双床系，用于使用基于案例的推理解释人工神经网络：XAI的ANN-CBR双胞胎特征加权方法的比较试验
5. New chemical methods for the synthesis of proteins and their application to the elucidation of protein structure by racemic protein crystallography. [D] . Pentelute, Brad L. 2008

机译：蛋白质合成的新化学方法及其在外消旋蛋白质晶体学阐明蛋白质结构中的应用。
6. An efficient heuristic method for active feature acquisition and its application to protein-protein interaction prediction [O] . Mohamed Thahir, Tarun Sharma, Madhavi K Ganapathiraju 2012

机译：一种有效的主动特征获取启发式方法及其在蛋白质相互作用预测中的应用
7. Hepatitis Diagnosis Using Case-Based Reasoning with Gradient Descent as Feature Weighting Method [O] . Yufika Sari Bagi, Suprapto Suprapto 2018

机译：使用基于梯度下降的肝炎诊断用作特征加权方法
8. Spatio-Temporal Case-Based Reasoning for Efficient Reactive Robot Navigation [R] . Likhachev, M. , Kaess, M. , Kira, Z. , 2005

机译：基于时空案例的高效无人机导航推理

Efficient case-based reasoning through feature weighting, and its application in protein crystallography.

摘要

著录项

相似文献

相关主题

期刊订阅