A Novel Clustering Method Suitable for Clustering of Biological Signal Datasets Containing Batched Outliers

Selahaddin B. Akben

首页> 外文期刊>Kuwait Journal of Science >A Novel Clustering Method Suitable for Clustering of Biological Signal Datasets Containing Batched Outliers

【24h】

A Novel Clustering Method Suitable for Clustering of Biological Signal Datasets Containing Batched Outliers

机译：一种适用于包含批次异常值的生物信号数据集聚类的新聚类方法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

During clustering analyses, instances of batched outliers of one class falling close to another class can be a significant?problem. Such outliers might be incorporated into a false class or lead to the false identification of unreal classes,which can lead to false localization of the cluster centers. Here we propose a novel method for accurate classification of outliers in batched clustering analyses, aimed specifically at the type of outliers most often encountered in biological signals. The?recommended divisive hierarchical clustering method is based on how much each element in the dataset is unwanted by other elements. In this method, the reluctance vectors applied to each element by the other elements are first determined. According to the common features of the reluctance vectors (horizontal and vertical components), two initial classes are obtained from some elements. All remaining elements are then included into classes according to their proximity to these classes. Then, using the reluctance vectors developed between the two established classes, class that might be re-divided are identified and further classes are constituted using the same splitting method. To validate this approach, which we named the selfish data clustering (SDC) method, areal dataset was analyzed using the SDC method and other commonly applied clustering methods. We found that our clustering method outperformed the conventional approaches by up to 12% (average is 6%) in datasets with low silhouette values.

机译：在聚类分析中，一个类别的批次异常值接近另一个类别的实例可能是一个重大问题。这样的离群值可能被合并到一个错误的类中，或者导致对虚幻的类的错误识别，这可能导致聚类中心的错误定位。在这里，我们提出了一种新的方法，用于在批量聚类分析中对异常值进行准确分类，特别针对生物学信号中最常遇到的异常值类型。推荐的划分层次聚类方法基于数据集中每个元素被其他元素所不希望的数量。在这种方法中，首先确定由其他元素施加到每个元素的磁阻矢量。根据磁阻矢量的共同特征（水平和垂直分量），可以从某些元素中获得两个初始类别。然后，根据所有剩余元素与这些类的接近程度将它们包括在类中。然后，使用在两个已建立类别之间开发的磁阻矢量，识别可能被重新划分的类别，并使用相同的拆分方法构成其他类别。为了验证这种称为自私数据聚类（SDC）方法的方法，使用SDC方法和其他常用聚类方法对面数据集进行了分析。我们发现，在低轮廓值的数据集中，我们的聚类方法比常规方法高出12％（平均为6％）。

著录项

来源
《Kuwait Journal of Science》 |2017年第4期|共页
作者
Selahaddin B. Akben;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类科学和自然哲学;
关键词
Batched outliersclusteringdata miningforce fieldssparse data.;

机译：批处理的异常值群集数据挖掘力字段稀疏数据。;

相似文献

外文文献
中文文献
专利

1. An Efficient Clustering Technique For Cluster Extraction From Unlabeled Datasets Using Nonlinear Methods [J] . Satish Kumar Soni, Ramjeevan Singh Thakur, Anil Kumar Gupta International Journal of Scientific & Technology Research . 2019,第8期

机译：一种使用非线性方法从未标记数据集中提取聚类的有效聚类技术
2. Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets [J] . D. D. Shrimankar, S. R. Sathe Bioinformatics and Biology Insights . 2016,第Supplaa2期

机译：SMP节点和工作站集群上并行算法的并行编程模型与基于图块的大型生物数据集新方法并行分析
3. SDCOR: Scalable density-based clustering for local outlier detection in massive-scale datasets [J] . Nozad Sayyed Ahmad Naghavi, Haeri Maryam Amir, Folino Gianluigi Knowledge-Based Systems . 2021,第Sepa27期

机译：SDCOR：基于尺寸的基于密度的基于密度的聚类，用于大规模数据集中的本地异常检测
4. A Modified Relationship Based Clustering Framework for Density Based Clustering and Outlier Filtering on High Dimensional Datasets [C] . Turgay Tugay Bilgin, A. Yilmaz Camurcu Advances in Knowledge Discovery and Data Mining; Lecture Notes in Artificial Intelligence; 4426 . 2007

机译：用于高密度数据集上基于密度的聚类和离群值过滤的基于关系的聚类改进框架
5. Survival-related Clustering of Cancer Patients by Integrating Clinical and Biological Datasets [D] . Wei, Xinming. 2020

机译：通过整合临床和生物数据集来生存相关聚类癌症患者
6. Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets [O] . D. D. Shrimankar, S. R. Sathe 2016

机译：大型生物数据集基于新图块的并行编程模型对SMP节点和工作站集群的并行算法进行分析
7. Constraint based Cluster Ensemble to Detect Outliers in Medical Datasets [O] . Visakh. R, Lakshmipathi.B Lakshmipathi.B 2012

机译：基于约束的群集集合来检测医疗数据集中的异常值

A Novel Clustering Method Suitable for Clustering of Biological Signal Datasets Containing Batched Outliers

摘要

著录项

相似文献

相关主题

期刊订阅