首页> 外文期刊>Kuwait Journal of Science >A Novel Clustering Method Suitable for Clustering of Biological Signal Datasets Containing Batched Outliers
【24h】

A Novel Clustering Method Suitable for Clustering of Biological Signal Datasets Containing Batched Outliers

机译:一种适用于包含批次异常值的生物信号数据集聚类的新聚类方法

获取原文
           

摘要

During clustering analyses, instances of batched outliers of one class falling close to another class can be a significant?problem. Such outliers might be incorporated into a false class or lead to the false identification of unreal classes,which can lead to false localization of the cluster centers. Here we propose a novel method for accurate classification of outliers in batched clustering analyses, aimed specifically at the type of outliers most often encountered in biological signals. The?recommended divisive hierarchical clustering method is based on how much each element in the dataset is unwanted by other elements. In this method, the reluctance vectors applied to each element by the other elements are first determined. According to the common features of the reluctance vectors (horizontal and vertical components), two initial classes are obtained from some elements. All remaining elements are then included into classes according to their proximity to these classes. Then, using the reluctance vectors developed between the two established classes, class that might be re-divided are identified and further classes are constituted using the same splitting method. To validate this approach, which we named the selfish data clustering (SDC) method, areal dataset was analyzed using the SDC method and other commonly applied clustering methods. We found that our clustering method outperformed the conventional approaches by up to 12% (average is 6%) in datasets with low silhouette values.
机译:在聚类分析中,一个类别的批次异常值接近另一个类别的实例可能是一个重大问题。这样的离群值可能被合并到一个错误的类中,或者导致对虚幻的类的错误识别,这可能导致聚类中心的错误定位。在这里,我们提出了一种新的方法,用于在批量聚类分析中对异常值进行准确分类,特别针对生物学信号中最常遇到的异常值类型。推荐的划分层次聚类方法基于数​​据集中每个元素被其他元素所不希望的数量。在这种方法中,首先确定由其他元素施加到每个元素的磁阻矢量。根据磁阻矢量的共同特征(水平和垂直分量),可以从某些元素中获得两个初始类别。然后,根据所有剩余元素与这些类的接近程度将它们包括在类中。然后,使用在两个已建立类别之间开发的磁阻矢量,识别可能被重新划分的类别,并使用相同的拆分方法构成其他类别。为了验证这种称为自私数据聚类(SDC)方法的方法,使用SDC方法和其他常用聚类方法对面数据集进行了分析。我们发现,在低轮廓值的数据集中,我们的聚类方法比常规方法高出12%(平均为6%)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号