CLAG: an unsupervised non hierarchical clustering algorithm handling biological data

Linda Dib; Alessandra Carbone

首页> 外文期刊>BMC Bioinformatics >CLAG: an unsupervised non hierarchical clustering algorithm handling biological data

【24h】

CLAG: an unsupervised non hierarchical clustering algorithm handling biological data

机译：CLAG：处理生物数据的无监督非分层聚类算法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Background Searching for similarities in a set of biological data is intrinsically difficult due to possible data points that should not be clustered, or that should group within several clusters. Under these hypotheses, hierarchical agglomerative clustering is not appropriate. Moreover, if the dataset is not known enough, like often is the case, supervised classification is not appropriate either. Results CLAG (for CLusters AGgregation) is an unsupervised non hierarchical clustering algorithm designed to cluster a large variety of biological data and to provide a clustered matrix and numerical values indicating cluster strength. CLAG clusterizes correlation matrices for residues in protein families, gene-expression and miRNA data related to various cancer types, sets of species described by multidimensional vectors of characters, binary matrices. It does not ask to all data points to cluster and it converges yielding the same result at each run. Its simplicity and speed allows it to run on reasonably large datasets. Conclusions CLAG can be used to investigate the cluster structure present in biological datasets and to identify its underlying graph. It showed to be more informative and accurate than several known clustering methods, as hierarchical agglomerative clustering, k-means, fuzzy c-means, model-based clustering, affinity propagation clustering, and not to suffer of the convergence problem proper to this latter.

机译：背景技术由于不应该将可能的数据点聚类或者应该在几个聚类内分组，因此在一组生物数据中搜索相似性本质上是困难的。在这些假设下，分层聚集聚类是不合适的。此外，如果对数据集的了解不够（通常是这样），则监督分类也不适合。结果CLAG（用于CLusters AGgregation）是一种无监督的非分层聚类算法，旨在聚类大量生物数据并提供聚类矩阵和表示聚类强度的数值。 CLAG对蛋白质家族中的残基，与各种癌症类型有关的基因表达和miRNA数据，由字符的多维向量描述的物种集，二元矩阵等相关矩阵进行聚类。它不会要求所有数据点都聚类，并且会在每次运行时收敛以产生相同的结果。它的简单性和速度使其可以在相当大的数据集上运行。结论CLAG可用于研究生物数据集中存在的簇结构并确定其基础图。与分层聚类聚类，k均值，模糊c均值，基于模型的聚类，亲和度传播聚类相比，它比几种已知的聚类方法更具信息性和准确性，并且没有遇到适合后者的聚类问题。

著录项

来源
《BMC Bioinformatics》 |2012年第1期|共页
作者
Linda Dib; Alessandra Carbone;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类生物科学;
关键词

相似文献

外文文献
中文文献
专利

1. A new unsupervised gene clustering algorithm based on the integration of biological knowledge into expression data [J] . Marie Verbanck, Sébastien Lê, Jér?me Pagès BMC Bioinformatics . 2013,第1期

机译：一种基于生物学知识集成到表达数据的新无监督基因聚类算法
2. An unsupervised hierarchical clustering based heuristic algorithm for facilitated training of electricity consumption disaggregation systems [J] . Farrokh Jazizadeh, Burcin Becerik-Gerber, Mario Berges, Advanced engineering informatics . 2014,第4期

机译：一种基于无监督分层聚类的启发式算法，有助于训练用电分解系统
3. Dignet: an unsupervised-learning clustering algorithm forclustering and data fusion [J] . Thomopoulos S.C.A., Bougoulias D.K., Chin-Der Wann IEEE Transactions on Aerospace and Electronic Systems . 1995,第1期

机译：Dignet：用于聚类和数据融合的无监督学习聚类算法
4. Biologically Supervised Hierarchical Clustering Algorithms for Gene Expression Data [C] . Boratyn, Grzegorz M, Datta, Engineering in Medicine and Biology Society, 2006 Annual International Conference of the IEEE . 2006

机译：基因表达数据的生物监督层次聚类算法
5. A neural network guided clustering algorithm for unsupervised microarray data analysis [D] . Sundaragopal, Srinivas 2005

机译：用于无监督微阵列数据分析的神经网络引导聚类算法
6. CLAG: an unsupervised non hierarchical clustering algorithm handling biological data [O] . Linda Dib, Alessandra Carbone 2012

机译：CLAG：处理生物数据的无监督非分层聚类算法
7. CLAG: an unsupervised non hierarchical clustering algorithm handling biological data [O] . Dib, Linda, Carbone, Alessandra 2012

机译：CLAG：处理生物数据的无监督非分层聚类算法

CLAG: an unsupervised non hierarchical clustering algorithm handling biological data

摘要

著录项

相似文献

相关主题

期刊订阅