首页> 外文期刊>Journal of Bioinformatics and Computational Biology >EVALUATION OF NORMALIZATION AND PRE-CLUSTERING ISSUES IN A NOVEL CLUSTERING APPROACH: GLOBAL OPTIMUM SEARCH WITH ENHANCED POSITIONING
【24h】

EVALUATION OF NORMALIZATION AND PRE-CLUSTERING ISSUES IN A NOVEL CLUSTERING APPROACH: GLOBAL OPTIMUM SEARCH WITH ENHANCED POSITIONING

机译:新型聚类方法中归一化和聚类前问题的评估:增强定位的全局最佳搜索

获取原文
获取原文并翻译 | 示例
           

摘要

We study the effects on clustering quality by different normalization and pre-clustering techniques for a novel mixed-integer nonlinear optimization-based clustering algorithm, the Global Optimum Search with Enhanced Positioning (EP_GOS_Clust). These are important issues to be addressed. DNA microarray experiments are informative tools to elucidate gene regulatory networks. But in order for gene expression levels to be comparable across microarrays, normalization procedures have to be properly undertaken. The aim of pre-clustering is to use an adequate amount of discriminatory characteristics to form rough information profiles, so that data with similar features can be pre-grouped together and outliers deemed insignificant to the clustering process can be removed. Using experimental DNA microarray data from the yeast Saccharomyces Cerevisiae, we study the merits of pre-clustering genes based on distance/correlation comparisons and symbolic representations such as {+, o, -}. As a performancemetric, we look at the intra- and inter-cluster error sums, two generic but intuitive measures of clustering quality. We also use publicly available Gene Ontology resources to assess the clusters' level of biological coherence. Our analysis indicates a significant effect by normalization and pre-clustering methods on the clustering results. Hence, the outcome of this study has significance in fine-tuning the EP_GOS_Clust clustering approach.
机译:我们研究了一种基于混合整数非线性优化的新型聚类算法,即具有增强定位的全局最优搜索(EP_GOS_Clust),通过不同的归一化和预聚类技术研究了对聚类质量的影响。这些是需要解决的重要问题。 DNA微阵列实验是阐明基因调控网络的信息工具。但是,为了使基因表达水平在微阵列之间具有可比性,必须正确进行标准化程序。预先聚类的目的是使用足够数量的区分特征来形成粗糙的信息配置文件,以便可以将具有相似特征的数据预先分组在一起,并可以消除对聚类过程无关紧要的异常值。使用来自酿酒酵母的实验DNA微阵列数据,我们基于距离/相关性比较和符号表示形式(例如{+,o,-})研究了预聚类基因的优点。作为一种性能指标,我们研究了集群内和集群间误差之和,这是两个通用但直观的集群质量度量。我们还使用公共可用的基因本体论资源来评估集群的生物一致性水平。我们的分析表明,归一化和预聚类方法对聚类结果有显着影响。因此,这项研究的结果对于微调EP_GOS_Clust聚类方法具有重要意义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号