首页> 美国卫生研究院文献>Genes >Multi-Objective Optimized Fuzzy Clustering for Detecting Cell Clusters from Single-Cell Expression Profiles
【2h】

Multi-Objective Optimized Fuzzy Clustering for Detecting Cell Clusters from Single-Cell Expression Profiles

机译:用于从单细胞表达谱检测细胞簇的多目标优化模糊聚类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Rapid advance in single-cell RNA sequencing (scRNA-seq) allows measurement of the expression of genes at single-cell resolution in complex disease or tissue. While many methods have been developed to detect cell clusters from the scRNA-seq data, this task currently remains a main challenge. We proposed a multi-objective optimization-based fuzzy clustering approach for detecting cell clusters from scRNA-seq data. First, we conducted initial filtering and SCnorm normalization. We considered various case studies by selecting different cluster numbers (cl = 2 to a user-defined number), and applied fuzzy c-means clustering algorithm individually. From each case, we evaluated the scores of four cluster validity index measures, Partition Entropy (PE), Partition Coefficient (PC), Modified Partition Coefficient (MPC), and Fuzzy Silhouette Index (FSI). Next, we set the first measure as minimization objective (↓) and the remaining three as maximization objectives (↑), and then applied a multi-objective decision-making technique, TOPSIS, to identify the best optimal solution. The best optimal solution (case study) that had the highest TOPSIS score was selected as the final optimal clustering. Finally, we obtained differentially expressed genes (DEGs) using Limma through the comparison of expression of the samples between each resultant cluster and the remaining clusters. We applied our approach to a scRNA-seq dataset for the rare intestinal cell type in mice [GEO ID: , 23,630 features (genes) and 288 cells]. The optimal cluster result (TOPSIS optimal score= 0.858) comprised two clusters, one with 115 cells and the other 91 cells. The evaluated scores of the four cluster validity indices, FSI, PE, PC, and MPC for the optimized fuzzy clustering were 0.482, 0.578, 0.607, and 0.215, respectively. The Limma analysis identified 1240 DEGs (cluster 1 vs. cluster 2). The top ten gene markers were Rps21, Slc5a1, Crip1, Rpl15, Rpl3, Rpl27a, Khk, Rps3a1, Aldob and Rps17. In this list, Khk (encoding ketohexokinase) is a novel marker for the rare intestinal cell type. In summary, this method is useful to detect cell clusters from scRNA-seq data.
机译:单细胞RNA测序(scRNA-seq)的快速发展使得可以在复杂疾病或组织中以单细胞分辨率测量基因的表达。尽管已经开发出许多方法来从scRNA-seq数据中检测细胞簇,但该任务目前仍然是主要挑战。我们提出了一种基于多目标优化的模糊聚类方法,用于从scRNA-seq数据中检测细胞簇。首先,我们进行了初始过滤和SCnorm标准化。我们通过选择不同的群集编号来考虑各种案例研究( c l = 2到用户定义的数字),并分别应用模糊c均值聚类算法。在每种情况下,我们评估了四个聚类有效性指标测度的分数,即分区熵( P E ),分配系数( P C ),修改后的分区系数( < mrow> M P C )和模糊轮廓索引( F S I )。接下来,我们将第一个量度设置为最小化目标(↓),将其余三个量度设置为最大化目标(↑),然后应用多目标决策技术TOPSIS来确定最佳的最优解决方案。选择具有最高TOPSIS分数的最佳最佳解决方案(案例研究)作为最终的最佳聚类。最后,我们通过比较每个所得簇与其余簇之间样品的表达,获得了使用Limma的差异表达基因(DEG)。我们将我们的方法应用于小鼠稀有肠道细胞类型的scRNA-seq数据集[GEO ID:,23,630个特征(基因)和288个细胞]。最佳聚类结果(TOPSIS最佳分数= 0.858)包括两个聚类,一个包含115个细胞,另一个包含91个细胞。四个聚类有效性指数的评估得分, <增长> <增长> F S I P E P C M P C 分别为0.482、0.578、0.607和0.215。 Limma分析确定了1240个DEG(聚类1与聚类2)。前十个基因标记是Rps21,Slc5a1,Crip1,Rpl15,Rpl3,Rpl27a,Khk,Rps3a1,Aldob和Rps17。在此列表中,Khk(编码酮己酮激酶)是一种罕见的肠道细胞类型的新型标记。总之,此方法可用于从scRNA-seq数据检测细胞簇。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号