首页> 美国卫生研究院文献>Frontiers in Genetics >A Tool for Visualization and Analysis of Single-Cell RNA-Seq Data Based on Text Mining
【2h】

A Tool for Visualization and Analysis of Single-Cell RNA-Seq Data Based on Text Mining

机译:基于文本挖掘的单细胞RNA-Seq数据可视化和分析工具

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Gene expression in individual cells can now be measured for thousands of cells in a single experiment thanks to innovative sample-preparation and sequencing technologies. State-of-the-art computational pipelines for single-cell RNA-sequencing data, however, still employ computational methods that were developed for traditional bulk RNA-sequencing data, thus not accounting for the peculiarities of single-cell data, such as sparseness and zero-inflated counts. Here, we present a ready-to-use pipeline named gf-icf (gene frequency–inverse cell frequency) for normalization of raw counts, feature selection, and dimensionality reduction of scRNA-seq data for their visualization and subsequent analyses. Our work is based on a data transformation model named term frequency–inverse document frequency (TF-IDF), which has been extensively used in the field of text mining where extremely sparse and zero-inflated data are common. Using benchmark scRNA-seq datasets, we show that the gf-icf pipeline outperforms existing state-of-the-art methods in terms of improved visualization and ability to separate and distinguish different cell types.
机译:现在,借助创新的样品制备和测序技术,可以在单个实验中测量成千上万个细胞在单个细胞中的基因表达。但是,单细胞RNA测序数据的最新计算流水线仍采用为传统的批量RNA测序数据开发的计算方法,因此无法解决单细胞数据的特殊性(例如稀疏性)和零膨胀计数。在这里,我们介绍了一个名为gf-icf(基因频率-细胞反向频率)的现成管道,用于原始计数的标准化,特征选择以及scRNA-seq数据的降维,以进行可视化和后续分析。我们的工作基于名为术语频率-逆文档频率(TF-IDF)的数据转换模型,该模型已广泛用于文本稀疏和零膨胀数据非常普遍的文本挖掘领域。使用基准scRNA-seq数据集,我们显示gf-icf管道在改进的可视化以及分离和区分不同细胞类型的能力方面优于现有的最新技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号