首页> 外文期刊>Frontiers in Bioengineering and Biotechnology >GeTallele: A Method for Analysis of DNA and RNA Allele Frequency Distributions
【24h】

GeTallele: A Method for Analysis of DNA and RNA Allele Frequency Distributions

机译:GetSillE:一种分析DNA和RNA等位基因分布的方法

获取原文
           

摘要

Variant allele frequencies (VAF) are an important measure of genetic variation that can be estimated at single-nucleotide variant (SNV) sites. RNA and DNA VAFs are used as indicators of a wide-range of biological traits, including tumor purity and ploidy changes, allele-specific expression and gene-dosage transcriptional response. Here we present a novel methodology to assess gene and chromosomal allele asymmetries and to aid in identifying genomic alterations in RNA and DNA datasets. Our approach is based on analysis of the VAF distributions in chromosomal segments (continuous multi-SNV genomic regions). In each segment we estimate variant probability, a parameter of a random process that can generate synthetic VAF samples that closely resembles the observed data. In this study, we show that variant probability is a biologically interpretable quantitative descriptor of the VAF distribution in chromosomal segments which is consistent with other approaches. To this end, we apply the proposed methodology on data from 72 samples obtained from patients with breast invasive carcinoma (BRCA) from The Cancer Genome Atlas (TCGA). We compare DNA and RNA VAF distributions from matched RNA and whole exome sequencing (WES) datasets and find that both genomic signals give very similar segmentation and estimated variant probability profiles. We also find a correlation between variant probability with copy number alterations (CNA). Finally, to demonstrate a practical application of variant probabilities, we use them to estimate tumor purity. Tumor purity estimates based on variant probabilities demonstrate good concordance with other approaches (Pearson’s correlation between 0.44 and 0.76). Our evaluation suggests that variant probabilities can serve as a dependable descriptor of VAF distribution, further enabling the statistical comparison of matched DNA and RNA datasets. Finally, they provide conceptual and mechanistic insights into relations between structure of VAF distributions and genetic events. The methodology is implemented in a Matlab toolbox that provides a suite of functions for analysis, statistical assessment and visualization of Genome and Transcriptome allele frequencies distributions. GeTallele is available at: https://github.com/SlowinskiPiotr/GeTallele.
机译:变异等位基因频率(VAF)是可以在单核苷酸变体(SNV)位点估计的遗传变异的重要衡量标准。 RNA和DNA VAFS用作广泛的生物性状的指标,包括肿瘤纯度和倍增性变化,等位基因特异性表达和基因剂量转录反应。在这里,我们提出了一种评估基因和染色体等位基因不对称的新方法,并有助于鉴定RNA和DNA数据集的基因组改变。我们的方法是基于分析染色体段中的VAF分布(连续多SNV基因组区域)。在每个段我们估计变量概率中,随机过程的参数可以生成与观察到的数据密切类似的合成VAF样本。在本研究中,我们表明变体概率是与其他方法一致的染色体段中的VAF分布的生物上可解释的定量描述符。为此,我们将提出的方法从癌症基因组Atlas(TCGA)从乳腺侵入性癌(BRCA)中获得的72名样本中的数据应用。我们将DNA和RNA VAF分布与匹配的RNA和全外壳测序(WES)数据集进行比较,并发现两个基因组信号都提供了非常相似的分割和估计变体概率概况。我们还发现Valiant概率与副本编号改变(CNA)之间的相关性。最后,为了证明变体概率的实际应用,我们用它们来估计肿瘤纯度。基于变体概率的肿瘤纯度估计表现出与其他方法的良好一致性(Pearson之间的相关性0.44和0.76)。我们的评价表明,变体概率可以用作VAF分布的可靠描述符,进一步能够实现匹配的DNA和RNA数据集的统计比较。最后,他们为VAF分布和遗传活动结构之间的关系提供了概念和机械洞察力。该方法在MATLAB工具箱中实现,提供了一种用于分析,基因组和转录组等位基因频率分布的分析,统计评估和可视化的功能套件。 GetSillE可在:https://github.com/slowinskieiotr/gellale。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号