首页> 外文期刊>Bioinformatics >Genomic sweeping for hypermethylated genes
【24h】

Genomic sweeping for hypermethylated genes

机译:超甲基化基因的基因组扫描

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: Genes silenced by the aberrent methylation of nearby CpG islands can contribute to the onset or progression of cancer and represent potential biomarkers for diagnosis and prognosis. Relatively few have thus far been validated as hypermethylated in cancer among over 14 000 candidates with promoter region CpG islands. A descriptive set of genes known to be unmethylated in cancer does not exist. This lack of a negative set and a large number of candidates necessitated the development of a new approach to identify novel genes hypermethylated in cancer. Results: We developed a general method, cluster_boost, that in an imbalanced data setting predicts new minority class members given limited known samples and a large set of unlabeled samples. Synthetic datasets modeled after the hypermethylated genes data show that cluster_boost can successfully identify minority samples within unlabeled data. Using genome sequence features, cluster_boost predicted candidate hypermethylated genes among 14 000 genes of unknown status. In primary ovarian cancers, we determined the methylation status for 15 genes with different levels of support for being hypermethlyated. Results indicate cluster_boost can accurately identify novel genes hypermethylated in cancer.
机译:动机:由于附近CpG岛异常甲基化而沉默的基因可能有助于癌症的发作或发展,并代表了诊断和预后的潜在生物标志物。迄今为止,在超过14 000个具有启动子区域CpG岛的候选人中,很少有人被证实在癌症中高甲基化。已知在癌症中未甲基化的描述性基因集不存在。由于缺乏阴性结果和大量候选人,必须开发一种新方法来鉴定癌症中甲基化过高的新基因。结果:我们开发了一种通用方法cluster_boost,该方法在不平衡的数据设置中会在已知有限的已知样本和大量未标记样本的情况下预测新的少数群体成员。根据高甲基化基因数据建模的综合数据集显示cluster_boost可以成功识别未标记数据中的少数样品。利用基因组序列特征,cluster_boost预测了未知状态的14 000个基因中的候选超甲基化基因。在原发性卵巢癌中,我们确定了15种基因的甲基化状态,这些基因具有不同程度的高甲基化支持率。结果表明cluster_boost可以准确识别癌症中甲基化过高的新基因。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号