首页> 美国卫生研究院文献>Scientific Reports >Translational utility of a hierarchical classification strategy in biomolecular data analytics
【2h】

Translational utility of a hierarchical classification strategy in biomolecular data analytics

机译:分级分类策略在生物分子数据分析中的转换效用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Hierarchical classification (HC) stratifies and classifies data from broad classes into more specific classes. Unlike commonly used data classification strategies, this enables the probabilistic prediction of unknown classes at different levels, minimizing the burden of incomplete databases. Despite these advantages, its translational application in biomedical sciences has been limited. We describe and demonstrate the implementation of a HC approach for “omics-driven” classification of 15 bacterial species at various taxonomic levels achieving 90–100% accuracy, and 9 cancer types into morphological types and 35 subtypes with 99% and 76% accuracy, respectively. Unknown bacterial species were probabilistically assigned with 100% accuracy to their respective genus or family using mass spectra (n = 284). Cancer types were predicted by mRNA data (n = 1960) for most subtypes with 95–100% accuracy. This has high relevance in clinical practice where complete datasets are difficult to compile with the continuous evolution of diseases and emergence of new strains, yet prediction of unknown classes, such as bacterial species, at upper hierarchy levels may be sufficient to initiate antimicrobial therapy. The algorithms presented here can be directly translated into clinical-use with any quantitative data, and have broad application potential, from unlabeled sample identification, to hierarchical feature selection, and discovery of new taxonomic variants.
机译:分层分类(HC)将数据从广泛的类别中分层和分类为更具体的类别。与常用的数据分类策略不同,这可以对不同级别的未知类进行概率预测,从而最大程度地减少了不完整数据库的负担。尽管有这些优点,但其在生物医学中的翻译应用受到限制。我们描述并演示了采用HC方法对各种分类学级别的15种细菌进行“组学驱动”分类,实现了90-100%的准确性,将9种癌症分为形态学类型和35种亚型,准确性分别为99%和76%,分别。使用质谱(n = 284),未知细菌物种被概率准确地分配给它们各自的属或科。大多数亚型的mRNA数据可预测癌症类型(n = 1960),准确度为95-100%。这在临床实践中具有高度相关性,在临床实践中,难以随疾病的不断发展和新菌株的出现而汇编出完整的数据集,而在较高级别上对未知类别(例如细菌种类)的预测可能足以启动抗微生物治疗。此处介绍的算法可直接转换为具有任何定量数据的临床应用,并且具有广泛的应用潜力,从未标记的样品识别到分层特征选择以及新的分类学变体的发现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号