首页> 美国卫生研究院文献>other >Application of Taxonomic Modeling to Microbiota Data Mining for Detection of Helminth Infection in Global Populations
【2h】

Application of Taxonomic Modeling to Microbiota Data Mining for Detection of Helminth Infection in Global Populations

机译:分类学模型在微生物群数据挖掘中检测全球人群蠕虫感染的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Human microbiome data from genomic sequencing technologies is fast accumulating, giving us insights into bacterial taxa that contribute to health and disease. The predictive modeling of such microbiota count data for the classification of human infection from parasitic worms, such as helminths, can help in the detection and management across global populations. Real-world datasets of microbiome experiments are typically sparse, containing hundreds of measurements for bacterial species, of which only a few are detected in the bio-specimens that are analyzed. This feature of microbiome data produces the challenge of needing more observations for accurate predictive modeling and has been dealt with previously, using different methods of feature reduction. To our knowledge, integrative methods, such as transfer learning, have not yet been explored in the microbiome domain as a way to deal with data sparsity by incorporating knowledge of different but related datasets. One way of incorporating this knowledge is by using a meaningful mapping among features of these datasets. In this paper, we claim that this mapping would exist among members of each individual cluster, grouped based on phylogenetic dependency among taxa and their association to the phenotype. We validate our claim by showing that models incorporating associations in such a grouped feature space result in no performance deterioration for the given classification task. In this paper, we test our hypothesis by using classification models that detect helminth infection in microbiota of human fecal samples obtained from Indonesia and Liberia countries. In our experiments, we first learn binary classifiers for helminth infection detection by using Naive Bayes, Support Vector Machines, Multilayer Perceptrons, and Random Forest methods. In the next step, we add taxonomic modeling by using the SMART-scan module to group the data, and learn classifiers using the same four methods, to test the validity of the achieved groupings. We observed a 6% to 23% and 7% to 26% performance improvement based on the Area Under the receiver operating characteristic (ROC) Curve (AUC) and Balanced Accuracy (Bacc) measures, respectively, over 10 runs of 10-fold cross-validation. These results show that using phylogenetic dependency for grouping our microbiota data actually results in a noticeable improvement in classification performance for helminth infection detection. These promising results from this feasibility study demonstrate that methods such as SMART-scan can be utilized in the future for knowledge transfer from different but related microbiome datasets by phylogenetically-related functional mapping, to enable novel integrative biomarker discovery.
机译:来自基因组测序技术的人类微生物组数据正在快速积累,这使我们深入了解了有助于健康和疾病的细菌类群。此类微生物群计数数据的预测模型可用于对寄生虫(例如蠕虫)的人类感染进行分类,从而有助于在全球范围内进行检测和管理。微生物组实验的真实世界数据集通常很稀疏,其中包含数百种细菌种类的测量值,其中只有少数几个在所分析的生物样本中被检测到。微生物组数据的这一特征带来了挑战,即需要更多观察来进行准确的预测建模,并且以前已经使用不同的特征缩减方法进行了处理。据我们所知,在微生物组领域中尚未探索整合方法(例如转移学习)作为通过合并不同但相关的数据集的知识来处理数据稀疏性的方法。整合这些知识的一种方法是在这些数据集的特征之间使用有意义的映射。在本文中,我们声称此映射将存在于每个单独簇的成员之间,并根据分类单元之间的系统发育依赖性及其与表型的关联性进行分组。我们通过证明在这样的分组特征空间中合并关联的模型不会导致给定分类任务的性能下降来验证我们的主张。在本文中,我们使用分类模型来检验我们的假设,该模型可检测从印度尼西亚和利比里亚国家获得的人类粪便样品的微生物群中的蠕虫感染。在我们的实验中,我们首先通过使用朴素贝叶斯,支持向量机,多层感知器和随机森林方法来学习用于蠕虫感染检测的二进制分类器。在下一步中,我们使用SMART-scan模块添加分类模型,以对数据进行分组,并使用相同的四种方法学习分类器,以测试所实现分组的有效性。在10次10​​倍交叉测试中,基于接收器工作特性(ROC)曲线(AUC)和平衡精度(Bacc)度量,我们分别观察到6%至23%和7%至26%的性能提升。 -验证。这些结果表明,使用系统发育相关性对我们的微生物群数据进行分组实际上可以显着提高蠕虫感染检测的分类性能。这项可行性研究的这些令人鼓舞的结果表明,诸如SMART-scan之类的方法可以在将来用于通过系统发育相关的功能图谱从不同但相关的微生物组数据集中进行知识转移,从而实现新型的整合生物标记物发现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号