首页> 美国卫生研究院文献>other >Functional Biogeography of Ocean Microbes Revealed through Non-Negative Matrix Factorization
【2h】

Functional Biogeography of Ocean Microbes Revealed through Non-Negative Matrix Factorization

机译:通过非负矩阵分解揭示海洋微生物的功能生物地理学

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The direct “metagenomic” sequencing of genomic material from complex assemblages of bacteria, archaea, viruses and microeukaryotes has yielded new insights into the structure of microbial communities. For example, analysis of metagenomic data has revealed the existence of previously unknown microbial taxa whose spatial distributions are limited by environmental conditions, ecological competition, and dispersal mechanisms. However, differences in genotypes that might lead biologists to designate two microbes as taxonomically distinct need not necessarily imply differences in ecological function. Hence, there is a growing need for large-scale analysis of the distribution of microbial function across habitats. Here, we present a framework for investigating the biogeography of microbial function by analyzing the distribution of protein families inferred from environmental sequence data across a global collection of sites. We map over 6,000,000 protein sequences from unassembled reads from the Global Ocean Survey dataset to protein families, generating a protein family relative abundance matrix that describes the distribution of each protein family across sites. We then use non-negative matrix factorization (NMF) to approximate these protein family profiles as linear combinations of a small number of ecological components. Each component has a characteristic functional profile and site profile. Our approach identifies common functional signatures within several of the components. We use our method as a filter to estimate functional distance between sites, and find that an NMF-filtered measure of functional distance is more strongly correlated with environmental distance than a comparable PCA-filtered measure. We also find that functional distance is more strongly correlated with environmental distance than with geographic distance, in agreement with prior studies. We identify similar protein functions in several components and suggest that functional co-occurrence across metagenomic samples could lead to future methods for de-novo functional prediction. We conclude by discussing how NMF, and other dimension reduction methods, can help enable a macroscopic functional description of marine ecosystems.
机译:对细菌,古细菌,病毒和微真核生物的复杂组合进行基因组材料的直接“基因组”测序已对微生物群落的结构产生了新的见解。例如,对宏基因组学数据的分析显示,存在以前未知的微生物分类群,其空间分布受环境条件,生态竞争和扩散机制的限制。但是,可能导致生物学家将两种微生物指定为分类学上不同的基因型的差异并不一定意味着生态功能的差异。因此,越来越需要对生境之间的微生物功能分布进行大规模分析。在这里,我们提出了一个框架,用于通过分析从环境序列数据推断出的全球站点集合中的蛋白质家族分布来研究微生物功能的生物地理学。我们将来自全球海洋调查数据集的未组装读数中的6,000,000个蛋白质序列映射到蛋白质家族,生成一个蛋白质家族相对丰度矩阵,该矩阵描述了每个蛋白质家族在位点之间的分布。然后,我们使用非负矩阵分解(NMF)将这些蛋白质家族概况近似为少量生态成分的线性组合。每个组件都有一个特征性的功能配置文件和站点配置文件。我们的方法可以识别多个组件中的常见功能签名。我们使用我们的方法作为过滤器来估计站点之间的功能距离,并且发现与可比较的PCA过滤措施相比,NMF过滤的功能距离测量值与环境距离的相关性更高。我们还发现,功能距离与环境距离的关系比与地理距离的关系更紧密,这与先前的研究一致。我们在几个组件中确定相似的蛋白质功能,并建议跨宏基因组学样本的功能共现可能会导致未来的创新功能预测方法。最后,我们讨论了NMF和其他降维方法如何帮助实现海洋生态系统的宏观功能描述。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号