首页> 外文期刊>Journal of Bioinformatics and Computational Biology >A WORKFLOW FOR MUTATION EXTRACTION AND STRUCTURE ANNOTATION
【24h】

A WORKFLOW FOR MUTATION EXTRACTION AND STRUCTURE ANNOTATION

机译:提取和结构标注的工作流程

获取原文
获取原文并翻译 | 示例
           

摘要

Rich information on point mutation studies is scattered across heterogeneous data sources. This paper presents an automated workflow for mining mutation annotations from full-text biomedical literature using natural language processing (NLP) techniques as well as for their subsequent reuse in protein structure annotation and visualization. This system, called mSTRAP (Mutation extraction and STRucture Annotation Pipeline), is designed for both information aggregation and subsequent brokerage of the mutation annotations. It facilitates the coordination of semantically related information from a series of text mining and sequence analysis steps into a formal OWL-DL ontology. The ontology is designed to support application-specific data management of sequence, structure, and literature annotations that are populated as instances of object and data type properties. mSTRAPviz is a subsystem that facilitates the brokerage of structure information and the associated mutations for visualization. For mutated sequences without any corresponding structure available in the Protein Data Bank (PDB), an automated pipeline for homology modeling is developed to generate the theoretical model. With mSTRAP, we demonstrate a workable system that can facilitate automation of the workflow for the retrieval, extraction, processing, and visualization of mutation annotations — tasks which are well known to be tedious, time-consuming, complex, and error-prone. The ontology and visualization tool are available at http://datam.i2r.a-star.edu.sg/mstrap.
机译:关于点突变研究的丰富信息散布在异构数据源中。本文介绍了一种自动化的工作流程,该工作流程使用自然语言处理(NLP)技术从全文生物医学文献中挖掘突变注释,以及随后在蛋白质结构注释和可视化中的重复使用。该系统称为mSTRAP(突变提取和STRucture注释管道),旨在用于信息聚合和突变注释的后续代理。它有助于从一系列文本挖掘和序列分析步骤到正式的OWL-DL本体的语义相关信息的协调。本体旨在支持序列,结构和文献注解的特定于应用程序的数据管理,这些序列,结构和文献注解填充为对象和数据类型属性的实例。 mSTRAPviz是一个子系统,可促进结构信息和相关突变的中介,以实现可视化。对于蛋白质数据库(PDB)中没有任何相应结构的突变序列,开发了用于同源性建模的自动管道以生成理论模型。借助mSTRAP,我们展示了一个可行的系统,该系统可以促进工作流程的自动化,以实现突变注释的检索,提取,处理和可视化-众所周知,这些任务繁琐,耗时,复杂且容易出错。本体和可视化工具可从http://datam.i2r.a-star.edu.sg/mstrap获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号