首页> 美国卫生研究院文献>Bioinformatics >KAnalyze: a fast versatile pipelined K-mer toolkit
【2h】

KAnalyze: a fast versatile pipelined K-mer toolkit

机译:KAnalyze:快速通用的流水线K-mer工具包

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: Converting nucleotide sequences into short overlapping fragments of uniform length, k-mers, is a common step in many bioinformatics applications. While existing software packages count k-mers, few are optimized for speed, offer an application programming interface (API), a graphical interface or contain features that make it extensible and maintainable. We designed KAnalyze to compete with the fastest k-mer counters, to produce reliable output and to support future development efforts through well-architected, documented and testable code. Currently, KAnalyze can output k-mer counts in a sorted tab-delimited file or stream k-mers as they are read. KAnalyze can process large datasets with 2 GB of memory. This project is implemented in Java 7, and the command line interface (CLI) is designed to integrate into pipelines written in any language.>Results: As a k-mer counter, KAnalyze outperforms Jellyfish, DSK and a pipeline built on Perl and Linux utilities. Through extensive unit and system testing, we have verified that KAnalyze produces the correct k-mer counts over multiple datasets and k-mer sizes.>Availability and implementation: KAnalyze is available on SourceForge:>Contact: >Supplementary information: are available at Bioinformatics online.
机译:>动机:在许多生物信息学应用中,将核苷酸序列转化为长度一致的短重叠片段(k-mers)是常见的步骤。现有软件包包含k-mers,但很少有针对速度进行优化的,提供应用程序编程接口(API),图形界面或包含使其可扩展和可维护的功能的软件。我们设计KAnalyze的目的是与最快的k-mer计数器竞争,以产生可靠的输出并通过结构合理,可记录文件和可测试的代码来支持未来的开发工作。当前,KAnalyze可以在排序的制表符分隔文件中输出k-mer计数,或在读取k-mer时流式传输它们。 KAnalyze可以处理具有2 GB内存的大型数据集。该项目使用Java 7实现,命令行界面(CLI)旨在集成到以任何语言编写的管道中。>结果:作为k-mer计数器,KAnalyze优于水母,DSK和基于Perl和Linux实用程序的管道。通过广泛的单元和系统测试,我们验证了KAnalyze可在多个数据集和k-mer大小上生成正确的k-mer计数。>可用性和实现:KAnalyze可在SourceForge上找到:>联系方式: >补充信息:可在线访问生物信息学。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号