首页> 美国卫生研究院文献>Genomics Informatics >An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases
【2h】

An Efficient Approach to Mining Maximal Contiguous Frequent Patterns from Large DNA Sequence Databases

机译:从大型DNA序列数据库中挖掘最大连续频率模式的有效方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Mining interesting patterns from DNA sequences is one of the most challenging tasks in bioinformatics and computational biology. Maximal contiguous frequent patterns are preferable for expressing the function and structure of DNA sequences and hence can capture the common data characteristics among related sequences. Biologists are interested in finding frequent orderly arrangements of motifs that are responsible for similar expression of a group of genes. In order to reduce mining time and complexity, however, most existing sequence mining algorithms either focus on finding short DNA sequences or require explicit specification of sequence lengths in advance. The challenge is to find longer sequences without specifying sequence lengths in advance. In this paper, we propose an efficient approach to mining maximal contiguous frequent patterns from large DNA sequence datasets. The experimental results show that our proposed approach is memory-efficient and mines maximal contiguous frequent patterns within a reasonable time.
机译:从DNA序列中挖掘有趣的模式是生物信息学和计算生物学中最具挑战性的任务之一。为了表达DNA序列的功能和结构,优选最大连续的频繁模式,因此可以捕获相关序列之间的共同数据特征。生物学家有兴趣寻找导致一组基因相似表达的基序的频繁有序排列。但是,为了减少挖掘时间和复杂性,大多数现有的序列挖掘算法要么着重于寻找短的DNA序列,要么需要事先明确指定序列长度。挑战是要找到更长的序列而不预先指定序列长度。在本文中,我们提出了一种从大型DNA序列数据集中挖掘最大连续频繁模式的有效方法。实验结果表明,我们提出的方法具有较高的存储效率,并且可以在合理的时间内挖掘出最大的连续频繁模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号