...
首页> 外文期刊>Scientific Research and Essays >Efficient bit-parallel multi-patterns approximate string matching algorithms
【24h】

Efficient bit-parallel multi-patterns approximate string matching algorithms

机译:高效的位并行多模式近似字符串匹配算法

获取原文
           

摘要

Multi-patterns approximate string matching (MASM) problem is to find all the occurrences of set of patterns P0, P1, P2...Pr-1, r≥1, in the given text T[0…n-1], allowing limited number of errors in the matches. This problem has many applications in computational biology viz. finding DNA subsequences after possible mutations, locating positions of a disease(s) in a genome etc. The MASM problem has been previously solved by Baeza-Yates and Navarro by extending the bit-parallel automata (BPA) of approximate matching and using the concept of classes of characters. The drawbacks of this approach are: (a) It requires verification for the potential matches and, (b) It can handle patterns of length less than or equal to word length (w) of computer used. In this paper, we propose two new bit-parallel algorithms to solve the same problem. These new algorithms requires no verification and can handle patterns of length & w. These two techniques also use the same BPA of approximate matching and concatenation to form a single pattern from the set of r patterns. We compare the performance of new algorithms with existing algorithms and found that our algorithms have better running time than the previous algorithms.
机译:多模式近似字符串匹配(MASM)问题是在给定文本T [0… n-1]中找到所有模式集P0,P1,P2 ... Pr-1,r≥ 1的出现,从而允许比赛中的错误数量有限。这个问题在计算生物学中有许多应用。在可能的突变后发现DNA子序列,确定疾病在基因组中的位置等。MASM问题先前已由Baeza-Yates和Navarro通过扩展近似匹配的位平行自动机(BPA)并使用该概念来解决。字符类别。这种方法的缺点是:(a)需要验证潜在的匹配,并且(b)它可以处理长度小于或等于所用计算机的字长(w)的模式。在本文中,我们提出了两种新的位并行算法来解决同一问题。这些新算法不需要验证,并且可以处理长度> 1的模式。 w。这两种技术还使用相同的BPA进行近似匹配和串联,以从r个模式集中形成一个模式。我们将新算法与现有算法的性能进行了比较,发现我们的算法比以前的算法具有更好的运行时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号