首页> 外国专利> SET-BASED SIMILAR SEQUENCE MATCHING APPARATUS AND METHOD

SET-BASED SIMILAR SEQUENCE MATCHING APPARATUS AND METHOD

机译:基于集合的相似序列匹配装置和方法

摘要

The present invention relates to a set-based similar sequence matching method thereof. The set-based similar sequence matching method includes: a step of generating a candidate set table by identifying sets of data set sequences, which are determined to be similar to sets of query set sequences among data set sequences, as candidate sets by using a set reverse index generated for the data set sequences; a step of generating a candidate window table by identifying windows of data set sequences, which are determined to be similar to windows included in the query set sequences, as candidate windows through the calculation of similarity between the windows of the query set sequences and the windows of the data set sequences including the candidate sets considering the candidate set table; and a step of identifying data set sequences similar to the query set sequences through the calculation of similarity between the query set sequences and the data set sequences including the candidate windows considering the candidate window table. Thus, the present invention is capable of quickly matching similar sequences by effectively improving the calculation of intersection sizes.
机译:本发明涉及一种基于集合的相似序列匹配方法。基于集合的相似序列匹配方法包括:通过使用一组将在数据集序列中被确定为与查询集合序列的集合相似的数据集序列的集合识别为候选集合,以生成候选集合表的步骤。为数据集序列生成的反向索引;通过计算查询集序列的窗口与窗口之间的相似度,通过将确定为与查询集序列中包括的窗口相似的数据集序列的窗口识别为候选窗口,来生成候选窗口表的步骤考虑候选集表的包括候选集的数据集序列;通过计算查询集序列和考虑候选窗口表的包括候选窗口的数据集序列之间的相似度来识别与查询集序列相似的数据集序列的步骤。因此,本发明能够通过有效地改善交叉点大小的计算来快速匹配相似序列。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号