首页> 中文期刊> 《黑龙江八一农垦大学学报》 >基于自适应免疫进化算法的聚焦爬虫搜索策略

基于自适应免疫进化算法的聚焦爬虫搜索策略

         

摘要

聚焦爬虫是主题搜索引擎的核心部件。针对目前聚焦爬虫搜索策略的不足,提出基于主题相关度和页面重要性相结合的综合相关度来判别页面主题相关性,并采用自适应免疫进化算法这种搜索策略指导聚焦爬虫的爬行,实验结果证明,该算法下载的主题相关网页数所占比例明显高于最佳搜索和广度优先搜索算法的比例,具有更高的搜索效率。%Focused crawler was a core component of the topic search engine.To overcome the deficiency of focused crawler search strategy,a comprehensive value based on theme relevance and importance of page was proposed to determine the topic relevant of the page,and the adaptive immune evolutionary algorithm of this search strategy was used to guide the crawling strategy of focused crawler.The experiment results showed that the algorithm download the proportion to the number of webpage related to the themes was higher significantly than the best search and breadth first search algorithm and had higher searching efficiency.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号