首页> 中文期刊> 《西华大学学报(自然科学版)》 >基于市场匹配的多Agent智能爬虫系统

基于市场匹配的多Agent智能爬虫系统

         

摘要

在网络文字、图像视频、音频数量日益增长的网络世界中,网络爬虫爬取结果变得越来越差,主要表现在爬取网页的精确率低、召回率低和重复率高等方面. 为解决这些问题,结合市场匹配基本原理和网络爬虫的特点,提出一种基于市场匹配算法的多Agent智能爬虫系统. 基于市场匹配算法,设计了多Agent智能爬虫系统,以雅虎一级目录12个主题为测试数据对网络爬虫爬取网页的精确率、召回率和重复率进行了分析. 结果表明,与未使用市场匹配算法的系统相比较,基于市场匹配算法的多Agent智能爬虫系统的精确率提高了9%、召回率提高了8%、重复率降低了5%,其爬虫性能有较大改善.%With the number of network texts, graphics videos,audios in the online world is growing rapidly, the web crawler be-comes more and more powerless, mainly showed in the lower precise rate, lower recall rate and higher repetition rate while crawling web pages. In order to address the problem mentioned above, a multi-Agent intelligent crawler system using market-matching algorithm is proposed by combining market-matching fundamentals and characteristics of web crawler. This paper firstly analyzed and designed every important part of multi-Agent intelligent crawler system in detail based on market-matching algorithm. Then the precise rate, re-call rate and repetition rate of crawling web pages were analyzed by using the directory of Yahoo as test data. Experimental results show that the multi-Agent intelligent crawler system can improve the performance of the web crawlers compared to the system without using market-matching algorithm, specifically manifest in the precision rate and recall rate increased by 9%,8% respectively, while its repe-tition rate decreased by 5%.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号