...
首页> 外文期刊>International journal of intelligent information and database systems >Web toolkit: an agent based scalable search engine using cellular automata based classification for duly ranked retrieved data
【24h】

Web toolkit: an agent based scalable search engine using cellular automata based classification for duly ranked retrieved data

机译:Web工具箱:基于代理的可扩展搜索引擎,使用基于元胞自动机的分类对适当排名的检索数据进行分类

获取原文
获取原文并翻译 | 示例
           

摘要

Web page classification is a major issue for categorising web documents to facilitate indexing, search and retrieval of web pages for search engine. Different crawling techniques have been utilised to accumulate web pages of different domains under separate databases depending on practical scenario. Downloaded web pages are being parsed for further processing. A classifier is designed dynamically using single cycle multiple attractor cellular automata for mapping downloaded web pages of different domains into specific structure. This paper proposes alternate technique for automatic categorisation of web pages into different domains. Retrieved web pages have been ranked automatically at the time of classifier formation. Typically, our system consists of crawling, ranking and storage parts created in a different way. Hierarchical concept has been used over parallel crawler. GF(2P) concept is introduced in ranking. The concept of SMACA has been utilised in indexing storage. Overall, a search engine module has been created using agent-based method.
机译:网页分类是将网页文档分类以促进索引,搜索和检索网页以供搜索引擎使用的主要问题。根据实际情况,已使用不同的爬网技术在单独的数据库下累积不同域的网页。正在对下载的网页进行解析,以进行进一步处理。使用单周期多吸引子细胞自动机动态设计分类器,以将不同域的下载网页映射到特定结构中。本文提出了将网页自动分类到不同域的另一种技术。检索到的网页已在分类器形成时自动排名。通常,我们的系统由以不同方式创建的爬网,排名和存储部分组成。分层概念已在并行搜寻器上使用。在排名中引入了GF(2P)概念。 SMACA的概念已用于索引存储。总体而言,已经使用基于代理的方法创建了搜索引擎模块。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号