首页> 外文会议>Joint IFSA World Congress and NAFIPS Annual Meeting >Clustering of Web Search Results based on an Iterative Fuzzy C-means Algorithm and Bayesian Information Criterion
【24h】

Clustering of Web Search Results based on an Iterative Fuzzy C-means Algorithm and Bayesian Information Criterion

机译:基于迭代模糊C型算法和贝叶斯信息标准的网络搜索结果的聚类

获取原文

摘要

The clustering of web search has become a very interesting research area among academic and scientific communities involved in information retrieval. Clustering of web search result systems, also called Web Clustering Engines, seek to increase the coverage of documents presented for the user to review, while reducing the time spent reviewing them. Several algorithms for web document clustering already exist, but results show there is room for more to be done. This paper introduces a new description-centric algorithm for clustering of web results called IFCWR. IFCWR initially selects a maximum estimated number of clusters using Forgy's strategy, then it iteratively merges clusters until results cannot be improved. Every merge operation implies the execution of Fuzzy C-Means for clustering results of web search and the calculus of Bayesian Information Criterion for automatically evaluating the best solution and number of clusters. IFCWR was compared against other established web document clustering algorithms, among them: Suffix Tree Clustering and Lingo. Comparison was executed on AMBIENT and MORESQUE datasets, using precision, recall, f-measure, SSL_k and other metrics. Results show a considerable improvement in clustering quality and performance.
机译:网络搜索的聚类已成为参与信息检索的学术和科学社区之间的非常有趣的研究领域。群集Web搜索结果系统,也称为Web聚类引擎,寻求增加为用户提供审查的文件的覆盖范围,同时减少了审查它们的时间。有几种Web文档群集算法已经存在,但结果显示有更多待完成的空间。本文介绍了一种以名为IFCWR的Web结果的聚类为中心的算法。 IFCWR最初使用前驱策略选择最大估计的群集数,然后它迭代地合并群集,直到无法提高结果。每个合并操作都意味着执行模糊C-inse的用于网络搜索结果以及贝叶斯信息标准的微积分,用于自动评估最佳解决方案和群集数量。将IFCWR与其他已建立的Web文档聚类算法进行比较:后缀树聚类和Lingo。使用精度,召回,F测量,SSL_K和其他度量,在环境和Moresque数据集上执行比较。结果表明聚类质量和性能相当大。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号