...
首页> 外文期刊>International Journal of Cloud Computing >An efficient document clustering using hybridised harmony search K-means algorithm with multi-view point
【24h】

An efficient document clustering using hybridised harmony search K-means algorithm with multi-view point

机译:使用多视点的杂交和谐搜索K-Means算法有效的文档聚类

获取原文
获取原文并翻译 | 示例
           

摘要

Document clustering is the most needed process in the data mining field where the number of documents with different methodologies are scattered. The meaningful information can be extracted from the group of documents by grouping them effectively. There are various researches that exist previously which concentrate on clustering the documents present in the real. In the previous works, document clustering is done by using the methodologies called the term weight-based hybridised harmony K-means search (TW HHKM), coverage factor-based hybridised harmony K-means search (CF HHKM), concept-based, kernel and weighted feature-based clustering algorithm (CKW HHKM). Clustering is normally done by using the K-means algorithm and the centroids of clusters are found optimally by using the harmony search algorithm. The problem reside in the above said existing methods are the poor accuracy while clustering the documents where the unrelated documents are grouped together. To overcome this problem, multi-view point HHKM (MP HHKM) approach is introduced, in which clustering can be done accurately. In this work, multi-point analysis is done based on the similarity measurement. The exploratory tests were directed on news group and TREC dataset from which it is robust that the proposed technique MP HHKM overtakes the existing technique with better accuracy values.
机译:文档群集是数据挖掘领域中最需要的过程,其中具有不同方法的文档数量分散。可以通过有效分组它们来从一组文档中提取有意义的信息。先前存在各种研究,其中集中在群体中存在的文件。在以前的作品中,通过使用称为术语重量的杂交和谐K-Mearcon搜索(TW HHKM)的方法来完成文档群集,基于覆盖因子的杂交和谐K-Meance搜索(CF HHKM),基于概念,内核基于加权特征的聚类算法(CKW HHKM)。通常通过使用K-Means算法进行群集,通过使用和声搜索算法,最佳地发现集群的质心。问题驻留在上述说明现有方法中是较差的准确性,同时培养不相关文档将组合在一起的文档。为了克服这个问题,介绍了多视点HHKM(MP HHKM)方法,其中可以准确地完成聚类。在这项工作中,基于相似度测量完成的多点分析。探索性测试是针对新闻组和TREC数据集,从中强大的是,所提出的技术MP HHKM以更好的准确度值超越现有技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号