...
首页> 外文期刊>Procedia Computer Science >An Efficient Association Rule Based Clustering of XML Documents
【24h】

An Efficient Association Rule Based Clustering of XML Documents

机译:基于有效关联规则的XML文档聚类

获取原文
           

摘要

Mining the web data is one of the emerging researches in data mining. The HTML can be used for maintaining the web data but it is hard to achieve the accurate web mining results from HTML documents. The XML documents make more convenient for finding the properties in web mining. Association rule based mining discovers the temporal associations among XML documents. But this kind of data mining is not sufficient to retrieve the properties of every XML document. Finding the properties for set of similar documents is better idea rather than to find the property of a single document. Hence, the key contribution of the work is to find the meaningful clustered based associations by association rule based clustering. Therefore, this paper proposes a hybrid approach which discovers the frequent XML documents by association rule mining and then find the clustering of XML documents by classical k-means algorithm. The proposed approach was tested with real data of Wikipedia. The comparative study and result analysis are discussed in the paper for knowing the importance of the proposed work.
机译:挖掘Web数据是数据挖掘中的新兴研究之一。 HTML可以用于维护Web数据,但是很难从HTML文档中获得准确的Web挖掘结果。 XML文档使在Web挖掘中查找属性更加方便。基于关联规则的挖掘发现XML文档之间的时间关联。但是,这种数据挖掘不足以检索每个XML文档的属性。找到一组相似文档的属性比找到单个文档的属性更好。因此,这项工作的关键贡献是通过基于关联规则的聚类找到有意义的基于聚类的关联。因此,本文提出了一种混合方法,该方法通过关联规则挖掘发现频繁的XML文档,然后通过经典的k-means算法找到XML文档的聚类。 Wikipedia的真实数据对提出的方法进行了测试。本文讨论了比较研究和结果分析,以了解拟议工作的重要性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号