首页> 外文会议>International Conference on Fuzzy Systems and Knowledge Discovery >Study on frequent term set-based hierarchical clustering algorithm
【24h】

Study on frequent term set-based hierarchical clustering algorithm

机译:基于频繁的基于术语的分层聚类算法研究

获取原文

摘要

This paper, we present a text-clustering algorithm of Frequent Term Set-based Clustering (FTSC), which uses frequent term sets for texts clustering. This algorithm can reduce the dimensionality of the text data efficiently, thus it can improve accurate rate and running speed of the clustering algorithm. The results of clustering texts by the FTSC algorithm cannot reflect the overlap of texts' classes. Based on the FTSC algorithm, its improved algorithm—Frequent Term Set-based Hierarchical Clustering algorithm (FTSHC) is given. This algorithm can determine the overlap of texts' classes by the overlap of frequent words sets, and provide an understandable description of the discovered clusters by the frequent terms sets. The experiment results prove that FTSC and FTSHC algorithms are more efficient than K-Means algorithm in the performance of clustering.
机译:本文,我们介绍了一种基于常规集合的群集(FTSC)的文本聚类算法,它使用频繁的术语集聚类。该算法可以有效地降低文本数据的维度,从而可以提高聚类算法的准确速率和运行速度。 FTSC算法的聚类文本结果不能反映文本类的重叠。基于FTSC算法,给出了其改进的基于算法的基于算法的分层聚类算法(FTSHC)。该算法可以通过频繁单词集的重叠确定文本类的重叠,并通过频繁的术语集提供发现的集群的可理解描述。实验结果证明了FTSC和FTSHC算法比群集性能更有效地比K-Means算法更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号