首页> 外文期刊>Systems Engineering and Electronics, Journal of >Tag clustering algorithm LMMSK: Improved K-means algorithm based on latent semantic analysis
【24h】

Tag clustering algorithm LMMSK: Improved K-means algorithm based on latent semantic analysis

机译:标签聚类算法LMMSK:基于潜在语义分析的改进K均值算法

获取原文
获取原文并翻译 | 示例
           

摘要

With the wide application of Web_2.0 and social software, there are more and more tag-related studies and applications. Because of the randomness and the personalization in users' tagging, tag research continues to encounter data space and semantics obstacles. With the min-max similarity (MMS) to establish the initial centroids, the traditional K-means clustering algorithm is firstly improved to the MMSK-means clustering algorithm, the superiority of which has been tested; based on MMSK-means and combined with latent semantic analysis (LSA), here secondly emerges a new tag clustering algorithm, LMMSK. Finally, three algorithms for tag clustering, MMSK-means, tag clustering based on LSA (LSA-based algorithm) and LMMSK, have been run on Matlab, using a real tag-resource dataset obtained from the Delicious Social Bookmarking System from 2004 to 2009. LMMSK's clustering result turns out to be the most effective and the most accurate. Thus, a better tag-clustering algorithm is found for greater application of social tags in personalized search, topic identification or knowledge community discovery. In addition, for a better comparison of the clustering results, the clustering corresponding results matrix (CCR matrix) is proposed, which is promisingly expected to be an effective tool to capture the evolutions of the social tagging system.
机译:随着Web_2.0和社交软件的广泛应用,越来越多的标签相关的研究和应用。由于用户标签的随机性和个性化,标签研究继续遇到数据空间和语义障碍。利用最小-最大相似度(MMS)建立初始质心,首先将传统的K-means聚类算法改进为MMSK-means聚类算法,并验证了其优越性。基于MMSK-means算法,结合潜在语义分析(LSA),提出了一种新的标签聚类算法LMMSK。最后,使用2004年至2009年从美味社交书签系统获得的真实标签资源数据集,在Matlab上运行了三种标签聚类算法:MMSK-均值,基于LSA(基于LSA的算法)和LMMSK的标签聚类。 LMMSK的聚类结果被证明是最有效和最准确的。因此,找到了一种更好的标签聚类算法,可将社交标签更多地应用于个性化搜索,主题识别或知识社区发现。此外,为了更好地比较聚类结果,提出了聚类对应的结果矩阵(CCR矩阵),有望将其作为捕捉社会标签系统演变的有效工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号