首页> 中文期刊> 《系统工程学报》 >一种基于会话聚类算法的Web使用挖掘方法

一种基于会话聚类算法的Web使用挖掘方法

         

摘要

Web usage mining has been an important task of data mining. It helps to understand the user group's identity, thus provides personalized service. A novel web usage mining algorithm based on the user sessions clustering is proposed in this paper. Firstly, a time-based user session identification method is used for Web log preprocessing. Furthermore, a 3-tuple data structure is designed to represent web sessions, and a session dimensionality reduction method based on web page semantic similarity is proposed, which could deduce the length of the session effectively with user's interest retaining. Secondly, a new session similarity measure is designed based on both time and frequency. Finally, a two-stage PS-KM session clustering algorithm is proposed. The algorithm first uses PSO method to make a global search, and then uses the local clustering process based on the K-means method. Experimental results show that the algorithm has highly effective.%Web使用挖掘作为数据挖掘的一个重要任务,有助于了解用户群体的特征,从而为其提供个性化服务.提出了一种基于用户会话聚类的Wei使用挖掘算法.首先,对Web日志预处理采用基于时间窗的用户会话识别方法,提出了一种基于三元组的用户会话表示方法,并在此基础上给出了基于网页语义相似性的会话处理方法,该方法能够在保持用户兴趣不变的情况下有效降低会话维度;其次,提出了一种基于时间及频次的用户会话相似性度量方法;最后,设计了一种两阶段PS-KM会话聚类算法,先用PSO方法进行全局搜索再转入基于K-means方法的局部聚类过程.仿真表明了算法的有效性.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号