首页> 外文期刊>International journal of information system modeling and design >Clustering Model for Microblogging Sites using Dimension Reduction Techniques
【24h】

Clustering Model for Microblogging Sites using Dimension Reduction Techniques

机译:使用降维技术的微博站点聚类模型

获取原文
获取原文并翻译 | 示例
       

摘要

In recent times, microblogging sites such as Twitter have become popular communication platforms for exchanging information. From the point of view of individual user, a reasonably active Twitter user can easily get hundreds of microblogs (tweets) in his/her timeline every day. In addition, a large number of the tweets contain fundamentally the same information, because of retweeting and re-posting. These huge amounts of repetitive data may cause data over-burden for the users, and no user can effectively process so much data. In this situation, methodologies to manage the data over-burden should be developed. One of the effective methods for managing the data over-burden on Twitter is to cluster semantically similar tweets into groups, with the goal that a user may see just a couple of tweets in each group. In this work, various graph clustering approaches based on dimension reduction are proposed for clustering microblogs. Through experiments on several microblog datasets, the authors demonstrate that the proposed techniques perform better than several classical text clustering algorithms.
机译:近年来,诸如Twitter之类的微博网站已成为用于交换信息的流行通信平台。从单个用户的角度来看,一个相当活跃的Twitter用户每天可以轻松地在其时间轴中获得数百条微博(推文)。此外,由于转发和重新发布,大量推文基本上都包含相同的信息。这些大量的重复数据可能导致用户数据负担过重,并且没有用户可以有效地处理这么多数据。在这种情况下,应开发出管理数据过载的方法。在Twitter上管理超载数据的有效方法之一是将语义相似的推文聚类为组,目的是使用户在每个组中仅看到几条推文。在这项工作中,提出了基于降维的各种图聚类方法来聚类微博。通过对多个微博客数据集进行的实验,作者证明了所提出的技术比几种经典的文本聚类算法具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号