首页> 外文会议>Advances in knowledge discovery and data mining >A Social Spam Detection Framework via Semi-supervised Learning
【24h】

A Social Spam Detection Framework via Semi-supervised Learning

机译:通过半监督学习的社会垃圾邮件检测框架

获取原文
获取原文并翻译 | 示例

摘要

With the increasing popularity of social networking websites such as Twitter, Facebook, Sina Weibo and MySpace, spammers on them are getting more and more rampant. Social spammers always create a mass of compromised or fake accounts to deceive users and lead them to access malicious websites which contain illegal, pornography or dangerous information. As we all know, most of the studies on social spam detection are based on supervised machine learning which requires plenty of annotated datasets. Unfortunately, labeling a large number of datasets manually is a complex, error-prone and tedious task which may costs a lot of human efforts and time. In this paper, we propose a novel semi-supervised classification framework for social spam detection, which combines co-training with k-medoids. First we utilize k-medoids clustering algorithm to acquire some informative and presentative samples for labelling as our initial seeds set. Then we take advantage of the content features and behavior features of users for our co-training classification framework. In order to illustrate the effectiveness of k-medoids, we compare the performance with random selecting strategy. Finally, we evaluate the effectiveness of our proposed detection framework compared with several classical supervised algorithms.
机译:随着Twitter,Facebook,新浪微博和MySpace等社交网站的日益普及,垃圾邮件发送者越来越猖ramp。社交垃圾邮件发送者总是创建大量的受感染或伪造的帐户来欺骗用户,并导致他们访问包含非法,色情或危险信息的恶意网站。众所周知,关于社交垃圾邮件检测的大多数研究都是基于有监督的机器学习,这需要大量带注释的数据集。不幸的是,手动标记大量数据集是一项复杂,容易出错且繁琐的任务,可能会花费大量的人力和时间。在本文中,我们提出了一种新颖的半监督分类框架,用于社会垃圾邮件检测,该框架将协同训练与k-medoids相结合。首先,我们利用k-medoids聚类算法来获取一些内容丰富且具有代表性的样本,作为我们的初始种子集进行标记。然后,我们将用户的内容特征和行为特征用于我们的共同训练分类框架。为了说明k-medoids的有效性,我们将性能与随机选择策略进行了比较。最后,与几种经典的监督算法相比,我们评估了我们提出的检测框架的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号