首页> 外文会议>International Computer Science and Engineering Conference >Identifying Influencers with Ensemble Classification Approach on Twitter
【24h】

Identifying Influencers with Ensemble Classification Approach on Twitter

机译:在Twitter上使用整体分类方法识别有影响力的人

获取原文

摘要

The popular tweets are often originated by ordinary people involving in an event producing interesting or intriguing contents which often be picked up by users so-called influencers who have a mass number of followers. Targeting the users who cost the viral burst in social media among the enormous users producing messages every second is a difficult job. If such influencers can be identified in real time when trends are happening, some incidents can be prevented, protected and even drawn benefits from. Identifying influencers can be done with data classification techniques. Our research proposes a technique for identifying influencers on Twitter social media platform using extracted features from six months of collected tweets from November 2016 to March 2017. The proposed method contains two types of feature which are static and dynamic features. The static features include the weekday, the time period of the first retweet, and its author's numbers of interaction and relation consist of retweets of the hashtag, followers, followings, friends, favorites, and statuses. The dynamic features are the changes in the author's numbers of interaction and relation since the first retweet to a target time. The features were extracted according to a period of time according to Bursting State and bursting duration of 30, 60, and 120 minutes. Four ensemble classification algorithms: Bagging, Random Forest, Extra Trees, and Voting were applied. The results demonstrate that Bagging algorithm yields the best results in general for a duration of 120 minutes, and even identifying influencers in the separated topic domain, it also yields better predictive results.
机译:流行的推文通常是由普通百姓发起的,涉及活动中产生有趣或引人入胜的内容,这些内容通常被用户吸引,这些用户被称为具有影响力的追随者。在每秒产生消息的庞大用户中,针对那些在社交媒体上花费大量病毒的用户,这是一项艰巨的工作。如果可以在趋势发生时实时识别出这些影响者,则可以预防,保护甚至从中受益。可以使用数据分类技术来确定影响者。我们的研究提出了一种使用从2016年11月至2017年3月六个月收集的推文中提取的特征来识别Twitter社交媒体平台上的影响者的技术。所提出的方法包含静态和动态特征两种类型。静态功能包括工作日,第一次转发的时间段,其作者的互动和关系数量包括主题标签,关注者,关注者,朋友,收藏夹和状态的转发。动态特征是自第一次转发到目标时间以来作者互动和关系数量的变化。根据爆裂状态和爆裂持续时间分别为30、60和120分钟的时间段提取特征。应用了四种集成分类算法:装袋,随机森林,额外树和投票。结果表明,Bagging算法通常在120分钟的时间内产生最好的结果,甚至在分离的主题域中识别影响者,它也产生更好的预测结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号