首页> 外文会议>Joint international conference on semantic technology >Semi-supervised Stance-Topic Model for Stance Classification on Social Media
【24h】

Semi-supervised Stance-Topic Model for Stance Classification on Social Media

机译:社交媒体姿态分类的半监督姿态主题模型

获取原文

摘要

Stance detection aims to automatically determine from text whether the author of the text is in favor of, against, or neutral towards a issue. Social media, such as Sina Weibo, reflects the general public's stances towards different issues. Detecting and summarizing stances towards specific issues from social media is an important and challenging task. Although stance detection on social media has been studied before, previous work, most of which are based on supervised learning, may not work well because they suffer from its heavy dependence on training data. Other weakly supervised method also use some heuristic rules to select the posts with specific stances as training data, but these selected posts often concentrate on a few subtopics of the specific issue, these weakly supervised method can only train a biased stance classifier. To better detect stances toward specific issues, we consider to detect stances with a small number of labeled training data and a mass of unlabeled data. To integrate the supervised information into our model, we combine a discriminative maximum entropy (Max-Ent) component with the generative component. The Max-Ent component leverages hand-crafted features from labeled data to separate different stances. In this paper, we propose a semi-supervised topic model, Semi-Supervised Stance Topic Model (SSTM), that model stances and topics of the posts on social media. Since the posts on social media are short texts, we also incorporate the structural information of the posts, i.e., gender information, location information and time information, to aggregate posts for alleviating the context sparsity of the posts. The model has been evaluated on the selected posts on sina weibo, which talk about "the verbal battle of Han han and Fang zhouzi", to classify the stance of each posts. Preliminary experiments have shown promising results achieved by SSTM. Moreover, we also analyze the common difficulties in stance detection on social media. Finally, we also visualize the subtopics of the given issue generated by SSTM.
机译:姿态检测的目的是根据文本自动确定文本的作者是赞成,反对还是对某个问题持中立态度。新浪微博等社交媒体反映了公众对不同问题的立场。从社交媒体中发现和总结对特定问题的立场是一项重要且具有挑战性的任务。尽管之前已经研究过社交媒体上的姿势检测,但是以前的工作(大部分都是基于监督学习)可能无法很好地工作,因为它们严重依赖于训练数据。其他弱监督方法也使用一些启发式规则来选择具有特定立场的职位作为训练数据,但是这些选择的职位通常专注于特定问题的一些子主题,这些弱监督方法只能训练有偏见的立场分类器。为了更好地检测针对特定问题的立场,我们考虑使用少量带标签的训练数据和大量未标记的数据来检测立场​​。要将监督信息整合到我们的模型中,我们将判别性最大熵(Max-Ent)成分与生成成分相结合。 Max-Ent组件利用来自标记数据的手工功能来分离不同的姿势。在本文中,我们提出了一种半监督主题模型,即半监督姿态主题模型(SSTM),该模型可以对社交媒体上帖子的立场和主题进行建模。由于社交媒体上的帖子是短文本,因此我们还将这些帖子的结构信息(即性别信息,位置信息和时间信息)纳入其中,以汇总帖子,以减轻帖子的上下文稀疏性。该模型已在新浪微博上精选的帖子中进行了评估,该帖子谈论“汉寒与方舟子的口头战”,以对每个帖子的立场进行分类。初步实验表明,SSTM取得了令人鼓舞的结果。此外,我们还分析了社交媒体上姿态检测中的常见困难。最后,我们还可视化了SSTM生成的给定问题的子主题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号