Social Media Text Classification under Negative Covariate Shift

机译：负协变量移位下的社交媒体文本分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In a typical social media content analysis task, the user is interested in analyzing posts of a particular topic. Identifying such posts is often formulated as a classification problem. However, this problem is challenging. One key issue is covariate shift. That is, the training data is not fully representative of the test data. We observed that the covariate shift mainly occurs in the negative data because topics discussed in social media are highly diverse and numerous, but the user-labeled negative training data may cover only a small number of topics. This paper proposes a novel technique to solve the problem. The key novelty of the technique is the transformation of document representation from the traditional n-gram feature space to a center-based similarity (CBS) space. In the CBS space, the covariate shift problem is significantly mitigated, which enables us to build much better classifiers. Experiment results show that the proposed approach markedly improves classification.

机译：在典型的社交媒体内容分析任务中，用户对分析特定主题的帖子感兴趣。识别此类职位通常被称为分类问题。但是，这个问题具有挑战性。一个关键问题是协变量转变。即，训练数据不能完全代表测试数据。我们观察到协变量偏移主要发生在负面数据中，因为在社交媒体中讨论的主题非常多样且众多，但用户标记的负面训练数据可能只涵盖了少数主题。本文提出了一种解决该问题的新技术。该技术的关键新颖之处在于将文档表示形式从传统的n-gram特征空间转换为基于中心的相似度（CBS）空间。在CBS空间中，协变量偏移问题得到了显着缓解，这使我们能够建立更好的分类器。实验结果表明，该方法显着提高了分类效果。

著录项

来源
《Conference on empirical methods in natural language processing》|2015年|2347-2356|共10页
会议地点
作者
Geli Fei; Bing Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Social Media Text Classification by Enhancing Well-Formed Text Trained Model [J] . Phat Jotikabukkana, Virach Sornlertlamvanich, Okumura Manabu, Journal of ICT Research and Applications . 2016,第2期

机译：通过加强格式良好的文本训练模型对社交媒体文本进行分类
2. Social Media Text Classification by Enhancing Well-Formed Text Trained Model [J] . Phat Jotikabukkana, Virach Sornlertlamvanich, Okumura Manabu, ITB Journal of Information and Communication Technology . 2016,第2期

机译：通过加强格式良好的文本训练模型对社交媒体文本进行分类
3. Text classification models for the automatic detection of nonmedical prescription medication use from social media [J] . Mohammed Ali Al-Garadi, Yuan-Chi Yang, Haitao Cai, BMC Medical Informatics and Decision Making . 2021,第1期

机译：社交媒体自动检测非医疗处方药物的文本分类模型
4. Social Media Text Classification under Negative Covariate Shift [C] . Geli Fei, Bing Liu Conference on empirical methods in natural language processing . 2015

机译：负协变量下的社交媒体文本分类
5. Cyberbullying Classification: Analysis of Text in Social Media Memes [D] . Gomez, Christopher. 2020

机译：网络欺凌分类：社交媒体模因中文文本分析
6. Text classification models for the automatic detection of nonmedical prescription medication use from social media [O] . Mohammed Ali Al-Garadi, Yuan-Chi Yang, Haitao Cai, 2021

机译：文本分类模型用于自动检测社交媒体非医疗处方药物
7. Social Media Text Classification by Enhancing Well-Formed Text Trained Model [O] . Phat Jotikabukkana, Virach Sornlertlamvanich, Okumura Manabu, 2016

机译：基于良好文本训练模型的社交媒体文本分类

Social Media Text Classification under Negative Covariate Shift

摘要

著录项

相似文献

相关主题

期刊订阅