基于IDSSL的文本情感分析研究

王刚; 李宁宁; 杨善林

首页> 中文期刊> 《管理工程学报》 >基于IDSSL的文本情感分析研究

基于IDSSL的文本情感分析研究

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

随着社交媒体的不断普及,网络上出现了大量用户创造的文本信息.这类文本所包含的用户的观点、意见和态度等情感信息,对于互联网用户有着重要的作用,已受到越来越多的重视,并已提出大量有监督的文本情感分析方法来利用这些数据.但文本情感分析中存在大量无标记样本,如何利用大量无标记样本和少量有标记样本进行学习的问题,已成为了文本情感分析领域亟待解决的问题之一.为此,本文提出一种改进的半监督文本情感分析方法IDSSL(Improved Disagreement-based Semi-Supervised Learning).该方法以基于分歧的半监督方法为框架,首先利用Random Subspace的方式构建多个初始分类器,然后以"多数帮助少数"的方式利用无标记样本训练分类器.最后,在情感分析经典数据集上进行了实验,结果证明了本文提出的方法的有效性,而且取得了比其它半监督学习方法都好的实验结果.%With the growing popularity of social media, a large number of user generated content is posted on the Internet. These kinds of texts contain user's points of view, opinions and attitudes, which play an important role for Internet users. Researchers pay increased attention to user-generated content. Subsequently, a lot of supervised text sentiment analysis methods have been proposed to make use of this kind of data. However, there are a lot of unlabeled data in the sentiment analysis. How to use a large number of unlabeled data and a small amount of labeled data has become one of the urgent research problems in the area of sentiment analysis. Therefore, this paper proposed an Improved Disagreement-based Semi-Supervised Learning (IDSSL) method for text sentiment analysis, which is based on the framework of disagreement-based semi-supervised learning. Firstly, a model for sentiment analysis based on the disagreement-based semi-supervised learning was constructed. First of all, the disagreement-based semi-supervised learning was theoretically analyzed. The analysis found that the multiple-classifiers method is better than original disagreement-based semi-supervised learning method. On the other hand, diversity is the key value of the multiple-classifier disagreement-based semi-supervised learning method. Moreover, Random Subspace method can lead to diversity of the classifiers in the area of sentiment analysis. Therefore, we constructed a sentiment analysis model by combining multiple classifiers method produced with Random Subspace method, namely IDSSL method. IDSSL method consists of three steps: (1) multiple initial classifiers are built based on the Random Subspace method; (2) classifiers are trained by the rule of "majority help minority" to utilize the unlabeled instances; and (3) the base classifier was integrated in majority vote. Secondly, experiments were carried out using the classic datasets of sentiment analysis. The established standard measure in sentiment analysis was adopted to evaluate the performance of the proposed method. IDSSL method is compared with several disagreement-based semi-supervised learning method, including Self-training method, Co-training method, Tri-training method and Co-forest method. Self-training, Co-training, Tri-training, and IDSSL used SVM as base learner. To minimize the influence of variability in the training set, the 10-fold cross validation was performed five times on the sentiment analysis datasets. Finally, experimental results proved the effectiveness of our proposed method. Moreover, our proposed method obtained better results than the other semi-supervised learning methods, including Self-training method, Co-training method, Tri-training method, and Co-forest method. In addition, we also discuss different semi-supervised learning methods’ results, the influence of the label rate on semi-supervised learning methods, and the influence of the add-number on the IDSSL method.

著录项

来源
《管理工程学报》 |2018年第3期|126-133|共8页
作者
王刚; 李宁宁; 杨善林;
展开▼
作者单位

合肥工业大学管理学院;

安徽合肥230009;

过程优化与智能决策教育部重点实验室;

安徽合肥 230009;

合肥工业大学管理学院;

安徽合肥230009;

合肥工业大学管理学院;

安徽合肥230009;

过程优化与智能决策教育部重点实验室;

安徽合肥 230009;

展开▼
原文格式 PDF
正文语种 chi
中图分类信息处理（信息加工）;
关键词
文本情感分析; 半监督学习; 多分类器; Random Subspace;

相似文献

中文文献
外文文献
专利

1. 基于MOOC平台数学类课程情感词典的文本分析研究 [J] . 刘娟 ,谭均翘 ,张靖红 . 科教导刊 . 2021,第019期
2. 基于文本挖掘的影视弹幕情感分析研究 [J] . 邹墨馨 ,辛雨璇 . 科技创新与应用 . 2021,第024期
3. 基于文本挖掘的电影评论情感分析研究 [J] . 辛雨璇 ,王晓东 . 牡丹江师范学院学报（自然科学版） . 2021,第001期
4. 基于注意力机制的金融文本情感分析研究 [J] . 陈天翔 . 信息技术与信息化 . 2020,第001期
5. 基于SVM的藏文微博文本情感分析研究与实现 [J] . 黄晨晨 ,索朗拉姆 ,拉姆卓嘎 . 高原科学研究 . 2020,第001期
6. 企业管理层的文本披露质量与上市公司的风险预警——基于公司年报文本信息的分析研究 [C] . Li Chenggang ,李成刚 ,Jia Hongye . 中国数量经济学会2019年年会 . 2019
7. 结合情感词典和神经网络的文本情感分析研究 [A] . 徐民霖 . 2020

基于IDSSL的文本情感分析研究

摘要

著录项

相似文献

相关主题

期刊订阅