首页> 外文期刊>Open Access Library Journal >Research on Chinese Text Feature Extraction and Sentiment Analysis Based on Combination Network
【24h】

Research on Chinese Text Feature Extraction and Sentiment Analysis Based on Combination Network

机译:基于组合网络的中文文本特征提取与情感分析研究

获取原文
           

摘要

The complexity of Chinese language system brings great challenge to sentiment analysis. Traditional artificial feature selection is easy to cause the problem of inaccurate segmentation semantics. High quality preprocessing results are of great significance to the subsequent network model learning. In order to effectively extract key features of sentences, retain feature words while removing irrelevant noise and reducing vector dimensions, an algorithm module based on sentiment lexicon combined with Word2vec incremental training is proposed in terms of feature engineering. Firstly, the data set is cleaned, and the sentence is segmented by loading a custom sentiment lexicon with Jieba. Secondly, the results after stopping words are obtained through Skip-gram training algorithm to obtain the word vector model. Secondly, the model is added to a large corpus for incremental training to obtain a more accurate word vector model. Finally, the features are learned and classified by inputting the embedding layer into the neural network model. Through the comparison experiment of multiple models, it is found that the combined model (CNN-BiLSTM-Attention) has better classification effect and better application ability.
机译:汉语系统的复杂性为情绪分析带来了巨大的挑战。传统的人工特征选择很容易引起分割语义不准确的问题。高质量的预处理结果对随后的网络模型学习具有重要意义。为了有效地提取句子的关键特征,在去除无关噪声和减少矢量维的同时保留特征词,基于特征工程,提出了一种基于情感词典增量训练的算法模块。首先,清除数据集,并且通过用Jieba加载自定义情绪词典来分段句子。其次,通过SKIP-GRGR训练算法获得停止单词之后的结果以获得单词矢量模型。其次,该模型被添加到大语料库中,用于增量训练,以获得更准确的单词矢量模型。最后,通过将嵌入层输入神经网络模型来学习和分类来学习和分类。通过多种模型的比较实验,发现组合模型(CNN-Bilstm-Peponsion)具有更好的分类效果和更好的应用能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号