首页> 外文会议>International Joint Conference on Neural Networks >A character-based convolutional neural network for language-agnostic Twitter sentiment analysis
【24h】

A character-based convolutional neural network for language-agnostic Twitter sentiment analysis

机译:基于字符的卷积神经网络,用于不可知的Twitter情感分析

获取原文

摘要

Most work on tweet sentiment analysis is mono-lingual and the models that are generated by machine learning strategies do not generalize across multiple languages. Cross-language sentiment analysis is usually performed through machine translation approaches that translate a given source language into the target language of choice. Machine translation is expensive and the results that are provided by theses strategies are limited by the quality of the translation that is performed. In this paper, we propose a language-agnostic translation-free method for Twitter sentiment analysis, which makes use of deep convolutional neural networks with character-level embeddings for pointing to the proper polarity of tweets that may be written in distinct (or multiple) languages. The proposed method is more accurate than several other deep neural architectures while requiring substantially less learnable parameters. The resulting model is capable of learning latent features from all languages that are employed during the training process in a straightforward fashion and it does not require any translation process to be performed whatsoever. We empirically evaluate the efficiency and effectiveness of the proposed approach in tweet corpora based on tweets from four different languages, showing that our approach comfortably outperforms the baselines. Moreover, we visualize the knowledge that is learned by our method to qualitatively validate its effectiveness for tweet sentiment classification.
机译:关于推特情感分析的大多数工作都是单语言的,并且由机器学习策略生成的模型不能跨多种语言进行概括。跨语言情感分析通常是通过机器翻译方法执行的,该方法将给定的源语言翻译成所选的目标语言。机器翻译是昂贵的,并且这些策略提供的结果受到执行的翻译质量的限制。在本文中,我们提出了一种用于Twitter情感分析的语言无关的免翻译方法,该方法利用具有字符级嵌入的深度卷积神经网络来指向可能以不同(或多个)形式编写的推文的正确极性。语言。所提出的方法比其他几种深层神经体系结构更准确,同时所需的学习参数也少得多。生成的模型能够以直接的方式从训练过程中使用的所有语言中学习潜在特征,并且它不需要执行任何翻译过程。我们基于来自四种不同语言的推文,对推文语料库中拟议方法的效率和有效性进行了经验评估,表明我们的方法舒适地胜过了基线。此外,我们可视化通过我们的方法学到的知识,以定性地验证其在推特情感分类中的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号