首页> 外文会议>International conference of the Italian Association for Artificial Intelligence >Evaluating Industrial and Research Sentiment Analysis Engines on Multiple Sources
【24h】

Evaluating Industrial and Research Sentiment Analysis Engines on Multiple Sources

机译:在多个来源上评估工业和研究情绪分析引擎

获取原文

摘要

Sentiment Analysis has a fundamental role in analyzing users opinions in all kinds of textual sources. Computing accurately sentiment expressed in huge amount of textual data is a key task largely required by the market, and nowadays industrial engines make available ready-to-use APIs for sentiment analysis-related tasks. However, building sentiment engines showing high accuracy on structurally different textual sources (e.g. reviews, tweets, blogs, etc.) is not a trivial task. Papers about cross-source evaluation lack of a comparison with industrial engines, which are instead specifically designed for dealing with multiple sources. In this paper, we compare the results of research and industrial engines on an extensive experimental evaluation, considering the document-level polarity detection task performed on different textual sources: tweets, apps reviews and general products reviews, in both English and Italian. The experimental evaluation results help the reader to quantify the performance gap between industrial and research sentiment engines when both are tested on heterogeneous textual sources and on different languages (English/Italian). Finally, we present the results of our multi-source solution X2Check. Considering an overall cross-source average F-score on all of the results, X2Check shows a performance that is 9.1% and 5.1% higher than Google CNL, respectively on Italian and English benchmarks. Compared to the research engines, X2Check shows a F-score that is always higher than tools not specifically trained on the test set under evaluation; it is lower at most of 3.4% in Italian and 11.6% on English benchmarks, compared to the best research tools specifically trained on the target source.
机译:情感分析在分析各种文本来源中的用户意见方面具有基本作用。准确计算以大量文本数据表示的情感是市场主要要求的一项关键任务,如今,工业引擎提供了可用于情感分析相关任务的即用型API。但是,构建在结构上不同的文本来源(例如评论,推文,博客等)上显示高精度的情感引擎并不是一件容易的事。关于跨源评估的论文缺乏与工业引擎的比较,工业引擎是专门为处理多种源而设计的。在本文中,我们考虑了在不同文本源上执行的文档级极性检测任务:推文,应用程序评论和常规产品评论(英语和意大利语),在广泛的实验评估中比较了研究结果和工业引擎的结果。实验评估结果可帮助读者量化在异类文本源和不同语言(英语/意大利语)上进行测试时,工业和研究情感引擎之间的性能差距。最后,我们展示了多源解决方案X2Check的结果。考虑到所有结果的总体跨源平均F分数,在意大利语和英语基准测试中,X2Check的性能分别比Google CNL高9.1%和5.1%。与研究引擎相比,X2Check的F得分总是高于未经过评估的测试集中专门训练的工具。与专门针对目标资料来源训练的最佳研究工具相比,该语言在意大利文的基准中最多占3.4%,在英语基准中的占11.6%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号