Review Spam Detection Using Word Embeddings and Deep Neural Networks

机译：使用词嵌入和深度神经网络查看垃圾邮件检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Review spam (fake review) detection is increasingly important taking into consideration the rapid growth of internet purchases. Therefore, sophisticated spam filters must be designed to tackle the problem. Traditional machine learning algorithms use review content and other features to detect review spam. However, as demonstrated in related studies, the linguistic context of words may be of particular importance for text categorization. In order to enhance the performance of review spam detection, we propose a novel content-based approach that considers both bag-of-words and word context. More precisely, our approach utilizes n-grams and the skip-gram word embedding method to build a vector model. As a result, high-dimensional feature representation is generated. To handle the representation and classify the review spam accurately, a deep feed-forward neural network is used in the second step. To verify our approach, we use two hotel review datasets, including positive and negative reviews. We show that the proposed detection system outperforms other popular algorithms for review spam detection in terms of accuracy and area under ROC. Importantly, the system provides balanced performance on both classes, legitimate and spam, irrespective of review polarity.

机译：考虑到互联网购买的快速增长，审查垃圾邮件（虚假审查）变得越来越重要。因此，必须设计复杂的垃圾邮件过滤器来解决该问题。传统的机器学习算法使用评论内容和其他功能来检测评论垃圾邮件。但是，如相关研究所示，单词的语言环境对于文本分类可能特别重要。为了提高垃圾评论的检测性能，我们提出了一种新颖的基于内容的方法，该方法同时考虑了词袋和词上下文。更准确地说，我们的方法利用n-gram和skip-gram词嵌入方法来构建矢量模型。结果，生成了高维特征表示。为了处理表示形式并准确地对垃圾评论进行分类，第二步使用了深度前馈神经网络。为了验证我们的方法，我们使用了两个酒店评论数据集，包括正面和负面评论。我们显示，提出的检测系统在ROC下的准确性和面积方面优于其他流行的垃圾邮件检测算法。重要的是，无论审阅极性如何，该系统在合法和垃圾邮件两个类别上均提供平衡的性能。

著录项

来源
《IFIP WG 12.5 International workshops on artificial intelligence applications and innovations;Mining humanistic data workshop;Workshop on 5g-putting intelligence to the network edge;Workshop on emerging trends in AI》|2019年|340-350|共11页
会议地点 Hersonissos(GR)
作者
Aliaksandr Barushka; Petr Hajek;
展开▼
作者单位

Institute of System Engineering and Informatics Faculty of Economics and Administration University of Pardubice Studentska 84 532 10 Pardubice Czech Republic;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Review spam; Skip-gram; Word2vec; Word embedding; Neural network;

机译：查看垃圾邮件；跳过图Word2vec;词嵌入；神经网络;

相似文献

外文文献
中文文献
专利

1. Sentiment analysis on product reviews based onweighted word embeddings and deep neural networks [J] . Onan Aytug Concurrency and computation: practice and experience . 2021,第23期

机译：基于重量单词嵌入和深神经网络的产品评论的情感分析
2. Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks [J] . Neural computing & applications . 2020,第9期

机译：使用成本敏感特征选择和基于集成的正则化深神经网络的社交网络垃圾邮件检测
3. Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification [J] . Physica, A. Statistical mechanics and its applications . 2020,第期

机译：使用预训练的单词嵌入在土耳其语文本分类的深神经网络上使用预先训练的单词嵌入来提高准确性
4. Review Spam Detection Using Word Embeddings and Deep Neural Networks [C] . Aliaksandr Barushka, Petr Hajek IFIP WG 12.5 International workshops on artificial intelligence applications and innovations . 2019

机译：查看垃圾邮件检测使用Word Embeddings和Deep Neural Networks
5. Spam Review Detection Using Self-Organizing Maps and Convolutional Neural Networks [D] . Neisari, Ashraf. 2020

机译：使用自组织地图和卷积神经网络的垃圾邮件审查检测
6. Identifying antimicrobial peptides using word embedding with deep recurrent neural networks [O] . Md-Nafiz Hamid, Iddo Friedberg -1

机译：通过深度递归神经网络的词嵌入识别抗菌肽
7. Spam review detection using self-organizing maps and convolutional neural networks [O] . Ashraf Neisari, Luis Rueda, Sherif Saad 2021

机译：使用自组织地图和卷积神经网络的垃圾邮件审查检测

Review Spam Detection Using Word Embeddings and Deep Neural Networks

摘要

著录项

相似文献

相关主题

期刊订阅