首页> 外文期刊>Journal of computer sciences >Sentiment Analysis of Arabic Reviews for Saudi Hotels Using Unsupervised Machine Learning
【24h】

Sentiment Analysis of Arabic Reviews for Saudi Hotels Using Unsupervised Machine Learning

机译:使用无监督机器学习的沙特酒店阿拉伯评论的情感分析

获取原文
           

摘要

Virtual worlds such as social networking sites, blogs and content communities are extremely becoming one of the most powerful sources for news, markets, industries etc. These virtual worlds can be used for many aspects, because they are rich platforms full of feedback, emotions, thoughts and reviews. The main objective of this paper is to cluster Arabic reviews of Saudi hotels for sentiment analysis into positive and negative clusters. We used web scraping to collect Arabic reviews associated only with Saudi hotels, from the tourism website TripAdvisor and obtained in total 4604 Arabic reviews. Then the TF-IDF was applied to extract relevant features. An unsupervised learning approach was applied, in particular K-means and Hierarchical algorithms with two distance metrics: Cosine and Euclidean. Our manual labelled test data shows that the K-means algorithm with cosine distance performed well when applying all of our prepossessing steps. We concluded that the suggested prepossessing steps play a critical role in Arabic language processing and sentiment analysis.
机译:社交网站,博客和内容社区等虚拟世界极为成为新闻,市场,行业等最强大的来源之一。这些虚拟世界可以用于许多方面,因为它们是丰富的平台,充满了反馈,情绪,情绪思考和评论。本文的主要目标是将沙特阿拉伯审查的群体纳入了积极和负块的情感分析。我们使用Web Scraping来收集与沙特酒店相关的阿拉伯语评论,距离旅游网站旅游遗址TripAdvisor,共计4604人的评论。然后应用TF-IDF来提取相关特征。应用无监督的学习方法,特别是具有两个距离指标的K-means和分层算法:余弦和欧几里德。我们的手册标记的测试数据显示,K-Means算法在应用我们所有的预存步骤时,具有余弦距离的算法。我们得出结论,建议的预备步骤在阿拉伯语加工和情绪分析中发挥着关键作用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号