Fast text classification with Naive Bayes method on Apache Spark

机译：在Apache Spark上使用Naive Bayes方法进行快速文本分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The increase in the number of devices and users online with the transition of Internet of Things (IoT), increases the amount of large data exponentially. Classification of ascending data, deletion of irrelevant data, and meaning extraction have reached vital importance in today's standards. Analysis can be done in various variations such as Classification of text on text data, analysis of spam, personality analysis. In this study, fast text classification was performed with machine learning on Apache Spark using the Naive Bayes method. Spark architecture uses a distributed in-memory data collection instead of a distributed data structure presented in Hadoop architecture to provide fast storage and analysis of data. Analyzes were made on the interpretation data of the Reddit which is open source social news site by using the Naive Bayes method. The results are presented in tables and graphs.

机译：随着物联网（IoT）的过渡，在线设备和用户数量的增加，使大数据量呈指数增长。在当今的标准中，升序数据的分类，不相关数据的删除以及含义提取已变得至关重要。可以进行各种变化的分析，例如文本数据上的文本分类，垃圾邮件分析，性格分析。在这项研究中，使用朴素贝叶斯方法在Apache Spark上通过机器学习对文本进行了快速分类。 Spark架构使用分布式内存中数据收集而不是Hadoop架构中提供的分布式数据结构来提供快速的数据存储和分析。使用朴素贝叶斯方法对开源社交新闻网站Reddit的解释数据进行了分析。结果显示在表格和图表中。

著录项

来源
《Signal Processing and Communications Applications Conference》|2017年|1-4|共4页
会议地点
作者
İskender Ülgen Oğul; Caner Özcan; Özlem Hakdağlı;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Sparks; Java; Internet of Things; Standards; Text categorization; Art;

机译：Sparks; Java;物联网;标准;文本分类;艺术;

相似文献

外文文献
中文文献
专利

1. Improved feature size customized fast correlation-based filter for Naive Bayes text classification [J] . Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2020,第3期

机译：改进的特征尺寸自定义基于快速相关的基于快速相关的过滤器，用于Naive Bayes文本分类
2. Comparing Results of Sentiment Analysis Using Naive Bayes and Support Vector Machine in Distributed Apache Spark Environment [J] . Tomasz Szandala Computer Science & Information Technology . 2018,第14期

机译：朴素贝叶斯和支持向量机在分布式Apache Spark环境中情感分析结果的比较
3. Integrating associative rule-based classification with Naive Bayes for text classification [J] . Hadi Wael, Al-Radaideh Qasem A., Alhawari Samer Applied Soft Computing . 2018,第期

机译：将基于关联规则的分类与Naive Bayes集成进行文本分类
4. Fast text classification with Naive Bayes method on Apache Spark [C] . ?skender ülgen O?ul, Caner ?zcan, ?zlem Hakda?l? Signal Processing and Communications Applications Conference . 2017

机译：与Apache Spark上的Naive Bayes方法快速文本分类
5. Modern Considerations for the Use of Naive Bayes in the Supervised Classification of Genetic Sequence Data [D] . Lakin, Steven M. 2021

机译：在遗传序列数据监督分类中使用Naive Bayes的现代考虑因素
6. Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences [O] . Michal Ziemski, Treepop Wisanwanichthan, Nicholas A. Bokulich, 2021

机译：在分类学分类的16S rRNA基因序列中击败天真的贝叶斯
7. Comparing Results of Sentiment Analysis Using Naive Bayes and Support Vector Machine in Distributed Apache Spark Environment [O] . Tomasz Szandala 2018

机译：使用Naive Bayes的情感分析结果和分布式Apache Spark环境中的支持向量机比较

Fast text classification with Naive Bayes method on Apache Spark

摘要

著录项

相似文献

相关主题

期刊订阅