An overview and comparison of free Python libraries for data mining and big data analysis

机译：用于数据挖掘和大数据分析的免费Python库的概述和比较

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The popularity of Python is growing, especially in the field of data science. Consequently, there is an increasing number of free libraries available for usage. The aim of this review paper is to describe and compare the characteristics of different data mining and big data analysis libraries in Python. There is currently no paper dealing with the subject and describing pros and cons of all these libraries. Here we consider more than 20 libraries and separate them into six groups: core libraries, data preparation, data visualization, machine learning, deep learning and big data. Beside functionalities of a certain library, important factors for comparison are the number of contributors developing and maintaining the library and the size of the community. Bigger communities mean larger chances for easily finding solution to a certain problem. We currently recommend: pandas for data preparation; Matplotlib, seaborn or Plotly for data visualization; scikit-learn for machine leraning; TensorFlow, Keras and PyTorch for deep learning; and Hadoop Streaming and PySpark for big data.

机译：Python的流行正在增长，特别是在数据科学领域。因此，越来越多的免费库可供使用。本文的目的是描述和比较Python中不同数据挖掘和大数据分析库的特征。当前没有涉及该主题并描述所有这些库的优缺点的论文。在这里，我们考虑了20多个库，并将它们分为六类：核心库，数据准备，数据可视化，机器学习，深度学习和大数据。除了某个图书馆的功能之外，进行比较的重要因素是开发和维护图书馆的贡献者数量以及社区规模。更大的社区意味着更容易找到特定问题的解决方案的机会。我们目前建议：熊猫用于数据准备; Matplotlib，seaborn或Plotly用于数据可视化; scikit-learn用于机器学习; TensorFlow，Keras和PyTorch进行深度学习; Hadoop Streaming和PySpark处理大数据。

著录项

来源
《International Convention on Information and Communication Technology, Electronics and Microelectronics》|2019年|977-982|共6页
会议地点
作者
I. Stančin; A. Jović;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Big Data; data analysis; data mining; data visualisation; learning (artificial intelligence); neural nets; public domain software; Python; software libraries;

机译：大数据;数据分析;数据挖掘;数据可视化;学习（人工智能）;神经网络;公共领域软件; Python;软件库;

相似文献

外文文献
中文文献
专利

1. The Python ARM Radar Toolkit (Py-ART), a Library for Working with Weather Radar Data in the Python Programming Language [J] . Jonathan J Helmus, Scott M Collis Journal of Open Research Software . 2016,第1期

机译：Python ARM Radar Toolkit（Py-ART），一个使用Python编程语言处理天气雷达数据的库
2. Methodology Of Analysis And Interrelation Of Data About Quality Indexes Of Library Services By Using Data- And Knowledge-mining Techniques [J] . Aristeidis Meletiou Library management . 2009,第3期

机译：利用数据和知识挖掘技术对图书馆服务质量指标数据进行分析和相互关联的方法论
3. SWIGLAL: Python and Octave interfaces to the LALSuite gravitational-wave data analysis libraries [J] . Karl Wette SoftwareX . 2020,第2期

机译：Swiglal：Python和Octave接口到Lalsuite Gravitational-Wave数据分析库
4. An overview and comparison of free Python libraries for data mining and big data analysis [C] . I. Stancin, A. Jovic International Convention on Information and Communication Technology, Electronics and Microelectronics . 2019

机译：数据挖掘和大数据分析的免费Python库的概述和比较
5. Data mining analysis of digital library database usage patterns as a tool facilitating efficient user navigation. [D] . Gibson, Ian Eric. 2001

机译：数字图书馆数据库使用模式的数据挖掘分析是一种有助于高效用户导航的工具。
6. rstoolbox - a Python library for large-scale analysis of computational protein design data and structural bioinformatics [O] . Jaume Bonet, Zander Harteveld, Fabian Sesterhenn, 2019

机译：rstoolbox-一个用于大规模分析计算蛋白设计数据和结构生物信息学的Python库
7. Analysis of interactions between inflammatory and vasoregulatory pathways in chronic heart failure: application of logical analysis of data, a novel data-mining tool = Analysis of interactions between inflammatory and vasoregulatory pathways in chronic heart failure: application of logical analysis of data, a novel data-mining tool [O] . Prohászka Zoltán, Aladzsity István, Cervenak László, 2012

机译：慢性心力衰竭中炎症与血管调节途径之间相互作用的分析：数据逻辑分析的应用，一种新型的数据挖掘工具=慢性心力衰竭中炎症与血管调节途径之间相互作用的分析：数据逻辑分析的应用，新颖数据采矿工具

An overview and comparison of free Python libraries for data mining and big data analysis

摘要

著录项

相似文献

相关主题

期刊订阅