首页> 外文会议>International Conference on Big Data and Artificial Intelligence >XData: A General-purpose Unified Processing System for Data Analysis and Machine Learning
【24h】

XData: A General-purpose Unified Processing System for Data Analysis and Machine Learning

机译:XDATA:用于数据分析和机器学习的通用统一处理系统

获取原文

摘要

With the rapid development and widespread application of data science, users need to mine value from large amounts of data. Currently, many new data processing systems have been proposed and used in practice. Each system usually targets a specialized domain, but in a real data analysis scenario, it is hard to separate different techniques and we often need unifying multiple processing. This paper proposes XData, a new general-purpose unified processing system for data analysis and machine learning. It is designed to support a variety of operators in a uniform statement, such as data retrieval, data aggregation, and modeling of machine learning. With the mechanism of pipeline and the data processing language (DPL), developers can easily combine multiple processing with SQL-like form to implement end-to-end applications. It also takes advantage of the intermediate results in the pipeline, and automatically optimizes the performance on the logical plan, physical topology, and task executor level. It has been put into use in practical engineering and the evaluations reveal that performance can be improved compared to traditional processing techniques.
机译:随着数据科学的快速发展和广泛应用,用户需要从大量数据中挖掘价值。目前,已经提出了许多新的数据处理系统并在实践中使用。每个系统通常都针对专门的域,但在真实的数据分析方案中,很难分离不同的技术,我们经常需要统一多次处理。本文提出了一种用于数据分析和机器学习的新型通用统一处理系统XDATA。它旨在支持统一声明中的各种操作员,例如数据检索,数据聚合和机器学习建模。利用管道和数据处理语言(DPL)的机制,开发人员可以轻松地将多个处理与SQL的形式相结合以实现端到端应用程序。它还利用了管道中的中间结果,并自动优化逻辑计划,物理拓扑和任务执行程序级别的性能。它已在实际工程中投入使用,评估表明,与传统加工技术相比,可以提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号