首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium Workshops >Workflow-Driven Distributed Machine Learning in CHASE-CI: A Cognitive Hardware and Software Ecosystem Community Infrastructure
【24h】

Workflow-Driven Distributed Machine Learning in CHASE-CI: A Cognitive Hardware and Software Ecosystem Community Infrastructure

机译:Chase-CI中的工作流驱动的分布式机器学习:一种认知硬件和软件生态系统社区基础设施

获取原文

摘要

The advances in data, computing and networking over the last two decades led to a shift in many application domains that includes machine learning on big data as a part of the scientific process, requiring new capabilities for integrated and distributed hardware and software infrastructure. This paper contributes a workflow-driven approach for dynamic data-driven application development on top of a new kind of networked Cyberinfrastructure called CHASE-CI. In particular, we present: 1) The architecture for CHASE-CI, a network of distributed fast GPU appliances for machine learning and storage managed through Kubernetes on the high-speed (10-100Gbps) Pacific Research Platform (PRP); 2) A machine learning software containerization approach and libraries required for turning such a network into a distributed computer for big data analysis; 3) An atmospheric science case study that can only be made scalable with an infrastructure like CHASE-CI; 4) Capabilities for virtual cluster management for data communication and analysis in a dynamically scalable fashion, and visualization across the network in specialized visualization facilities in near real-time; and, 5) A step-by-step workflow and performance measurement approach that enables taking advantage of the dynamic architecture of the CHASE-CI network and container management infrastructure.
机译:在过去二十年中,数据,计算和网络的进步导致了许多应用域中的班次,其中包括大数据的机器学习作为科学过程的一部分,需要新的集成和分布式硬件和软件基础架构的新功能。本文为动态数据驱动的应用程序开发提供了一种工作流驱动的方法,以便在一个名为Chase-CI的新型网络网络卷系统上的顶部。特别是,我们呈现:1)Chase-CI的架构,一个用于机器学习和存储的分布式快速GPU设备网络,通过高速(10-100Gbps)太平洋研究平台(PRP)管理通过Kubernetes管理; 2)将这种网络转换为分布式计算机的机器学习软件容器化方法和库,以进行大数据分析; 3)大气科学案例研究,只能通过像追逐-CI等基础设施进行可扩展; 4)虚拟群集管理的能力,用于数据通信和分析以动态可扩展的方式,以及在近期实时的专业可视化设施中的网络可视化; 5)逐步的工作流程和性能测量方法,可以利用Chase-CI网络和集装箱管理基础设施的动态架构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号