首页> 外文会议>Annual International Symposium on Computer Architecture >DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers
【24h】

DjiNN and Tonic: DNN as a service and its implications for future warehouse scale computers

机译:Djinn和Tonic:DNN作为服务及其对未来仓库秤计算机的影响

获取原文

摘要

As applications such as Apple Siri, Google Now, Microsoft Cortana, and Amazon Echo continue to gain traction, webservice companies are adopting large deep neural networks (DNN) for machine learning challenges such as image processing, speech recognition, natural language processing, among others. A number of open questions arise as to the design of a server platform specialized for DNN and how modern warehouse scale computers (WSCs) should be outfitted to provide DNN as a service for these applications. In this paper, we present DjiNN, an open infrastructure for DNN as a service in WSCs, and Tonic Suite, a suite of 7 end-to-end applications that span image, speech, and language processing. We use DjiNN to design a high throughput DNN system based on massive GPU server designs and provide insights as to the varying characteristics across applications. After studying the throughput, bandwidth, and power properties of DjiNN and Tonic Suite, we investigate several design points for future WSC architectures. We investigate the total cost of ownership implications of having a WSC with a disaggregated GPU pool versus a WSC composed of homogeneous integrated GPU servers. We improve DNN throughput by over 120× for all but one application (40× for Facial Recognition) on an NVIDIA K40 GPU. On a GPU server composed of 8 NVIDIA K40s, we achieve near-linear scaling (around 1000× throughput improvement) for 3 of the 7 applications. Through our analysis, we also find that GPU-enabled WSCs improve total cost of ownership over CPU-only designs by 4–20×, depending on the composition of the workload.
机译:作为Apple Siri,谷歌,Microsoft Cortana和Amazon Echo的应用程序,WebService公司正在采用大型深度神经网络(DNN),用于机器学习挑战,如图像处理,语音识别,自然语言处理等等。对于专门用于DNN的服务器平台以及现代仓库秤计算机(WSCS)的设计,出现了许多开放性问题,以便为这些应用程序提供DNN作为服务。在本文中,我们展示了Djin,DNN的开放式基础设施作为WSCS中的服务,以及补品套件,跨越图像,语音和语言处理的7个端到端应用程序套件。我们使用Djinn根据大规模GPU服务器设计设计一个高吞吐量DNN系统,并为应用程序的不同特性提供见解。在研究Djinn和Tonic Suite的吞吐量,带宽和功率特性之后,我们调查了未来WSC​​架构的几个设计点。我们调查了具有Dis传播的GPU池的WSC的总体拥有的总成本,而VSC由均匀集成GPU服务器组成。我们在NVIDIA K40 GPU上提高了除了一个应用(面部识别40倍)超过120倍以上的DNN吞吐量。在由8个NVIDIA K40S组成的GPU服务器上,我们实现了7个应用中的3个近线性缩放(约1000×吞吐量改进)。通过我们的分析,我们还发现,支持GPU的WSCs通过4-20×的CPU-only设计的总体拥有成本提高了4-20倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号