首页> 外文学位 >GreenHDFS: Data-centric and cyber-physical energy management system for Big Data clouds.
【24h】

GreenHDFS: Data-centric and cyber-physical energy management system for Big Data clouds.

机译:GreenHDFS:用于大数据云的以数据为中心和网络物理的能源管理系统。

获取原文
获取原文并翻译 | 示例

摘要

Explosion in Big Data has led to a rapid increase in the popularity of Big Data analytics. With the increase in the sheer volume of data that needs to be stored and processed, storage and computing demands of the Big Data analytics workloads are growing exponentially, leading to a surge in extremely large-scale Big Data cloud platforms, and resulting in burgeoning energy costs and environmental impact.;The sheer size of Big Data lends it significant data movement inertia and that coupled with the network bandwidth constraints inherent in the cloud's cost-efficient and scale-out economic paradigm, makes data-locality a necessity for high performance in the Big Data environments. Instead of sending data to the computations as has been the norm, computations are sent to the data to take advantage of the higher data-local performance. The state-of-the-art run-time energy management techniques are job-centric in nature and rely on thermal- and energy-aware job placement, job consolidation, or job migration to derive energy costs savings. Unfortunately, data-locality requirement of the compute model limits the applicability of the state-of-the-art run-time energy management techniques as these techniques are inherently data-placement-agnostic in nature, and provide energy savings at significant performance impact in the Big Data environment.;GreenHDFS is based on the observation that data needs to be a first-class object in energy management in the Big Data environments to allow high data access performance. GreenHDFS takes a novel data-centric, cyber-physical approach to reduce compute (i.e., server) and cooling operating energy costs. On the physical-side, GreenHDFS is cognizant that all-servers-are-not-alike in the Big Data analytics cloud and is aware of the variations in the thermal-profiles of the servers. On the cyber-side, GreenHDFS is aware that all-data-is-not-alike and knows the differences in the data-semantics (i.e., computational jobs arrival rate, size, popularity, and evolution life spans) of the Big Data placed in the Big Data analytics cloud. Armed with this cyber-physical knowledge, and coupled with its insights, predictive data models, and run-time information GreenHDFS does proactive, cyber-physical, thermal- and energy-aware file placement, and data-classification-driven scale-down, which implicitly results in thermal- and energy-aware job placement in the Big Data analytics cloud compute model. GreenHDFS's data-centric energy- and thermal-management approach results in a reduction in energy costs without any associated performance impact, allows scale-down of a subset of servers in spite of the unique challenges posed by Big Data analytics cloud to scale-down, and ensures thermal-reliability of the servers in the cluster.;Free-cooling or air- and water-side economization (i.e., use outside air or natural water resources to cool the data center) is gaining popularity as it can result in significant cooling energy costs savings. There is also a drive towards increasing the cooling set point of the cooling systems to make them more efficient. If the ambient temperature of the outside air or the cooling set point temperature is high, the inlet temperatures of the servers get high which reduces their ability to dissipate computational heat, resulting in an increase in server temperatures. The servers are rated to operate safely only with a certain temperature range, beyond which the failure rates increase. GreenHDFS considers the differences in the thermal-reliability-driven load-tolerance upper-bound of the servers in its predictive thermal-aware file placement and places file chunks in a manner that ensures that temperatures of servers don't exceed temperature upper-bound. Thus, by ensuring thermal-reliability at all times and by lowering the overall temperature of the servers, GreenHDFS enables data centers to enjoy energy-saving economizer mode for longer periods of time and also enables an increase in the cooling set point.;There are a substantial number of data centers that still rely fully on traditional air-conditioning. These data centers can not always be retrofitted with the economizer modes or hot- and cold-aisle air containment as incorporation of the economizer and air containment may require space for duct-work, and heat exchangers which may not be available in the data center. Existing data centers may also not be favorably located geographically; air-side economization is more viable in geographic locations where ambient air temperatures are low for most part of the year and humidity is in the tolerable range. GreenHDFS provides a software-based approach to enhance the cooling-efficiency of such traditional data centers as it lowers the overall temperature in the cluster, makes the thermal-profile much more uniform, and reduces hot air recirculation, resulting in lowered cooling energy costs. (Abstract shortened by UMI.).
机译:大数据的爆炸导致大数据分析的迅速普及。随着需要存储和处理的庞大数据量的增加,大数据分析工作负载的存储和计算需求呈指数增长,导致超大规模的大数据云平台激增,并带来了蓬勃发展的能源大数据的庞大规模使其具有显着的数据移动惯性,再加上云的经济高效和横向扩展的经济范式固有的网络带宽约束,使得数据本地化成为实现高性能的必要条件。大数据环境。与其按常规发送数据到计算,不如将计算发送到数据以利用更高的数据局部性能。最先进的运行时能源管理技术本质上是以工作为中心的,并且依靠热能和能感知能量的工作位置,工作合并或工作迁移来节省能源成本。不幸的是,计算模型的数据局部性要求限制了最新运行时能量管理技术的适用性,因为这些技术本质上本质上与数据放置无关,并且在显着影响性能的情况下提供了节能。 GreenHDFS是基于以下观察:数据需要成为大数据环境中能源管理中的一流对象,以实现高数据访问性能。 GreenHDFS采用新颖的以数据为中心的网络物理方法,以减少计算量(即服务器)并降低运行能源成本。在物理方面,GreenHDFS意识到大数据分析云中的所有服务器都不一样,并且意识到服务器热特性的变化。在网络方面,GreenHDFS意识到所有数据都不一样,并且知道放置的大数据在数据语义上的差异(即计算工作到达率,大小,受欢迎程度和演化寿命)在大数据分析云中。有了这些网络物理知识,再加上其洞察力,预测性数据模型和运行时信息,GreenHDFS可以进行主动的,网络物理,热和能源感知的文件放置,以及数据分类驱动的按比例缩小,这隐含地导致在大数据分析云计算模型中了解热和能源的工作。 GreenHDFS的以数据为中心的能源和热管理方法可降低能源成本,而不会带来任何相关的性能影响,尽管大数据分析云面临着独特的挑战,但仍可缩小一部分服务器的规模, ;并确保集群中服务器的热可靠性。;免费冷却或空气和水方面的节约(即,使用外部空气或自然水资源来冷却数据中心)正变得越来越流行,因为它可以导致大量的冷却。节省能源成本。还存在增加冷却系统的冷却设定点以使其更有效的动力。如果外部空气的环境温度或冷却设定点温度较高,则服务器的入口温度会升高,这会降低其散热能力,从而导致服务器温度升高。这些服务器只能在一定温度范围内安全运行,超过此温度范围,故障率就会增加。 GreenHDFS在其预测的热感知文件放置中考虑了服务器的热可靠性驱动负载上限的差异,并以确保服务器温度不超过温度上限的方式放置文件块。因此,通过始终确保热可靠性并降低服务器的整体温度,GreenHDFS使数据中心可以长时间享受节能节能模式,并且还可以提高冷却设定点。大量仍完全依靠传统空调的数据中心。这些数据中心无法始终以节能器模式进行改造,也不能总是采用热通道和冷通道气密性,因为结合使用节能器和气密性可能需要空间进行风管工作,并且热交换器可能在数据中心中不可用。现有数据中心的地理位置也可能不理想;在一年中大部分时间环境温度较低且湿度在可容忍范围内的地理位置,采用空气侧节能更为可行。 GreenHDFS提供了一种基于软件的方法,可提高此类传统数据中心的冷却效率,因为它可以降低群集中的总体温度,使热量分布更加均匀,并减少热空气再循环,从而降低了冷却能源成本。 (摘要由UMI缩短。)。

著录项

  • 作者

    Kaushik, Rini.;

  • 作者单位

    University of Illinois at Urbana-Champaign.;

  • 授予单位 University of Illinois at Urbana-Champaign.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 227 p.
  • 总页数 227
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号