首页 | 本学科首页   官方微博 | 高级检索  
     


Model-driven coordinated management of data centers
Authors:Tridib Mukherjee  Ayan Banerjee  Georgios Varsamopoulos  Sandeep K.S. Gupta
Affiliation:1. University of Sannio, DING – Department of Engineering, Piazza Roma 21, 82100 Benevento, Italy;2. Second University of Naples, Built Environment Control Laboratory, via San Lorenzo, 81031 Aversa, Italy;1. TCS Innovation Labs-TRDDC, Tata Consultancy Services Limited, Pune, India;2. TCS Innovation Labs-Chennai, Tata Consultancy Services Limited, Chennai, India;3. Department of Computer Science and Engineering, The Pennsylvania State University, PA, USA;1. DACYA, Universidad Complutense de Madrid, Madrid 28040, Spain;2. CCS – Center for Computational Simulation, Campus de Montegancedo UPM, 28660, Spain;3. LSI – Integrated Systems Lab., Universidad Politécnica de Madrid, Madrid 28040, Spain
Abstract:Management of computing infrastructure in data centers is an important and challenging problem, that needs to: (i) ensure availability of services conforming to the Service Level Agreements (SLAs); and (ii) reduce the Power Usage Effectiveness (PUE), i.e. the ratio of total power, up to half of which is attributed to data center cooling, over the computing power to service the workloads. The cooling energy consumption can be reduced by allowing higher-than-usual thermostat set temperatures while maintaining the ambient temperature in the data center room within manufacturer-specified server redline temperatures for their reliable operations. This paper proposes: (i) a Coordinated Job, Power, and Cooling Management (JPCM) policy, which performs: (a) job management so as to allow for an increase in the thermostat setting of the cooling unit while meeting the SLA requirements, (b) power management to reduce the produced thermal load, and (c) cooling management to dynamically adjust the thermostat setting; and (ii) a Model-driven coordinated Management Architecture (MMA), which uses a state-based model to dynamically decide the correct management policy to handle events, such as new workload arrival or failure of a cooling unit, that can trigger an increase in the ambient temperature. Each event is associated with a time window, referred to as the window-of-opportunity, after which the temperature at the inlet of one or more servers can go beyond the redline temperature if proper management policies are not enforced.This window-of-opportunity monotonically decreases with increase in the incoming workload. The selection of the management policy depends on their potential energy benefits and the conformance of the delays in their actuation to the window-of-opportunity. Simulations based on actual job traces from the ASU HPC data center show that the JPCM can achieve up to 18% energy-savings over separated power or job management policies. However, high delay to reach a stable ambient temperature (in case of cooling management through dynamic thermostat setting) can violate the server redline temperatures. A management decision chart is developed as part of MMA to autonomically employ the management policy with maximum energy-savings without violating the window-of-opportunity, and hence the redline temperatures. Further, a prototype of the JPCM is developed by configuring the widely used Moab cluster manager to dynamically change the server priorities for job assignment.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号