共查询到20条相似文献,搜索用时 15 毫秒
1.
Complex web applications are usually served by multi-tier web clusters. With the growing cost of energy, the importance of reducing power consumption in server systems is now well-known and has become a major research topic. However, most existing research focused solely on homogeneous clusters. This paper addresses the challenge of power management in Heterogeneous Multi-tier Web Clusters. We apply Generalized Benders Decomposition (GBD) to decompose the global optimization problem into small sub-problems. This algorithm achieves the optimal solution in an iterative fashion. The evaluation results show that our algorithm achieves more energy conservation than the previous work. 相似文献
2.
本文采用格子Boltzmann方法(LBM)在图形处理器(GPU)上计算了由静止圆柱阵列组成的团聚物周期单元内的不可压缩流体流动,流固交界面处采用直接反弹以实现无滑移边界,每个圆柱上的曳力通过统计动量交换直接求得。根据LBM求得的流体速度,对于团聚物中的单圆柱按能量最小多尺度(EMMS)模型计算平均曳力系数,并考察了将聚团近似为均匀悬浮的临界条件。对颗粒雷诺数Re_p在0~10之间的80种固相份额的模拟结果表明,密相空隙率可以表征这种临界条件。当固相份额恒定时,该临界空隙率随着Re_p的增加而降低;当Re_p恒定时,该临界空隙率随着固相份额的增加而降低。 相似文献
3.
Three-dimensional Networks-on-Chips (3D NoCs) have recently been proposed to address the on-chip communication demands of future highly dense 3D multi-core systems. Homogeneous 3D NoC topologies have many Through Silicon Vias (TSVs) which have a costly and complex manufacturing process. Also, 3D routers use more memory and are more power hungry than conventional 2D routers. Alternatively, heterogeneous 3D NoCs combine both the area and performance benefits of 2D and 3D static router architectures by using a limited number of TSVs. To improve the performance of heterogeneous 3D NoCs, we propose an adaptive router architecture which balances the traffic in such NoCs. Particularly, experimental results show that our proposed architecture significantly improves the performance up to 75% by replacing 2D static routers with adaptive 2D routers in heterogeneous 3D NoCs, while keeping the maximum clock frequency, power and energy consumption of the adaptive router nearly at the same level as the static router. 相似文献
4.
To reduce the environmental impact, it is essential to make data centers green, by turning off servers and tuning their speeds for the instantaneous load offered, that is, determining the dynamic configuration in web server clusters. We model the problem of selecting the servers that will be on and finding their speeds through mixed integer programming; we also show how to combine such solutions with control theory. For proof of concept, we implemented this dynamic configuration scheme in a web server cluster running Linux, with soft real-time requirements and QoS control, in order to guarantee both energy-efficiency and good user experience. In this paper, we show the performance of our scheme compared to other schemes, a comparison of a centralized and a distributed approach for QoS control, and a comparison of schemes for choosing speeds of servers. 相似文献
5.
Executing heterogeneous workloads with different priorities, resource demands and performance objectives is one of the key operations for today’s data centers to increase resource as well as energy efficiency. In order to meet the performance objectives of diverse workloads, schedulers rely on evictions even resulting in waste of resources due to lost executions of evicted tasks. It is not straightforward to design priority schedulers which capture key aspects of workloads and systems and also to strike a balance between resource (in)efficiency and application performance tradeoff. To explore large space of designing such schedulers, we propose a trace-driven cluster management framework that models a comprehensive set of system configurations and general priority-based scheduling policies. In particular, we focus on the impact of task evictions on resource inefficiency and task response times of multiple priority classes driven by Google production cluster trace. Moreover, we propose a system design as a use case exploiting workload heterogeneity and introducing workload-awareness into the system configuration and task assignment. 相似文献
6.
The Journal of Supercomputing - This article presents a set of linear regression models to predict the impact of task migration on different objectives, like performance and energy consumption. It... 相似文献
7.
The performance of a conventional parallel application is often degraded by load‐imbalance on heterogeneous clusters. Although it is simple to invoke multiple processes on fast processing elements to alleviate load‐imbalance, the optimal process allocation is not obvious. Kishimoto and Ichikawa presented performance models for high‐performance Linpack (HPL), with which the sub‐optimal configurations of heterogeneous clusters were actually estimated. Their results on HPL are encouraging, whereas their approach is not yet verified with other applications. This study presents some enhancements of Kishimoto's scheme, which are evaluated with four typical scientific applications: computational fluid dynamics (CFD), finite‐element method (FEM), HPL (linear algebraic system), and fast Fourier transform (FFT). According to our experiments, our new models (NP‐T models) are superior to Kishimoto's models, particularly when the non‐negative least squares method is used for parameter extraction. The average errors of the derived models were 0.2% for the CFD benchmark, 2% for the FEM benchmark, 1% for HPL, and 28% for the FFT benchmark. This study also emphasizes the importance of predictability in clusters, listing practical examples derived from our study. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献
8.
Loop partitioning on parallel and distributed systems has been a critical problem. Furthermore, it becomes more difficult
to deal with on the emerging heterogeneous PC cluster environments. In the past, some loop self-scheduling schemes have been
proposed to be applicable to heterogeneous cluster environments. In this paper, we propose a performance-based approach, which
partitions loop iterations according to the performance ratio of cluster nodes. To verify the proposed approach, a heterogeneous
cluster is built, and three types of application programs are implemented to be executed in this testbed. Experimental results
show that the proposed approach performs better than traditional schemes.
相似文献
10.
Scalable storage architectures enable digital libraries and archives for the addition or removal of storage devices to increase
storage capacity and bandwidth or retire older devices. Past work in this area have mainly focused on statically scaling homogeneous
storage devices. However, heterogeneous devices are quickly being adopted for storage scaling since they are usually faster,
larger, more widely available, and more cost-effective. We propose BroadScale, an algorithm based on Random Disk Labeling,
to dynamically scale heterogeneous storage systems by distributing data objects according to their device weights. Assuming
a random placement of objects across a group of heterogeneous storage devices, our optimization objectives when scaling are
to ensure a uniform distribution of objects, redistribute a minimum number of objects, and maintain fast data access with
low computational complexity. We show through experimentation that BroadScale achieves these requirements when scaling heterogeneous
storage. 相似文献
11.
Data distribution and load balancing become increasingly important in large-scale distributed storage system. This paper -focuses on the problem of designing an optimal, self-adaptive strategies for balanced distribution and reorganization of replicated objects among a dynamically heterogeneous nodes, and presents a novel decentralized algorithm, Dynamic Interval Mapping, which maps replicated objects to a scalable collection of nodes, it distributes objects to nodes optimally, redistributing minimum amount of objects when new nodes are added or existing nodes are removed to maintain the balanced distribution. It supports weighted allocation and guarantees that replicas of a particular object are not placed on the same node. The time complexity and storage requirements are superior to previous methods. 相似文献
12.
Multi-tenancy promises high utilization of available system resources and helps maintaining cost-effective operations for service providers. However, multi-tenant high-performance computing (HPC) infrastructures, like dynamic HPC clouds, bring unique challenges, both associated with providing performance isolation to the tenants, and achieving efficient load-balancing across the network fabric. Each tenant should experience predictable network performance, unaffected by the workload of other tenants. At the same time, it is equally important that the network links are balanced, avoiding network saturation. The network saturation can lead to unpredictable application performance, and a potential loss of profit for the cloud service providers.In this paper, we present two significant extensions to our previously proposed partition-aware fat-tree routing algorithm, pFTree, for InfiniBand-based HPC systems. First, we extend pFTree to incorporate provider defined partition-wise policies that govern how the nodes in different partitions are allowed to share network resources with each other. Second, we present a weighted version of the pFTree routing algorithm, that besides partitions, also takes node traffic characteristics into account to balance load across the network links more evenly. A comprehensive evaluation comprising both real-world experiments and simulations confirms the correctness and feasibility of the proposed extensions. 相似文献
13.
This paper addresses high level synthesis for real-time digital signal processing (DSP) architectures using heterogeneous functional units (FUs). For such special purpose architecture synthesis, an important problem is how to assign a proper FU type to each operation of a DSP application and generate a schedule in such a way that all requirements can be met and the total cost can be minimized. We propose a two-phase approach to solve this problem. In the first phase, we solve the heterogeneous assignment problem, i.e., how to assign proper FU types to applications such that the total cost can be minimized while the timing constraint is satisfied. In the second phase, based on the assignments obtained in the first phase, we propose a minimum resource scheduling algorithm to generate a schedule and a feasible configuration that uses as little resource as possible. We prove that the heterogeneous assignment problem is NP-complete. Efficient algorithms are proposed to find an optimal solution when the given DFG is a simple path or a tree. Three other algorithms are proposed to solve the general problem. The experiments show that our algorithms can effectively reduce the total cost compared with the previous work. 相似文献
14.
Wireless Sensor Networks (WSNs) are useful for a wide range of applications, from different domains. Recently, new features and design trends have emerged in the WSN field, making those networks appealing not only to the scientific community but also to the industry. One such trend is the running different applications on heterogeneous sensor nodes deployed in multiple WSNs in order to better exploit the expensive physical network infrastructure. Another trend deals with the capability of accessing sensor generated data from the Web, fitting WSNs in novel paradigms of Internet of Things (IoT) and Web of Things (WoT). Using well-known and broadly accepted Web standards and protocols enables the interoperation of heterogeneous WSNs and the integration of their data with other Web resources, in order to provide the final user with value-added information and applications. Such emergent scenarios where multiple networks and applications interoperate to meet high level requirements of the user will pose several changes in the design and execution of WSN systems. One of these challenges regards the fact that applications will probably compete for the resources offered by the underlying sensor nodes through the Web. Thus, it is crucial to design mechanisms that effectively and dynamically coordinate the sharing of the available resources to optimize resource utilization while meeting application requirements. However, it is likely that Quality of Service (QoS) requirements of different applications cannot be simultaneously met, while efficiently sharing the scarce networks resources, thus bringing the need of managing an inherent tradeoff. In this paper, we argue that a middleware platform is required to manage heterogeneous WSNs and efficiently share their resources while satisfying user needs in the emergent scenarios of WoT. Such middleware should provide several services to control running application as well as to distribute and coordinate nodes in the execution of submitted sensing tasks in an energy-efficient and QoS-enabled way. As part of the middleware provided services we present the Resource Allocation in Heterogeneous WSNs (SACHSEN) algorithm. SACHSEN is a new resource allocation heuristic for systems composed of heterogeneous WSNs that effectively deals with the tradeoff between possibly conflicting QoS requirements and exploits heterogeneity of multiple WSNs. 相似文献
15.
当宽带大容量数据采集进入并行计算机网络后,通过集群计算方式对强衰弱通信信号实现高增益、低延时处理,达到有效实时解译通信数据的目的。提出了一种新的动态启发式调度算法——MDS算法。该算法综合考虑任务的时间要求、系统吞吐率和负载均衡。在任务的截止期较短的情况下,MDS算法仍能保证任务具有较高的调度成功率;同时在满足任务截止期的条件下系统具有较高的吞吐率并达到负载均衡。通过实验测试,分析了一些任务参数对MDS算法的影响,并与其他算法进行了比较。实验结果表明,MDS算法优于其他算法。 相似文献
16.
This paper presents a load balancing algorithm specifically designed for heterogeneous clusters, composed of nodes with different computational capabilities. The method is based on a new index, which takes into consideration two levels of processors heterogeneity: the number of cores per node and the computational power of each core. The experimental results show that this index allows achieving balanced workload distributions even on those clusters where heterogeneity can not be neglected. 相似文献
17.
Developing energy-efficient clusters not only can reduce power electricity cost but also can improve system reliability. Existing scheduling strategies developed for energy-efficient clusters conserve energy at the cost of performance. The performance problem becomes especially apparent when cluster computing systems are heavily loaded. To address this issue, we propose in this paper a novel scheduling strategy–adaptive energy-efficient scheduling or AEES–for aperiodic and independent real-time tasks on heterogeneous clusters with dynamic voltage scaling. The AEES scheme aims to adaptively adjust voltages according to the workload conditions of a cluster, thereby making the best trade-offs between energy conservation and schedulability. When the cluster is heavily loaded, AEES considers voltage levels of both new tasks and running tasks to meet tasks’ deadlines. Under light load, AEES aggressively reduces the voltage levels to conserve energy while maintaining higher guarantee ratios. We conducted extensive experiments to compare AEES with an existing algorithm–MEG, as well as two baseline algorithms–MELV, MEHV. Experimental results show that AEES significantly improves the scheduling quality of MELV, MEHV and MEG. 相似文献
18.
云计算集群环境下多资源分配的公平性是考量资源调度子系统最重要的指标之一,DRF作为通用的多资源公平分配算法,在异构异质的集群环境下可能有失公平性。在研究Mesos框架中DRF多资源公平分配算法的基础上,设计并实现了增加机器性能评估影响因子的meDRF分配算法。将计算节点的机器性能得分,作为DRF主导份额计算的因子,使得计算任务有均等的机会获得优质计算资源和劣质计算资源。通过选取 K-means、Bayes及PageRank等多种作业进行实验,实验结果表明:meDRF较DRF分配算法更能体现多资源分配的公平性,且资源分配具有更好的稳定性,能有效提高系统资源的利用率。 相似文献
19.
Remotely sensed hyperspectral sensors provide image data containing rich information in both the spatial and the spectral
domain, and this information can be used to address detection tasks in many applications. One of the most widely used and
successful algorithms for anomaly detection in hyperspectral images is the RX algorithm. Despite its wide acceptance and high
computational complexity when applied to real hyperspectral scenes, few approaches have been developed for parallel implementation
of this algorithm. In this paper, we evaluate the suitability of using a hybrid parallel implementation with a high-dimensional
hyperspectral scene. A general strategy to automatically map parallel hybrid anomaly detection algorithms for hyperspectral
image analysis has been developed. Parallel RX has been tested on an heterogeneous cluster using this routine. The considered
approach is quantitatively evaluated using hyperspectral data collected by the NASA’s Airborne Visible Infra-Red Imaging Spectrometer
system over the World Trade Center in New York, 5 days after the terrorist attacks. The numerical effectiveness of the algorithms
is evaluated by means of their capacity to automatically detect the thermal hot spot of fires (anomalies). The speedups achieved
show that a cluster of multi-core nodes can highly accelerate the RX algorithm. 相似文献
20.
Heterogeneous cluster systems consisting of CPUs and different kinds of accelerators have become mainstream in HPC. Programming such systems is a difficult task and requires addressing manifold challenges that stem from the intricate composition of such systems and peculiarities of scientific applications.
A broad range of obstacles preventing efficient execution have to be considered and dealt with properly. In this paper, we propose a systematic approach and a framework that is capable of providing comprehensive support for running data-parallel applications in heterogeneous asymmetric clusters. Our implementation provides work partitioning and distribution by ensuring workload balance in the cluster while handling of partitioning-induced communication and synchronization in a transparent way. In our experimental section, we choose 11 representative scientific applications from different domains to evaluate our approach. Experimental results show a strong speedup and workload balance for different cluster configurations. 相似文献
|