期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Performance analysis based resource allocation for green cloud computing

Hwa Min Lee Young-Sik Jeong Haeng Jin Jang 《The Journal of supercomputing》2014,69(3):1013-1026

Cloud computing has become a new computing paradigm that has huge potentials in enterprise and business. Green cloud computing is also becoming increasingly important in a world with limited energy resources and an ever-rising demand for more computational power. To maximize utilization and minimize total cost of the cloud computing infrastructure and running applications, resources need to be managed properly and virtual machines shall allocate proper host nodes to perform the computation. In this paper, we propose performance analysis based resource allocation scheme for the efficient allocation of virtual machines on the cloud infrastructure. We experimented the proposed resource allocation algorithm using CloudSim and its performance is compared with two other existing models. 相似文献

2.

Enhancing performance of failure-prone clusters by adaptive provisioning of cloud resources

Bahman Javadi Parimala Thulasiraman Rajkumar Buyya 《The Journal of supercomputing》2013,63(2):467-489

In this paper, we investigate Cloud computing resource provisioning to extend the computing capacity of local clusters in the presence of failures. We consider three steps in the resource provisioning including resource brokering, dispatch sequences, and scheduling. The proposed brokering strategy is based on the stochastic analysis of routing in distributed parallel queues and takes into account the response time of the Cloud provider and the local cluster while considering computing cost of both sides. Moreover, we propose dispatching with probabilistic and deterministic sequences to redirect requests to the resource providers. We also incorporate checkpointing in some well-known scheduling algorithms to provide a fault-tolerant environment. We propose two cost-aware and failure-aware provisioning policies that can be utilized by an organization that operates a cluster managed by virtual machine technology, and seeks to use resources from a public Cloud provider. Simulation results demonstrate that the proposed policies improve the response time of users’ requests by a factor of 4.10 under a moderate load with a limited cost on a public Cloud. 相似文献

3.

A survey and review of the current state of rollback‐recovery for cluster systems

Andrew Maloney Andrzej Goscinski 《Concurrency and Computation》2009,21(12):1632-1666

A variety of research problems exist that require considerable time and computational resources to solve. Attempting to solve these problems produces long‐running applications that require a reliable and trustworthy system upon which they can be executed. Cluster systems provide an excellent environment upon which to run these applications because of their low cost to performance ratio; however, due to being created using commodity components they are prone to failures. This report surveyed and reviewed the issues currently relating to providing fault tolerance for long‐running applications. Several fault tolerance approaches were investigated; however, it was found that rollback‐recovery provides a favourable approach for user applications in cluster systems. Two facilities are required to provide fault tolerance using rollback‐recovery: checkpointing and recovery. It was shown here that a multitude of work has been done for enhancing checkpointing; however, the intricacies of providing recovery have been neglected. The problems associated with providing recovery include; providing transparent and autonomic recovery, selecting appropriate recovery computers, and maintaining a consistent observable behaviour when an application fails. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

4.

Asynchronous Transfer Mode and other Network Technologies for Wide-Area and High-Performance Cluster Computing

Hawick K. A. James H. A. 《The Journal of supercomputing》2001,19(3):285-297

We review fast networking technologies for both wide-area and high performance cluster computer systems. We describe our experiences in constructing asynchronous transfer mode (ATM)-based local- and wide-area clusters and the tools and technologies this experience led us to develop. We discuss our experiences using Internet Protocol on such systems as well as native ATM protocols and the problems facing wide-area integration of cluster systems. We are presently constructing Beowulf-class computer clusters using a mix of Fast Ethernet and Gigabit Ethernet technology and we anticipate how such systems will integrate into a new local-area Gigabit Ethernet network and what technologies will be used for connecting shared HPC resources across wide-areas. High latencies on wide-area cluster systems led us to develop a metacomputing problem-solving environment known as distributed information systems control world (DISCWorld). We summarize our main developments in this project as well as the key features and research directions for software to exploit computational services running on fast networked cluster systems. 相似文献

5.

Energy efficiency of dynamic management of virtual cluster with heterogeneous hardware

Jukka Kommeri Tapio Niemi Jukka K. Nurminen 《The Journal of supercomputing》2017,73(5):1978-2000

Cloud computing is an essential part of today’s computing world. Continuously increasing amount of computation with varying resource requirements is placed in large data centers. The variation among computing tasks, both in their resource requirements and time of processing, makes it possible to optimize the usage of physical hardware by applying cloud technologies. In this work, we develop a prototype system for load-based management of virtual machines in an OpenStack computing cluster. Our prototype is based on an idea of ‘packing’ idle virtual machines into special park servers optimized for this purpose. We evaluate the method by running real high-energy physics analysis software in an OpenStack test cluster and by simulating the same principle using the Cloudsim simulator software. The results show a clear improvement, 9–48 % , in the total energy efficiency when using our method together with resource overbooking and heterogeneous hardware. 相似文献

6.

A low-level resource allocation in an agent-based Cloud Computing platform

《Applied Soft Computing》2016

The distribution of computational resources in a Cloud Computing platform is a complex process with several parameters to consider such as the demand for services, available computational resources and service level agreements with end users. Currently, the state-of-the-art presents centralized approaches derived from previous technologies related to cluster of servers. These approaches allocate computational resources by means of the addition/removal of (physical/virtual) computational nodes. However, virtualization technology currently allows for research into new techniques, which makes it possible to allocate at a lower level. In other words, not only is it possible to add/remove nodes, but also to modify the resources of each virtual machine (low level resource allocation). Thus, agent theory is a key technology in this field, allowing decentralized resource allocation. This innovative approach has undeniable improvements such us computational load distribution and reduced computation time. The evaluation was carried out through experiments in a real Cloud environment, thus proving the validity of the proposed approach. 相似文献

7.

Parallel patterns for heterogeneous CPU/GPU architectures: Structured parallelism from cluster to cloud

《Future Generation Computer Systems》2014

The widespread adoption of traditional heterogeneous systems has substantially improved the computing power available and, in the meantime, raised optimisation issues related to the processing of task streams across both CPU and GPU cores in heterogeneous systems. Similar to the heterogeneous improvement gained in traditional systems, cloud computing has started to add heterogeneity support, typically through GPU instances, to the conventional CPU-based cloud resources. This optimisation of cloud resources will arguably have a real impact when running on-demand computationally-intensive applications.In this work, we investigate the scaling of pattern-based parallel applications from physical, “local” mixed CPU/GPU-clusters to a public cloud CPU/GPU infrastructure. Specifically, such parallel patterns are deployed via algorithmic skeletons to exploit a peculiar parallel behaviour while hiding implementation details.We propose a systematic methodology to exploit approximated analytical performance/cost models, and an integrated programming framework that is suitable for targeting both local and remote resources to support the offloading of computations from structured parallel applications to heterogeneous cloud resources, such that performance values not available on local resources may be actually achieved with the remote resources. The amount of remote resources necessary to achieve a given performance target is calculated through the performance models in order to allow any user to hire the amount of cloud resources needed to achieve a given target performance value. Thus, it is therefore expected that such models can be used to devise the optimal proportion of computations to be allocated on different remote nodes for Big Data computations.We present different experiments run with a proof-of-concept implementation based on FastFlow on small departmental clusters as well as on a public cloud infrastructure with CPU and GPU using the Amazon Elastic Compute Cloud. In particular, we show how CPU-only and mixed CPU/GPU computations can be offloaded to remote cloud resources with predictable performances and how data intensive applications can be mapped to a mix of local and remote resources to guarantee optimal performances. 相似文献

8.

网格环境下的集群系统作业管理研究 总被引：2，自引：4，他引：2

丁静波罗静童维勤《计算机工程与设计》2004,25(11):1900-1902

网格计算已经逐渐形成一个重要的新领域。相对于传统的分布式计算，它的显著之处在于它能够共享网络上的各种资源，包括地理上分布的各种计算资源。PBS是广泛应用于并行计算机的作业管理系统，它可以按照用户定义的配置参数相对公平地为每个作业分配系统资源。但是在网格环境范围内对集群系统进行管理仍然是一门有待研究的课题。利用网格系统软件和集群系统管理软件，实现了一种在网格环境下对集群系统作业进行管理的方法。相似文献

9.

Parallel computation on multilayer cluster grids

R. Zarrelli M. Petrone 《Concurrency and Computation》2007,19(13):1865-1878

Cluster architectures are increasingly used to solve high‐performance computing applications. To build more computational power, sets of clusters, interconnected by high‐speed networks, can be used in an elaboration to form a cluster grid. In this type of architecture, it is difficult to exploit all the internal resources of a cluster, because each one can be shielded by a firewall and is usually configured with machines where there is only one visible IP front‐end node that hides all its internal nodes from the external world. The exploitation of resources is even more complicated if we consider the general case where each internal node of a cluster can be a front‐end node of an another cluster. This type of architecture has been defined as a multilayer cluster grid. In this paper, a Parallel Virtual Machine (PVM) extension is presented which, through a middleware solution based on the H2O distributed metacomputing framework, permits the building of a parallel virtual machine in a multilayer cluster grid environment. In addition, the existing code written for PVM can be executed into this environment without modifications. Copyright © 2007 John Wiley & Sons, Ltd. 相似文献

10.

Model-driven auto-scaling of green cloud computing infrastructure

Brian DoughertyAuthor Vitae Jules White^{Author Vitae} 《Future Generation Computer Systems》2012,28(2):371-378

Cloud computing can reduce power consumption by using virtualized computational resources to provision an application’s computational resources on demand. Auto-scaling is an important cloud computing technique that dynamically allocates computational resources to applications to match their current loads precisely, thereby removing resources that would otherwise remain idle and waste power. This paper presents a model-driven engineering approach to optimizing the configuration, energy consumption, and operating cost of cloud auto-scaling infrastructure to create greener computing environments that reduce emissions resulting from superfluous idle resources. The paper provides four contributions to the study of model-driven configuration of cloud auto-scaling infrastructure by (1) explaining how virtual machine configurations can be captured in feature models, (2) describing how these models can be transformed into constraint satisfaction problems (CSPs) for configuration and energy consumption optimization, (3) showing how optimal auto-scaling configurations can be derived from these CSPs with a constraint solver, and (4) presenting a case study showing the energy consumption/cost reduction produced by this model-driven approach. 相似文献

11.

A survey of cloud resource management for complex engineering applications

Haibao CHEN Song WU Hai JIN Wenguang CHEN Jidong ZHAI Yingwei LUO Xiaolin WANG 《Frontiers of Computer Science》2016,10(3):447-461

Traditionally, complex engineering applications (CEAs), which consist of numerous components (software) and require a large amount of computing resources, usually run in dedicated clusters or high performance computing (HPC) centers. Nowadays, Cloud computing system with the ability of providing massive computing resources and customizable execution environment is becoming an attractive option for CEAs. As a new type on Cloud applications, CEA also brings the challenges of dealing with Cloud resources. In this paper, we provide a comprehensive survey of Cloud resource management research for CEAs. The survey puts forward two important questions: 1) what are the main challenges for CEAs to run in Clouds? and 2) what are the prior research topics addressing these challenges? We summarize and highlight the main challenges and prior research topics. Our work can be probably helpful to those scientists and engineers who are interested in running CEAs in Cloud environment. 相似文献

12.

集群管理在Web上的设计与实现 总被引：1，自引：0，他引：1

冯胜鹏郭雷《计算机辅助工程》2006,15(1):14-17

在某超级计算中心高性能计算平台环境下,设计并实现一个基于Web的集群管理（Cluster Management Based on Web,CMBW）系统,为集群系统管理员管理集群资源带来便利,并给终端用户提供访问集群系统资源的统一交互界面和方便的集群资源使用模式．相似文献

13.

云虚拟环境下资源分配的研究与实现 总被引：3，自引：1，他引：3

郑伟伟邹华林荣恒《软件》2012,(1):46-48,54

云计算中的基础设施层IaaS(Infrastructureas a Service)是通过虚拟化技术管理底层物理资源,向用户提供可用的计算机集群。对于如何有效的分配资源,使其利用率最大化,即使得物理机器中的各种资源的碎片最小也成了云计算中需要考虑的问题。针对这一问题,可以结合遗传算法来解决这种多目标多约束的组合优化问题,实现云虚拟环境下的资源分配问题。通过仿真实验表明,该算法可用有效的减少物理机器中的碎片,提高资源的利用率。相似文献

14.

Scaling <Emphasis Type="Italic">Ab Initio</Emphasis> Predictions of 3D Protein Structures in Microsoft Azure Cloud

Dariusz Mrozek Paweł Gosk Bożena Małysiak-Mrozek 《Journal of Grid Computing》2015,13(4):561-585

Computational methods for protein structure prediction allow us to determine a three-dimensional structure of a protein based on its pure amino acid sequence. These methods are a very important alternative to costly and slow experimental methods, like X-ray crystallography or Nuclear Magnetic Resonance. However, conventional calculations of protein structure are time-consuming and require ample computational resources, especially when carried out with the use of ab initio methods that rely on physical forces and interactions between atoms in a protein. Fortunately, at the present stage of the development of computer science, such huge computational resources are available from public cloud providers on a pay-as-you-go basis. We have designed and developed a scalable and extensible system, called Cloud4PSP, which enables predictions of 3D protein structures in the Microsoft Azure commercial cloud. The system makes use of the Warecki-Znamirowski method as a sample procedure for protein structure prediction, and this prediction method was used to test the scalability of the system. The results of the efficiency tests performed proved good acceleration of predictions when scaling the system vertically and horizontally. In the paper, we show the system architecture that allowed us to achieve such good results, the Cloud4PSP processing model, and the results of the scalability tests. At the end of the paper, we try to answer which of the scaling techniques, scaling out or scaling up, is better for solving such computational problems with the use of Cloud computing. 相似文献

15.

基于公共云的 HPC 集群实现及自动伸缩闲时计算研究

田永军何万青孙相征余洋《计算机工程与科学》2019,41(7):1155-1160

对于HPC用户来说,计算成本是迁云所考虑的重要因素之一,阿里云上提供的抢占式实例,是一种按需实例,旨在降低使用公共云计算资源成本,抢占式实例市场价格是波动的,通常远低于正常的按需实例,甚至达到正常按需实例的一折。抢占式实例一般会在创建时为用户保留一段最短时间,过后有可能会被释放,所以一般适用于无状态的应用场景。提出在公共云上的自动伸缩策略,其面向通用的HPC集群调度器,基于用户的应用软件类型、提交作业规律以及用户对性能和成本等多方面需求,自动在云上部署扩容计算资源,控制成本。对用户来说,可以做到"only pay for what you want and what you use"。基于公共云上丰富的资源规格类型和售卖方式,利用自动伸缩服务,抢占式实例,断点续算等技术可以配置低成本的公共云上HPC自动伸缩方案:用户提交作业的同时可以指定成本上限,自动伸缩服务自动在低于此成本的前提下寻找和扩容抢占式计算资源,同时利用断点续算功能保证作业在计算资源切换的时候可以继续运算。最后,通过LAMMPS和GROMACS两个高性能应用实例验证了该策略的可行性和有效性。相似文献

16.

Composable IO: a novel resource sharing platform in personal Clouds

Xiaoxin Wu Wei Wang Ben Lin Kai Miao 《The Journal of supercomputing》2012,61(2):353-370

A fundamental goal for Cloud computing is to group resources to accomplish tasks that may require strong computing or communication capability. In this paper we design specific resource sharing technology under which IO peripherals can be shared among Cloud members. In particular, in a personal Cloud that is built up by a number of personal devices, IO peripherals at any device can be applied to support application running at another device. We call this IO sharing composable IO because it is equivalent to composing IOs from different devices for an application. We design composable USB and achieve pro-migration USB access, namely a migrated application running at the targeted host can still access the USB IO peripherals at the source host. This is supplementary to traditional VM migration under which application can only use resources from the device where the application runs. We address reliability issues by keeping a backup VM. In addition, we define a security framework to ensure operating environment security when using composable IO in personal environment. Experimental results show that through composable IO applications in personal Cloud can achieve much better user experience. 相似文献

17.

Power efficient server consolidation for Cloud data center

《Future Generation Computer Systems》2017

Cloud computing has become an essential part of the global digital economy due to its extensibility, flexibility and reduced costs of operations. Nowadays, data centers (DCs) contain thousands of different machines running a huge number of diverse applications over an extended period. Resource management in Cloud is an open issue since an efficient resource allocation can reduce the infrastructure running cost. In this paper, we propose a snapshot-based solution for server consolidation problem from Cloud infrastructure provider (CIP) perspective. Our proposed mathematical formulation aims at reducing power cost by employing efficient server consolidation, and also considering the issues such as (i) mapping incoming and failing virtual machines (VMs), (ii) reducing a total number of VM migrations and (iii) consolidating running server workloads. We also compare the performance of our proposed model to the well-known Best Fit heuristics and its extension to include server consolidation via VM migration denoted as Best Fit with Consolidation (BFC). Our proposed mathematical formulation allows us to measure the solution quality in absolute terms, and it can also be applicable in practice. In our simulations, we show that relevant improvements (from 6% to 15%) over the widely adopted Best Fit algorithm achieved in a reasonable computing time. 相似文献

18.

Analyzing real cluster data for formulating allocation algorithms in cloud platforms

《Parallel Computing》2016

A problem commonly faced in Computer Science research is the lack of real usage data that can be used for the validation of algorithms. This situation is particularly true and crucial in Cloud Computing. The privacy of data managed by commercial Cloud infrastructures, together with their massive scale, makes them very uncommon to be available to the research community. Due to their scale, when designing resource allocation algorithms for Cloud infrastructures, many assumptions must be made in order to make the problem tractable.This paper provides deep analysis of a cluster data trace recently released by Google and focuses on a number of questions which have not been addressed in previous studies. In particular, we describe the characteristics of job resource usage in terms of dynamics (how it varies with time), of correlation between jobs (identify daily and/or weekly patterns), and correlation inside jobs between the different resources (dependence of memory usage on CPU usage). From this analysis, we propose a way to formalize the allocation problem on such platforms, which encompasses most job features from the trace with a small set of parameters. 相似文献

19.

CloudyGame: Enabling cloud gaming on the edge with dynamic asset streaming and shared game instances

Bhojan Anand Ng Siang Ping Ng Joel Ooi Wei Tsang 《Multimedia Tools and Applications》2020,79(43-44):32503-32523

Multimedia Tools and Applications - Cloud gaming has emerged as a new computer game delivery paradigm that promises gaming anywhere, anytime, on any device, by running the computer game on a cloud... 相似文献

20.

Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing

《Future Generation Computer Systems》2015

Today, almost everyone is connected to the Internet and uses different Cloud solutions to store, deliver and process data. Cloud computing assembles large networks of virtualized services such as hardware and software resources. The new era in which ICT penetrated almost all domains (healthcare, aged-care, social assistance, surveillance, education, etc.) creates the need of new multimedia content-driven applications. These applications generate huge amount of data, require gathering, processing and then aggregation in a fault-tolerant, reliable and secure heterogeneous distributed system created by a mixture of Cloud systems (public/private), mobile devices networks, desktop-based clusters, etc. In this context dynamic resource provisioning for Big Data application scheduling became a challenge in modern systems. We proposed a resource-aware hybrid scheduling algorithm for different types of application: batch jobs and workflows. The proposed algorithm considers hierarchical clustering of the available resources into groups in the allocation phase. Task execution is performed in two phases: in the first, tasks are assigned to groups of resources and in the second phase, a classical scheduling algorithm is used for each group of resources. The proposed algorithm is suitable for Heterogeneous Distributed Computing, especially for modern High-Performance Computing (HPC) systems in which applications are modeled with various requirements (both IO and computational intensive), with accent on data from multimedia applications. We evaluate their performance in a realistic setting of CloudSim tool with respect to load-balancing, cost savings, dependency assurance for workflows and computational efficiency, and investigate the computing methods of these performance metrics at runtime. 相似文献