期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers 总被引：4，自引：0，他引：4

Saurabh Kumar GargAuthor Vitae Chee Shin Yeo^{Author Vitae} 《Journal of Parallel and Distributed Computing》2011,71(6):732-749

The use of High Performance Computing (HPC) in commercial and consumer IT applications is becoming popular. HPC users need the ability to gain rapid and scalable access to high-end computing capabilities. Cloud computing promises to deliver such a computing infrastructure using data centers so that HPC users can access applications and data from a Cloud anywhere in the world on demand and pay based on what they use. However, the growing demand drastically increases the energy consumption of data centers, which has become a critical issue. High energy consumption not only translates to high energy cost which will reduce the profit margin of Cloud providers, but also high carbon emissions which are not environmentally sustainable. Hence, there is an urgent need for energy-efficient solutions that can address the high increase in the energy consumption from the perspective of not only the Cloud provider, but also from the environment. To address this issue, we propose near-optimal scheduling policies that exploit heterogeneity across multiple data centers for a Cloud provider. We consider a number of energy efficiency factors (such as energy cost, carbon emission rate, workload, and CPU power efficiency) which change across different data centers depending on their location, architectural design, and management system. Our carbon/energy based scheduling policies are able to achieve on average up to 25% of energy savings in comparison to profit based scheduling policies leading to higher profit and less carbon emissions. 相似文献

2.

Hybrid shuffled frog leaping algorithm for energy-efficient dynamic consolidation of virtual machines in cloud data centers

《Expert systems with applications》2014,41(13):5804-5816

Cloud computing aims to provide dynamic leasing of server capabilities as scalable virtualized services to end users. However, data centers hosting cloud applications consume vast amounts of electrical energy, thereby contributing to high operational costs and carbon footprints. Green cloud computing solutions that can not only minimize the operational costs but also reduce the environmental impact are necessary. This study focuses on the Infrastructure as a Service model, where custom virtual machines (VMs) are launched in appropriate servers available in a data center. A complete data center resource management scheme is presented in this paper. The scheme can not only ensure user quality of service (through service level agreements) but can also achieve maximum energy saving and green computing goals. Considering that the data center host is usually tens of thousands in size and that using an exact algorithm to solve the resource allocation problem is difficult, the modified shuffled frog leaping algorithm and improved extremal optimization are employed in this study to solve the dynamic allocation problem of VMs. Experimental results demonstrate that the proposed resource management scheme exhibits excellent performance in green cloud computing. 相似文献

3.

Task scheduling for parallel sparse Cholesky factorization

G. A. Geist E. Ng 《International journal of parallel programming》1989,18(4):291-314

This paper presents a solution to the problem of partitioning the work for sparse matrix factorization to individual processors on a multiprocessor system. The proposed task assignment strategy is based on the structure of the elimination tree associated with the given sparse matrix. The goal of the task scheduling strategy is to achieve load balancing and a high degree of concurrency among the processors while reduçing the amount of processor-to-processor data comnication, even for arbitrarily unbalanced elimination trees. This is important because popular fill-reducing ordering methods, such as the minimum degree algorithm, often produce unbalanced elimination trees. Results from the Intel iPSC/2 are presented for various finite-element problems using both nested dissection and minimum degree orderings.Research supported by the Applied Mathematical Sciences Research Program, Office of Energy Research, U.S. Department of Energy under contract DE-AC05-84OR21400 with Martin Marietta Energy Systems Inc. 相似文献

4.

Optimizing virtual machine allocation for parallel scientific workflows in federated clouds

《Future Generation Computer Systems》2015

Cloud computing has established itself as an interesting computational model that provides a wide range of resources such as storage, databases and computing power for several types of users. Recently, the concept of cloud computing was extended with the concept of federated clouds where several resources from different cloud providers are inter-connected to perform a common action (e.g. execute a scientific workflow). Users can benefit from both single-provider and federated cloud environment to execute their scientific workflows since they can get the necessary amount of resources on demand. In several of these workflows, there is a demand for high performance and parallelism techniques since many activities are data and computing intensive and can execute for hours, days or even weeks. There are some Scientific Workflow Management Systems (SWfMS) that already provide parallelism capabilities for scientific workflows in single-provider cloud. Most of them rely on creating a virtual cluster to execute the workflow in parallel. However, they also rely on the user to estimate the amount of virtual machines to be allocated to create this virtual cluster. Most SWfMS use this initial virtual cluster configuration made by the user for the entire workflow execution. Dimensioning the virtual cluster to execute the workflow in parallel is then a top priority task since if the virtual cluster is under or over dimensioned it can impact on the workflow performance or increase (unnecessarily) financial costs. This dimensioning is far from trivial in a single-provider cloud and specially in federated clouds due to the huge number of virtual machine types to choose in each location and provider. In this article, we propose an approach named GraspCC-fed to produce the optimal (or near-optimal) estimation of the amount of virtual machines to allocate for each workflow. GraspCC-fed extends a previously proposed heuristic based on GRASP for executing standalone applications to consider scientific workflows executed in both single-provider and federated clouds. For the experiments, GraspCC-fed was coupled to an adapted version of SciCumulus workflow engine for federated clouds. This way, we believe that GraspCC-fed can be an important decision support tool for users and it can help determining an optimal configuration for the virtual cluster for parallel cloud-based scientific workflows. 相似文献

5.

Near real-time parallel processing and advanced data management of SAR images in grid environments

Massimo Cafaro Italo Epicoco Sandro Fiore Daniele Lezzi Silvia Mocavero Giovanni Aloisio 《Journal of Real-Time Image Processing》2009,4(3):219-227

In this paper, we describe the process of parallelizing an existing, production level, sequential Synthetic Aperture Radar (SAR) processor based on the Range-Doppler algorithmic approach. We show how, taking into account the constraints imposed by the software architecture and related software engineering costs, it is still possible with a moderate programming effort to parallelize the software and present an message-passing interface (MPI) implementation whose speedup is about 8 on 9 processors, achieving near real-time processing of raw SAR data even on a moderately aged parallel platform. Moreover, we discuss a hybrid two-level parallelization approach that involves the use of both MPI and OpenMP. We also present GridStore, a novel data grid service to manage raw, focused and post-processed SAR data in a grid environment. Indeed, another aim of this work is to show how the processed data can be made available in a grid environment to a wide scientific community, through the adoption of a data grid service providing both metadata and data management functionalities. In this way, along with near real-time processing of SAR images, we provide a data grid-oriented system for data storing, publishing, management, etc.

Giovanni AloisioEmail:

相似文献

6.

Distributed data organization and parallel data retrieval methods for huge laser scanner point clouds 总被引：1，自引：0，他引：1

Ma Hongchao Zongyue Wang 《Computers & Geosciences》2011,37(2):193-201

This paper proposes a novel method for distributed data organization and parallel data retrieval from huge volume point clouds generated by airborne Light Detection and Ranging (LiDAR) technology under a cluster computing environment, in order to allow fast analysis, processing, and visualization of the point clouds within a given area. The proposed method is suitable for both grid and quadtree data structures. As for distribution strategy, cross distribution of the dataset would be more efficient than serial distribution in terms of non-redundant datasets, since a dataset is more uniformly distributed in the former arrangement. However, redundant datasets are necessary in order to meet the frequent need of input and output operations in multi-client scenarios: the first copy would be distributed by a cross distribution strategy while the second (and later) would be distributed by an iterated exchanging distribution strategy. Such a distribution strategy would distribute datasets more uniformly to each data server. In data retrieval, a greedy algorithm is used to allocate the query task to a data server, where the computing load is lightest if the data block needing to be retrieved is stored among multiple data servers. Experiments show that the method proposed in this paper can satisfy the demands of frequent and fast data query. 相似文献

7.

SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters

Rong Gu Xiaoliang Yang Jinshuang Yan Yuanhao Sun Bing Wang Chunfeng Yuan Yihua Huang 《Journal of Parallel and Distributed Computing》2014

As a widely-used parallel computing framework for big data processing today, the Hadoop MapReduce framework puts more emphasis on high-throughput of data than on low-latency of job execution. However, today more and more big data applications developed with MapReduce require quick response time. As a result, improving the performance of MapReduce jobs, especially for short jobs, is of great significance in practice and has attracted more and more attentions from both academia and industry. A lot of efforts have been made to improve the performance of Hadoop from job scheduling or job parameter optimization level. In this paper, we explore an approach to improve the performance of the Hadoop MapReduce framework by optimizing the job and task execution mechanism. First of all, by analyzing the job and task execution mechanism in MapReduce framework we reveal two critical limitations to job execution performance. Then we propose two major optimizations to the MapReduce job and task execution mechanisms: first, we optimize the setup and cleanup tasks of a MapReduce job to reduce the time cost during the initialization and termination stages of the job; second, instead of adopting the loose heartbeat-based communication mechanism to transmit all messages between the JobTracker and TaskTrackers, we introduce an instant messaging communication mechanism for accelerating performance-sensitive task scheduling and execution. Finally, we implement SHadoop, an optimized and fully compatible version of Hadoop that aims at shortening the execution time cost of MapReduce jobs, especially for short jobs. Experimental results show that compared to the standard Hadoop, SHadoop can achieve stable performance improvement by around 25% on average for comprehensive benchmarks without losing scalability and speedup. Our optimization work has passed a production-level test in Intel and has been integrated into the Intel Distributed Hadoop (IDH). To the best of our knowledge, this work is the first effort that explores on optimizing the execution mechanism inside map/reduce tasks of a job. The advantage is that it can complement job scheduling optimizations to further improve the job execution performance. 相似文献

8.

Towards building a cloud for scientific applications

Lizhe Wang Marcel KunzeJie Tao Gregor von Laszewski 《Advances in Engineering Software》2011,42(9):714-722

The Cloud computing becomes an innovative computing paradigm, which aims to provide reliable, customized and QoS guaranteed computing infrastructures for users. This paper presents our early experience of Cloud computing based on the Cumulus project for compute centers. In this paper, we give the Cloud computing definition and Cloud computing functionalities. This paper also introduces the Cumulus project with its various aspects, such as design pattern, infrastructure, and middleware. This paper delivers the state-of-the-art for Cloud computing with theoretical definition and practical experience. 相似文献

9.

MRPC: A high performance RPC system for MPMD parallel computing

Chi‐Chao Chang Grzegorz Czajkowski Thorsten Von Eicken 《Software》1999,29(1):43-66

MRPC is an RPC system that is designed and optimized for MPMD parallel computing. Existing systems based on standard RPC incur an unnecessarily high cost when used on high‐performance multi‐computers, limiting the appeal of RPC‐based languages in the parallel computing community. MRPC combines the efficient control and data transfer provided by Active Messages (AM) with a minimal multithreaded runtime system that extends AM with the features required to support MPMD. This approach introduces only the necessary RPC overheads for an MPMD environment. MRPC has been integrated into Compositional C++ (CC++), a parallel extension of C++ that offers an MPMD programming model. Basic performance in MRPC is within a factor of two from those of Split‐C, a highly tuned SPMD language, and other messaging layers. CC++ applications perform within a factor of two to six from comparable Split‐C versions, which represent an order of magnitude improvement over previous CC++ implementations. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献

10.

Impact of platform heterogeneity on the design of parallel algorithms for morphological processing of high-dimensional image data

Antonio Plaza Javier Plaza David Valencia 《The Journal of supercomputing》2007,40(1):81-107

The main objective of this paper is to describe a realistic framework to understand parallel performance of high-dimensional image processing algorithms in the context of heterogeneous networks of workstations (NOWs). As a case study, this paper explores techniques for mapping hyperspectral image analysis techniques onto fully heterogeneous NOWs. Hyperspectral imaging is a new technique in remote sensing that has gained tremendous popularity in many research areas, including satellite imaging and aerial reconnaissance. The automation of techniques able to transform massive amounts of hyperspectral data into scientific understanding in valid response times is critical for space-based Earth science and planetary exploration. Using an evaluation strategy which is based on comparing the efficiency achieved by an heterogeneous algorithm on a fully heterogeneous NOW with that evidenced by its homogeneous version on a homogeneous NOW with the same aggregate performance as the heterogeneous one, we develop a detailed analysis of parallel algorithms that integrate the spatial and spectral information in the image data through mathematical morphology concepts. For comparative purposes, performance data for the tested algorithms on Thunderhead (a large-scale Beowulf cluster at NASA’s Goddard Space Flight Center) are also provided. Our detailed investigation of the parallel properties of the proposed morphological algorithms provides several intriguing findings that may help image analysts in selection of parallel techniques and strategies for specific applications.

Antonio PlazaEmail:

相似文献

11.

Open-source simulators for Cloud computing: Comparative study and challenging issues

《Simulation Modelling Practice and Theory》2015

Resource scheduling in infrastructure as a service (IaaS) is one of the keys for large-scale Cloud applications. Extensive research on all issues in real environment is extremely difficult because it requires developers to consider network infrastructure and the environment, which may be beyond the control. In addition, the network conditions cannot be controlled or predicted. Performance evaluations of workload models and Cloud provisioning algorithms in a repeatable manner under different configurations are difficult. Therefore, simulators are developed. To understand and apply better the state-of-the-art of Cloud computing simulators, and to improve them, we study four known open-source simulators. They are compared in terms of architecture, modeling elements, simulation process, performance metrics and scalability in performance. Finally, a few challenging issues as future research trends are outlined. 相似文献

12.

Probabilistic performance analysis for parallel search techniques

Wei-Ming Lin Bo Yang 《International journal of parallel programming》1995,23(2):161-189

This paper discusses the performance analysis of two generic fundamental parallel search techniques on shared memory multi-processor systems in solving the constraint satisfaction problem (CSP). Probabilistic analysis on their expected computation steps needed and their inherent load-balancing capability is performed. Corresponding experimental results are alsoprovided to verify the correctness of the proposed analysis. This fundamental analysis approach can be further applied to various advanced parallel search techniques or various problem solving techniques on parallel platforms. This research was supported in part by the University of Texas at San Antonio under the Faculty Research Award program 相似文献

13.

A case study on expressiveness and performance of component-oriented parallel programming

Francisco Heron de Carvalho Junior Cenez Araújo de Rezende 《Journal of Parallel and Distributed Computing》2013

Component-oriented programming has been applied to address the requirements of large-scale applications from computational sciences and engineering that present high performance computing (HPC) requirements. However, parallelism continues to be a challenging requirement in the design of CBHPC (Component-Based High Performance Computing) platforms. This paper presents strong evidence about the efficacy and the efficiency of HPE (Hash Programming Environment), a CBHPC platform that provides full support for parallel programming, on the development, deployment and execution of numerical simulation code onto cluster computing platforms. 相似文献

14.

A security and cost aware scheduling algorithm for heterogeneous tasks of scientific workflow in clouds

《Future Generation Computer Systems》2016

Security is increasingly critical for various scientific workflows that are big data applications and typically take quite amount of time being executed on large-scale distributed infrastructures. Cloud computing platform is such an infrastructure that can enable dynamic resource scaling on demand. Nevertheless, based on pay-per-use and hourly-based pricing model, users should pay attention to the cost incurred by renting virtual machines (VMs) from cloud data centers. Meanwhile, workflow tasks are generally heterogeneous and require different instance series (i.e., computing optimized, memory optimized, storage optimized, etc.). In this paper, we propose a security and cost aware scheduling (SCAS) algorithm for heterogeneous tasks of scientific workflow in clouds. Our proposed algorithm is based on the meta-heuristic optimization technique, particle swarm optimization (PSO), the coding strategy of which is devised to minimize the total workflow execution cost while meeting the deadline and risk rate constraints. Extensive experiments using three real-world scientific workflow applications, as well as CloudSim simulation framework, demonstrate the effectiveness and practicality of our algorithm. 相似文献

15.

Preface: Security and privacy in big data clouds

《Future Generation Computer Systems》2017

This special issue assembles a set of twelve papers, which provide new insights on the security and privacy technology of big data in cloud computing environments. This preface provides overview of all articles in the viewpoint set. 相似文献

16.

iPACS: Power-aware covering sets for energy proportionality and performance in data parallel computing clusters

Jinoh Kim Jerry Chou Doron Rotem 《Journal of Parallel and Distributed Computing》2014

Energy consumption in datacenters has recently become a major concern due to the rising operational costs and scalability issues. Recent solutions to this problem propose the principle of energy proportionality, i.e., the amount of energy consumed by the server nodes must be proportional to the amount of work performed. For data parallelism and fault tolerance purposes, most common file systems used in MapReduce-type clusters maintain a set of replicas for each data block. A covering subset is a group of nodes that together contain at least one replica of the data blocks needed for performing computing tasks. In this work, we develop and analyze algorithms to maintain energy proportionality by discovering a covering subset that minimizes energy consumption while placing the remaining nodes in low-power standby mode in a data parallel computing cluster. Our algorithms can also discover covering subset in heterogeneous computing environments. In order to allow more data parallelism, we generalize our algorithms so that it can discover k

k

-covering subset, i.e., a set of nodes that contain at least k

k

replicas of the data blocks. Our experimental results show that we can achieve substantial energy saving without significant performance loss in diverse cluster configurations and working environments. 相似文献

17.

CEA: A Cyclic Expansion Algorithm for data migration in parallel video servers

Mingfu Li Hsun-Hao YangAuthor Vitae 《Journal of Parallel and Distributed Computing》2012

Parallel video servers can achieve highly storage-saving and granularly load-balancing, but they suffer from a system expansion problem. As the number of users continuously increases, the system inevitably needs to expand the number of video servers. However, the expansion of a parallel video server system is not as simple as that of a replicated video server system. Hence, this work develops an efficient expansion algorithm, called the Cyclic Expansion Algorithm (CEA), for parallel video servers. The proposed CEA algorithm has several good features. First, the data layout of each video content exhibits periodicity. Consequently, the meta-data size of each video and the complexity of the CEA algorithm are reduced. Second, the number of required data movements during a system expansion is optimized. Third, the total number of required XOR recomputations for updating parity blocks during an expansion is also minimized. Additionally, the new CEA can be applied to a variety of distributed storage systems, such as the cloud-based storage systems using striping and parity check techniques. 相似文献

18.

云计算中基于任务分层和时间约束的关联任务调度算法

陈曦毛莺池接青朱沥沥《计算机应用》2014,34(11):3069-3072

针对云计算中对关联任务进行调度时出现任务执行延迟的问题,提出了一种基于任务分层和时间约束的关联任务调度(RTS-THTC)算法。该算法采用构建有向无环图(DAG)的方式表示关联任务的执行次序,通过使用对DAG进行分层的方法提高任务的并行性,计算每一层任务的完成时间约束,将每一层中的任务同时调度至具有最小完成时间的资源上。与基于异构环境的最小完成时间(HEFT)算法的对比实验〖BP(〗原文“试验”〖BP)〗结果表明,RTS-THTC算法在完成时间上比HEFT算法短,并且能够有效地减缓关联任务出现延迟的情况。相似文献

19.

Energy-Aware Scheduling for Tasks with Target-Time in Blockchain based Data Centres

I. Devi G.R. Karpagam 《计算机系统科学与工程》2022,40(2):405-419

Cloud computing infrastructures have intended to provide computing services to end-users through the internet in a pay-per-use model. The extensive deployment of the Cloud and continuous increment in the capacity and utilization of data centers (DC) leads to massive power consumption. This intensifying scale of DCs has made energy consumption a critical concern. This paper emphasizes the task scheduling algorithm by formulating the system model to minimize the makespan and energy consumption incurred in a data center. Also, an energy-aware task scheduling in the Blockchain-based data center was proposed to offer an optimal solution that minimizes makespan and energy consumption. The established model was analyzed with a target-time responsive precedence scheduling algorithm. The observations were analyzed and compared with the traditional scheduling algorithms. The outcomes exhibited that the developed solution incurs better performance with a response to resource utilization and decreasing energy consumption. The investigation revealed that the applied strategy considerably enhanced the effectiveness of the designed schedule.

相似文献

20.

A multiple parallel download scheme with server throughput and client bandwidth considerations for data grids 总被引：1，自引：0，他引：1

Ruay-Shiung Ming-Huang Hau-Chin 《Future Generation Computer Systems》2008,24(8):798-805

Though computer technology advances quickly, the computing speed and storage capacity of a single computer still cannot satisfy the requirements of many applications. As a result, grids have emerged to utilize the collective power of many computers. One of them, the data grid, provides a mechanism for handling a large amount of data. One of the characteristics of a data grid is to replicate files to many different computers such that a popular file would be more available. If a grid site does not have a file, it will have to download it from other grid sites. Thus, the parallel download method, which allows a user to download different parts of a file from various computers simultaneously, is used to decrease download time.However, multiple parallel downloads will affect one another. Thus if all jobs in the grid system use parallel download, the problem of resource competition and conflict will happen. In this paper, we propose a parallel download scheme considering the server output throughput limits and client input bandwidth constraints. Experimental results show that the proposed download scheme outperforms static and dynamic parallel download schemes. 相似文献