首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Volunteer computing uses the free resources in Internet and Intranet environments for large-scale computation and storage. Currently, 70 applications use over 12 PetaFLOPS of computing power from such platforms. However, these platforms are currently limited to embarrassingly parallel applications. In an effort to broaden the set of applications that can leverage volunteer computing, we focus on the problem of predicting if a group of resources will be continuously available for a relatively long time period. Ensuring the collective availability of volunteer resources is challenging due to their inherent volatility and autonomy. Collective availability is important for enabling parallel applications and workflows on volunteer computing platforms. We evaluate our predictive methods using real availability traces gathered from hundreds of thousands of hosts from the SETI@home volunteer computing project. We show our prediction methods can guarantee reliably the availability of collections of volunteer resources. We show that this is particularly useful for service deployments over volunteer computing environments.  相似文献   

2.
CodiP2P and DisCoP are two peer-to-peer (P2P) computing overlays aimed at sharing computing resources (CPU, Memory, etc.) to execute parallel applications. Their component nodes are basically PC’s and a wide range of computer servers, desktops or laptops. This paper joins these two platforms into a new one, DisCoP2P, to combine the features from both overlays. CodiP2P is highly scalable, and DisCoP has an efficient searching mechanism and the ability to classify computing resources. The new platform takes advantage of these features and uses them to offer new facilities to schedule and execute parallel applications efficiently. This is accomplished at null cost because the platform is made up of nodes that share resources for free. This research field can also be classified in desktop computing. The success of this platform depends greatly on the added overhead. This overhead is produced mainly in searching for resources and system administration. The obtained results in a preliminary prototype, although not sufficiently conclusive, demonstrate the applicability of DisCoP2P in the real world, i.e. Internet.  相似文献   

3.
Software cost models and effort estimates help project managers allocate resources, control costs and schedule and improve current practices, leading to projects finished on time and within budget. In the context of Web development, these issues are also crucial, and very challenging given that Web projects have short schedules and very fluidic scope. In the context of Web engineering, few studies have compared the accuracy of different types of cost estimation techniques with emphasis placed on linear and stepwise regressions, and case-based reasoning (CBR). To date only one type of CBR technique has been employed in Web engineering. We believe results obtained from that study may have been biased, given that other CBR techniques can also be used for effort prediction. Consequently, the first objective of this study is to compare the prediction accuracy of three CBR techniques to estimate the effort to develop Web hypermedia applications and to choose the one with the best estimates. The second objective is to compare the prediction accuracy of the best CBR technique against two commonly used prediction models, namely stepwise regression and regression trees. One dataset was used in the estimation process and the results showed that the best predictions were obtained for stepwise regression.  相似文献   

4.
The execution times of large-scale parallel applications on nowadays multi/many-core systems are usually longer than the mean time between failures. Therefore, parallel applications must tolerate hardware failures to ensure that not all computation done is lost on machine failures. Checkpointing and rollback recovery is one of the most popular techniques to implement fault-tolerant applications. However, checkpointing parallel applications is expensive in terms of computing time, network utilization and storage resources. Thus, current checkpoint-recovery techniques should minimize these costs in order to be useful for large scale systems. In this paper three different and complementary techniques to reduce the size of the checkpoints generated by application-level checkpointing are proposed and implemented. Detailed experimental results obtained on a multicore cluster show the effectiveness of the proposed methods to reduce checkpointing cost.  相似文献   

5.
Conventional performance evaluation mechanisms focus on dedicated systems. Grid computing infrastructure, on the other hand, is a shared collaborative environment constructed on virtual organizations. Each organization has its own resource management policy and usage pattern. The non-dedicated characteristic of Grid computing prevents the leverage of conventional performance evaluation systems. In this study, we introduce the grid harvest service (GHS) performance evaluation and task scheduling system for solving large-scale applications in a shared environment. GHS is based on a novel performance prediction model and a set of task scheduling algorithms. GHS supports three classes of task scheduling, single task, parallel processing and meta-task. Experimental results show that GHS provides a satisfactory solution for performance prediction and task scheduling of large applications and has a real potential.  相似文献   

6.
This article presents a parallel self-verified solver for dense linear systems of equations. This kind of solver is commonly used in many different kinds of real applications which deal with large matrices. Nevertheless, two key problems appear to limit the use of linear system solvers to a more extensive range of real applications: solution correctness and high computational cost. In order to solve the first one, verified computing would be an interesting choice. An algorithm that uses this concept is able to find a highly accurate and automatically verified result providing more reliability. However, the performance of these algorithms quickly becomes a drawback. Aiming at a better performance, parallel computing techniques were employed. Two main parts of this method were parallelized: the computation of the approximate inverse of matrix A and the preconditioning step. The results obtained show that these optimizations increase significantly the overall performance.  相似文献   

7.
In this paper we present a new environment called MERPSYS that allows simulation of parallel application execution time on cluster-based systems. The environment offers a modeling application using the Java language extended with methods representing message passing type communication routines. It also offers a graphical interface for building a system model that incorporates various hardware components such as CPUs, GPUs, interconnects and easily allows various formulas to model execution and communication times of particular blocks of code. A simulator engine within the MERPSYS environment simulates execution of the application that consists of processes with various codes, to which distinct labels are assigned. The simulator runs one Java thread per label and scales computations and communication times adequately. This approach allows fast coarse-grained simulation of large applications on large-scale systems. We have performed tests and verification of results from the simulator for three real parallel applications implemented with C/MPI and run on real HPC clusters: a master-slave code computing similarity measures of points in a multidimensional space, a geometric single program multiple data parallel application with heat distribution and a divide-and-conquer application performing merge sort. In all cases the simulator gave results very similar to the real ones on configurations tested up to 1000 processes. Furthermore, it allowed us to make predictions of execution times on configurations beyond the hardware resources available to us.  相似文献   

8.
In the last years, scientific workflows have emerged as a fundamental abstraction for structuring and executing scientific experiments in computational environments. Scientific workflows are becoming increasingly complex and more demanding in terms of computational resources, thus requiring the usage of parallel techniques and high performance computing (HPC) environments. Meanwhile, clouds have emerged as a new paradigm where resources are virtualized and provided on demand. By using clouds, scientists have expanded beyond single parallel computers to hundreds or even thousands of virtual machines. Although the initial focus of clouds was to provide high throughput computing, clouds are already being used to provide an HPC environment where elastic resources can be instantiated on demand during the course of a scientific workflow. However, this model also raises many open, yet important, challenges such as scheduling workflow activities. Scheduling parallel scientific workflows in the cloud is a very complex task since we have to take into account many different criteria and to explore the elasticity characteristic for optimizing workflow execution. In this paper, we introduce an adaptive scheduling heuristic for parallel execution of scientific workflows in the cloud that is based on three criteria: total execution time (makespan), reliability and financial cost. Besides scheduling workflow activities based on a 3-objective cost model, this approach also scales resources up and down according to the restrictions imposed by scientists before workflow execution. This tuning is based on provenance data captured and queried at runtime. We conducted a thorough validation of our approach using a real bioinformatics workflow. The experiments were performed in SciCumulus, a cloud workflow engine for managing scientific workflow execution.  相似文献   

9.
Grids consist of both dedicated and non-dedicated clusters. For effective mapping of parallel applications on grid resources, a grid metascheduler has to evaluate different sets of resources in terms of predicted execution times for the applications when executed on the sets of resources. In this work, we have developed a comprehensive set of performance modeling strategies for predicting execution times of parallel applications on both dedicated and non-dedicated environments. Our strategies adapt to changing network and CPU loads on the grid resources. We have evaluated our strategies on 8, 16, 24 and 32-node clusters with random loads and load traces from a grid system. Our strategies give less than 30% average percentage prediction errors in all cases, which, to our knowledge, is the best reported for non-dedicated environments. We also found that grid scheduling using predictions of execution times from our performance modeling techniques will lead to perfect mapping of applications to resources in many cases.  相似文献   

10.
The widespread adoption of traditional heterogeneous systems has substantially improved the computing power available and, in the meantime, raised optimisation issues related to the processing of task streams across both CPU and GPU cores in heterogeneous systems. Similar to the heterogeneous improvement gained in traditional systems, cloud computing has started to add heterogeneity support, typically through GPU instances, to the conventional CPU-based cloud resources. This optimisation of cloud resources will arguably have a real impact when running on-demand computationally-intensive applications.In this work, we investigate the scaling of pattern-based parallel applications from physical, “local” mixed CPU/GPU-clusters to a public cloud CPU/GPU infrastructure. Specifically, such parallel patterns are deployed via algorithmic skeletons to exploit a peculiar parallel behaviour while hiding implementation details.We propose a systematic methodology to exploit approximated analytical performance/cost models, and an integrated programming framework that is suitable for targeting both local and remote resources to support the offloading of computations from structured parallel applications to heterogeneous cloud resources, such that performance values not available on local resources may be actually achieved with the remote resources. The amount of remote resources necessary to achieve a given performance target is calculated through the performance models in order to allow any user to hire the amount of cloud resources needed to achieve a given target performance value. Thus, it is therefore expected that such models can be used to devise the optimal proportion of computations to be allocated on different remote nodes for Big Data computations.We present different experiments run with a proof-of-concept implementation based on FastFlow  on small departmental clusters as well as on a public cloud infrastructure with CPU and GPU using the Amazon Elastic Compute Cloud. In particular, we show how CPU-only and mixed CPU/GPU computations can be offloaded to remote cloud resources with predictable performances and how data intensive applications can be mapped to a mix of local and remote resources to guarantee optimal performances.  相似文献   

11.
Dynamically allocating computing nodes to parallel applications is a promising technique for improving the utilization of cluster resources. Detailed simulations can help identify allocation strategies and problem decomposition parameters that increase the efficiency of parallel applications. We describe a simulation framework supporting dynamic node allocation which, given a simple cluster model, predicts the running time of parallel applications taking CPU and network sharing into account. Simulations can be carried out without needing to modify the application code. Thanks to partial direct execution, simulation times and memory requirements are reduced. In partial direct execution simulations, the application's parallel behavior is retrieved via direct execution, and the duration of individual operations is obtained from a performance prediction model or from prior measurements. Simulations may then vary cluster model parameters, operation durations and problem decomposition parameters to analyze their impact on the application performance and identify the limiting factors. We implemented the proposed techniques by adding direct execution simulation capabilities to the Dynamic Parallel Schedules parallelization framework. We introduce the concept of dynamic efficiency to express the resource utilization efficiency as a function of time. We verify the accuracy of our simulator by comparing the effective running time, respectively the dynamic efficiency, of parallel program executions with the running time, respectively the dynamic efficiency, predicted by the simulator under different parallelization and dynamic node allocation strategies.  相似文献   

12.
The addition of reconfigurable hardware (FPGAs) to the nodes of Beowulf-style clusters has the potential to accelerate a variety of parallel applications through a combination of parallel programming and reconfigurable computing techniques. However, making efficient use of the computational resources available places a significant burden on the application developer due to the lack of support for reconfigurable computing and task heterogeneity in standard message-passing libraries. This paper describes Accessible Reconfigurable Computing (ARC), a metacomputing environment designed to address these issues. The architecture, implementation, and operation of the system are described in detail.  相似文献   

13.
This paper addresses efficient mapping and reconfiguration of advanced video applications onto a general purpose multi-core platform. By accurately modeling the resource usage for an application, allocation of processing resources on the platform can be based on the actually needed resources instead of a worst-case approach, thereby improving Quality-of-Service (QoS). Here, we exploit a new and strongly upcoming class of dynamic video applications based on image and content analysis for resource management and control. Such applications are characterized by irregular computing behavior and memory usage. It is shown that with linear models and statistical techniques based on the Markov modeling, a rather good accuracy (94?C97%) for predicting the resource usage can be obtained. This prediction accuracy is so good that it allows resource prediction at runtime, thereby leading to an actively controlled system management.  相似文献   

14.

Workload prediction is an essential prerequisite to allocate resources efficiently and maintain service level agreements in cloud computing environment. However, the best solution for a prediction task may not be a single model due to the challenge of varied characteristics of different systems. Thus, in this work, we propose an ensemble model, namely ESNemble, based on echo state network (ESN) for workload time series forecasting. ESNemble consists of four main steps, including features selection using ESN reservoirs, dimensionality reduction using kernel principal component analysis, features aggregation using matrices concatenation, and regression using least absolute shrinkage and selection operator for final predictions. In addition, necessary hyperparameters for ESNemble are optimized using genetic algorithm. For experimental evaluation, we have used ESNemble to combine five different prediction algorithms on three recent logs extracted from real-world web servers. Through our experimental results, we have shown that ESNemble outperforms all component models in terms of accuracy and resource allocation and presented the running time of our model to show the feasibility of our model in real-world applications.

  相似文献   

15.
16.
Nowadays, high-performance computing (HPC) clusters are increasingly popular. Large volumes of job logs recording many years of operation traces have been accumulated. In the same time, the HPC cloud makes it possible to access HPC services remotely. For executing applications, both HPC end-users and cloud users need to request specific resources for different workloads by themselves. As users are usually not familiar with the hardware details and software layers, as well as the performance behavior of the underlying HPC systems. It is hard for them to select optimal resource configurations in terms of performance, cost, and energy efficiency. Hence, how to provide on-demand services with intelligent resource allocation is a critical issue in the HPC community. Prediction of job characteristics plays a key role for intelligent resource allocation. This paper presents a survey of the existing work and future directions for prediction of job characteristics for intelligent resource allocation in HPC systems. We first review the existing techniques in obtaining performance and energy consumption data of jobs. Then we survey the techniques for single-objective oriented predictions on runtime, queue time, power and energy consumption, cost and optimal resource configuration for input jobs, as well as multi-objective oriented predictions. We conclude after discussing future trends, research challenges and possible solutions towards intelligent resource allocation in HPC systems.  相似文献   

17.
This paper reports an application dependent network design for extreme scale high performance computing (HPC) applications. Traditional scalable network designs focus on fast point-to-point transmission of generic data packets. The proposed network focuses on the sustainability of high performance computing applications by statistical multiplexing of semantic data objects. For HPC applications using data-driven parallel processing, a tuple is a semantic object. We report the design and implementation of a tuple switching network for data parallel HPC applications in order to gain performance and reliability at the same time when adding computing and communication resources. We describe a sustainability model and a simple computational experiment to demonstrate extreme scale application’s sustainability with decreasing system mean time between failures (MTBF). Assuming three times slowdown of statistical multiplexing and 35% time loss per checkpoint, a two-tier tuple switching framework would produce sustained performance and energy savings for extreme scale HPC application using more than 1024 processors or less than 6 hour MTBF. Higher processor counts or higher checkpoint overheads accelerate the benefits.  相似文献   

18.
While the research community has already studied a considerable amount of techniques related to achieving high bandwidth, good reliability, low power consumption, certain quality of service in communication on networks on chip (NOC) especially with artificial communication patterns, a little attention has paid to the effects of memory organizations to performance of computing engines employing NOCs with real parallel workloads. In this paper we compare the performance of some shared memory organizations for chip multiprocessors (CMP) employing advanced homogeneous 2D-mesh-like NOCs and making use of emulated shared memory and non-uniform memory access models. The evaluated techniques range from applying different hashing functions to elimination methods of speed difference between processing resources and memories, and from access methods to latency hiding and concurrent memory access support techniques. Tests are performed on our CMP/NOC framework with simple but real parallel programs that can be directly used as building blocks of larger explicitly parallel applications.  相似文献   

19.
In cloud systems, a clear necessity emerges related to the use of efficient and scalable computing resources. For this, accurate predictions on the load of computing resources are a key. Thanks to these accurate predictions, reduced power consumption and enhanced revenue of the system can be achieved, since resources can be ready when users need them and shutdown when they are no longer needed. This work presents an architecture to manage web applications based on cloud computing, which combines both local and public cloud resources. This work also presents the algorithms needed to efficiently manage such architecture. Among them, a load forecasting algorithm has been developed based on Exponential Smoothing. An use case of the e-learning services of our University presenting the behaviour of our architecture has been evaluated through a series of simulations. Among the most remarkable results, power consumption is reduced by 32 % at the cost of 367.31 US$ a month compared with the current architecture.  相似文献   

20.
A PTS-PGATS based approach for data-intensive scheduling in data grids   总被引:1,自引:0,他引:1  
Grid computing is the combination of computer resources in a loosely coupled, heterogeneous, and geographically dispersed environment. Grid data are the data used in grid computing, which consists of large-scale data-intensive applications, producing and consuming huge amounts of data, distributed across a large number of machines. Data grid computing composes sets of independent tasks each of which require massive distributed data sets that may each be replicated on different resources. To reduce the completion time of the application and improve the performance of the grid, appropriate computing resources should be selected to execute the tasks and appropriate storage resources selected to serve the files required by the tasks. So the problem can be broken into two sub-problems: selection of storage resources and assignment of tasks to computing resources. This paper proposes a scheduler, which is broken into three parts that can run in parallel and uses both parallel tabu search and a parallel genetic algorithm. Finally, the proposed algorithm is evaluated by comparing it with other related algorithms, which target minimizing makespan. Simulation results show that the proposed approach can be a good choice for scheduling large data grid applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号