期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Scalable dynamic Monitoring,Analysis and Tuning Environment for parallel applications

P. Caymes-Scutari A. Morajko T. Margalef E. Luque 《Journal of Parallel and Distributed Computing》2010

Parallel/distributed systems are continuously growing. This allows and enables the scalability of the applications, either by considering bigger problems in the same period of time or by solving the problem in a shorter time. In consequence, the methodologies, approaches and tools related to parallel paradigm should be brought up to date to support the increasing requirements of the applications and the users. MATE (Monitoring, Analysis and Tuning Environment) provides automatic and dynamic tuning for parallel/distributed applications. The tuning decisions are made according to performance models, which provide a fast means to decide what to improve in the execution. However, MATE presents some bottlenecks as the application grows, due to the fact that the analysis process is made in a full centralized manner. In this work, we propose a new approach to make MATE scalable. In addition, we present the experimental results and the analysis to validate the proposed approach against the original one. 相似文献

2.

A constrained, globalized, and bounded Nelder–Mead method for engineering optimization

M.A. Luersen R. Le Riche F. Guyon 《Structural and Multidisciplinary Optimization》2004,27(1-2):43-54

One of the fundamental difficulties in engineering design is the multiplicity of local solutions. This has triggered much effort in the development of global search algorithms. Globality, however, often has a prohibitively high numerical cost for real problems. A fixed cost local search, which sequentially becomes global, is developed in this work. Globalization is achieved by probabilistic restarts. A spacial probability of starting a local search is built based on past searches. An improved Nelder–Mead algorithm is the local optimizer. It accounts for variable bounds and nonlinear inequality constraints. It is additionally made more robust by reinitializing degenerated simplexes. The resulting method, called the Globalized Bounded Nelder–Mead (GBNM) algorithm, is particularly adapted to tackling multimodal, discontinuous, constrained optimization problems, for which it is uncertain that a global optimization can be afforded. Numerical experiments are given on two analytical test functions and two composite laminate design problems. The GBNM method compares favorably with an evolutionary algorithm, both in terms of numerical cost and accuracy. 相似文献

3.

A restarted and modified simplex search for unconstrained optimization

Qiu Hong Zhao Dragan Urosevi&#x; Nenad Mladenovi&#x; Pierre Hansen 《Computers & Operations Research》2009,36(12):3263

In this paper we propose a simple but efficient modification of the well-known Nelder–Mead (NM) simplex search method for unconstrained optimization. Instead of moving all n simplex vertices at once in the direction of the best vertex, our “shrink” step moves them in the same direction but one by one until an improvement is obtained. In addition, for solving non-convex problems, we simply restart the so-modified NM (MNM) method by constructing an initial simplex around the solution obtained in the previous phase. We repeat restarts until there is no improvement in the objective function value. Thus, our restarted modified NM (RMNM) is a descent and deterministic method and may be seen as an extended local search for continuous optimization. In order to improve computational complexity and efficiency, we use the heap data structure for storing and updating simplex vertices. Extensive empirical analysis shows that: our modified method outperforms in average the original version as well as some other recent successful modifications; in solving global optimization problems, it is comparable with the state-of-the-art heuristics. 相似文献

4.

Hybrid parallel chaos optimization algorithm with harmony search algorithm

《Applied Soft Computing》2014

The application of chaotic sequences can be an interesting alternative to provide search diversity in an optimization procedure, named chaos optimization algorithm (COA). Since the chaotic motion is pseudo-randomness and chaotic sequences are sensitive to the initial conditions, the search ability of COA is usually effected by the starting values. Considering this weakness, parallel chaos optimization algorithm (PCOA) is studied in this paper. To obtain optimum solution accurately, harmony search algorithm (HSA) is integrated with PCOA to form a novel hybrid algorithm. Different chaotic maps are compared and the impacts of parallel parameter on the hybrid algorithm are discussed. Several simulation results are used to show the effective performance of the proposed hybrid algorithm. 相似文献

5.

基于分布式协调系统的并行频繁模式增长算法的优化

王洁戴清濒李环《计算机科学》2012,39(3):174-182

频繁模式挖掘可以发现数据中频繁出现的模式,是关联规则挖掘的重要步骤。并行频繁模式算法将其应用到并行环境中,以对海量数据进行挖掘。在Apachc软件基金会的Mahout项目实现的基础上,对计数和排序阶段以及算法的执行顺序提出了新的优化策略。优化后的设计将计数信息存储在分布式协调系统上,充分地利用了分布式协调系统的高可用性、适宜存储元数据信息的特点。该设计减小了小文件在分布式文件系统(HDFS)上的开销,同时保留了其优点,还能使计数过程和排序过程并行执行,减小了计算节点的内存开销。对比了文件系统I/O的开销,并分析了实现设计中的难点,为未来的工作打下了基础。相似文献

6.

A genetic algorithm and a particle swarm optimizer hybridized with Nelder–Mead simplex search

Shu-Kai S. Fan Yun-Chia Liang Erwie Zahara 《Computers & Industrial Engineering》2006,50(4):401-425

This paper integrates Nelder–Mead simplex search method (NM) with genetic algorithm (GA) and particle swarm optimization (PSO), respectively, in an attempt to locate the global optimal solutions for the nonlinear continuous variable functions mainly focusing on response surface methodology (RSM). Both the hybrid NM–GA and NM–PSO algorithms incorporate concepts from the NM, GA or PSO, which are readily to implement in practice and the computation of functional derivatives is not necessary. The hybrid methods were first illustrated through four test functions from the RSM literature and were compared with original NM, GA and PSO algorithms. In each test scheme, the effectiveness, efficiency and robustness of these methods were evaluated via associated performance statistics, and the proposed hybrid approaches prove to be very suitable for solving the optimization problems of RSM-type. The hybrid methods were then tested by ten difficult nonlinear continuous functions and were compared with the best known heuristics in the literature. The results show that both hybrid algorithms were able to reach the global optimum in all runs within a comparably computational expense. 相似文献

7.

Efficient task scheduling for budget constrained parallel applications on heterogeneous cloud computing systems

《Future Generation Computer Systems》2017

As the cost-driven public cloud services emerge, budget constraint is one of the primary design issues in large-scale scientific applications executed on heterogeneous cloud computing systems. Minimizing the schedule length while satisfying the budget constraint of an application is one of the most important quality of service requirements for cloud providers. A directed acyclic graph (DAG) can be used to describe an application consisted of multiple tasks with precedence constrains. Previous DAG scheduling methods tried to presuppose the minimum cost assignment for each task to minimize the schedule length of budget constrained applications on heterogeneous cloud computing systems. However, our analysis revealed that the preassignment of tasks with the minimum cost does not necessarily lead to the minimization of the schedule length. In this study, we propose an efficient algorithm of minimizing the schedule length using the budget level (MSLBL) to select processors for satisfying the budget constraint and minimizing the schedule length of an application. Such problem is decomposed into two sub-problems, namely, satisfying the budget constraint and minimizing the schedule length. The first sub-problem is solved by transferring the budget constraint of the application to that of each task, and the second sub-problem is solved by heuristically scheduling each task with low-time complexity. Experimental results on several real parallel applications validate that the proposed MSLBL algorithm can obtain shorter schedule lengths while satisfying the budget constraint of an application than existing methods in various situations. 相似文献

8.

IMC based Robust PID design: Tuning guidelines and automatic tuning 总被引：4，自引：1，他引：3

R. Vilanova 《Journal of Process Control》2008,18(1):61-70

This communication addresses the problem of tuning a PID controller for step response. The tuning is based upon a First Order Plus Time Delay (FOPTD) model and aims to achieve a step response specification while taking into account robustness considerations. The industrial ISA-PID formulation is chosen. A tuning rule is derived first where the four parameters of the ISA-PID are determined by means of two new parameters: one parameter is related to the desired closed-loop time constant and the other one to the robustness level. On a second step, these two parameters are set to a fixed value in order to get a simple and automatic rule that directly gives the controller parameters in terms of the process model parameters. The proposed automatic tuning rule is compared with other known tunings. 相似文献

9.

参数模块和属性约简的应用服务器优化方法 总被引：1，自引：0，他引：1

刘岩王正方朱云龙董晓梅申德荣《小型微型计算机系统》2010,31(3)

现实的优化方法与策略往往是优化人员基于服务器厂商所提供的官方技术文档来分析各种参数的实际意义,优化比较漫长,缺少系统性和规律性,很难快速的确定所需调节的关键参数.本文针对常用的应用服务器分析了其性能下降的原因,提出了调节参数模块化思想并结合属性约简算法对参数模块进行属性约简,从实践中定量的找出影响系统性能的主要参数对其进行着重调节快速提高系统性能,提出了一种全新的服务器优化方法. 相似文献

10.

竞争算法在PID整定中的应用

单亚锋唐毅荆晓亮《计算机系统应用》2011,20(12):193-196,188

针对传统PID的整定存在的不足,引入CCA(Colonial Competitive Algorithm),该算法是一种属于社会启发的算法,以被控对象的二次型性能指标优劣作为PID整定的衡量标准,提出了一种新的PID整定方法,该算法可以在搜索空间里迅速收敛到最优解.由仿真结果说明,该整定算法节省内存,寻优时间短且不需要... 相似文献

11.

Managing complex data and geometry in parallel structured AMR applications 总被引：2，自引：0，他引：2

Richard D. Hornung Andrew M. Wissink Scott R. Kohn 《Engineering with Computers》2006,22(3-4):181-195

Adaptive mesh refinement (AMR) is an increasingly important simulation methodology for many science and engineering problems. AMR has the potential to generate highly resolved simulations efficiently by dynamically refining the computational mesh near key numerical solution features. AMR requires more complex numerical algorithms and programming than uniform fixed mesh approaches. Software libraries that provide general AMR functionality can ease these burdens significantly. A major challenge for library developers is to achieve adequate flexibility to meet diverse and evolving application requirements. In this paper, we describe the design of software abstractions for general AMR data management and parallel communication operations in SAMRAI, an object-oriented C++ structured AMR (SAMR) library developed at Lawrence Livermore National Laboratory (LLNL). The SAMRAI infrastructure provides the foundation for a variety of diverse application codes at LLNL and elsewhere. We illustrate SAMRAI functionality by describing how its unique features are used in these codes which employ complex data structures and geometry. We highlight capabilities for moving and deforming meshes, coupling multiple SAMR mesh hierarchies, and immersed and embedded boundary methods for modeling complex geometrical features. We also describe how irregular data structures, such as particles and internal mesh boundaries, may be implemented using SAMRAI tools without excessive application programmer effort. This work was performed under the auspices of the US Department of Energy by University of California Lawrence Livermore National Laboratory under contract number W-7405-Eng-48 and is released under UCRL-JRNL-214559. 相似文献

12.

Dynamic workload balancing of parallel applications with user-level scheduling on the Grid 总被引：1，自引：0，他引：1

Vladimir V. Jakub T. Valeria V. 《Future Generation Computer Systems》2009,25(1):28-34

This paper suggests a hybrid resource management approach for efficient parallel distributed computing on the Grid. It operates on both application and system levels, combining user-level job scheduling with dynamic workload balancing algorithm that automatically adapts a parallel application to the heterogeneous resources, based on the actual resource parameters and estimated requirements of the application. The hybrid environment and the algorithm for automated load balancing are described, the influence of resource heterogeneity level is measured, and the speedup achieved with this technique is demonstrated for different types of applications and resources. 相似文献

13.

Managing complexity in massively parallel, adaptive, multiphysics applications

H. Carter Edwards 《Engineering with Computers》2006,22(3-4):135-155

A new generation of scientific and engineering applications are being developed to support multiple coupled physics, adaptive meshes, and scaling in massively parallel environments. The capabilities required to support multiphysics, adaptivity, and massively parallel execution are individually complex and are especially challenging to integrate within a single application. Sandia National Laboratories has managed this challenge by consolidating these complex physics-independent capabilities into the Sierra Framework which is shared among a diverse set of application codes. The success of the Sierra Framework has been predicated on managing the integration of complex capabilities through a conceptual model based upon formal mathematical abstractions. Set theory is used to express and analyze the data structures, operations, and interactions of these complex capabilities. This mathematically based, conceptual modeling approach to managing complexity is not specific to the Sierra Framework—it is generally applicable to any scientific and engineering application framework. 相似文献

14.

Constructing the Voronoi diagram of a set of line segments in parallel 总被引：1，自引：1，他引：0

Michael T. Goodrich Colm Ó'Dúnlaing Chee K. Yap 《Algorithmica》1993,9(2):128-141

In this paper we give a parallel algorithm for constructing the Voronoi diagram of a polygonal scene, i.e., a set of line segments in the plane such that no two segments intersect except possibly at their endpoints. Our algorithm runs inO(log² n) time usingO(n) processors in the CREW PRAM model.The research of M. T. Goodrich was supported by NSF under Grants CCR-8810568 and CCR-9003299 and by NSF/DARPA under Grant CCR-8908092. C. K. Yap's research was supported in part by NSF Grants DCR-8401898 and CCR-9002819. 相似文献

15.

Design and kinetostatic analysis of a new parallel manipulator

Dan Zhang Zhuming Bi Beizhi Li 《Robotics and Computer》2009,25(4-5):782-791

This paper proposes an innovative design for a parallel manipulator that can be applied to a machine tool. The proposed parallel manipulator has three degrees of freedom (DOFs), including the rotations of a moving platform about the x and y axes and a translation of this platform along the z-axis. A passive link is introduced into this new parallel manipulator in order to increase the stiffness of the system and eliminate any unexpected motion. Both direct and inverse kinematic problems are investigated, and a dynamic model using a Newton–Euler approach is implemented. The global system stiffness of the proposed parallel manipulator, which considers the compliance of links and joints, is formulated and the kinetostatic analysis is conducted. Finally, a case study is presented to demonstrate the applications of the kinematic and dynamic models and to verify the concept of the new design. 相似文献

16.

Output-sensitive algorithms for optimally constructing the upper envelope of straight line segments in parallel

N. Gupta S. Chopra 《Journal of Parallel and Distributed Computing》2007

相似文献

17.

A parallel two-level method for simulating blood flows in branching arteries with the resistive boundary condition

Yuqi Wu Xiao-Chuan Cai 《Computers & Fluids》2011,45(1):92-102

Computer modeling of blood flows in the arteries is an important and very challenging problem. In order to understand, computationally, the sophisticated hemodynamics in the arteries, it is essential to couple the fluid flow and the elastic wall structure effectively and specify physiologically realistic boundary conditions. The computation is expensive and the parallel scalability of the solution algorithm is a key issue of the simulation. In this paper, we introduce and study a parallel two-level Newton–Krylov–Schwarz method for simulating blood flows in compliant branching arteries by using a fully coupled system of linear elasticity equation and incompressible Navier–Stokes equations with the resistive boundary condition. We first focus on the accuracy of the resistive boundary condition by comparing it with the standard pressure type boundary condition. We then show the parallel scalability results of the two-level approach obtained on a supercomputer with a large number of processors and on problems with millions of unknowns. 相似文献

18.

Restart strategies in optimization: parallel and serial cases

Oleg V. Shylo Timothy Middelkoop Panos M. Pardalos 《Parallel Computing》2011,37(1):60-68

This paper addresses the problem of minimizing the average running time of the Las Vegas type algorithm, both in serial and parallel setups. The necessary conditions for the existence of an effective restart strategy are presented. We clarify the counter-intuitive empirical observations of super linear speedup and relate parallel speedup with the restart properties of serial algorithms. The general property of restart distributions is derived. The computational experiments involving the state-of-the-art optimization algorithm are provided. 相似文献

19.

Constrained multi-objective trajectory planning of parallel kinematic machines

Amar Luc Marek 《Robotics and Computer》2009,25(4-5):756-769

This paper presents a new approach to multi-objective dynamic trajectory planning of parallel kinematic machines (PKM) under task, workspace and manipulator constraints. The robot kinematic and dynamic model, (including actuators) is first developed. Then the proposed trajectory planning system is introduced. It minimizes electrical and kinetic energy, robot traveling time separating two sampling periods, and maximizes a measure of manipulability allowing singularity avoidance. Several technological constraints such as actuator, link length and workspace limitations, and some task requirements, such as passing through imposed poses are simultaneously satisfied. The discrete augmented Lagrangean technique is used to solve the resulting strong nonlinear constrained optimal control problem. A decoupled formulation is proposed in order to cope with some difficulties arising from dynamic parameters computation. A systematic implementation procedure is provided along with some numerical issues. Simulation results proving the effectiveness of the proposed approach are given and discussed. 相似文献

20.

A case study of different task implementations for multioutput stages in non-trivial parallel pipeline applications

Angeles Navarro Rafael Asenjo Francisco Corbera Antonio J. Dios Emilio L. Zapata 《Parallel Computing》2014

Task-based libraries, such as Intel’s Threading Building Blocks (TBB), are promising tools that help programmers to develop parallel code in a productive way, thanks to high-level constructors which simplify the chore of efficiently exploiting system resources. In this paper we focus on one type of task parallelism, pipeline parallelism, which is becoming an increasingly popular parallel programming pattern for streaming applications in the domain of digital signal processing, graphics, compression and encryption. Specifically, TBB provides a high-level template to express pipeline parallelism, but it is limited to representing simple pipeline structures. We address the issue of non-trivial parallel pipeline structures in which one or more stages in the pipeline have more items leaving than arriving, a problem for which the current TBB pipeline template does not provide support. In this work, we describe a new Multioutput filter that we have incorporated into the TBB pipeline framework to deal with these multioutput stages. Using real world streaming applications from different computational domains (dedup and scenerecog), we also compare the performance of our implementation using the Multioutput filter in the TBB pipeline template to other more complex TBB task-based implementations that only use the standard filters. We also develop new analytical models for each implementation to better understand the resources utilization in each case. Performance evaluation and analysis shows that the implementation based on the Multioutput filter outperforms the other solutions because: it promotes finer task parallelism, which is more suited to the TBB task-stealing mechanism in order to better exploit the resources; and it also reduces the overheads related to memory and task management. 相似文献