首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 687 毫秒
1.
In this paper, we introduce MRMOGA (Multiple Resolution Multi‐Objective Genetic Algorithm), a new parallel multi‐objective evolutionary algorithm which is based on an injection island approach. This approach is characterized by adopting an encoding of solutions which uses a different resolution for each island. This approach allows us to divide the decision variable space into well‐defined overlapped regions to achieve an efficient use of multiple processors. Also, this approach guarantees that the processors only generate solutions within their assigned region. In order to assess the performance of our proposed approach, we compare it to a parallel version of an algorithm that is representative of the state‐of‐the‐art in the area, using standard test functions and performance measures reported in the specialized literature. Our results indicate that our proposed approach is a viable alternative to solve multi‐objective optimization problems in parallel, particularly when dealing with large search spaces. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

2.
Branch‐and‐bound (B&B) algorithms are attractive methods for solving to optimality combinatorial optimization problems using an implicit enumeration of a dynamically built tree‐based search space. Nevertheless, they are time‐consuming when dealing with large problem instances. Therefore, pruning tree nodes (subproblems) is traditionally used as a powerful mechanism to reduce the size of the explored search space. Pruning requires to perform the bounding operation, which consists of applying a lower bound function to the subproblems generated during the exploration process. Preliminary experiments performed on the Flow‐Shop scheduling problem (FSP) have shown that the bounding operation consumes over 98% of the execution time of the B&B algorithm. In this paper, we investigate the use of graphics processing unit (GPU) computing as a major complementary way to speed up the search. We revisit the design and implementation of the parallel bounding model on GPU accelerators. The proposed approach enables data access optimization. Extensive experiments have been carried out on well‐known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU‐based single core execution using an Intel Core i7‐970 processor without GPU, speedups higher than 100 times faster are achieved for large problem instances. At an equivalent peak performance, GPU‐accelerated B&B is twice faster than its multi‐core counterpart. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

3.
The lattice Boltzmann method (LBM) is a widely used computational fluid dynamics method for flow problems with complex geometries and various boundary conditions. Large‐scale LBM simulations with increasing resolution and extending temporal range require massive high‐performance computing (HPC) resources, thus motivating us to port it onto modern many‐core heterogeneous supercomputers like Tianhe‐2. Although many‐core accelerators such as graphics processing unit and Intel MIC have a dramatic advantage of floating‐point performance and power efficiency over CPUs, they also pose a tough challenge to parallelize and optimize computational fluid dynamics codes on large‐scale heterogeneous system. In this paper, we parallelize and optimize the open source 3D multi‐phase LBM code openlbmflow on the Intel Xeon Phi (MIC) accelerated Tianhe‐2 supercomputer using a hybrid and heterogeneous MPI+OpenMP+Offload+single instruction, mulitple data (SIMD) programming model. With cache blocking and SIMD‐friendly data structure transformation, we dramatically improve the SIMD and cache efficiency for the single‐thread performance on both CPU and Phi, achieving a speedup of 7.9X and 8.8X, respectively, compared with the baseline code. To collaborate CPUs and Phi processors efficiently, we propose a load‐balance scheme to distribute workloads among intra‐node two CPUs and three Phi processors and use an asynchronous model to overlap the collaborative computation and communication as far as possible. The collaborative approach with two CPUs and three Phi processors improves the performance by around 3.2X compared with the CPU‐only approach. Scalability tests show that openlbmflow can achieve a parallel efficiency of about 60% on 2048 nodes, with about 400K cores in total. To the best of our knowledge, this is the largest scale CPU‐MIC collaborative LBM simulation for 3D multi‐phase flow problems. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

4.
This paper describes a Speaker Verification System based on the use of multi resolution classifiers in order to cope with performance degradation due to natural variations of the excitation source and of the vocal tract. The different resolution representations of the speaker are obtained by considering multiple frame lengths in the feature extraction process and from these representations a single Pseudo‐Multi Parallel Branch (P‐MPB) Hidden Markov Model is obtained. In the verification process, different resolution representations of the speech signal are classified by multiple P‐MPB systems: the final decision is obtained by means of different combination techniques. The system based on the Weighted Majority Vote technique considerably outperforms baseline systems: improvements are between 15% and 38%. The execution time of the verification process is also evaluated and it proves to be very acceptable, thus allowing the use of the approach for applications in real time systems.  相似文献   

5.
A distributed system consists of a collection of autonomous heterogeneous resources that provide resource sharing and a common platform for running parallel compute‐intensive applications. The different application characteristics combined with the heterogeneity and performance variations of the distributed system make it difficult to find the optimal set of needed resources. When deployed, user applications are usually handled by application domain experts or system administrators who depending on the infrastructure provide a scheduling strategy for selecting the best candidate resource over a set of available resources. However, the provided strategy is usually generic, aimed at handling a wide array of applications and does not take into consideration specific application resource requirements. As such, an intelligent method for selecting the best resources based on expert knowledge is needed. In this paper, we propose a neural network‐based multi‐agent resource selection technique capable of mimicking the services of an expert user. In addition, to cope with the geographical distribution of the underlying system, we employ a multi‐agent coordination mechanism. The proposed neural network‐based scheduling framework combined with the multi‐agent intelligence is a unique approach to efficiently deal with the resource selection problem. Results run on a simulated environment show the efficiency of our proposed method. Several scheduling simulations were conducted to compare the performance of some conventional resource selection methods against the proposed agent‐based neural network technique. The results obtained indicate that the agent‐based approach outperformed the classical algorithms by reducing the amount of time required to search for suitable resources irrespective of the resource size. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

6.
Probabilistic interval‐valued hesitant fuzzy sets (PIV‐HFSs) are suitable for aggregating information from different groups because the probabilistic information of all the groups can be included by using interval values. Moreover, decision makers (DMs) prefer to use interval values to provide evaluation information. Furthermore, the traditional multi‐criteria group decision‐making (MCGDM) approach has some limitations, such as obtaining the DMs' weights with inappropriate methods and neglecting the interactions amongst the criteria and the psychological characteristics of DMs. Motivated by these research background, the main contents of this study are as follows. First, PIV‐HFSs are proposed, and the convex combination operation is extended into PIV‐HFSs. Second, a hybrid MCGDM approach with PIV‐HFSs is suggested that is based on the maximizing deviation method, fuzzy analytic network process (FANP) and TODIM (an acronym in Portuguese for interactive and multi‐criteria decision‐making model). Third, an evaluation case of health management centres based on the service‐specific failure mode and effect analysis (FMEA) is considered. The results show that the most crucial secondary factor is frequency (0.35775) and that the most serious failure mode is the inaccurate check‐in. The results demonstrate that the proposed model can evaluate service quality effectively and that it performs better than other methods.  相似文献   

7.
Symbolic computation has underpinned a number of key advances in Mathematics and Computer Science. Applications are typically large and potentially highly parallel, making them good candidates for parallel execution at a variety of scales from multi‐core to high‐performance computing systems. However, much existing work on parallel computing is based around numeric rather than symbolic computations. In particular, symbolic computing presents particular problems in terms of varying granularity and irregular task sizes that do not match conventional approaches to parallelisation. It also presents problems in terms of the structure of the algorithms and data. This paper describes a new implementation of the free open‐source GAP computational algebra system that places parallelism at the heart of the design, dealing with the key scalability and cross‐platform portability problems. We provide three system layers that deal with the three most important classes of hardware: individual shared memory multi‐core nodes, mid‐scale distributed clusters of (multi‐core) nodes and full‐blown high‐performance computing systems, comprising large‐scale tightly connected networks of multi‐core nodes. This requires us to develop new cross‐layer programming abstractions in the form of new domain‐specific skeletons that allow us to seamlessly target different hardware levels. Our results show that, using our approach, we can achieve good scalability and speedups for two realistic exemplars, on high‐performance systems comprising up to 32000 cores, as well as on ubiquitous multi‐core systems and distributed clusters. The work reported here paves the way towards full‐scale exploitation of symbolic computation by high‐performance computing systems, and we demonstrate the potential with two major case studies. © 2016 The Authors. Concurrency and Computation: Practice and Experience Published by John Wiley & Sons Ltd.  相似文献   

8.
In this paper, we present a primal‐dual interior‐point algorithm to solve a class of multi‐objective network flow problems. More precisely, our algorithm is an extension of the single‐objective primal infeasible dual feasible inexact interior point method for multi‐objective linear network flow problems. Our algorithm is contrasted with standard interior point methods and experimental results on bi‐objective instances are reported. The multi‐objective instances are converted into single objective problems with the aid of an achievement function, which is particularly adequate for interactive decision‐making methods.  相似文献   

9.
The Block Conjugate Gradient algorithm (Block‐CG) was developed to solve sparse linear systems of equations that have multiple right‐hand sides. We have adapted it for use in heterogeneous, geographically distributed, parallel architectures. Once the main operations of the Block‐CG (Tasks) have been collected into smaller groups (subjobs), each subjob is matched by the middleware MJMS (MPI Jobs Management System) with a suitable resource selected among those which are available. Moreover, within each subjob, concurrency is introduced at two different levels and with two different granularities: the coarse‐grained parallelism to perform independent tasks and the fine‐grained parallelism within the execution of a task. We refer to this algorithm as to multi‐grained distributed implementation of the parallel Block‐CG. We compare the performance of a parallel implementation with the one of the distributed implementation running on a variety of Grid computing environments. The middleware MJMS—developed by some of the authors and built on top of Globus Toolkit and Condor‐G—was used for co‐allocation, synchronization, scheduling and resource selection. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

10.
In recent years, the application of metaheuristic techniques to solve multi‐objective optimization problems has become an active research area. Solving this kind of problems involves obtaining a set of Pareto‐optimal solutions in such a way that the corresponding Pareto front fulfils the requirements of convergence to the true Pareto front and uniform diversity. Most of the studies on metaheuristics for multi‐objective optimization are focused on Evolutionary Algorithms, and some of the state‐of‐the‐art techniques belong this class of algorithms. Our goal in this paper is to study open research lines related to metaheuristics but focusing on less explored areas to provide new perspectives to those researchers interested in multi‐objective optimization. In particular, we focus on non‐evolutionary metaheuristics, hybrid multi‐objective metaheuristics, parallel multi‐objective optimization and multi‐objective optimization under uncertainty. We analyze these issues and discuss open research lines.  相似文献   

11.
迭代空间交错条块并行Gauss-Seidel算法   总被引:1,自引:0,他引:1  
胡长军  张纪林  王珏  李建江 《软件学报》2008,19(6):1274-1282
针对并行GS(Gauss-Seidel)迭代算法中数据局部性差、同步和通信开销大的问题,首先改进传统GS迭代,提出了多层对称GS迭代算法.然后给出了以迭代空间条块序作为执行序的串行执行模型.该模型通过对迭代空间进行"时滞"划分,对迭代空间条块内部多次迭代计算,提高算法的数据局部性.最后提出一种基于迭代空间条块的并行执行模型.该模型改进了迭代空间网格划分,并通过网格条块重排序减少了cache缺失率、通信启动和同步次数.实验结果表明,迭代空间交错条块并行算法比传统的区域分解方法和红黑排序并行算法具有更好的并行效率和可扩展性.  相似文献   

12.
The unequal area facility layout problem (UA‐FLP) has been addressed by many methods. Most of them only take aspects that can be quantified into account. This contribution presents a novel approach, which considers both quantitative aspects and subjective features. To this end, a multi‐objective interactive genetic algorithm is proposed with the aim of allowing interaction between the algorithm and the human expert designer, normally called the decision maker (DM) in the field of UA‐FLP. The contribution of the DM's knowledge into the approach guides the complex search process, adjusting it to the DM's preferences. The entire population associated to facility layout designs is evaluated by quantitative criteria in combination with an assessment prepared by the DM, who gives a subjective evaluation for a set of representative individuals of the population in each iteration. In order to choose these individuals, a soft computing clustering method is used. Two interesting real‐world data sets are analysed to empirically probe the robustness of these models. The first UA‐FLP case study describes an ovine slaughterhouse plant and the second, a design for recycling carton plant. Relevant results are obtained, and interesting conclusions are drawn from the application of this novel intelligent framework.  相似文献   

13.
Because of environmental and monetary concerns, it is increasingly important to reduce the energy consumption in all areas, including parallel and high performance computing. In this article, we propose an approach to reduce the energy consumption needed for the execution of a set of tasks computed in parallel in a fork‐join fashion. The approach consists of an analytical model for the energy consumption of a parallel computation in fork‐join form on dynamic voltage frequency scaling processors, a theoretical specification of an energy‐optimal frequency‐scaled state, and the energy minimization by computing optimal scaling factors. For larger numbers of tasks, the approach is extended by scheduling algorithms, which exploit the analytical result and aim at a reduction of the energy. Energy measurements of a complex numerical method and the SPEC CPU2006 benchmarks as well as simulations for a large number of randomly generated tasks illustrate and validate the energy modeling, the minimization, and the scheduling results. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

14.
This paper deals with the leader‐following consensus for nonlinear stochastic multi‐agent systems. To save communication resources, a new centralized/distributed hybrid event‐triggered mechanism (HETM) is proposed for nonlinear multi‐agent systems. HETMs can be regarded as a synthesis of continuous event‐triggered mechanism and time‐driven mechanism, which can effectively avoid Zeno behavior. To model the multi‐agent systems under centralized HETM, the switched system method is applied. By utilizing the property of communication topology, low‐dimensional consensus conditions are obtained. For the distributed hybrid event‐triggered mechanism, due to the asynchronous event‐triggered instants, the time‐varying system method is applied. Meanwhile, the effect of network‐induced time‐delay on the consensus is also considered. To further reduce the computational resources by constantly testing whether the broadcast condition has been violated, self‐triggered implementations of the proposed event‐triggered communication protocols are also derived. A numerical example is given to show the effectiveness of the proposed method.  相似文献   

15.
The field of computational biology encloses a wide range of optimization problems that show non‐deterministic polynomial‐time hard complexities. Nowadays, phylogeneticians are dealing with a growing amount of biological data that must be analyzed to explain the origins of modern species. Evolutionary relationships among organisms are often described by means of tree‐shaped structures known as phylogenetic trees. When inferring phylogenies, two main challenges must be addressed. First, the inference of reliable evolutionary trees on data sets where different optimality principles support conflicting evolutionary hypotheses. Second, the processing of enormous tree searches spaces where traditional sequential strategies cannot be applied. In this sense, phylogenetic inference can benefit from the combination of high performance computing and evolutionary computation to carry out the reconstruction of complex evolutionary histories in reduced execution times. In this paper, we introduce multiobjective phylogenetics, a hybrid OpenMP/MPI approach to parallelize a well‐known multiobjective metaheuristic, the fast non‐dominated sorting genetic algorithm (NSGA‐II). This algorithm has been designed to conduct phylogenetic analyses on multi‐core clusters in accordance with two principles: maximum parsimony and maximum likelihood. The main goal is to combine the benefits of shared‐memory and distributed‐memory programming paradigms to efficiently infer a set of high‐quality Pareto solutions. Experiments on six real nucleotide data sets and comparisons with other hybrid parallel approaches show that multiobjective phylogenetics is able to achieve significant performance in terms of parallel, multiobjective, and biological results. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

16.
We propose a new framework design for exploiting multi‐core architectures in the context of visualization dataflow systems. Recent hardware advancements have greatly increased the levels of parallelism available with all indications showing this trend will continue in the future. Existing visualization dataflow systems have attempted to take advantage of these new resources, though they still have a number of limitations when deployed on shared memory multi‐core architectures. Ideally, visualization systems should be built on top of a parallel dataflow scheme that can optimally utilize CPUs and assign resources adaptively to pipeline elements. We propose the design of a flexible dataflow architecture aimed at addressing many of the shortcomings of existing systems including a unified execution model for both demand‐driven and event‐driven models; a resource scheduler that can automatically make decisions on how to allocate computing resources; and support for more general streaming data structures which include unstructured elements. We have implemented our system on top of VTK with backward compatibility. In this paper, we provide evidence of performance improvements on a number of applications.  相似文献   

17.
The whole computer hardware industry embraced the multi‐core. The extreme optimisation of sequential algorithms is then no longer sufficient to squeeze the real machine power, which can be only exploited via thread‐level parallelism. Decision tree algorithms exhibit natural concurrency that makes them suitable to be parallelised. This paper presents an in‐depth study of the parallelisation of an implementation of the C4.5 algorithm for multi‐core architectures. We characterise elapsed time lower bounds for the forms of parallelisations adopted and achieve close to optimal performance. Our implementation is based on the FastFlow parallel programming environment, and it requires minimal changes to the original sequential code. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

18.
In this paper, we proposed a new hybrid control algorithm to achieve leader–follower flocking in multi‐agent systems. In the algorithm, the position is transmitted continuously, whereas the velocity is utilized discretely, which is governed by a distributed event‐triggered mechanism, and the neighbors' velocity is not required to detect the event‐triggered condition for each agent. It is shown that stable flocking is achieved asymptotically while the connectivity of networks is preserved. A numerical example is provided to illustrate the theoretical results. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

19.
In this paper, the controllability of multi‐agent systems is studied under leader‐follower framework, where the interconnection between agents is directed. The concept of leader‐follower connectedness is first introduced for directed graph to check controllability, and some graph‐theoretic conditions are derived for constructively designed topologies. Then, distance partition and almost equitable partition are employed to quantitatively study the controllable subspaces. Moreover, the relationship between the state invariance of multi‐agent systems and the existence of almost equitable partition is discussed. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

20.
A simple and robust algorithm for computationally efficient design optimization of microwave filters is presented. Our approach exploits a trust‐region (TR)‐based algorithm that utilizes linear approximation of the filter response obtained using adjoint sensitivity. The algorithm is sequentially executed on a family of electromagnetic (EM)‐simulated models of different fidelities, starting from a coarse‐discretization one, and ending at the original, high‐fidelity filter model to be optimized. Switching between the models is determined using suitably defined convergence criteria. This arrangement allows for substantial cost reduction of the initial stages of the optimization process without compromising the accuracy and resolution of the final design. The performance of our technique is illustrated through the design of a fifth‐order waveguide filter and a coupled iris waveguide filter. We also demonstrate that the multi‐fidelity approach allows for considerable computational savings compared to TR‐based optimization of the high‐fidelity EM model (also utilizing adjoint sensitivity). © 2014 Wiley Periodicals, Inc. Int J RF and Microwave CAE 25:178–183, 2015.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号