共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
通过分析原有遗传算法解决剖分问题时,存在早熟现象的本质原因,对选择算子、交叉算子、变异算子提出了新的实现方法.为进一步提高算法的性能,将退火算法有机融合到遗传算法中,并采用多种群不同策略协同搜索机制,有效地避免过早收敛,对于参数采用构造模糊控制器自适应控制,加快了搜索速度、提高了搜索能力.仿真试验结果表明,该算法能够精确收敛到最优解或次优解. 相似文献
3.
Parallel simulated annealing using speculative computation 总被引:1,自引:0,他引:1
Witte E.E. Chamberlain R.D. Franklin M.A. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(4):483-494
A parallel simulated annealing algorithm that is problem-independent, maintains the serial decision sequence, and obtains speedup which can exceed log2P on P processors is discussed. The algorithm achieves parallelism by using the concurrency technique of speculative computation. Implementation of the parallel algorithm on a hypercube multiprocessor and application to a task assignment problem are described. The simulated annealing solutions are shown to be, on average, 28% better than the solutions produced by a random task assignment algorithm and 2% better than the solutions produced by a heuristic 相似文献
4.
5.
Simulated annealing is known to be an efficient method for combinatorial optimization problems. Its usage for realistic problem size, however, has been limited by the long execution time due to its sequential nature. This report presents a practical approach to synchronous simulated annealing for massively parallel distributed-memory multiprocessors. We use an n-ary speculative tree to execute n different iterations in parallel on n processors, called generalized speculative computation (GSC). Execution results of the 100- to 500-city traveling salesman problems on the AP1000 massively parallel multiprocessor demonstrate that the GSC approach can be an effective method for parallel simulated annealing as it gave over 20-fold speedup on 100 processors 相似文献
6.
Gonçalves-e-Silva Kayo Aloise Daniel Xavier-de-Souza Samuel 《The Journal of supercomputing》2018,74(6):2841-2869
The Journal of Supercomputing - We propose a parallel synchronous and asynchronous implementation of the coupled simulated annealing (CSA) algorithm in a shared-memory architecture. The original... 相似文献
7.
Nabhan T.M. Zomaya A.Y. 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(12):1226-1233
In this paper, we propose a parallel simulated annealing algorithm based on the technique presented by Witte et al. (1991) but with low communication overhead. The performance of our proposed algorithm is significantly better than the method presented by Witte et al., particularly for optimization problems where the time required to communicate the solution is comparable to the evaluation time. The efficiency of the technique is demonstrated using two case studies with good results 相似文献
8.
The techniques that researchers have used to control error in VLSI placement are surveyed. The author discusses the application of parallelism, synchronization with serial subsets, combining algorithms, periodic synchronization, shared-memory implementation, local-memory implementation, and connection Machine implementation. The issues of temporary versus cumulative error, task allocation, and error measurements are examined 相似文献
9.
This paper presents a parallel simulated annealing algorithm that is able to achieve 90% parallel efficiency in iteration on up to 192 processors and up to 40% parallel efficiency in time when applied to a 5000-dimension Rastrigin function. Our algorithm breaks scalability barriers in the method of Chu et al. (1999) by abandoning adaptive cooling based on variance. The resulting gains in parallel efficiency are much larger than the loss of serial efficiency from lack of adaptive cooling. Our algorithm resamples the states across processors periodically. The resampling interval is tuned according to the success rate for each specific number of processors. We further present an adaptive method to determine the resampling interval based on the adoption rate. This adaptive method is able to achieve nearly identical parallel efficiency but higher success rates compared to the fixed interval one using the best interval found. 相似文献
10.
基于模拟退火的并行粒子群优化研究 总被引:17,自引:0,他引:17
针对粒子群优化(PSO)容易陷入局部极小,提出将模拟退火(SA)引入并行PSO算法.这种模拟退火并行粒子群算法,结合了并行粒子群算法的快速寻优能力和SA的概率突跳特性,保持了群体多样性,从而避免了种群退化.针对转炉提钒过程是一个复杂非线性反应过程而难以建立终点控制模型的问题,提出了基于模拟退火的并行粒子群RBF网络的辨识模型,优化了RBF核中心个数,从而克服了随机性选择.将该模型用于预测提钒吹氧时间,仿真结果表明预测误差不超过真实值的20%. 相似文献
11.
12.
It is argued that although it is not yet clear which of the two wafer scale integration (WSI) forms, monolithic or hybrid, will gain the lead to an enabling technology for second-generation massively parallel computers (MPCs), there are noticeably more backers for hybrid-WSI. The application requirements, implementation problems, and engineering issues of MPCs are discussed. In particular, the associative string processor (ASP) modules, which comprise building blocks for second-generation MPC configurations, are described. The progress reported in developing ASP modules is quantitatively extrapolated to other MPC implementations 相似文献
13.
We present a new method for predicting RNA secondary structure based on a genetic algorithm. The algorithm is designed to run on a massively parallel SIMD computer. Statistical analysis shows that the program performs well when compared to a dynamic programming algorithm used to solve the same problem. The program has also pointed out a long-standing simplification in the implementation of the original dynamic programming algorithm that sometimes causes it not to find the optimal secondary structure. 相似文献
14.
Soo-Young Lee Kyung Geun Lee 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(10):993-1008
Simulated annealing is a general-purpose optimization technique capable of finding an optimal or near-optimal solution in various applications. However, the long execution time required for a good quality solution has been a major drawback in practice. Extensive studies have been carried out to develop parallel algorithms for simulated annealing. Most of them were not very successful, mainly because multiple processing elements (PEs) were required to follow a single Markov chain and, therefore, only a limited parallelism was exploited. In this paper, we propose new parallel simulated annealing algorithms which allow multiple Markov chains to be traced simultaneously by PEs which may communicate with each other. We have considered both synchronous and asynchronous implementations of the algorithms. Their performance has been analyzed in detail and also verified by extensive experimental results. It has been shown that for graph partitioning the proposed parallel simulated annealing schemes can find a solution of equivalent (or even better) quality up to an order of magnitude faster than the conventional parallel schemes. Among the proposed schemes, the one where PEs exchange information dynamically (not with a fixed period) performs best 相似文献
15.
A. V. Kalashnikov V. A. Kostenko 《Journal of Computer and Systems Sciences International》2008,47(3):455-463
Algorithms of simulated annealing for solving problems of multiprocessor scheduling are considered, an approach to parallelization is proposed, and the results of comparisons between the classical sequential, sequential, and parallel algorithms of simulated annealing using a partition of the initial space of solutions into regions are presented. 相似文献
16.
提出了三种新的GPU并行的自适应邻域模拟退火算法,分别是GPU并行的遗传-模拟退火算法,多条马尔可夫链并行的退火算法,基于BLOCK分块的GPU并行模拟退火算法,并通过对GPU端的程序采取合并内存访问,避免bank冲突,归约法等方式进一步提升了性能。实验中选取了11个典型的基准函数,实验结果证明这三种GPU并行退火算法比nonu-SA算法具有更好的精度和更快的收敛速度。 相似文献
17.
18.
Parallel implementation of simulated annealing to reproduce multiple-point statistics 总被引:2,自引:0,他引:2
Oscar Peredo 《Computers & Geosciences》2011,37(8):1110-1121
This paper shows an innovative implementation of simulated annealing in the context of parallel computing. Details regarding the use of parallel computing through a cluster of processors, as well as the implementation decisions, are provided. Simulated annealing is presented aiming at the generation of stochastic realizations of categorical variables reproducing multiple-point statistics.The procedure starts with the use of a training image to determine the frequencies of occurrence of particular configurations of nodes and values. These frequencies are used as target statistics that must be matched by the stochastic images generated with the algorithm. The simulation process considers an initial random image of the spatial distribution of the categories. Nodes are perturbed randomly and after each perturbation the mismatch between the target statistics and the current statistics of the image is calculated. The perturbation is accepted if the statistics are closer to the target, or conditionally rejected if not, based on the annealing schedule.The simulation was implemented using parallel processes with C++ and MPI. The message passing scheme was implemented using a speculative computation framework, by which prior to making the decision of acceptance or rejection of a proposed perturbation, processes already start calculating the next possible perturbation at a second level; one as if the perturbation on level one is accepted, and another process as if the proposed perturbation is rejected. Additional levels can start their calculation as well, conditional to the second level processes. Once a process reaches a decision as to whether accept or reject the suggested perturbation, all processes within the branch incompatible with that decision are dropped. This allows a speed up of up to logn(p+1), where n is the number of categories and p the number of processes simultaneously active.Examples are provided to demonstrate improvements and speed ups that can be achieved. 相似文献
19.
Banerjee P. Jones M.H. Sargent J.S. 《Parallel and Distributed Systems, IEEE Transactions on》1990,1(1):91-106
A discussion is presented of two ways of mapping the cells in a two-dimensional area of a chip onto processors in an n -dimensional hypercube such that both small and large cell moves can be applied. Two types of move are allowed: cell exchanges and cell displacements. The computation of the cost function in parallel among all the processors in the hypercube is described, along with a distributed data structure that needs to be stored in the hypercube to support such a parallel cost evaluation. A novel tree broadcasting strategy is presented for the hypercube that is used extensively in the algorithm for updating cell locations in the parallel environment. A dynamic parallel annealing schedule is proposed that estimates the errors due to interacting parallel moves and adapts the rate of synchronization automatically. Two novel approaches in controlling error in parallel algorithms are described: heuristic cell coloring and adaptive sequence control. The performance on an Intel iPSC-2/D4/MX hypercube is reported 相似文献
20.
Csaba M. Jzsa Gza Kolumbn Antonio M. Vidal Francisco J. Martínez‐Zaldívar Alberto Gonzlez 《Concurrency and Computation》2015,27(17):4993-5018
Multiple‐input multiple‐output (MIMO) systems have attracted considerable attention in wireless communications because they offer a significant increase in data throughput and link coverage without additional bandwidth requirement or increased transmit power. The price that has to be paid is the increased complexity of hardware components and algorithms. The sphere detector (SD) algorithm solves the problem of maximum likelihood (ML) detection for MIMO channels by significantly reducing the search space of possible solutions. The main drawback of the SD algorithm is in its sequential nature, consequently, running it on massively parallel architectures (MPAs) is very inefficient. In order to overcome the drawbacks of the SD algorithm, a new parallel sphere detector (PSD) algorithm is proposed. It implements a novel hybrid tree search method, where the algorithm parallelism is assured by the efficient combination of depth‐first search and breadth‐first search algorithms. A path metric‐based parallel sorting is employed at each intermediate stage. The PSD algorithm is able to adjust its memory requirements and extent of parallelism to fit a wide range of parallel architectures. Mapping details for MPAs are proposed by giving the details of thread dependent, highly parallel building blocks of the algorithm. Based on the building blocks proposed, a mapping to general‐purpose graphics processing unit is provided, and its performance is evaluated. In order to achieve high‐throughput, several levels of parallelism are introduced, and different scheduling strategies are considered. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献