首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
This paper presents a parallel formulation of depth-first search which retains the storage efficiency of sequential depth-first search and can be mapped on to anyMIMD architecture. To study its effectiveness it has been implemented to solve the 15-puzzle problem on three commercially available multiprocessors—Sequent Balance 21000, the Intel Hypercube and BBN Butterfly. We have been able to achieve fairly linear speedup on Sequent up to 30 processors (the maximum configuration available) and on the Intel Hypercube andBBN Butterfly up to 128 processors (the maximum configurations available). Many researchers considered the ring architecture to be quite suitable for parallel depth-first search. Our experimental results show that hypercube and sharedmemory architectures are significantly better.At the heart of our parallel formulation is a dynamic work distribution scheme that divides the work between different processors. The effectiveness of the parallel formulation is strongly influenced by the work distribution scheme and architectural features such as presence/absence of shared memory, the diameter of the network, relative speed of the communication network, etc. In a companion paper,(1) we analyze the effectiveness of different load-balancing schemes and architectures, and also present new improved work distribution schemes.This work was supported by Army Research Office Grant No. DAAG29-84-K-0060 to the Artificial Intelligence Laboratory, and Office of Naval Research Grant N00014-86-K-0763 to the Computer Science Department at the University of Texas ay Austin  相似文献   

2.
Consider the problem of exploring a large state-space for a goal state where although many such states may exist in the state-space, finding any one state satisfying the requirements is sufficient. All the methods known until now for conducting such search in parallel using multiprocessors fail to provide consistent linear speedups over sequential execution. The speedups vary between sublinear to superlinear and from one execution to another. Further, adding more processors may sometimes lead to a slow-down rather than speedup, giving rise to speedup anomalies reported in literature. We present a prioritizing strategy which yields consistent speedups that are close toP withP processors, and that monotonically increase with the additon of processors. This is achieved by keeping the total number of nodes expanded during parallel search very close to that of a sequential search. In addition, the strategy requires substantially smaller memory relative to other methods. The performance of this strategy is demonstrated on a multiprocessor with several state-space search problems.This research has been supported in part by the National Science Foundation under Contract No. CCR-89-02496.  相似文献   

3.
针对多核CPU和GPU环境下图的深度优先搜索问题,提出多核CPU中实现并行DFS的新算法,通过有效利用内存带宽来提高性能,且当图增大时优势越明显.在此基础上提出一种混合方法,为DFS每一分支动态地选择最佳的实现:顺序执行;两种不同算法的多核执行;GPU执行.混合算法为每种大小的图提供相对更好的性能,且能避免高直径图上的最坏情况.通过比较多CPU和GPU系统,分析底层架构对DFS性能的影响.实验结果表明,一个高端single-socket GPU系统的DFS执行性能相当于一个高端4-socket CPU系统.  相似文献   

4.
通过八数码问题比较搜索算法的性能   总被引:1,自引:0,他引:1  
搜索算法的核心在于搜索策略的制定.一般的搜索算法采用无信息指导的搜索策略,如深度优先搜索(DFS)和宽度优先搜索(BFS),还有一些搜索算法采用了启发式信息指导的搜索策略,如A*算法.不同的搜索策略会使得搜索算法的性能有很大的差异.使用以上3种搜索算法实现八教码问题的求解,分析和比较三者所表现出来的性能,同时指出3种搜索算法的特点和应用范围,最后给出分析结论以指导开发和使用更加高效的搜索策略.  相似文献   

5.
The paper deals with the parallel variant of the scheduling algorithm dedicated to the hybrid flow shop problem. The problem derives from practice of automated manufacturing lines, e.g. for printed packages. The overall goal is to design a new algorithm which merges the performance of the best known sequential approach with the efficient exploitation of parallel calculation environments. In order to fulfill the above aim, there are two methods proposed in this paper: the original fast method of parallel calculation of the criterion function and the local neighborhood parallel search method embedded in the tabu search approach. The theoretical analysis, as well as the original implementation, with the use of vector processing instructions SSE2 supported by suitable data organization, are presented below. Numerical properties of the proposed algorithm are empirically verified on the multi-core processor.  相似文献   

6.
对指纹图像的细化算法进行了较深入地研究,分析了两种常用细化算法--快速细化算法和改进的OPTA算法各自的优缺点.针对其中存在的迭代次数多、细化速度慢、图像局部细化不彻底等问题,提取了一种无回溯深度优先搜索的快速指纹细化算法.实验结果表明,该算法在保证对图像完全细化的同时,也具有较快的细化处理速度.  相似文献   

7.
《国际计算机数学杂志》2012,89(1-4):255-268
Parallel Breadth-First Search (BFS) algorithms for ordered trees and graphs on a shared memory model of a Single Instruction-stream Multiple Data-stream computer are proposed. The parallel BFS algorithm for trees computes the BFS rank of eachnode of an ordered tree consisting of n nodes in time of 0(β log n) when 0(n 1+1/β) processors are used, β being an integer greater than or equal to 2. The parallel BFS algorithm for graphs produces Breadth-First Spanning Trees (BFSTs) of a directedgraph G having n nodes in time 0(log d.log n) using 0(n 3) processors, where d is the diameter of G If G is a strongly connected graph or a connected undirected graph the BFS algorithm produces n BFSTs, each BFST having a different start node.  相似文献   

8.
The availability of commodity multiprocessors and high-speed networks of workstations offer significant opportunities for addressing the increasing computational requirements of optimization applications. To leverage these potential benefits, it is important, however, to make parallel and distributed processing easily accessible to a wide audience of optimization programmers. This paper addresses this challenge by proposing parallel and distributed programming abstractions that keep the distance from sequential local search algorithms as small as possible. The abstractions, including parallel loops, interruptions, thread pools, and shared objects, are compositional and cleanly separate the optimization program and the parallel instructions. They have been evaluated experimentally on a variety of applications, including warehouse location and coloring, for which they provide significant speedups.  相似文献   

9.
袁泉  石昭祥 《微计算机信息》2006,22(34):271-273
二次时频分布(TFD)是一种常用的非平稳信号分析方法,以最常用的Wigner-Ville分布(WVD)为例,分析了离散时频分布算法实现,针对该算法的复杂性和计算量大的特点,实现了离散信号时频分布的并行算法,通过采用缓冲池和计算结果的压缩传输,同时减小了并行计算中传输数据总量和传输次数,将通信开销从O(N2)降低到O(N),解决了并行时频分布计算中数据传输“瓶颈”问题。  相似文献   

10.
《国际计算机数学杂志》2012,89(3-4):113-121
Given n items, a parallel algorithm for generating all the n! permutations is presented. The computational model used is a linear array which consists of n identical processing elements with a simple structure. One permutation is produced at each other time step. The elapsed time to produce a permutation is independent of the integer n. The basic idea used is the iterative method and the exchange of two consecutive components in an existing permutation. The design procedures of this algorithm are considered in detail. The ranking and unranking functions of the required permutations are also discussed.  相似文献   

11.
虽然G.729中采用的集中搜索和G.729a中采用的深度优先树搜索可以有效减少固定码本搜索复杂度,但固定码本搜索在整个语音编码算法中仍占有较大比重.为了在基本维持语音质量的前提下,减少搜索运算量,研究了几种快速搜索算法,脉冲替代和预选替代一个脉冲搜索算法可以大大减少搜索次数,但语音质量明显下降,因此提出每次替代两个脉冲搜索算法,得到比替代一个脉冲较为完整的搜索,产生较好的语音质量.仿真结果表明,该算法可以大大减少搜索运算量,并且保持了和G.729a深度优先树搜索算法相同的语音质量.  相似文献   

12.
并行遗传算法分析   总被引:16,自引:1,他引:16  
在科学计算机领域,并行遗传算法开始受到关注。分析了遗传算法并行化的同和实现模型,讨论了遗传算法隐含的并行性,对于灵活应用并行遗传算法有指导意义。  相似文献   

13.
This paper presents an improved analysis of a randomized parallel backtrack search algorithm (RPBS). Our analysis uses the single-node-donation model that each donation contains a single tree node. It is shown that with high probability the total number of messages generated by RPBS is O(phd) where p is the number of processors, and h and d are the height and degree of the backtrack search tree. Under the assumption of unit-time message delivery, it is shown that with high probability the execution time of RPBS is n/p + O(hd) where n is the number of nodes of the backtrack search tree and the leading term n/p has no constant factor. As the result of limited communication requirement, RPBS can be efficiently implemented in message-passing or shared-memory multiprocessor systems. A general analysis of network implementation of RPBS is presented. The concept of total routing time, the sum of routing times of all messages, is introduced as a measure of communication cost. It is shown that the overall effect of message delay to the execution time of RPBS is small if the total routing time is small. Some experimental data on a shared-memory machine are reported. Received November 23, 1996; revised February 15, 1998.  相似文献   

14.
Networks are a class of general systems represented by becomes a weighted graph visualizing the constraints imposed their UC-structure. Suppressing the nature of elements the network by interconnections rather than the elements themselves. These constraints follow generalized Kirchhoff's laws derived from physical constraints. Once we have a graph; then the working environment becomes the graph-theory. An algorithm derived from graph theory is developed within the paper in order to analyze general networks. The algorithm is based on computing all the spanning trees in the graph G with an associated weight. This weight is the product ofadmittance's of the edges forming the spanning tree. In the first phase this algorithm computes a depth first spanning tree together with its cotree. Both are used as parents for controlled generation of off-springs. The control is represented in selecting the off-springs that were not generated previously. While the generation of off-springs, is based on replacement of one or more tree edges by cycle edges corresponding to cotree edges. The algorithm can generate a frequency domain analysis of the network.  相似文献   

15.
为了解决布谷鸟搜索算法后期收敛速度慢、求解精度不高、易陷入局部最优等缺陷,提出了一种基于Powell局部搜索策略的全局优化布谷鸟搜索算法.算法将布谷鸟全局搜索能力与Powell方法的局部寻优性能有机地结合,并根据适应度值逐步构建精英种群候选解池在迭代后期牵引Powell搜索的局部优化,在保证求解速度、尽可能找到全局极值点的同时提高算法的求解精度.对52个典型测试函数实验结果表明,该算法相比于传统的布谷鸟搜索算法不仅寻优精度和寻优率有所提高,并且适应能力强、鲁棒性好,与最新提出的其他改进算法相比也具有一定的竞争优势.  相似文献   

16.
吴磊  芦东昕  方马 《计算机工程》2003,29(22):89-90,150
在并行算法中,涉及指针的算法是很重要的。该文讨论了一种称为指针转移的技术,这一技术提供了一种并行地控制表操作的快速方法。文绍了如何运用指针转移技术对表执行前缀计算,如何尽快把表的算法改写为适用于树的算法。  相似文献   

17.
《国际计算机数学杂志》2012,89(3-4):205-226
Ghosh and Bhattacharjee propose [2] (Intern. J. Computer Math., 1984, Vol. 15, pp. 255-268) an algorithm of determining breadth first spanning trees for graphs, which requires that the input graphs contain some vertices, from which every other vertex in the input graph can be reached. These vertices are called starting vertices. The complexity of the GB algorithm is O(log2 n) using O{n 3) processors. In this paper an algorithm, named BREADTH, also computing breadth first spanning trees, is proposed. The complexity is O(log2 n) using O{n 3/logn) processors. Then an efficient parallel algorithm, named- BREADTHFOREST, is proposed, which generalizes algorithm BREADTH. The output of applying BREADTHFOREST to a general graph, which may not contain any starting vertices, is a breadth first spanning forest of the input graph. The complexity of BREADTHFOREST is the same as BREADTH.  相似文献   

18.
This paper presents a number of novel metaheuristic approaches that can efficiently map stream graphs on multicores. A stream graph consists of a set of actors performing different functions communicating through edges. Orchestrating stream graphs on multicores can be formulated as an Integer Linear Programming (ILP) problem but ILP solver takes exponential time to provide an optimal solution. We propose metaheuristic algorithms to achieve near optimal solutions within a reasonable amount of time. We employ six different variants of the Hill-Climbing (HC) algorithm employing different tweak operators that produce excellent result extremely quickly. We also propose six different variants of Genetic Algorithm (GA) to examine how effective these variants can be in escaping the local optima. We finally combine HC and GA techniques (which is also known as ‘memetic algorithm’) to produce hybrid techniques that outperform the individual performance of HC and GA techniques. We compare our results with the results generated by the CPLEX optimization tool. Our best technique has achieved a geometric mean speedup of 7.42× across a range of StreamIt benchmarks on an eight-core processor.  相似文献   

19.
Optimal algorithms for the online time series search problem   总被引:1,自引:0,他引:1  
In the problem of online time series search introduced by El-Yaniv et al. (2001) [1], a player observes prices one by one over time and shall select exactly one of the prices on its arrival without the knowledge of future prices, aiming to maximize the selected price. In this paper, we extend the problem by introducing profit function. Considering two cases where the search duration is either known or unknown beforehand, we propose two optimal deterministic algorithms respectively. The models and results in this paper generalize those of El-Yaniv et al. (2001) [1].  相似文献   

20.
This two-part paper presents modelling and scheduling approaches of flexible manufacturing systems using Petri nets (PNs) and artificial intelligence (AI)-based heuristic search methods. In Part I, PN-based modelling approaches and basic AI-based heuristic search algorithms were presented. In Part II, a new heuristic function that exploits PN information is proposed. Heuristic information obtained from the PN model is used to dramatically reduce the search space. This heuristic is derived from a new concept, the resource cost reachability matrix, which builds on the properties of B-nets proposed in Part I. Two hybrid search algorithms, (1) an approach to model dispatching rules using analysis information provided by the PN simulation and (2) an approach of the modified stage-search algorithm, are proposed to reduce the complexity of large systems. A random problem generator is developed to test the proposed methods. The experimental results show promising results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号