首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
For pt.I. see ibid., p. 170-80. In pt.I, we presented a binding environment for the AND and OR parallel execution of logic programs. This environment was instrumental in rendering a compiler for the AND and OR parallel execution of logic programs machine independent. In this paper, we describe a compiler based on the Reduce-OR process model (ROPM) for the parallel execution of Prolog programs, and provide performance of the compiler on five parallel machines: the Encore Multimax, the Sequent Symmetry, the NCUBE 2, the Intel i860 hypercube and a network of Sun workstations. The compiler is part of a machine independent parallel Prolog development system built on top of a run time environment for parallel programming called the Chare kernel, and runs unchanged on these multiprocessors. In keeping with the objectives behind the ROPM, the compiler supports both on and independent AND parallelism in Prolog programs and is suitable for execution on both shared and nonshared memory machines. We discuss the performance of the Prolog compiler in some detail and describe how grain size can be used to deliver performance that is within 10% of the underlying sequential Prolog compiler on one processor, and scale linearly with increasing number of processors on problems exhibiting sufficient parallelism. The loose coupling between parallel and sequential components makes it possible to use the best available sequential compiler as the sequential component of our compiler  相似文献   

2.
A specification of the OR-parallel execution of Prolog programs, using CHOCS (calculus of higher order communicating systems) [24], is presented in the paper. A translation is defined from Prolog programs and goals to CHOCS processes: the execution of the CHOCS process corresponding to a goal mimics the OR-parallel execution of the original Prolog goal. In the translation, clauses and predicate definitions of a Prolog program correspond to processes. To model OR-parallelism, the processes , corresponding to clauses (having the same head predicate ) start their execution concurrently, but, in order to respect the depth-first search rule, each is guarded by the termination of the executions of processes 's, . The computational model is proved correct with respect to the semantics of Prolog, as given in [4, 5]. Our model, because of its algebraic specification, can be easily used to prove properties of the parallel execution of Prolog programs. Moreover, the model exploits the maximum degree of parallelism, by giving the Prolog solutions in parallel, without any order among them. However, this model, being close to the Prolog semantics definition, contains sources of inefficiency which make it unpractical as a guide for the implementation. To overcome these problems, a new computational model is defined. This model is obtained by modifications of the basic one and thus its correctness can be easily proved. Finally, we show how to obtain models of different real implementations of OR-parallel Prolog by slight modification of the new model. The relations among all these models, in terms of parallelism degree, are studied by using the concepts of bisimulation and simulation, developed for concurrent calculi. Received: 5 May 1995 / 28 May 1996  相似文献   

3.
The restricted and-parallelism (RAP) execution model of logic programs is described. This model uses a compile-time data-dependence analysis to generate execution graph expressions for the clauses in a Prolog program. These expressions use simple run-time tests to determine the possibilities of parallelism and produce multiple parallel execution graphs from a single execution graph expression. A simple algorithm is then presented which automatically produces these execution graphs for Prolog clauses. Several ways in which the algorithm can be significantly improved by using the results of program-level data-dependence analysis are discussed.  相似文献   

4.
This paper presents a system for parallel execution of Prolog supporting both independent conjunctive and disjunctive parallelism. The system is intended for distributed memory architecture and is composed of a set of workers with a hierarchical structure scheduler. The execution model has been designed in such a way that each worker's environment does not contain references to terms in other environments, thus reducing communication overhead. In order to guarantee the improvement of the performance by the parallelism exploitation, a granularity control has been introduced for each kind of parallelism. For conjunctive parallelism PDP applies a control based on the estimation provided by CASLOG. The features of the system allow to introduce this control without adding overhead. For disjunctive parallelism PDP controls granularity by applying a heuristic-based method, which can be adapted to other parallel Prolog systems. Different scheduling policies have also been tested. The system has been implemented on a transputer network and performance results show that it provides a high speedup for coarse grain parallel programs.  相似文献   

5.
When two or more literals in the body of a Prolog clause are solved in (AND) parallel, their solutions need to bejoined to compute solutions for the clause. This is often a difficult problem in parallel Prolog systems that exploit OR and independent AND parallelism in Prolog programs. In several AND/OR parallel systems proposed recently, this problem is side-stepped at the cost of unexploited OR parallelism in the program, in part due to the complexity of the backtracking algorithm beneath AND parallel branches. In some cases, the data dependency graphs used by these systems cannot represent all the exploitable indenpendent AND parallelism known at compile time.In this paper, we describe the compile time analysis for an optimizedjoin algorithm for supporting independent AND parallelism in logic programs efficiently without leaving any OR parallelism unexploited. We then discuss how this analysis can be used to yield very efficient runtime behavior. We also discuss problems associated with a tree representation of the search space when arbitrarily complex data dependency graphs are permitted. We describe how these problems can be resolved by mapping the search space onto the data dependency graphs themselves. The algorithm has been implemented in a compiler for parallel Prolog based on the Reduce-OR process model. The algorithm is suitable for the implementation of AND/OR systems on both shared and nonshared memory machines. Performance on benchmark programs exhibiting AND and OR parallelism on one shared memory machine and one message passing machine is presented.This work was supported in part by NSF Grants CCR-87-00988 and CCR-89-02496.A shorter version of this paper appears in theProceedings of NACLP 1990.  相似文献   

6.
An architecture is presented for the parallel execution of sequential Prolog. The architecture is based on a pipeline of unification processors and designed to work as a co-processor to a more conventional, UNIX based workstation. The unification processors execute highly optimized compiled Prolog code; however, the basic concept of the architecture could also increase the performance of interpreter based systems. It will be shown that even programs that do not exhibit any of the ‘classical’ forms of parallelism (i.e. AND-, OR-parallelism, etc.) can be effectively mapped onto the proposed architecture. The presented architecture should also prove very effective as a multi-user Prolog machine executing several independent Prolog programs in parallel. In contrast to other attempts to execute sequential Prolog in parallel we do not restrict the use of any of the standard Prolog language features such as dynamic assert/retract, CUT, etc. Simulation results show that peak execution rates of over 1000 KLIPS can be obtained.  相似文献   

7.
This paper concerns the exploitation of user transparent inherent parallelism of pure Prolog programs using program transformation. We describe a novel paradigmenumerate-and-filter for transforming generate-and-test programs for execution under the committed-choice model extended to incorporate multiple solutions based on set enumeration. The paradigm simulates OR-parallelism by stream AND-parallelism integrating OR-parallelism, AND-parallelism, and stream parallelism. Generate-and-test programs are classified into three categories:simple generate-and-test, recursively embedded generate-and-test, and deeply intertwined generate-and-test. The intermediate programs are further transformed to reduce structure copying and metacalls. Algorithms are presented and demonstrated by transforming the representative examples of different classes of generate-and-test programs to Flat Concurrent Prolog equivalents. Statistics show that the techniques are efficient.Funded in part by Cleveland Advanced Manufacturing Program through the State of Ohio as a part of its core research program grant to Center of Automation and Intelligent Systems Research, Case Western Reserve University and NSF equipment grant CDA-8820390 to Kent State University.  相似文献   

8.
We present an annotation language well-suited for rendering aspects of Prolog execution. Our annotations are special Prolog goals that act as executable comments, performing debugging at run-time. No restrictions are placed upon the object language, the concern being verification of (full) Standard Prolog programs. Here we discuss the merits of the annotations for Prolog debugging. All the examples are actual runs of our system, Nope.  相似文献   

9.
Logic programs offer many opportunities for the exploitation of parallelism.But the parallel execution of a task incurs various overheads.This paper focuses on the issues relevant to parallelizing Prolog on shared-memory multiprocessors efficiently.  相似文献   

10.
We present an overview of the ACE system, a sound and complete parallel implementation of Prolog that exploits parallelism transparently (i.e., without any user intervention) from AI programs and symbolic applications coded in Prolog. ACE simultaneously exploits all the major forms of parallelism – Or-parallelism, Independent And-parallelism, and Dependent And-parallelism – found in Prolog programs. These three varieties of parallelism are discussed in detail, along with the problems encountered in their practical exploitation. Our solutions to these problems, incorporated in the ACE system, are presented. The ACE system has been implemented on Sequent Symmetry and Sun Sparc Multiprocessors; performance results from this implementation for several AI programs are presented, which confirm the effectiveness of the choices made. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

11.
Efficient reordering of Prolog programs   总被引:1,自引:0,他引:1  
Prolog programs are often inefficient: execution corresponds to a depth-first traversal of an AND/OR graph; traversing subgraphs in another order can be less expensive. It is shown how the reordering of clauses within Prolog predicates, and especially of goals within clauses, can prevent unnecessary search. The characterization and detection of restrictions on reordering is discussed. A system of calling modes for Prolog, geared to reordering, is proposed, and ways to infer them automatically are discussed. The information needed for safe reordering is summarized, and which types can be inferred automatically and which must be provided by the user are considered. An improved method for determining a good order for the goals of Prolog clauses is presented and used as the basis for a reordering system  相似文献   

12.
The Muse (multiple sequential Prolog engines) approach has been used to make a simple and efficient OR-parallel implementation of the full Prolog language. The performance results of the Muse system on bus-based multiprocessor machines have been presented in previous papers. This paper discusses the implementation and performance results of the Muse system on switch-based multiprocessors (the BBN Butterfly GP1000 and TC2000). The results of Muse execution show that high real speedups can be achieved for Prolog programs that exhibit coarse-grained parallelism. The scheduling overhead is equivalent to around 8–26 Prolog procedure calls per task on the TC2000. The paper also compares the Muse results with corresponding results for the Aurora OR-parallel Prolog system. For a large set of benchmarks, the results are in favor of the Muse system.  相似文献   

13.
This paper proposes a predicate nameddosim which provides a new function for parallel execution of logic programs. The parallelism achieved by this predicate is a simultaneous mapping operation such as bagof and setof predicates. However, the degree of parallelism can be easily decided by arranging the arguments of the dosim goal. The parallel processing system with dosim was realized on a tight-coupled multiprocessor machine. To control the degree of parallelism and reduce the amount of memory required for execution, we introduce the grouping method for the goals executed in parallel and some variations of the dosim predicate. The effectiveness of the proposed method is demonstrated by the results of the execution of several applications.  相似文献   

14.
Muse (Multi-sequential Prolog engines) is a simple and efficient approach to Or-parallel execution of Prolog programs. It is based on having several sequential Prolog engines, each with its local address space, and some shared memory space. It is currently implemented on a 7-processors machine with local/shared memory constructed at SICS, a 16-processors Sequent Symmetry, a 96-processors BBN Butterfly I, and a 45-processors BBN Butterfly II. The sequential SICStus Prolog system has been adapted to Or-parallel implementation. Extra overhead associated with this adaptation is very low in comparison with the other approaches. The speed-up factor is very close to the number of processors in the system for a large class of problems.The goal of this paper is to present the Muse execution model, some of its implementation issues, a variant of Prolog suitable for multiprocessor implementations, and some experimental results obtained from two different multiprocessor systems.  相似文献   

15.
This paper begins by describing BSL, a new logic programming language fundamentally different from Prolog. BSL is a nondeterministic Algol-class language whose programs have a natural translation to first order logic; executing a BSL program without free variables amounts to proving the corresponding first order sentence. A new approach is proposed for parallel execution of logic programs coded in BSL, that relies on advanced compilation techniques for extracting fine grain parallelism from sequential code. We describe a new “Very Long Instruction Word” (VLIW) architecture for parallel execution of BSL programs. The architecture, now being designed at the IBM Thomas J. Watson Research Center, avoids the synchronization and communication delays (normally associated with parallel execution of logic programs on multiprocessors), by determining data dependences between operations at compile time, and by coupling the processing elements very tightly, via a single central shared register file. A simulator for the architecture has been implemented and some simulation results are reported in the paper, which are encouraging.  相似文献   

16.
A major reason for the lack of practical use of parallel computers has been the absence of a suitable model of parallel computation. Many existing models are either theoretical or are tied to a particular architecture. A more general model must be architecture independent, must realistically reflect execution costs, and must reduce the cognitive overhead of managing massive parallelism. A growing number of models meeting some of these goals have been suggested. We discuss their properties and relative strengths and weaknesses. We conclude that data parallelism is a style with much to commend it, and discuss the Bird-Meertens formalism as a coherent approach to data parallel programming.This work was supported by the Natural Sciences and Engineering Research Council of Canada.  相似文献   

17.
This paper presents some fundamental properties of independent and- parallelism and extends its applicability by enlarging the class of goals eligible for parallel execution. A simple model of (independent) and-parallel execution is proposed and issues of correctness and efficiency are discussed in the light of this model. Two conditions, “strict” and “nonstrict” independence, are defined and then proved sufficient to ensure correctness and efficiency of parallel execution: If goals which meet these conditions are executed in parallel, the solutions obtained are the same as those produced by standard sequential execution. Also, in the absence of failure, the parallel proof procedure does not generate any additional work (with respect to standard SLD resolution), while the actual execution time is reduced. Finally, in case of failure of any of the goals, no slowdown will occur. For strict independence, the results are shown to hold independently of whether the parallel goals execute in the same environment or in separate environments. In addition, a formal basis is given for the automatic compile-time generation of independent and-parallelism: Compiletime conditions to efficiently check goal independence at run time are proposed and proved sufficient. Also, rules are given for constructing simpler conditions if information regarding the binding context of the goals to be executed in parallel is available to the compiler.  相似文献   

18.
19.
We propose an extension to Prolog called the count term for controlling Prolog execution. The purpose is to allow the programmers as well as the users to have greater flexibility in controlling the execution behavior of Prolog programs and for limiting the number of answers or proofs retrieved when Prolog is used as a database query language. Both syntax and operational semantics of the count term are defined. An implementation strategy based on WAM (Warren Abstract Machine) is provided. We analyze the possible meanings one might associate with the count term. The possible replacement of cut and fail by the count term is presented. The ease of analysis of programs with count terms is discussed.  相似文献   

20.
《Computer Languages》1996,22(2-3):115-142
We present the design and implementation of the and-parallel component of ACE. ACE is a computational model for the full Prolog language that simultaneously exploits both or-parallelism and independent and-parallelism. A high-performance implementation of the ACE model has been realized and its performance reported in this paper. We discuss how some of the standard problems which appear when implementing and-parallel systems are solved in ACE. We then propose a number of optimizations aimed at reducing the overheads and the increased memory consumption which occur in such systems when using previously proposed solutions. Finally, we present results from an implementation of ACE which includes the optimizations proposed. The results show that ACE exploits and-parallelism with high efficiency and high speedups. Furthermore, they also show that the proposed optimizations, which are applicable to many other and-parallel systems, significantly decrease memory consumption and increase speedups and absolute performance both in forward execution and during backtracking.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号