首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Brig Elliott 《Software》1982,12(4):331-340
Dartmouth College has implemented a single debugger for several languages sharing a common runtime environment: PL/I, Basic and Fortran. The debugger is fairly powerful; users set breakpoints and traces which occur whenever the values of given variables change, or whenever certain relational expressions become true, for example. All debugging is carried on in a syntax similar to that of a high-level language. This debugger was implemented in about a month. It should be fairly easy to implement on most timesharing systems. This paper describes the debugger's user interface and gives a rough sketch of its implementation.  相似文献   

3.
One of the problems in the development of multiprocessor systems for image analysis is the selection and efficient utilization of an interconnection network between the multiple processing units. This paper proposes a system organization centered around a class of interconnection networks and a global bus. Control schemes are developed for realizing the intertask communication requirements typically encountered in the parallel formulation of problems for image analysis. These schemes are simple, distributed and efficient. The utility of this organization is demonstrated by evaluating the performance of two applications.  相似文献   

4.
We present a parallel algorithm for computing an optimal sequence alignment in efficient space. The algorithm is intended for a message-passing architecture with one-dimensional-array topology. The algorithm computes an optimal alignment of two sequences of lengthsM andN inO((M+N) 2 /P) time andO((M+N)/P) space per processor, where the number of processors isP>=max(M, N). Thus, whenP=max(M, N) it achieves linear speedup and requires constant space per processor. Some experimental results on an Intel hypercube are provided.This research was supported by NIH Grant LM05110 from the National Library of Medicine.  相似文献   

5.
Multiple resource-sharing is a common situation in parallel and complex manufacturing processes and may lead to deadlock states. To alleviate this issue, this paper presents the method of modeling parallel processing flows, sharing limited number of resources, in flexible manufacturing systems (FMSs). A new class of Petri net called parallel process net with resources (PPNRs) is introduced for modeling such FMSs. PPNRs have the capacity to model the more complex resource-sharing among parallel manufacturing processes. Furthermore, this paper presents the simple diagnostic and remedial procedures for deadlocks in PPNRs. The proposed technique for deadlock detection and recovery is based on transition vectors which have the power of determining the structural aspects as well as the process flow condition in PPNRs. Moreover, the proposed technique for dealing with deadlocks is not a siphon-based thus the large-scale PPNRs for real-life FMSs can be tackled. Finally, the proposed method of modeling and deadlock analysis in the FMS having parallel processing is demonstrated by a practical example.  相似文献   

6.
A message-passing class library C++ for portable parallel programming   总被引:1,自引:1,他引:0  
An object-oriented message-passing class library in C++, called PPI++, for portable parallel programming has been developed. PPI++ (parallel portability interface in C++) is designed to serve as a stable (unchanging) interface between the client parallel code and the rapidly evolving distributed computing environments. By taking advantage of encapsulation, inheritance, and polymorphism supported by C++, PPI++ provides a clean and consistent programming interface, which helps improve the clarity and expressiveness of client parallel codes and hides implementation details and complexity from the user to ease parallel programming tasks. In addition, the use of strong type-checking in C++ allows the detection of potential misuses of the library at compile time, and thus promotes code reliability. This paper describes the object-oriented design and implementation of PPI++. Evaluation of PPI++ on important performance issues, such as portability, ease-of-use, extensibility, and efficiency, is also discussed.  相似文献   

7.
MAPE:一个并行系统结构性能评价模拟环境   总被引:1,自引:0,他引:1  
1 引言 MIMD(Multiple Instructions Multiple Data,多指令流多数据流)计算机,这一概念来源于Flynn对计算机系统的传统分类方法。在这种结构下,系统据有多个处理器,每个处理器独立地执行各自的指令,作用于各自拥有的数据。这种结构能很好地支持高并行问题  相似文献   

8.
This paper describes an efficient implementation and evaluation of a parallel eigensolver for computing all eigenvalues of dense symmetric matrices. Our eigensolver uses a Householder tridiagonalization method, which has higher parallelism and performance than conventional methods when problem size is relatively small, e.g., the order of 10,000. This is very important for relevant practical applications, where many diagonalizations for such matrices are required so often. The routine was evaluated on the 1024 processors HITACHI SR2201, and giving speedup ratios of about 2–5 times as compared to the ScaLAPACK library on 1024 processors of the HITACHI SR2201.  相似文献   

9.
As parallel machines become more widely available, many existing algorithms are being converted to take advantage of the improved speed offered by such computers. However, the method by which the algorithm is distributed is crucial towards obtaining the speed-ups required for many real-time tasks. This paper presents three parallel implementations of the Douglas—Peucker line simplification algorithm on a Sequent Symmetry computer and compares the performance of each with the original sequential algorithm.  相似文献   

10.
Numerical methods for second order differential equations with two-point boundary conditions are incorporated into a three part method for the solution of a second order nonlinear Fredholm integro-differential equation. The interest in this paper is the development of an algorithm for parallel processing the discrete nonlinear system.Numerical examples are given.  相似文献   

11.
We consider two general precedence-constrained scheduling problems that have wide applicability in the areas of parallel processing, high performance compiling, and digital system synthesis. These problems are intractable so it is important to be able to compute tight bounds on their solutions. A tight lower bound on makespan scheduling can be obtained by replacing precedence constraints with release and due dates, giving a problem that can be efficiently solved. We demonstrate that recursively applying this approach yields a bound that is provably tighter than other known bounds, and experimentally shown to achieve the optimal value at least 90.3% of the time over a synthetic benchmark.We compute the best known lower bound on weighted completion time scheduling by applying the recent discovery of a new algorithm for solving a related scheduling problem. Experiments show that this bound significantly outperforms the linear programming-based bound. We have therefore demonstrated that combinatorial algorithms can be a valuable alternative to linear programming for computing tight bounds on large scheduling problems.  相似文献   

12.
13.
An increasing awareness of the need for high speed parallel processing systems for image analysis has stimulated a great deal of interest in the design and development of such systems. Efficient processing schemes for several specific problems have been developed providing some insight into the general problems encountered in designing efficient image processing algorithms for parallel architectures. However it is still not clear what architecture or architectures are best suited for image processing in general, or how one may go about determining those which are. An approach that would allow application requirements to specify architectural features would be useful in this context. Working towards this goal, general principles are outlined for formulating parallel image processing tasks by exploiting parallelism in the algorithms and data structures employed. A synchronous parallel processing model is proposed which governs the communication and interaction between these tasks. This model presents a uniform framework for comparing and contrasting different formulation strategies. In addition, techniques are developed for analyzing instances of this model to determine a high level specification of a parallel architecture that best ‘matches’ the requirements of the corresponding application. It is also possible to derive initial estimates of the component capabilities that are required to achieve predefined performance levels. Such analysis tools are useful both in the design stage, in the selection of a specific parallel architecture, or in efficiently utilizing an existing one. In addition, the architecture independent specification of application requirements makes it a useful tool for benchmarking applications.  相似文献   

14.
In this paper, we present a software tool, RTS (real time simulator), that analyses the time cost behaviour of parallel computations through simulation. It is assumed in RTS that the computer system which supports the executions of parallel computations has a limited number of processors all processors have the same speed and they communicate with each other through a shared memory. In RTS, the time cost of a parallel computation is defined as a function of the input, the algorithm, the data structure, the processor speed, the number of processors, the processor power allocation, the communication and the execution environment. How RTS models the time cost is first discussed in the paper. In the model, a locking technique is used to manipulate the access to the shared memory, processing power is equally allocated among all the operations that are currently being performed in parallel in the computer system, and the number of operations in the execution environment of a parallel computation changes from time to time. How RTS works and how the simulation is used to do time cost analysis are also discussed.  相似文献   

15.
This paper discusses postulated advantages of parallel job designs compared with sequential designs in business processes. Until now, the literature suggests an overall dominance of parallel designs. An analysis of relevant application domains, in which parallel tasks are prevalent in order to accelerate processes, offers two remarkable insights: (1) parallel tasks can but do not have to speed-up business processes; (2) coordination efforts may reduce or even invert potential performance gains. Thus, this paper thoroughly evaluates interdependencies between parallel designs and coordination efforts through simulation experiments. Several process patterns for order processing in a real-world example will be examined. The results offer noteworthy characteristics of parallel designs which extent the existing knowledge: (a) gains from parallelization are reduced with increasing process variability, (b) small increments in coordination efforts neutralize gains from parallelization, (c) in high work load situations, the resource capacity for the coordination task is a severe bottleneck, and (d) if the findings (a)–(c) are carefully considered, only the parallelization of multiple tasks leads to significant performance gains.  相似文献   

16.
Parallel processing plays an important role in sensor-based control of intelligent mobile robots. This paper describes the design and implementation of a parallel processing architecture used for real-time, sensor-based control of mobile robots. This architecture takes the form of a network of sensing and control nodes, based on a novel module that we call Locally Intelligent Control Agent (LICA). It is a hybrid control architecture containing low-level feedback control loops and high-level decision making components. All the sensing, planning, and control tasks for intelligent control of a mobile robot are distributed across such a network, and operate in parallel. It has been used successfully in many experiments to perform planning and navigation tasks in real-time. Such a generic architecture can be readily applied to many diverse applications.  相似文献   

17.
The paper analyzes and selects an appropriate interconnection network for a compliant multiprocessor. The multiprocessor is compliant to the tasks assigned to it in the sense that it can be reconfigured to provide a more efficient fit to the tasks to be executed. A number of possible candidate networks for the multiprocessor is considered: Omega, ADM, Hypercube and Torus. The potential applicability of these networks to the multiprocessor is analyzed from the points of view of partitionability, inter-PE delay, fault impact, and cost. After the individual analysts of the above points of consideration is completed, a weighted network factor is formed, and the optimal type of network is selected, under different performance criteria. The overall results point to the selection of the Torus or Hypercube network for most cases under consideration.  相似文献   

18.
Increasing acceptance of the necessity for high-order parallelism in order to progress digital processing still leaves open the large question of what machine architectures are best for which class of problem.

To help answer this, we are investigating and comparing the use of both SIMD and MIMD architectures for programmable processing in real-time systems. A distributed array machine, Mil-DAP (derived from the original ICL DAP) has been developed and benchmarked on radar, image processing, and on terrain modelling problems. Multi-transputer arrays have been applied to an overlapping set of problems in image processing, FFT and terrain-based computation.

The results are compared and preliminary conclusions drawn.  相似文献   


19.
The cgmCUBE project: Optimizing parallel data cube generation for ROLAP   总被引:5,自引:0,他引:5  
On-line Analytical Processing (OLAP) has become one of the most powerful and prominent technologies for knowledge discovery in VLDB (Very Large Database) environments. Central to the OLAP paradigm is the data cube, a multi-dimensional hierarchy of aggregate values that provides a rich analytical model for decision support. Various sequential algorithms for the efficient generation of the data cube have appeared in the literature. However, given the size of contemporary data warehousing repositories, multi-processor solutions are crucial for the massive computational demands of current and future OLAP systems. In this paper we discuss the cgmCUBE Project, a multi-year effort to design and implement a multi-processor platform for data cube generation that targets the relational database model (ROLAP). More specifically, we discuss new algorithmic and system optimizations relating to (1) a thorough optimization of the underlying sequential cube construction method and (2) a detailed and carefully engineered cost model for improved parallel load balancing and faster sequential cube construction. These optimizations were key in allowing us to build a prototype that is able to produce data cube output at a rate of over one TeraByte per hour. Research supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).  相似文献   

20.
A truly parallel logic programming system is proposed. The system is based on the commercially available parallel logic programming language STRAND, which has been extended in order to overcome the inherent limitations of such systems, like AND-type of parallelism, lack of backtracking, limited unification, etc. The system has been tested using an example from the area of natural language processing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号