首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider efficiently routing permutations in a class of switch-based interconnects. Permutation is an important communication pattern in parallel and distributed computing systems. We present a generic approach to realizing arbitrary permutations in a class of unique-path, self-routable interconnects. It is well-known that this type of interconnect has low hardware cost, but can realize only a small portion of all possible permutations between its inputs and outputs in a single pass. We consider routing arbitrary permutations with link-disjoint paths and node-disjoint paths in such interconnects in a minimum number of passes. In particular, routing with node-disjoint paths has important applications in emerging optical interconnects. We employ and further expand the Latin square technique used in all-to-all personalized exchange algorithms for this class of interconnects for general permutation routing. Our implementation of permutation routing is optimal in terms of the number of passes that messages are transmitted through the network, and it is near-optimal in network transmission time for sufficiently long messages. The possibility of adopting a single-stage interconnect is also discussed.  相似文献   

2.
Recently, Chinn et al. [10] presented lower bounds for store-and-forward permutation routing algorithms on the n \times n mesh with bounded buffer size and where a packet must take a shortest (or minimal ) path to its destination. We extend their analysis to algorithms that are nearly minimal. We also apply this technique to the domain of hot potato algorithms, where there is no storage of packets and the shortest path to a destination is not assumed (and is in general impossible). We show that ``natural' variants and ``improvements' of several algorithms in the literature perform poorly in the worst case. As a result, we identify algorithmic features that are undesirable for worst-case hot potato permutation routing. Recent works in hot potato routing have tried to define simple and greedy classes of algorithms. We show that when an algorithm is too simple and too greedy, its performance in routing permutations is poor in the worst case. Specifically, the technique of [10] is also applicable to algorithms that do not necessarily send packets in minimal or even nearly minimal paths: it may be enough that they naively attempt to do so when possible. In particular, our results show that a certain class of greedy algorithms that was suggested recently by Ben-Dor et al. [6] contains algorithms that have poor performance in routing worst-case permutations. Received August 24, 1995; revised May 27, 1997.  相似文献   

3.
Permuting a vector is a fundamental primitive which arises in many applications. In particular, rational permutations, which are defined by permutations of the bits of the binary representations of the vector indices, are widely used. Matrix transposition and bit-reversal are notable examples of rational permutations. In this paper we contribute a number of results regarding the execution of these permutations in cache hierarchies, with particular emphasis on the cache-oblivious setting. We first bound from below the work needed to execute a rational permutation with an optimal cache complexity. Then, we develop a cache-oblivious algorithm to perform any rational permutation, which exhibits optimal work and cache complexities under the tall cache assumption. We finally show that for certain families of rational permutations (including matrix transposition and bit reversal) no cache-oblivious algorithm can exhibit optimal cache complexity for all values of the cache parameters. This latter result specializes the one proved by Brodal and Fagerberg for general permutations to the case of rational permutations, and provides further evidence that the tall cache assumption is often necessary to attain cache optimality in the context of cache-oblivious algorithms.  相似文献   

4.
In this paper we study the problem of the whole mirror duplication-random loss model in terms of pattern avoiding permutations. We prove that the class of permutations obtained with this model after a given number p of duplications of the identity is the class of permutations avoiding the alternating permutations of length p2+1. We also compute the number of duplications necessary and sufficient to obtain any permutation of length n. We provide two efficient algorithms to reconstitute a possible scenario of whole mirror duplications from identity to any permutation of length n. One of them uses the well-known binary reflected Gray code (Gray, 1953) [10]. Other relative models are also considered.  相似文献   

5.
We study a class of recursive permutation generation methods which construct a sequence of alln! permutations ofn elements by repeatedly generating all permutations of the elements in the firstn?1 positions and exchanging one of them with the element in then-th position. We give a general principle which enables us to obtain a whole class of permutation generation methods. This class includes the well-known algorithms of Wells and Heap as special cases, but contains also many new simple algorithms. Moreover, we are able to produce permutation generation methods with prescribed properties concerning the change that should be made in order to skip a block ofm! permutations with fixed elements in positionsm+1, …,n. For any method in our class, this change is a single transposition for any oddm>1, and a cyclic shift along a cycle of lengthm for any evenm.  相似文献   

6.
The so-called permutation separability criteria are simple operational conditions that are necessary for separability of mixed states of multipartite systems: (1) permute the indices of the density matrix and (2) check if the trace norm of at least one of the resulting operators is greater than one. If it is greater than one then the state is necessarily entangled. A shortcoming of the permutation separability criteria is that many permutations give rise to equivalent separability criteria. Therefore, we introduce a necessary condition for two permutations to yield independent criteria called combinatorial independence. This condition basically means that the map corresponding to one permutation cannot be obtained by concatenating the map corresponding to the second permutation with a norm-preserving map. We characterize completely combina-torially independent criteria, and determine simple permutations that represent all independent criteria. The representatives can be visualized by means of a simple graphical notation. They are composed of three basic operations: partial transpose, and two types of so-called reshufflings. In particular, for a four-partite system all criteria except one are composed of partial transpose and only one type of reshuffling; the exceptional one requires the second type of reshuffling. Furthermore, we show how to obtain efficiently a simple representative for every permutation. This method allows to check easily if two permutations are Combinatorially equivalent or not.  相似文献   

7.
In a multistage interconnection network (MIN) the calculation of the number of permutations of the input terminals into the output terminals is a classic difficult problem. In this paper, we introduce an innovative technique based on Colored Petri Nets (known as CP-nets or CPNs) that will allow us to analyze the permutation capability of arbitrary MINs. We show how to verify whether a MIN is rearrangeable through the state space analysis of the associated CP-net and we measure the permutation capability of non-rearrangeable MINs in terms of the permutations that can be generated. The proposed approach takes advantage of powerful existing software tools, particularly, CPNTools, which is used to explore the occurrence graphs of CP-nets and determine the set of permutations performed by the modeled MINs. This new technique is easy to use and can be efficiently applied to MINs made of cross-bar switches.  相似文献   

8.
We consider using third-order equational methods to formally verify that an infinite systolic algorithm correctly implements a family of convolution functions. The detailed case study we present illustrates the use of third-order algebra as a formal framework for developing families of computing systems. It also provides an interesting insight into the use of infinite algorithms as a means of verifying a family of finite algorithms. We consider using purely equational reasoning in our verification proofs and in particular, using the rule of free variable induction. We conclude by considering how our verification proofs can be automated using rewriting techniques.  相似文献   

9.
This paper describes deterministic communication-efficient algorithms for performing dynamic permutations on a coarse-grained parallel machine. Our analysis shows that the general permutation operation can be completed inCμn/p(+ lower order terms) time and is optimal and scalable providednp3+p2τ/μ (nis the size of the permutation or the number of elements distributed across thepprocessors, τ is the start-up overhead, and 1/μ is the data transfer rate).Cis a small constant typically between 2 and 3 for write permutations, slightly higher for read permutations. Modifications to exploit locality of access are presented. Special classes of permutations that are optimal for smaller sizes are also described. A companion paper [22] deals with the problem of random data accesses with hot spots.  相似文献   

10.
Abstract Congruence Closure   总被引:3,自引:0,他引:3  
We describe the concept of an abstract congruence closure and provide equational inference rules for its construction. The length of any maximal derivation using these inference rules for constructing an abstract congruence closure is at most quadratic in the input size. The framework is used to describe the logical aspects of some well-known algorithms for congruence closure. It is also used to obtain an efficient implementation of congruence closure. We present experimental results that illustrate the relative differences in performance of the different algorithms. The notion is extended to handle associative and commutative function symbols, thus providing the concept of an associative-commutative congruence closure. Congruence closure (modulo associativity and commutativity) can be used to construct ground convergent rewrite systems corresponding to a set of ground equations (containing AC symbols). This revised version was published online in August 2006 with corrections to the Cover Date.  相似文献   

11.
In the early 1980s, there was a number of papers on what should be called proofs by consistency. They describe how to perform inductive proofs, without using an explicit induction scheme, in the context of equational specifications and ground-convergent rewrite systems. The method was explicitly stated as a first-order consistency proof in the case of pure equational, constructor-based specifications. In this paper, we show how, in general, inductive proofs can be reduced to first-order consistency and hence be performed by a first-order theorem prover. Moreover, we extend previous methods, allowing nonequational specifications (even non-Horn specifications) and designing some specific strategies. Finally, we also show how to drop the ground convergence requirement (which is called Saturatedness for general clauses).  相似文献   

12.
With the advent of new routing methods, the distance that a message is sent is becoming relatively less and less important. Thus, assuming no link contention, permutation seems to be an efficient collective communication primitive. In this paper, we present several algorithms for decomposing all-to-many personalized communication into a set of disjoint partial permutations. We discuss several algorithms and study their effectiveness from the view of static scheduling as well as run-time scheduling. An approximate analysis shows that with n processors, and assuming that every processor sends and receives d messages to random destinations, our algorithm can perform the scheduling in O(dn In d) time, on average, and can use an expected number of d+log d partial permutations to carry out the communication. We present experimental results of our algorithms on the CM-5  相似文献   

13.
Permutation is a frequently-used communication pattern in parallel and distributed computing systems and telecommunication networks. Node-disjoint routing has important applications in guided wave optical interconnects where the optical "crosstalk" between messages passing the same switch should be avoided. In this paper, we consider routing arbitrary permutations on an optical baseline network (or reverse baseline network) with node-disjoint paths. We first prove the equivalence between the set of admissible permutations (or semipermutations) of a baseline network and that of its reverse network based on a step-by-step permutation routing. We then show that an arbitrary permutation can be realized in a baseline network (or a reverse baseline network) with node-disjoint paths in four passes, which beats the existing results [M. Vaez et al., (2000)], [G. Maier et al., (2001)] that a permutation can be realized in an n /spl times/ n banyan network with node-disjoint paths in O(n/sup 1/2/) passes. This represents the currently best-known result for the number of passes required for routing an arbitrary permutation with node-disjoint paths in unique-path multistage networks. Unlike other unique path MINs (such as omega networks or banyan networks), only baseline networks have been found to possess such four-pass routing property. We present routing algorithms in both self-routing style and central-controlled style. Different from the recent work in [Y. Yang et al., (2003)], which also gave a four-pass node-disjoint routing algorithm for permutations, the new algorithm is efficient in transmission time for messages of any length, while the algorithm in [Y. Yang et al., (2003)] can work efficiently only for long messages. Comparisons with previous results demonstrate that routing in a baseline network proposed in this paper could be a better choice for routing permutations due to its lowest hardware cost and near-optimal transmission time.  相似文献   

14.
Many common machine learning methods such as support vector machines or Gaussian process inference make use of positive definite kernels, reproducing kernel Hilbert spaces, Gaussian processes, and regularization operators. In this work these objects are presented in a general, unifying framework and interrelations are highlighted.With this in mind we then show how linear stochastic differential equation models can be incorporated naturally into the kernel framework. And vice versa, many kernel machines can be interpreted in terms of differential equations. We focus especially on ordinary differential equations, also known as dynamical systems, and it is shown that standard kernel inference algorithms are equivalent to Kalman filter methods based on such models.In order not to cloud qualitative insights with heavy mathematical machinery, we restrict ourselves to finite domains, implying that differential equations are treated via their corresponding finite difference equations.  相似文献   

15.
In-situ permutation algorithms permute an array of n items without using a second array. Little space is needed when one proceeds permuting along a cycle. Thus, those algorithms first search for one element called the leader on each cycle of the permutation. Then they move the items cycle by cycle. A great fraction of the runtime is spent with testing whether elements are leaders. We present a simple-to-implement heuristic to accelerate the search for leaders and present performance gains for randomly chosen permutations on three algorithms.  相似文献   

16.
Most of the work on the combination of unification algorithms for the union of disjoint equational theories has been restricted to algorithms that compute finite complete sets of unifiers.Thus the developed combination methods usually cannot be used to combine decision procedures,i.e., algorithms that just decide solvability of unification problems without computing unifiers.In this paper we describe a combination algorithm for decision procedures that works for arbitrary equational theories, provided that solvability of so–called unification problems with constant restrictions—a slight generalization of unification problems with constants—is decidable for these theories.As a consequence of this new method, we can, for example, show that generalA-unifiability, i.e.,solvability ofA-unification problems with free function symbols, is decidable. HereAstandsfor the equational theory of one associative function symbol.Our method can also be used to combine algorithms that compute finite complete sets of unifiers. Manfred Schmidt–Schauß' combination result, the until now most general result in this direction, can be obtained as a consequence of this fact.We also obtain the new result that unification in the union of disjoint equational theories is finitary, if general unification—i.e., unification of terms with additional free function symbols—is finitary in the single theories.  相似文献   

17.
We define an equational relation as the union of some components of the least solution of a system of equations of tree transformations in a pair of algebras. We focus on equational tree transformations which are equational relations obtained by considering the least solutions of such systems in pairs of term algebras. We characterize equational tree transformations in terms of tree transformations defined by different bimorphisms. To demonstrate the robustness of equational tree transformations, we give equational definitions of some well-known tree transformation classes for which bimorphism characterizations also exist. These are the class of alphabetic tree transformations, the class of linear and nondeleting extended top-down tree transformations, and the class of bottom-up tree transformations and its linear and linear and nondeleting subclasses. Finally, we prove that a relation is equational if and only if it is the morphic image of an equational tree transformation.  相似文献   

18.
As the VLSI technology makes large crossbar switches affordable, Clos networks have become a feasible option of large interconnection networks. However, to make these networks practical and useful, efficient routing algorithms need to be developed. This paper will develop and study several randomized routing algorithms for Clos networks. The algorithms are based on the idea that if the first column of Clos is set to some configuration somehow, then the resulting network becomes self-routed using the destination addresses. Each of the randomized algorithms sets the first column to a configuration selected by a random process. The algorithms are then self-routed and take no computation time to set the switches. Probabilistic analysis and simulation measurements of the communication delay of permutation routing are conducted. It is shown that the communication delay of any permutation is 3–6 cycles in networks of up to 1024 processors. Although other routing algorithms route arbitrary permutations in one cycle over Clos/Benes networks and 2 cycles over δ networks, these algorithms take prohibitively large times to compute the appropriate switch settings, while our randomized algorithms are self-routed and spend NO time on computing the switch settings. This makes our algorithms superior to any universal nonrandomized routing algorithm for Clos/Benes networks or δ networks. The speed, universality and ease of implementation of our randomized algorithms make Clos networks highly attractive for large parallel computer systems.  相似文献   

19.
We develop a general study of the algebraic specification practice, originating from the OBJ tradition, which encodes atomic sentences in logical specification languages as Boolean terms. This practice originally motivated by operational aspects, but also leading to significant increase in expressivity power, has recently become important within the context of some formal verification methodologies mainly because it allows the use of simple equational reasoning for frameworks based on logics that do not have an equational nature. Our development includes a generic rigorous definition of the logics underlying the above mentioned practice, based on the novel concept of ‘quasi-Boolean encoding’, a general result on existence of initial semantics for these logics, and presents a general method for employing Birkhoff calculus of conditional equations as a sound calculus for these logics. The high level of generality of our study means that the concepts are introduced and the results are obtained at the level of abstract institutions (in the sense of Goguen and Burstall [12]) and are therefore applicable to a multitude of logical systems and environments.  相似文献   

20.
An ILS algorithm is proposed to solve the permutation flowshop sequencing problem with total flowtime criterion. The effects of different initial permutations and different perturbation strengths are studied. Comparisons are carried out with three constructive heuristics, three ant-colony algorithms and a particle swarm optimization algorithm. Experiments on benchmarks and a set of random instances show that the proposed algorithm is more effective. The presented ILS improves the best known permutations by a significant margin.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号