首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Aperiodic sorteris a sorting network that is a cascade of a number of identicalblocks, where outputiof each block is inputiof the next block. Previously, Dowd, Perl, Rudolph, and Saks introduced thebalancedmerging network, withN=2kinputs/outputs and log Nstages of comparators. Using a very intricate proof, they showed that a cascade of log Nsuch blocks constitutes a sorting network. In this paper, we introduce a large class of merging networks with the same periodic property. This class contains 2N/2−1networks, whereN=2kis the number of inputs. The balanced merger is one network in this class. Other networks use fewer comparators. We provide a very simple and elegant correctness proof based on the recursive structure of the networks.  相似文献   

2.
We consider the problem of merging two sorted sequences on constant degree networks performing compare—exchange operations only. The classical solution to this problem is given by the networks based on Batcher's Odd—Even Merge and Bitonic Merge running in log(2n ) time. Due to the obvious log n lower bound for the runtime, this is time-optimal. We present a new family of merging networks working in a completely different way than the previously known algorithms. They have a novel property of being periodic: this means that for some (small) constant k , each processing unit of the network performs the same operations at steps t and t+k (as long as t+k does not exceed the runtime). The only operations executed are compare—exchange operations, just like in the case of Batcher's networks. The architecture of the networks is very simple and easy to lay out. We show that even for period 3 there is a network in our family merging two n -element sorted sequences in time O(log n ). Since each network of period 2 requires steps to merge such sequences, 3 is the smallest period for which we may achieve a fast runtime. In order to improve constants standing in front of log n we increment the period and tune the construction using additional techniques. We achieve the runtime 9 . . . log_3 n 5.7 . . . log n for a network of period 4, and 2.25 . . . ((k+3)/(k-1+log 3))log n 2.25 . . . log n for a network of period k+3 , for . Due to the periodic design, our networks have small area complexity. For instance, if each processing unit requires O(1) area and a comparator uses a single wire of width O(1) connecting the processing elements, then our networks require area. This compares well with Batcher's networks that require area . Received February 1997, and in revised form September 1997, and in final form February 1998.  相似文献   

3.
With the popularity of parallel database machines based on the shared-nothing architecture, it has become important to find external sorting algorithms which lead to a load-balanced computation, i.e., balanced execution, communication and output. If during the course of the sorting algorithm each processor is equally loaded, parallelism is fully exploited. Similarly, balanced communication will not congest the network traffic. Since sorting can be used to support a number of other relational operations (joins, duplicate elimination, building indexes etc.) data skew produced by sorting can further lead to execution skew at later stages of these operations. In this paper we present a load-balanced parallel sorting algorithm for shared-nothing architectures. It is a multiple-input multiple-output algorithm with four stages, based on a generalization of Batcher's odd-even merge. At each stage then keys are evenly distributed among thep processors (i.e., there is no final sequential merge phase) and the distribution of keys between stages ensures against network congestion. There is no assumption made on the key distribution and the algorithm performs equally well in the presence of duplicate keys. Hence our approach always guarantees its performance, as long asn is greater thanp 3, which is the case of interest for sorting large relations. In addition, processors can be added incrementally. Recommended by: Patrick Valduriez  相似文献   

4.
A lower bound theorem is established for the number of comparators in a merging network. Let M(m, n) be the least number of comparators required in the (m, n)-merging networks, and let C(m, n) be the number of comparators in Batcher's (m, n)-merging network, respectively. We prove for n≥1 that M(4, n)=C(4, n) for n≡0, 1, 3 mod 4, M(4, n)≥C(4, n)−1 for n≡2 mod 4, and M(5, n)=C(5, n) for n≡0, 1, 5 mod 8. Furthermore Batcher's (6, 8k+6)-, (7, 8k+7)-, and (8, 8k+8)-merging networks are optimal for k≥0. Our lower bound for (m, n)-merging networks, mn, has the same terms as C(m, n) has as far as n is concerned. Thus Batcher's (m, n)-merging network is optimal up to a constant number of comparators, where the constant depends only on m. An open problem posed by Yao and Yao (Lower bounds on merging networks, J. Assoc. Comput. Mach.23, 566–571) is solved: limn→∞M(m, n)/n=log m/2+m/2log m.  相似文献   

5.
The interconnection network considered in this paper is the k-ary n-cube that is an attractive variance of the well-known hypercube. Many interconnection networks can be viewed as the subclasses of the k-ary n-cubes include the cycle, the torus and the hypercube. A bipartite graph is Hamiltonian laceable if there exists a Hamiltonian path joining every two vertices which are in distinct partite sets. A bipartite graph G is strongly Hamiltonian laceable if it is Hamiltonian laceable and there exists a path of length N − 2 joining each pair of vertices in the same partite set, where N = |V(G)|. We prove that the k-ary n-cube is strongly Hamiltonian laceable for k is even and n  2.  相似文献   

6.
分组排序算法   总被引:3,自引:0,他引:3       下载免费PDF全文
提出了分组排序算法,详细分析了算法的原理及其时间与空间复杂度,得出了在最坏情况下的时间复杂度是θmn);最好情况和平均情况下的时间复杂度均是θnlog(n/mk));在最坏情况下的空间复杂度是O(mn-m2m);最好情况和平均情况下的空间复杂度均是O(mklog(n/mk));并用多组随机数据与效率较高的快速算法进行仿真对比实验,试验结果说明了文中结论的正确性。这一结果,将有助于进一步设计高效的海量数据分析方法。  相似文献   

7.
The computational complexity of a parallel algorithm depends critically on the model of computation. We describe a simple and elegant rule-based model of computation in which processors apply rules asynchronously to pairs of objects from a global object space. Application of a rule to a pair of objects results in the creation of a new object if the objects satisfy the guard of the rule. The model can be efficiently implemented as a novel MIMD array processor architecture, the Intersecting Broadcast Machine. For this model of computation, we describe an efficient parallel sorting algorithm based on mergesort. The computational complexity of the sorting algorithm isO(nlog2 n), comparable to that for specialized sorting networks and an improvement on theO(n 1.5) complexity of conventional mesh-connected array processors.  相似文献   

8.
In [C.H. Tsai, S.Y. Jiang, Path bipancyclicity of hypercubes, Inform. Process. Lett. 101 (2007) 93–97], the authors showed that any path in an n-cube with length of k, 2k2n−4, lies on a cycle of every even length from 2k to 2n inclusive. Base on Lemma 5 of that paper, they proved the subcase 2.2.1 of the main theorem of that paper. However, the lemma is false, therefore, we propose a lemma to replace that lemma. Therefore, the main result of [C.H. Tsai, S.Y. Jiang, Path bipancyclicity of hypercubes, Inform. Process. Lett. 101 (2007) 93–97] is still correct.  相似文献   

9.
We consider the problem of exploring an anonymous line by a team of k identical, oblivious, asynchronous deterministic mobile robots that can view the environment but cannot communicate. We completely characterize sizes of teams of robots capable of exploring an n-node line. For k<n, exploration by k robots turns out to be possible, if and only if either k=3, or k?5, or k=4 and n is odd. For all values of k for which exploration is possible, we give an exploration algorithm. For all others, we prove an impossibility result.  相似文献   

10.
Parallel merge sort is useful for sorting a large quantity of data progressively. The merge sort should be parallelized carefully since the conventional algorithm has poor performance due to the successive reduction of the number of participating processors by half, and down to one in the last merging stage. The proposed load-balanced merge sort utilizes all processors throughout the computation. It evenly distributes data to all processors in each stage. Thus every processor is forced to work in all phases. Significant performance enhancement has been achieved up to a speedup of (P–1)/log P where P is the number of processors. Experimental results demonstrate a speedup of 9.6 (upper bound of 10.7) on 32-processor Cray T3E when sorting 4M 32-bit integers, and a speed up of 2.3 (upper bound of 2.8) on an 8-node PC cluster.  相似文献   

11.
Adaptivity in sorting algorithms is sometimes gained at the expense of practicality. We give experimental results showing that Splaysort — sorting by repeated insertion into a Splay tree — is a surprisingly efficient method for in-memory sorting. Splaysort appears to be adaptive with respect to all accepted measures of presortedness, and it outperforms Quicksort for sequences with modest amounts of existing order. Although Splaysort has a linear space overhead, there are many applications for which this is reasonable. In these situations Splaysort is an attractive alternative to traditional comparison-based sorting algorithms such as Heapsort, Mergesort, and Quicksort.  相似文献   

12.
Building on previous works, this paper establishes that the minimal depth of a Bitonic sorter of n keys is 2⌈log(n)⌉−⌊log(n)⌋.  相似文献   

13.
The sorting network described by Ajtaiet al. was the first to achieve a depth ofO(logn). The networks introduced here are simplifications and improvements based strongly on their work. While the constants obtained for the depth bound still prevent the construction being of practical value, the structure of the presentation offers a convenient basis for further development.The author was supported by a Senior Fellowship from the Science and Engineering Research Council during the course of this work.  相似文献   

14.
A modified counter-propagation (CP) algorithm with supervised learning vector quantizer (LVQ) and dynamic node allocation has been developed for rapid classification of molecular sequences. The molecular sequences were encoded into neural input vectors using an n–gram hashing method for word extraction and a singular value decomposition (SVD) method for vector compression. The neural networks used were three-layered, forward-only CP networks that performed nearest neighbor classification. Several factors affecting the CP performance were evaluated, including weight initialization, Kohonen layer dimensioning, winner selection and weight update mechanisms. The performance of the modified CP network was compared with the back-propagation (BP) neural network and the k–nearest neighbor method. The major advantages of the CP network are its training and classification speed and its capability to extract statistical properties of the input data. The combined BP and CP networks can classify nucleic acid or protein sequences with a close to 100% accuracy at a rate of about one order of magnitude faster than other currently available methods.  相似文献   

15.
The traditional zero-one principle for sorting networks states that “if a network with n input lines sorts alln2 binary sequences into nondecreasing order, then it will sort any arbitrary sequence of n numbers into nondecreasing order”. We generalize this to the situation when a network sorts almost all binary sequences and relate it to the behavior of the sorting network on arbitrary inputs. We also present an application to mesh sorting.  相似文献   

16.
A bipartite graph is vertex-bipancyclic (respectively, edge-bipancyclic) if every vertex (respectively, edge) lies in a cycle of every even length from 4 to |V(G)| inclusive. It is easy to see that every connected edge-bipancyclic graph is vertex-bipancyclic. An n-dimensional hypercube, or n-cube denoted by Qn, is well known as bipartite and one of the most efficient networks for parallel computation. In this paper, we study a stronger bipancyclicity of hypercubes. We prove that every n-dimensional hypercube is (2n−4)-path-bipancyclic for n?3. That is, for any path P of length k with 1?k?2n−4 and any integer l with max{2,k}?l?2n−1, an even cycle C of length 2l can be found in Qn such that the path P is included in C for n?3.  相似文献   

17.
We study the problem of dynamic tree embedding ink-partite networksGkand analyze the performance on interpartition load distribution of the embedding. We show that, for ring-connectedGk, if the embedding proceeds by taking a unidirectional random walk at a length randomly chosen from [0, Δ − 1], where Δ is a multiple ofk, the best-case performance is achievable at probability ek, which is much higher than the asymptotically zero probability at which the worst-case performance may appear. We also show that the same probabilities hold for fully connectedGkif the embedding proceeds by taking a random walk at a length randomly chosen from [2, ∞). Whenk= 2 (bipartite networks), our results show that if we do the embedding under the above random-walk schemes in their corresponding networks, we will have a 50% chance to achieve the best-case performance. We also analyze the performances for embedding in these networks in the expected case and observe the interesting fact that they match the performances in the best case when the network isk-partitionable into partitions of equal size.  相似文献   

18.
We consider the problem of determining constructions with an asymptotically optimal oblivious diameter in small world graphs under the Kleinberg’s model. In particular, we give the first general lower bound holding for any monotone distance distribution, that is induced by a monotone generating function. Namely, we prove that the expected oblivious diameter is Ω(log 2 n) even on a path of n nodes. We then focus on deterministic constructions and after showing that the problem of minimizing the oblivious diameter is generally intractable, we give asymptotically optimal solutions, that is with a logarithmic oblivious diameter, for paths, trees and Cartesian products of graphs, including d-dimensional grids for any fixed value of d. The research was partially funded by the European project COST Action 293, “Graphs and Algorithms in Communication Networks” (GRAAL).  相似文献   

19.
We prove an (lg2 n/lg lgn) lower bound for the depth ofn-input sorting networks based on the shuffle permutation. The best previously known lower bound was the trivial (lgn) bound, while the best upper bound is given by Batcher's (lg2 n)-depth bitonic sorting network. The proof technique employed in the lower bound argument may be of independent interest.C. G. Plaxton was supported by NSF Research Initiation Award CCR-9111591, and the Texas Advanced Research Program under Grant No. 003658-480. T. Suel was supported by the Texas Advanced Research Program under Grant No. 003658-480, and by an MCD Fellowship of the University of Texas at Austin.  相似文献   

20.
Squares are strings of the form ww where w is any nonempty string. Main and Lorentz proposed an O(nlogn)-time algorithm for finding the positions of all squares in a string of length n. Based on their result, we show how to find the positions of all squares in a run-length encoded string in time O(NlogN) where N is the number of runs in this string, provided that we do not explicitly compute at all “trivial squares” occurring within runs. The algorithm is optimal and its time complexity is independent of the length of the original uncompressed string.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号