期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

《Journal of Parallel and Distributed Computing》1994,21(2):237-245

A class of recursive sorting networks based on mod(s) m-merging, a generalization of Batcher′s odd-even merging, is studied in this paper. Iterative decomposition of a sorter into smaller comparator modules, an equivalent problem but in a different setting, is also resolved based on the scheme proposed here. We first show that it is impossible to generalize the principle of odd-even merging in a three-stage network. Then the condition on generalized mod(s) m-merging with four stages is proven. This scheme degenerates to the three-stage odd-even merging network when m = 2 and s = 2. An algorithm for computing the optimal configuration is also given. Applications of this sorting network include parallel relational database computers and modular construction of sorting networks for large packet switches. 相似文献

2.

On Probabilistic Networks for Selection, Merging, and Sorting 总被引：1，自引：0，他引：1

T. Leighton Y. Ma T. Suel 《Theory of Computing Systems》1997,30(6):559-582

We study comparator networks for selection, merging, and sorting that output the correct result with high probability, given a random input permutation. We prove tight bounds, up to constant factors, on the size and depth of probabilistic (n,k)-selection networks. In the case of (n, n/2)-selection, our result gives a somewhat surprising bound of on the size of networks of success probability in , where δ is an arbitrarily small positive constant, thus comparing favorably with the best previously known solutions, which have size . We also prove tight bounds, up to lower-order terms, on the size and depth of probabilistic merging networks of success probability in , where δ is an arbitrarily small positive constant. Finally, we describe two fairly simple probabilistic sorting networks of success probability at least and nearly logarithmic depth. Received January 22, 1996, and in final form February 14, 1997. 相似文献

3.

Sorting with near linear speed-up on tightly coupled multiprocessors

Mitchell Wheat 《Concurrency and Computation》1991,3(1):1-13

A new parallel sorting algorithm, called parsort, suitable for implementation on tightly coupled multiprocessors is presented. The algorithm is based upon quicksort and two-way merging. An asynchronous parallel partitioning algorithm is used to distribute work evenly during merging to ensure a good load balance amongst processors, which is crucial if we are to achieve high efficiency. The implementation of this parallel sorting algorithm exhibits theoretical and measured near linear speed-up when compared to sequential quicksort. This is illustrated by the results of experiments carried out on the Sequent Balance 8000 multiprocessor. 相似文献

4.

排序算法在龙芯3A上的优化实现

下载免费PDF全文

翁玉萍顾乃杰李恺陈强《计算机工程》2011,37(20):255-257

分析归并排序算法和快速排序算法,根据国产CPU龙芯3A的体系结构特性,提出2种优化算法并进行实现。综合利用访存特性,引入拷贝优化、循环展开、交换操作优化和不同基本排序混用等优化技术。测试结果表明,在不影响排序稳定性的前提下,与Glibc 2.11库中的排序函数相比,2种优化算法均能提升16.9%~90.5%的排序性能。相似文献

5.

基于对应比较图的Fabric排序机制优化

刘润德陈志德《计算机系统应用》2023,32(5):323-329

针对HLF (Hyperledger Fabric)区块链系统在排序阶段中存在的缺陷,提出了一种基于对应比较图的图排序优化方案.利用对应比较图具有相关不变性质的图合并过程以及其算法运行时间短的特点,设计了一种基于交易重要度的拓扑算法,旨在减少由于默认的顺序排序而导致的序列化冲突问题.通过实验与分析,表明该方案有效解决了原始方案的序列化冲突问题,减少了系统中无效事务的比例,提升了系统交易效率,节省了大量的计算与存储资源. 相似文献

6.

Evolutionary Design of Arbitrarily Large Sorting Networks Using Development

Luká??Sekanina Email author Michal?Bidlo 《Genetic Programming and Evolvable Machines》2005,6(3):319-347

An evolutionary algorithm is combined with an application-specific developmental scheme in order to evolve efficient arbitrarily large sorting networks. First, a small sorting network (that we call the embryo) has to be prepared to solve the trivial instance of a problem. Then the evolved program (the constructor) is applied on the embryo to create a larger sorting network (solving a larger instance of the problem). Then the same constructor is used to create a new instance of the sorting network from the created larger sorting network and so on. The proposed approach allowed us to rediscover the conventional principle of insertion which is traditionally used for constructing large sorting networks. Furthermore, the principle was improved by means of the evolutionary technique. The evolved sorting networks exhibit a lower implementation cost and delay. 相似文献

7.

A Superlogarithmic Lower Bound for Shuffle-Unshuffle Sorting Networks

C. G. Plaxton T. Suel 《Theory of Computing Systems》2000,33(3):233-254

Shuffle-unshuffle sorting networks are a class of comparator networks whose structure maps efficiently to the hypercube and any of its bounded degree variants. Recently, n -input shuffle-unshuffle sorting networks with depth have been discovered. These networks are the only known sorting networks of depth o( lg² n) that are not based on expanders, and their existence raises the question of whether a depth of O( lg n) can be achieved by any shuffle-unshuffle sorting network. In this paper we resolve this question by establishing an Ω( lg n lg lg n/lg lg lg n) lower bound on the depth of any n -input shuffle-unshuffle sorting network. Our lower bound can be extended to certain restricted classes of nonoblivious sorting algorithms on hypercubic machines. Received September 9, 1999, and in final form December 20, 1999. 相似文献

8.

A Wait-Free Sorting Algorithm

N. Shavit E. Upfal A. Zemach 《Theory of Computing Systems》2001,34(6):519-544

Sorting is one of a set of fundamental problems in computer science. In this paper we present the first wait-free algorithm for sorting an input array of size N using P ≤ N processors to achieve optimal running time. We show two variants of the algorithm, one deterministic and one randomized, and prove that, with high probability, the latter suffers no more than contention when run synchronously. Known sorting algorithms, when made wait-free through previously established transformation techniques, have complexity O(log ³ N) . The algorithm we present here, when run in the CRCW PRAM model, executes with high probability in O(log N) time when P=N , and O((Nlog N)/P) otherwise, which is optimal amongst comparison-based sorting algorithms. The wait-free property guarantees that the sort will complete despite any delays or failures incurred by the processors. This is a very desirable property from an operating systems point of view, since it allows oblivious thread scheduling as well as thread creation and deletion, without fear of losing the algorithm's correctness. Received May 15, 1998, and in revised form November 17, 1999. Online publication November 19, 2001. 相似文献

9.

Quantum Complexities of Ordered Searching, Sorting, and Element Distinctness

Hoyer Neerbek Shi 《Algorithmica》2008,34(4):429-448

Abstract. We consider the quantum complexities of the following three problems: searching an ordered list, sorting an un-ordered list, and deciding whether the numbers in a list are all distinct. Letting N be the number of elements in the input list, we prove a lower bound of (1/π )(ln(N )-1) accesses to the list elements for ordered searching, a lower bound of Ω(N logN ) binary comparisons for sorting, and a lower bound of

binary comparisons for element distinctness. The previously best known lower bounds are 1/12 log₂ (N) - O (1) due to Ambainis, Ω(N) , and

, respectively. Our proofs are based on a weighted all-pairs inner product argument. In addition to our lower bound results, we give an exact quantum algorithm for ordered searching using roughly 0.631 log₂ (N) oracle accesses. Our algorithm uses a quantum routine for traversing through a binary search tree faster than classically, and it is of a nature very different {from} a faster exact algorithm due to Farhi, Goldstone, Gutmann, and Sipser. 相似文献

10.

基于CPN的动态多路归并外排序算法建模

吴建强 LUO Wen-jun 《计算机与现代化》2008,(8):110-112

介绍CPN（Colorea Petri Nets）的基本概念,用CPN建模实现动态的、并发的多路归并外排序算法。算法利用多个缓冲区解决外部文件读入的等待延时,通过调整缓冲区的大小和数量可在不同的机器上获得最佳效果。相似文献

11.

Parallel Self-Index Integer Sorting

Hazem M. Bahig Sameh S. Daoud Mahmoud K. A. Khairat 《The Journal of supercomputing》2002,22(3):269-275

We consider the problem of sorting n integers when the elements are drawn from the restricted domain [1...n]. A new deterministic parallel algorithm for sorting n integers is obtained. Its running time is O(lognlog(n/logn)) using n/logn processors on EREW (exclusive read exclusive write) PRAM (parallel random access machine). Also, our algorithm was modified to become optimal when we use processors. This algorithm belongs to class EP (Efficient, Polynomial fast). 相似文献

12.

A Framework for Simple Sorting Algorithms on Parallel Disk Systems

S. Rajasekaran 《Theory of Computing Systems》2001,34(2):101-114

In this paper we present a simple parallel sorting algorithm and illustrate its application in general sorting, disk sorting, and hypercube sorting. The algorithm (called the (l,m) -mergesort (LMM)) is an extension of the bitonic and odd—even mergesorts. Literature on parallel sorting is abundant. Many of the algorithms proposed, though being theoretically important, may not perform satisfactorily in practice owing to large constants in their time bounds. The algorithm presented in this paper has the potential of being practical. We present an application to the parallel disk sorting problem. The algorithm is asymptotically optimal (assuming that N is a polynomial in M , where N is the number of records to be sorted and M is the internal memory size). The underlying constant is very small. This algorithm performs better than the disk-striped mergesort (DSM) algorithm when the number of disks is large. Our implementation is as simple as that of DSM (requiring no fancy data structures or prefetch techniques.) As a second application, we prove that we can get a sparse enumeration sort on the hypercube that is simpler than that of the classical algorithm of Nassimi and Sahni [16]. We also show that Leighton's columnsort algorithm is a special case of LMM. Online publication December 26, 2000. 相似文献

13.

The Bit Complexity of Distributed Sorting

O. Gerstel S. Zaks 《Algorithmica》1997,18(3):405-416

We study the bit complexity of the sorting problem for asynchronous distributed systems. We show that for every network with a tree topology T, every sorting algorithm must send at least bits in the worst case, where is the set of possible initial values, and Δ _T is the sum of distances from all the vertices to a median of the tree. In addition, we present an algorithm that sends at most bits for such trees. These bounds are tight if either L=Ω(N ^1+ε ) or Δ _T =Ω(N ² ). We also present results regarding average distributions. These results suggest that sorting is an inherently nondistributive problem, since it requires an amount of information transfer that is equal to the concentration of all the data in a single processor, which then distributes the final results to the whole network. The importance of bit complexity—as opposed to message complexity—stems also from the fact that, in the lower bound discussion, no assumptions are made as to the nature of the algorithm. Received May 2, 1994; revised December 22, 1995. 相似文献

14.

Introspective Sorting and Selection Algorithms

DAVID R. MUSSER 《Software》1997,27(8):983-993

Quicksort is the preferred in-place sorting algorithm in many contexts, since its average computing time on uniformly distributed inputs is Θ(N log N), and it is in fact faster than most other sorting algorithms on most inputs. Its drawback is that its worst-case time bound is Θ(N²). Previous attempts to protect against the worst case by improving the way quicksort chooses pivot elements for partitioning have increased the average computing time too much – one might as well use heapsort, which has a Θ(N log N) worst-case time bound, but is on the average 2–5 times slower than quicksort. A similar dilemma exists with selection algorithms (for finding the i-th largest element) based on partitioning. This paper describes a simple solution to this dilemma: limit the depth of partitioning, and for subproblems that exceed the limit switch to another algorithm with a better worst-case bound. Using heapsort as the ‘stopper’ yields a sorting algorithm that is just as fast as quicksort in the average case, but also has an Θ(N log N) worst case time bound. For selection, a hybrid of Hoare's FIND algorithm, which is linear on average but quadratic in the worst case, and the Blum–Floyd–Pratt–Rivest–Tarjan algorithm is as fast as Hoare's algorithm in practice, yet has a linear worst-case time bound. Also discussed are issues of implementing the new algorithms as generic algorithms, and accurately measuring their performance in the framework of the C+:+ Standard Template Library. ©1997 by John Wiley & Sons, Ltd. 相似文献

15.

Sorting and Searching in Faulty Memories 总被引：1，自引：1，他引：0

Irene Finocchi Giuseppe F. Italiano 《Algorithmica》2008,52(3):309-332

In this paper we investigate the design and analysis of algorithms resilient to memory faults. We focus on algorithms that, despite the corruption of some memory values during their execution, are nevertheless able to produce a correct output at least on the set of uncorrupted values. In this framework, we consider two fundamental problems: sorting and searching. In particular, we prove that any O(nlog n) comparison-based sorting algorithm can tolerate the corruption of at most O((nlog n)^1/2) keys. Furthermore, we present one comparison-based sorting algorithm with optimal space and running time that is resilient to O((nlog n)^1/3) memory faults. We also prove polylogarithmic lower and upper bounds on resilient searching. This work has been partially supported by the Sixth Framework Programme of the EU under Contract Number 507613 (Network of Excellence “EuroNGI: Designing and Engineering of the Next Generation Internet”) and by MIUR, the Italian Ministry of Education, University and Research, under Project ALGO-NEXT (“Algorithms for the Next Generation Internet and Web: Methodologies, Design and Experiments”). A preliminary version of this work was presented at the 36th ACM Symposium on Theory of Computing (STOC’04) . 相似文献

16.

Fast Four‐Way Parallel Radix Sorting on GPUs

Linh Ha Jens Krüger Cláudio T. Silva 《Computer Graphics Forum》2009,28(8):2368-2378

Efficient sorting is a key requirement for many computer science algorithms. Acceleration of existing techniques as well as developing new sorting approaches is crucial for many real‐time graphics scenarios, database systems, and numerical simulations to name just a few. It is one of the most fundamental operations to organize and filter the ever growing massive amounts of data gathered on a daily basis. While optimal sorting models for serial execution on a single processor exist, efficient parallel sorting remains a challenge. In this paper, we present a hardware‐optimized parallel implementation of the radix sort algorithm that results in a significant speed up over existing sorting implementations. We outperform all known General Processing Unit (GPU) based sorting systems by about a factor of two and eliminate restrictions on the sorting key space. This makes our algorithm not only the fastest, but also the first general GPU sorting solution. 相似文献

17.

Sorting Unsigned Permutations by Weighted Reversals,Transpositions, and Transreversals

下载免费PDF全文

Xiao-Wen Lou Da-Ming Zhu 《计算机科学技术学报》2010,25(4):853-863

Reversals, transpositions and transreversals are common events in genome rearrangement. The genome rearrangement sorting problem is to transform one genome into another using the minimum number of given rearrangement operations. An integer permutation is used to represent a genome in many cases. It can be divided into disjoint strips with each strip denoting a block of consecutive integers. A singleton is a strip of one integer. And the genome rearrangement problem turns into the problem of sorting a permutation into the identity permutation equivalently. Hannenhalli and Pevzner designed a polynomial time algorithm for the unsigned reversal sorting problem on those permutations with O(log n) singletons. In this paper, first we describe one case in which Hannenhalli and Pevzner’s algorithm may fail and propose a corrected approach. In addition, we propose a (1+ε)-approximation algorithm for sorting unsigned permutations with O(log n) singletons by reversals of weight 1 and transpositions/transreversals of weight 2. 相似文献

18.

Data parallel sorting for particle simulation

Leonardo Dagum 《Concurrency and Computation》1992,4(3):241-255

Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O(N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimum performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analysed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine. 相似文献

19.

Algorithms for Building Annular Structures with Minimalist Robots Inspired by Brood Sorting in Ant Colonies 总被引：1，自引：1，他引：0

Matt Wilson Chris Melhuish Ana B. Sendova-Franks Samuel Scholes 《Autonomous Robots》2004,17(2-3):115-136

This study shows that a task as complicated as multi-object ‘ant-like annular sorting’ can be accomplished with ‘minimalist’ solutions employing simple mechanisms and minimal hardware. It provides an alternative to ‘patch sorting’ for multi-object sorting. Three different mechanisms, based on hypotheses about the behaviour of Leptothorax ants are investigated and comparisons are made. Mechanism I employs a simple clustering algorithm, with objects of different sizes. The mechanism explores the idea that it is the size difference of the object that promotes segregation. Mechanism II is an extension to our earlier two-object segregation mechanism. We test the ability of this mechanism to segregate an increased number of object types. Mechanism III uses a combined leaky integrator, which allows a greater segregation of object types while retaining the compactness of the structure. Its performance is improved by optimizing the mechanism's parameters using a genetic algorithm. We compare the three mechanisms in terms of sorting performance. Comparisons between the results of these sorting mechanisms and the behaviour of ants should facilitate further insights into both biological and robotic research and make a contribution to the further development of swarm robotics. 相似文献

20.

Reflected min-max heaps

Christos Makris 《Information Processing Letters》2003,86(4):209-214

In this paper we present a simple and efficient implementation of a min-max priority queue, reflected min-max priority queues. The main merits of our construction are threefold. First, the space utilization of the reflected min-max heaps is much better than the naive solution of putting two heaps back-to-back. Second, the methods applied in this structure can be easily used to transform ordinary priority queues into min-max priority queues. Third, when considering only the setting of min-max priority queues, we support merging in constant worst-case time which is a clear improvement over the best worst-case bounds achieved by Høyer. 相似文献