首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 468 毫秒
1.
Sorting on STAR     
This paper gives timing comparisons for three sorting algorithms written for the CDC STAR computer. One algorithm is Hoare's Quicksort, which is the fastest or nearly the fastest sorting algorithm for most computers. A second algorithm is a vector version of Quicksort that takes advantage of the STAR's vector operations. The third algorithm is an adaptation of Batcher's sorting algorithm, which makes especially good use of vector operations but has a complexity of N (log N)2 as compared to a complexity of N log N for the Quicksort algorithms. In spite of its worse complexity, Batcher's sorting algorithm is competitive with the serial version of Quicksort for vectors up to the largest that can be treated by STAR. Vector Quicksort outperforms the other two algorithms and is generally preferred. These results indicate that unusual instruction sets can introduce biases in program execution time that counter results predicted by worst-case asymptotic complexity analysis.  相似文献   

2.
针对在数据量动态增加的场景下现有的排序算法管理数据导致算法性能大大降低的问题,提出一种16-bit Trie树排序算法.借助邻居节点上存储的链节点指针完成排序,它不仅可以边构建边排序,且引入动态数组可以提高该算法的空间效率.仿真结果表明,传统Trie树支持数据动态更新,但通过遍历Trie树的方式完成排序耗时较多,快速排...  相似文献   

3.
一种非比较分段排序算法的研究   总被引:4,自引:2,他引:4  
非比较分段排序(简称NCSS)算法是建立在模仿人类思想方式基础上的一种非比较排序算法,算法分析和实验结果都表明:NCSS算法的时间复杂度和待排序数据分布无关,为O(N),而附加存储空间极小,排序速率明显优于QuickSort,ProportionSplitSort,分段快速排序等算法,NCSS算法特别适合于数据最大的场合。  相似文献   

4.
随着现实待挖掘数据库规模不断增长,系统可使用的内存成为用FP-GROWTH算法进行关联规则挖掘的瓶颈.为了摆脱内存的束缚,对大规模数据库中的数据进行关联规则挖掘,基于磁盘的关联规则挖掘成为重要的研究方向.对此,改进原始的FP-TREE数据结构,提出了一种新颖的基于磁盘表的DTRFP-GROWTH(disk table resident FP-TREE growth)算法.该算法利用磁盘表存储FP-TREE,降低内存使用,在传统FP-GROWTH算法占用过多内存、挖掘工作无法进行时,以独特的磁盘表存储FP-TREE技术,减少内存使用,能够继续完成挖掘工作,适合空间性能优先的场合.不仅如此,该算法还将关联规则挖掘和关系型数据库整合,克服了基于文件系统相关算法效率较低、开发难度较大等问题.在真实数据集上进行了验证实验以及性能分析.实验结果表明,在内存空间有限的情况下,DTRFP-GROWTH算法是一种有效的基于磁盘的关联规则挖掘算法.  相似文献   

5.
External sorting—the process of sorting a file that is too large to fit into the computer's internal memory and must be stored externally on disks—is a fundamental subroutine in database systems[G], [IBM]. Of prime importance are techniques that use multiple disks in parallel in order to speed up the performance of external sorting. The simple randomized merging (SRM ) mergesort algorithm proposed by Barve et al. [BGV] is the first parallel disk sorting algorithm that requires a provably optimal number of passes and that is fast in practice. Knuth [K,Section 5.4.9] recently identified SRM (which he calls ``randomized striping') as the method of choice for sorting with parallel disks. In this paper we present an efficient implementation of SRM, based upon novel and elegant data structures. We give a new implementation for SRM's lookahead forecasting technique for parallel prefetching and its forecast and flush technique for buffer management. Our techniques amount to a significant improvement in the way SRM carries out the parallel, independent disk accesses necessary to read blocks of input runs efficiently during external merging. Our implementation is based on synchronous parallel I/O primitives provided by the TPIE programming environment[TPI]; whenever our program issues an I/O read (write) operation, one block of data is synchronously read from (written to) each disk in parallel. We compare the performance of SRM over a wide range of input sizes with that of disk-striped mergesort (DSM ), which is widely used in practice. DSM consists of a standard mergesort in conjunction with striped I/O for parallel disk access. SRM merges together significantly more runs at a time compared with DSM, and thus it requires fewer merge passes. We demonstrate in practical scenarios that even though the streaming speeds for merging with DSM are a little higher than those for SRM (since DSM merges fewer runs at a time), sorting using SRM is often significantly faster than with DSM (since SRM requires fewer passes). The techniques in this paper can be generalized to meet the load-balancing requirements of other applications using parallel disks, including distribution sort and multiway partitioning of a file into several other files. Since both parallel disk merging and multimedia processing deal with streams that get ``consumed' at nonuniform and partially predictable rates, our techniques for lookahead based upon forecasting data may have relevance in video server applications. Received June 28, 2000, and in revised form June 5, 2001. Online publication April 8, 2002.  相似文献   

6.
基于光盘库的Hadoop分布式文件系统(HDFS光盘库)在单位存储成本、数据安全性、使用寿命等方面非常符合当前大数据存储要求,但是HDFS不适合存储大量小文件和实时数据读取。为了使HDFS光盘库能更好地运用到更多大数据存储场景,本文提出一种更加适合大数据存储的磁光虚拟存储系统(MOVS, Magneto-optical Virtual Storage System)。系统在HDFS光盘库与用户之间加入磁盘缓存,并在磁盘缓存内通过文件标签分类、虚拟存储、小文件合并等技术将磁盘缓存内小文件合并为适合HDFS光盘库存储的大文件,提高系统的数据传输速度。系统还使用了文件预取、缓存替换等文件调度算法对磁盘缓存内文件进行动态更新,减少用户访问HDFS光盘库次数。实验结果表明,MOVS相对HDFS光盘库在响应时间和数据传输速度方面得到很大改善。  相似文献   

7.
Dalia Motzkin 《Software》1981,11(6):607-611
A sorting algorithm, called Stable Quicksort, is presented. the algorithm is comparable in speed with the Quicksort algorithm, but is stable. The experimental evidence presented support the theoretical evaluation of the performance of Stable Quicksort.  相似文献   

8.
Adaptivity in sorting algorithms is sometimes gained at the expense of practicality. We give experimental results showing that Splaysort — sorting by repeated insertion into a Splay tree — is a surprisingly efficient method for in-memory sorting. Splaysort appears to be adaptive with respect to all accepted measures of presortedness, and it outperforms Quicksort for sequences with modest amounts of existing order. Although Splaysort has a linear space overhead, there are many applications for which this is reasonable. In these situations Splaysort is an attractive alternative to traditional comparison-based sorting algorithms such as Heapsort, Mergesort, and Quicksort.  相似文献   

9.
An algorithm, which asymptotically halves the number of comparisons made by the common Heapsort, is presented and analysed in the worst case. The number of comparisons is shown to be (n+1)(log(n+1)+log log(n+1)+1.82)+O(log n) in the worst case to sort n elements, without using any extra space. Quicksort, which usually is referred to as the fastest in-place sorting method, uses 1.38n log n − O(n) in the average case (see Gonnet (1984)).  相似文献   

10.
External sorting: run formation revisited   总被引:1,自引:0,他引:1  
External mergesort begins with a run formation phase creating the initial sorted runs. Run formation can be done by a load-sort-store algorithm or by replacement selection. A load-sort-store algorithm repeatedly fills available memory with input records, sorts them, and writes the result to a run file. Replacement selection produces longer runs than load-sort-store algorithms and completely overlaps sorting and I/O, but it has poor locality of reference resulting in frequent cache misses and the classical algorithm works only for fixed-length records. This paper introduces batched replacement selection: a cache-conscious version of replacement selection that works also for variable-length records. The new algorithm resembles AlphaSort in the sense that it creates small in-memory runs and merges them to form the output runs. Its performance is experimentally compared with three other run formation algorithms: classical replacement selection, Quicksort, and AlphaSort. The experiments show that batched replacement selection is considerably faster than classic replacement selection. For small records (average 100 bytes), CPU time was reduced by about 50 percent and elapsed time by 47-63 percent. It was also consistently faster than Quicksort, but it did not always outperform AlphaSort. Replacement selection produces fewer runs than Quicksort and AlphaSort. The experiments confirmed that this reduces the merge time whereas the effect on the overall sort time depends on the number of disks available.  相似文献   

11.
R. Geoff Dromey 《Software》1984,14(6):509-518
The widely known Quicksort algorithm does not attempt to actively take advantage of partial order in sorting data. A simple change can be made to the Quicksort strategy to give a bestcase performance of n, for ordered data, with a smooth transition to O(n log n) for random data. This algorithm (Transort) matches the performance of Sedgewick's claimed best implementation of Quicksort for random data.  相似文献   

12.
随着磁盘价格的不断下降,越来越多的容灾备份系统采用将本地数据经过网络传输到服务器磁盘进行备份,但是大数据时代的来,瞄,导致存储压力越来越大。然而在庞大的数据中,有很大一部分数据活跃率并不高。设计并实现了一种基于磁带分级存储的文件备份与恢复系统,采用索引的方式记录并组织备份文件,实现增量备份;该系统将服务器端拆分为两部分,首先将文件数据备份到服务器磁盘,再根据相关时间保留策略,将活跃率较低的数据归档到脏带中。保证了大数据时代服务器磁盘的高效利用。  相似文献   

13.
An efficient external sorting algorithm with minimal space requirement is presented in this article. The average number of passes over the data is approximately 1 +Ln(N + 1)/4B, whereN is the number of records in the file to be sorted, andB is the buffer size. The external storage requirement is only the file itself, no additional disk space is required. The internal storage requirement is four buffers: two for input, and two for output. The buffer size can be adjusted to the available memory space. A stack of size log2 N is also required.This work was partially supported by a fellowship and grant from Western Michigan University.  相似文献   

14.
In this paper, we proposed a new efficient sorting algorithm based on insertion sort concept. The proposed algorithm is called Bidirectional Conditional Insertion Sort (BCIS). It is in-place sorting algorithm and it has remarkably efficient average case time complexity when compared with classical insertion sort (IS). By comparing our new proposed algorithm with the Quicksort algorithm, BCIS indicated faster average case time for relatively small size arrays up to 1500 elements. Furthermore, BCIS was observed to be faster than Quicksort within high rate of duplicated elements even for large size array.  相似文献   

15.
H.D. Mills and R.C. Linger (1986) propose adding the datatype set to existing programming languages. During some investigations using sets, it became apparent that Quicksort can be written without using stacks (or recursion). Using sets can lead to efficient multiprocessor usage, because if the elements of a set can be processed in any order, they can frequently be processed simultaneously. An example of the possibilities is an intelligent disk control unit based on External Quicksort, using four processors and four read/write heads., The control unit can sort a large disk file in about 1/3 of the time taken by the one-processor version  相似文献   

16.
归并方式的多线程快速排序算法   总被引:1,自引:0,他引:1  
本文基于Java平台针对经典快速排序提出改进方案,使用归并的思想对快速排序作了多线程优化,并对单、多线程下的快速排序进行了对比测试和分析。结果表明,通过多线程优化,快速排序在双核主机上对5千万个随机整型数据进行排序的速度是单线程的1.6倍,说明了该优化方法的有效性。该方法思路直观、容易理解,宜作为多核技术教学案例。  相似文献   

17.
Salzberg  Betty 《Acta Informatica》1989,27(3):195-215
Summary External sorting is usually accomplished by first creating sorted runs, then merging the runs. In the merge phase, writing and calculating can be overlapped by reading if two input buffers are used for each sorted run. If the memory is very large, the input buffers will be large and using two input buffers per sorted run will be more efficient than using only one input buffer per run and risking reduced overlap of reading and writing. In many cases, merging time can be cut in half. We derive a formula for estimating the total time for merging for a given memory size, file size, number of merging passes and for a given disk drive. We present an extreme example where in spite of having two buffers per run, significant non-overlap occurs. However, in realistic problems, we show that making one merge pass with two input buffers per run is near optimal. This contradicts earlier results on merging which do not take large memory into account.  相似文献   

18.
Sorting huge amounts of datasets have become essential in many computer applications, such as search engines, database and web-based applications, in order to improve searching performance. Moreover, due to the witnessed prevalence of the commercial Simultaneous Multithreaded architecture (SMT), parallel programming using multithreading becomes a dire need for efficiently using all available hardware resources for one application. In this paper, one of the efficient and quick algorithms, the Quicksort, is applied as a parallel multithreaded algorithm on SMT architecture, where virtual parallelization has been achieved using the POSIX threads (Pthreads) library. The proposed algorithm is evaluated and compared with its sequential counterpart. The obtained analytical and experimental results reveal that multithreading is a viable technique for implementing the parallel Quicksort algorithm efficiently on SMT architecture, where it has been shown both analytically and experimentally that the parallel multithreaded Quicksort algorithm outperforms the sequential Quicksort algorithm in terms of various performance metrics including; time complexity and speedup.  相似文献   

19.
This paper is the second of a two part series in which a general purpose sorting algorithm i.e. Quicksort is adapated for execution on an M.I.M.D. computer system. In this part, the parallel algorithm derived in Part 1 is simulated and qualitative agreement with the results from the run-time analysis was obtained.  相似文献   

20.
Efficient storage techniques for digital continuous multimedia   总被引:4,自引:0,他引:4  
The problem of collocational storage of media strands, which are sequences of continuously recorded audio samples or video frames, on disk to support the integration of storage and transmission of multimedia data with computing is examined. A model that relates disk and device characteristics to the playback rates of media strands and derives storage patterns so as to guarantee continuous retrieval of media strands is presented. To efficiently utilize the disk space, mechanisms for merging storage patterns of multiple media strands by filling the gaps between media blocks of one strand with media blocks of other strands are developed. Both an online algorithm suitable for merging a new media strand into a set of already stored strands and an offline merging algorithm that can be applied a priori to the storage of a set of media strands before any of them have been stored on disk are proposed. As a consequence of merging, storage patterns of media strands may become perturbed slightly. To compensate for this read-ahead and buffering are required so that continuity of retrieval remains satisfied are also presented  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号