首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 156 毫秒
1.
介绍一种新的并行排序算法,该算法以双调归并排序为基础,运用图形硬件的并行体系结构和二叉排序树数据结构的优点,用部分并行代替所有阶段的顺序执行,对双调排序算法进行优化.对该算法进行分析,在理论上n个序列在P个流处理器上的排序,最优的时间复杂度为O((nlogn)/p).实验测试结果表明,优化后的算法比其它基于图形硬件的双调归并排序算法所用时间短.  相似文献   

2.
划分点定位并行排序算法   总被引:5,自引:0,他引:5  
提出并分析了划分点定位并行排序(parallel sorting by divide-point locating)算法。在算法中,输入数据被平均划分并分配给所有处理机,因此每个处理机具有相同的工作负载。给出了网络分布计算环境下PSDL算法的实验结果,并与PSRS算法进行了对比。理论分析和实验结果表明,PSDL算法是一种高效率、高扩展性的并行排序算法。  相似文献   

3.
结合0-1整数规划的隐式枚举法对目标排序法进行分析.引入PSRS(并行正则采样排序)算法对目标排序法的核心运算进行并行化,并改进PSRS算法的数据收集策略以适应0-1整数规划的并行隐式枚举.最后给出了基于改进的PSRS的并行0-1整数规划的求解算法,并对算法的时间复杂度进行了分析.  相似文献   

4.
基于Shared-Nothing的并行Hash连接算法效率分析   总被引:3,自引:0,他引:3  
李庆华  睢海燕  邓冲 《软件学报》2000,11(3):386-392
该文研究了基于Shared-Nothing结构的几种常用并行连接算法,分析了影响查询响应时间的各种因素.在此基础上,以多种硬件成分作为参数建立一个代价分析模型.使用该模型计算并行Hash算法在每个处理机上的平均任务执行时间和总的查询响应时间,并比较了几种算法在不同硬件配置下的执行效率.所提出的模型和分析方法为评价和选取并行连接算法提供了一种可行的途径.  相似文献   

5.
陈宏建  陈崚  秦玲  徐晓华  屠莉 《计算机工程》2004,30(24):17-18,191
在Y.Pan提出的基于流水光总线阵列模型(LARPBS)上使用N个处理器对N个元素进行排序在最好情况下以O(logN)时间,最坏情况下以O(N)时间完成的并行排序算法的基础上,提出了一种LARPBS模型上的可扩展的快速并行排序算法,对N个元素进行排序,使用p(1≤P≤N)个处理器在最好情况下以O(NlogN/p)时间,最坏情况下以O(N^2/p)时间完成排序。另外还提出了一种LARPBS模型上改进的快速高效并行排序算法,该算法对N个元素进行排序使用N个处理器在最好情况下以O(log√N)时间、最坏情况下以O(√N)时间完成排序。  相似文献   

6.
束俊辉  张武  薛倩斐  谢江 《计算机应用》2014,34(11):3117-3120
为有效降低生物网络比对算法的时间复杂度,提出一种基于可扩展的蛋白质相互作用网络比对(SPINAL)算法的消息传递接口(MPI)并行化实现方法。该方法将MPI并行化思想运用在SPINAL算法中,在多核环境中采用并行排序代替算法原本的排序方式,并结合负载均衡策略合理分配任务。实验结果表明,与未使用并行排序以及负载均衡策略相比,该方法在处理大规模生物网络比对时能有效地缩短计算时间,提高运算效率,对于不同组比对数据都有较为稳定的优化保障,具有良好的可扩展性。  相似文献   

7.
基于倾斜与振荡法多路归并排序算法,提出了纵横多路并行归并算法,与已有方法递归应用两路归并过程不同·该算法直接对m×k的矩阵(m,k为任意整数)进行排序,消除了对两路递归过程的依赖,是一种新的多路归并排序算法·通过和倾斜与振荡法多路归并排序算法和高效的任意路并行归并算法的性能分析比较,当3k40时,该算法的时间复杂性低于同类算法·同时,该算法在专用硬件实现的设计复杂性上也具有明显的优势·  相似文献   

8.
一种新的并行归并排序算法   总被引:5,自引:0,他引:5  
文章提出了一种新的并行归并排序算法。算法充分利用并行系统中各个处理机中数据排序后序列长度相等的特点,计算出归并段对中的一个元素和最后一个元素的位置,然后再从相应的位置进行归并排序。该算法可使排序后的数据分布完全达到平衡,具有较高的负载平衡性、可扩展性和排序稳定性。文章最后给出了基于PC集群的实验结果,并把该结果与PSRS算法作了比较。  相似文献   

9.
NOW系统上的并行快速排序算法   总被引:5,自引:0,他引:5  
介绍了在NOW系统上的并行快速排序算法的设计与实现,分析了影响算法性能的因素及改进方法,最后给出了该算法对字符串排序的并行效率为49.15%。  相似文献   

10.
基于谱理论的特征选择算法FSST优先选择最具有局部信息保持能力和全局区分能力的特征.在实验分析该算法的基础上,采用分治策略对该算法最耗时的部分(规范化数据,构造Laplacian图和计算特征得分)进行并行化,从而提出一种基于谱理论的并行特征选择算法PFSST(Parallel Feature Selection with Spectral Theory),在多核系统上的实验证明了PFSST的并行有效性.  相似文献   

11.
We present a simple and general parallel sorting scheme, ZZ-sort, which can be used to derive a class of efficient in-place sorting algorithms on realistic parallel machine models. We prove a tight bound for the worst case performance of ZZ-sort. We also demonstrate the average performance of ZZ-sort by experimental results obtained on a MasPar parallel computer. Our experiments indicate that ZZ-sort can be incorporated into a distributed memory parallel computer system as a standard routine, and this routine is useful for space critical situations. Finally, we show that ZZ-sort can be used to convert a non-adaptive parallel sorting algorithm into an in-place and adaptive one by considering the problem of sorting an arbitrarily large input on fixed-size reconfigurable meshes.  相似文献   

12.
一种基于和谐理论神经网络的分类算法   总被引:1,自引:0,他引:1  
文中从和谐理论神经网络派生一种新的神经网络模型,并以该模型为基础提出了一种新的并行分类算法。神经网络采用不完全连接方式,网络节点的连接数大大减少,具有计算功能简单便于硬件实现的优点。同时,文中提出的分类算法不同于大多数基于神经网络的分类算法,不需事先对待分类的数据进行任何形式的预处理,分类运算可在有限步长内自动完成,具有结构简单、透明度高、运算速度快等特点。  相似文献   

13.
14.
Many sorting algorithms have been studied in the past, but there are only a few algorithms that can effectively exploit both single‐instruction multiple‐data (SIMD) instructions and thread‐level parallelism. In this paper, we propose a new high‐performance sorting algorithm, called aligned‐access sort (AA‐sort), that exploits both the SIMD instructions and thread‐level parallelism available on today's multicore processors. Our algorithm consists of two phases, an in‐core sorting phase and an out‐of‐core merging phase. The in‐core sorting phase uses our new sorting algorithm that extends combsort to exploit SIMD instructions. The out‐of‐core algorithm is based on mergesort with our novel vectorized merging algorithm. Both phases can take advantage of SIMD instructions. The key to high performance is eliminating unaligned memory accesses that would reduce the effectiveness of SIMD instructions in both phases. We implemented and evaluated the AA‐sort on PowerPC 970MP and Cell Broadband Engine platforms. In summary, a sequential version of the AA‐sort using SIMD instructions outperformed IBM's optimized sequential sorting library by 1.8 times and bitonic mergesort using SIMD instructions by 3.3 times on PowerPC 970MP when sorting 32 million random 32‐bit integers. Also, a parallel version of AA‐sort demonstrated better scalability with increasing numbers of cores than a parallel version of bitonic mergesort on both platforms. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

15.
Efficient sorting is a key requirement for many computer science algorithms. Acceleration of existing techniques as well as developing new sorting approaches is crucial for many real‐time graphics scenarios, database systems, and numerical simulations to name just a few. It is one of the most fundamental operations to organize and filter the ever growing massive amounts of data gathered on a daily basis. While optimal sorting models for serial execution on a single processor exist, efficient parallel sorting remains a challenge. In this paper, we present a hardware‐optimized parallel implementation of the radix sort algorithm that results in a significant speed up over existing sorting implementations. We outperform all known General Processing Unit (GPU) based sorting systems by about a factor of two and eliminate restrictions on the sorting key space. This makes our algorithm not only the fastest, but also the first general GPU sorting solution.  相似文献   

16.
基于散列和归并技术的有效并行排序方法   总被引:1,自引:1,他引:1       下载免费PDF全文
本文提出一个在共享存储多处理机系统上实现的快速、有效的并行排序算法:将长度为n的待排序数据划分成p个长度为n/p的子序列,引入散列技术并行地对这p个子序列的数据进行二次散列排序,这一阶段所需的平均时间为O(n/p);最后并行地将p个有序子序列归并成一个长度为n的有序序列,归并阶段所需的时间为O(n-n/
/p)。整个排序算法的并行执行代价为O(np)。本排序方法可以拓以网络并行机群环境。  相似文献   

17.
在介绍带有宽总线网络的可重构计算模型(RAPWBN)的二进制值的前缀和操作的基础上,提出了该模型上的抽取压缩操作算法,并由此得到了该模型上的并行归并排序算法。在具有N个处理器和N条行总线的RAPWBN模型上,若总线带宽ω>log N字节,对长度为N的序列进行归并排序,在最坏情况下以O(logN·loglogN)时间完成。  相似文献   

18.
With the help of a model of an associative parallel processor with vertical processing (STAR computer), Prim-Dijkstra and Kraskal algorithms for finding a minimal spanning tree of an undirected graph represented in the form of a list of edges and their weights are compared. A relatively simple representation of the Prim-Dijkstra algorithm is constructed in which the initial node is taken into account. The Kraskal algorithm is also presented and the possibility of eliminating the stage of preliminary sorting of edges by their weights is shown. Translated from Kibernetika i Sistemnyi Analiz, No. 2. pp. 19–27, March–April, 2000.  相似文献   

19.
Study on Parallel Computing   总被引:5,自引:0,他引:5       下载免费PDF全文
In this paper, we present a general survey on parallel computing. The main contents include parallel computer system which is the hardware platform of parallel computing, parallel algorithm which is the theoretical base of parallel computing, parallel programming which is the software support of parallel computing. After that, we also introduce some parallel applications and enabling technologies. We argue that parallel computing research should form an integrated methodology of "architecture algorithm programming application". Only in this way, parallel computing research becomes continuous development and more realistic.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号