首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到15条相似文献,搜索用时 102 毫秒
1.
在介绍带有宽总线网络的可重构计算模型(RAPWBN)的基本结构及其二进制值的前缀和操作的基础上,提出该模型上的一种并行归并排序算法,在具有N~α(1<α<2)个处理器和N条行总线的RAPWBN模型上,若总线带宽ω>logN字节,对长度为N的序列进行归并排序,可以在O((loglogN)~2)时间完成.  相似文献   

2.
在介绍带有宽总线网络的可重构计算模型(RAPWBN)的基本结构及其二进制值的前缀和操作的基础上,提出了RAPWBN模型上的抽取压缩操作算法,并由此得到了RAPWBN模型上的两种快速高效并行排序算法,对长度为N的序列进行排序,在具有N2个处理器和N条行总线的RAPWBN模型上,若总线带宽ω>logN字节,可以在O(1)时间完成排序.在具有N个处理器和N条行总线的RAPWBN模型上,最好情况下以O(logN)时间、最坏情况下以O(N)时间完成排序.  相似文献   

3.
该文介绍了带有宽总线网络的可重构计算模型(RAPWBN)的基本结构及其二进制值的前缀和操作,提出了一种快速并行排序算法,对长度为N的序列进行排序,在具有N2个处理器和N条行总线的RAPWBN模型上,若总线带宽ω>logN字节,可以在O(1)时间完成排序。该算法的成本达到了最优。  相似文献   

4.
带有宽总线网络的可重构计算模型上的并行排序算法   总被引:1,自引:0,他引:1  
在介绍带有宽总线网络的可重构计算模型(RAPWBN)的基本结构及其二进制值的前缀和操作的基础上,提出了RAPWBN模型上的抽取压缩操作算法,并由此得到了RAPWBN模型上的快速高效并行排序算法,在具有N个处理机和N条行总线的RAPWBN模型上,若总线带宽ω>logN字节,则对元素位数固定的N个元素可以在O(1)时间完成排序,对元素位数不固定的N个元素,可以在O(k)时间完成排序,这里k为元素的最大位数.  相似文献   

5.
陈宏建  陈崚  李开荣  陈莉莉 《计算机工程》2004,30(23):31-33,110
在介绍带有宽总线网络的可重构计算阵列(RAPWBN)的基本结构及其二进制值的前缀和操作的基础上,提出了 RAPWBN 阵列上的整数求和算法,并由此得到了 RAPWBN 阵列上的两种快速高效的矩阵乘法运算并行算法。在具有 N3个处理器和 N2条行总线的 RAPWBN 阵列上,若总线带宽ω>logN 字节,矩阵乘法可以在 O(1)时间完成;在具有 N2个处理器和 N 条行总线的 RAPWBN 阵列上,矩阵乘法可以在 O(N)时间完成。它们的效率都为 O(N3),达到了最优。  相似文献   

6.
基于流水总线的可重构线性阵列系统(LARPBS)是一种建立在光总线上的并行计算模型,许多研究工作者已经在该模型上设计出了一些高效的并行算法。文章提出了一种基于LARPBS模型上Vnliant并行归并的实现算法,利用该法对长度为N的序列进行排序,最坏情况下可以使用N个处理器在O(logNloglogN)时间完成。  相似文献   

7.
文章提出了一种LARPBS模型上的并行归并排序算法,利用该算法对长度为N的序列进行排序,使用N~(1+)着(0<着<1)个处理机可以在O((loglogN)~2)时间完成。  相似文献   

8.
陈宏建  陈崚  秦玲  徐晓华  屠莉 《计算机工程》2004,30(24):17-18,191
在Y.Pan提出的基于流水光总线阵列模型(LARPBS)上使用N个处理器对N个元素进行排序在最好情况下以O(logN)时间,最坏情况下以O(N)时间完成的并行排序算法的基础上,提出了一种LARPBS模型上的可扩展的快速并行排序算法,对N个元素进行排序,使用p(1≤P≤N)个处理器在最好情况下以O(NlogN/p)时间,最坏情况下以O(N^2/p)时间完成排序。另外还提出了一种LARPBS模型上改进的快速高效并行排序算法,该算法对N个元素进行排序使用N个处理器在最好情况下以O(log√N)时间、最坏情况下以O(√N)时间完成排序。  相似文献   

9.
陈宏建  陈崚  罗家奇 《计算机工程》2006,32(17):115-117
提出了RAPWBN模型上的整数前缀和与抽取压缩操作算法,并由此得到了RAPWBN模型上的快速高效Hough变换并行算法,对于大小为n×n的二值数字图像,p个θ参数值。可以使用pn2个处理器在O(1)时间完成。使算法的速度和效率达到了最优。  相似文献   

10.
基于倾斜与振荡法多路归并排序算法,提出了纵横多路并行归并算法,与已有方法递归应用两路归并过程不同·该算法直接对m×k的矩阵(m,k为任意整数)进行排序,消除了对两路递归过程的依赖,是一种新的多路归并排序算法·通过和倾斜与振荡法多路归并排序算法和高效的任意路并行归并算法的性能分析比较,当3k40时,该算法的时间复杂性低于同类算法·同时,该算法在专用硬件实现的设计复杂性上也具有明显的优势·  相似文献   

11.
The computation model on which the algorithms are developed is the reconfigurable array of processors with wider bus networks (abbreviated to RAPWBN). The main difference between the RAPWBN model and other existing reconfigurable parallel processing systems is that the bus width of each network is bounded within the range [2,[/spl radic/(N)]]. Such a strategy not only saves the silicon area of the chip as well as increases the computational power enormously, but the strategy also allows the execution speed of the proposed algorithms to be tuned by the bus bandwidth. To demonstrate the computational power of the RAPWBN, the channel-assignment problem is derived in this paper. For the channel-assignment problem with N pairs of components, we first design an O(T + [N//spl omega/]) time parallel algorithm using 2N processors with a 2N-row by 2N-column bus network, where the bus width of each bus network is /spl omega/-bit for 2 /spl les/ /spl omega/ /spl les/ [/spl radic/N] and T = [log/sub /spl omega//N] + 1. By tuning the bus bandwidth to the natural log N-bit and the extended N/sup 1/c/-bit (N/sup 1/c/ > log N) for any constant c and c /spl ges/ 1, two more results which run in O(log N/log log N) and O(1) time, respectively, are also derived. When compared to the algorithms proposed by Olariu et al. [17] and Lin [14], it is shown that our algorithm runs in the equivalent time complexity while significantly reducing the number of processors to O(N).  相似文献   

12.
A fast algorithm for computing a histogram on reconfigurable mesh   总被引:1,自引:0,他引:1  
The reconfigurable mesh captures salient features from a variety of sources, including the content addressable array parallel processor, the CHiP, the polymorphic-torus network and the bus automaton. It consists of an array of processors interconnected by a reconfigurable bus system. The bus system can be used to dynamically obtain various interconnection patterns between the processors. In this paper, we present a fast algorithm for computing the histogram of an N×N image with h grey levels in O(min{√h+log*(N/h),N}) time on an N×N reconfigurable mesh assuming each PE has a constant amount of local memory. This algorithm runs on the PARBUS and MRN/LRN models. In addition, histogram modification can be performed in O(√h) time on the same model. A variant of out algorithm runs in O(min{√h+log log(N/h),N}) time on an N×N RMESH in which each PE has constant storage. This result improves the known time and memory bounds for histogramming on the RMESH model  相似文献   

13.
The Hough transform is an important problem in image processing and computer vision. An efficient algorithm for computing the Hough transform has been proposed on a reconfigurable array by Kao et al. (1995). For a problem with an √N×√N image and an n×n parameter space, the algorithm runs in a constant time on a three-dimensional (3-D) n×n×N reconfigurable mesh where the data bus is N1c/-bit wide. To our best knowledge, this is the most efficient constant-time algorithm for computing the Hough transform on a reconfigurable mesh. In this paper, an improved Hough transform algorithm on a reconfigurable mesh is proposed. For the same problem, our algorithm runs in constant time on a 3-D n*n×n×√n√n reconfigurable mesh, where the data bus is only log N-bit wide. In most practical situations, n=O(√N). Hence, our algorithm requires much less VLSI area to accomplish the same task. In addition, our algorithm can compute the Radon transform (a generalized Hough transform) in O(1) time on the same model, whereas the algorithm in the above paper cannot be adapted to computing Radon transform easily  相似文献   

14.
可重构流水线总线并行机上图像的聚类算法   总被引:1,自引:1,他引:1  
聚类是图像处理中的一个基本操作。对于分为K个簇的N个模式,且每个模式含有M个特征,该文在N个处理器的一维可重构流水线总线并行机上提出了一个时间复杂度为O(M×K)的聚类操作中一次迭代的并行算法。并对该并行算法进行了扩展,使得当处理器数分别为N×M和N×M×K时,一次迭代的时间复杂度分别为O(K)和O(1)。  相似文献   

15.
A decimation-in-time radix-2 fast Fourier transform (FFT) algorithm is considered here for implementation in multiprocessors with shared bus, multistage interconnection network (MIN), and in mesh connected computers. Results are derived for data allocation, interprocessor communication, approximate computation time, and speedup of an N point FFT on any P available processing elements (PE's). Further generalization is obtained for a radix-r FFT algorithm. An N X N point two-dimensional discrete Fourier transform (DFT) implementation is also considered when one or more rows of the input data matrix are allocated to each PE.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号