首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We present a simple parallel algorithm for computing the greatest common divisor (gcd) of twon-bit integers in the Common version of the CRCW model of computation. The run-time of the algorithm in terms of bit operations isO(n/logn), usingn 1+? processors, where ? is any positive constant. This improves on the algorithm of Kannan, Miller, and Rudolph, the only sublinear algorithm known previously, both in run time and in number of processors; they requireO(n log logn/logn),n 2 log2 n, respectively, in the same CRCW model. We give an alternative implementation of our algorithm in the CREW model. Its run-time isO(n log logn/logn), usingn 1+? processors. Both implementations can be modified to yield the extended gcd, within the same complexity bounds.  相似文献   

2.
We present a simple parallel algorithm for computing the greatest common divisor (gcd) of twon-bit integers in the Common version of the CRCW model of computation. The run-time of the algorithm in terms of bit operations isO(n/logn), usingn 1+ processors, where is any positive constant. This improves on the algorithm of Kannan, Miller, and Rudolph, the only sublinear algorithm known previously, both in run time and in number of processors; they requireO(n log logn/logn),n 2 log2 n, respectively, in the same CRCW model.We give an alternative implementation of our algorithm in the CREW model. Its run-time isO(n log logn/logn), usingn 1+ processors. Both implementations can be modified to yield the extended gcd, within the same complexity bounds.Supported in part by an IBM Graduate Fellowship and a Bantrell Postdoctoral Fellowship.Supported in part by a Weizmann Postdoctoral Fellowship.4 All logarithms are to base 2.  相似文献   

3.
On parallel integer sorting   总被引:1,自引:0,他引:1  
We present an optimal algorithm for sortingn integers in the range [1,n c ] (for any constantc) for the EREW PRAM model where the word length isn , for any >0. Using this algorithm, the best known upper bound for integer sorting on the (O(logn) word length) EREW PRAM model is improved. In addition, a novel parallel range reduction algorithm which results in a near optimal randomized integer sorting algorthm is presented. For the case when the keys are uniformly distributed integers in an arbitrary range, we give an algorithm whose expected running time is optimal.Supported by NSF-DCR-85-03251 and ONR contract N00014-87-K-0310  相似文献   

4.
In this paper we propose an efficient algorithm to implement parallel integer multiplication by a combination of parallel additions, shifts and reads from a memory-resident lookup table dedicated to squares. Such an operator called PIM (parallel integer multiplication) is in fact microprogrammed at the PROM level. Our theoretical approach is included within the framework of time and space parallel complexity theory. The mathematical relation used defines this multiplication operator in terms of a difference of two quadratic expressions, each being computed in parallel by one addition and one shift. This leads to the CPU time for any pair of numbers being constant. Our contribution is above all of practical interest on any massively parallel architecture in the field of scientific and numerical computing.  相似文献   

5.
提出了一种改进的非支配排序遗传算法。通过扩大第一代种群规模,在初期加速种群的进化;对选择算子引入概率操作来提高种群的多样性;同时引入混合交叉算子,动态调节算法的搜索空间。最后以收敛性和分布性作为性能指标,使用公开的多目标测试函数对其进行测试,并与基本的非支配排序遗传算法和改进的多目标粒子群算法进行比较。实验结果表明,改进后的非支配排序遗传算法在收敛性和分布性两方面均有提升。  相似文献   

6.
When designing an algorithm for circle drawing, W.E. Wright (ibid., vol. 10, no. 5, pp. 60-7, Sept. 1990) chose to divide the circle into equal sub-arcs for rendering by separate processors. This method is simple and computationally tractable. Also, compared with the sequential algorithm, Wright's parallel circle-drawing algorithm achieves a speedup of 90% of P, where P represents the number of processors. We use an equal x-step division in this article to reformulate Wright's algorithm, which he regarded as unsatisfactory and messy. We show that this division method, which achieves a speedup of 100% of P, is not only computationally tractable but also requires no extra calculations  相似文献   

7.
改进的分支定界算法   总被引:1,自引:0,他引:1  
孙树亮  陈忠  刘政连 《软件》2011,(10):32-34
如何进行最优的特征选择是模式识别的研究重点之一。目前比较常用的最优特征选择方法是BAB和BAB+算法,然而此算法搜索时间比较长。在此基础上详细地阐述改进的分支定界的原理以及算法,该算法的基本思想是通过剪切那些肯定不会产生最优解的分支,同时引入了部分路径和父路径的概念,以达到决策树能够快速搜索到最优解的目的。实验结果证明了该算法的有效性及优越性。  相似文献   

8.
9.
A parallel O(log3 vbEvb) algorithm for finding a maximal matching in a graph G(V, E) is presented. The model of computation is the CRCW-PRAM, and vbVvb + vbEvb processors are used.This algorithm is a substantial improvement upon the two previous algorithms known to us. These algorithms by Karp and Wigderson (1984) and Lev (1980) achieve O(log4 vbEvb) depth with vbEvb3/log vbEvb and vbEvb + vbVvb processors respectively. The last one though having a better performance than the first, only applies to bipartite graphs.  相似文献   

10.
《电子技术应用》2017,(9):137-140
为实现无人机航拍图像中图像序列自动排序,提出了一种基于相位相关法改进的图像序列自动排序算法。该算法利用对数极坐标的方式来表示图像间的平移、旋转、尺度缩放的关系,并利用最大相关度准则以及峰值坐标判断相邻图像的位置关系。实验结果表明,此算法能有效地解决全景图像拼接中序列图像混乱的问题,避免了人工干预,增强了算法的应用范围,具有很强的实用价值。  相似文献   

11.
The computational complexity of a parallel algorithm depends critically on the model of computation. We describe a simple and elegant rule-based model of computation in which processors apply rules asynchronously to pairs of objects from a global object space. Application of a rule to a pair of objects results in the creation of a new object if the objects satisfy the guard of the rule. The model can be efficiently implemented as a novel MIMD array processor architecture, the Intersecting Broadcast Machine. For this model of computation, we describe an efficient parallel sorting algorithm based on mergesort. The computational complexity of the sorting algorithm isO(nlog2 n), comparable to that for specialized sorting networks and an improvement on theO(n 1.5) complexity of conventional mesh-connected array processors.  相似文献   

12.
Dr. F. Körner 《Computing》1983,30(3):253-260
The quadratic integer programming problem is considered. It will be shown in which order the variablesx 1, ...,x n should be ramified in order to reduce the number of knots being studied to a minimum. There areO(n 3) operations necessary to determine a favourable ramification. Numerical tests confirm the efficiency of the given algorithm.  相似文献   

13.
A sequential algorithm with complexity O(M2+n) for the integer knapsack problem is presented. M is the capacity of the knapsack, and n the number of objects. The algorithm admits an efficient parallelization on a p-processor ring machine. The corresponding parallel algorithm is O(M2/p+n). The parallel algorithm is compared with a version of the well-known Lee algorithm adapted to the integer knapsack problem. Computational results on both a local area network and a transputer are reported.  相似文献   

14.
In 1975 Fukunaga and Narendra proposed an efficient branch and bound algorithm for computing k-nearest neighbors. Their algorithm, after a hierarchical decomposition of the design set into disjoint subsets, employs two rules in order to eliminate the necessity of calculating many distances. This correspondence discusses the applicability of two additional rules for a further reduction of the number of distance computations. Experimental results using samples from bivariate Gaussian and uniform distributions suggest that the number of distance computations required by the modified is typicaly one fourth of that of the Fukunaga-Narendra algorithm.  相似文献   

15.
16.
Apriori算法是关联规则挖掘中最经典的算法之一,其核心问题是频繁项集的获取。针对经典Apriori算法存在的需多次遍历事务数据库及需产生候选项集等问题,首先通过转换存储结构、消除候选集产生过程等方法对Apriori算法进行优化,同时,随着大数据时代的到来,数据量与日俱增,传统算法面临巨大挑战,因此,又将优化的Apriori与Spark相结合,充分利用Spark的内存计算、弹性分布式数据集等优势,提出了IABS(Improved Apriori algorithm based on Spark)。通过与已有的同类算法进行比较,IABS的数据可扩展性和节点可扩展性得以验证,并且在多种数据集上平均获得了23.88%的性能提升,尤其随着数据量的增长,性能提升更加明显。  相似文献   

17.
High-speed electronic sorting networks are difficult to implement with VLSI technology because of the dense and global connectivity required. Optics eliminates this bottleneck by offering global interconnections, massive parallelism, and noninterfering communications. We present a parallel sorting algorithm and its efficient optical implementation using currently available optical hardware. The algorithm sorts n data elements in a few steps, independent of the number of elements to be sorted. Thus, it is a constant-time sorting algorithm, that is, O(1) time  相似文献   

18.
With the popularity of parallel database machines based on the shared-nothing architecture, it has become important to find external sorting algorithms which lead to a load-balanced computation, i.e., balanced execution, communication and output. If during the course of the sorting algorithm each processor is equally loaded, parallelism is fully exploited. Similarly, balanced communication will not congest the network traffic. Since sorting can be used to support a number of other relational operations (joins, duplicate elimination, building indexes etc.) data skew produced by sorting can further lead to execution skew at later stages of these operations. In this paper we present a load-balanced parallel sorting algorithm for shared-nothing architectures. It is a multiple-input multiple-output algorithm with four stages, based on a generalization of Batcher's odd-even merge. At each stage then keys are evenly distributed among thep processors (i.e., there is no final sequential merge phase) and the distribution of keys between stages ensures against network congestion. There is no assumption made on the key distribution and the algorithm performs equally well in the presence of duplicate keys. Hence our approach always guarantees its performance, as long asn is greater thanp 3, which is the case of interest for sorting large relations. In addition, processors can be added incrementally. Recommended by: Patrick Valduriez  相似文献   

19.
We generalize the well-known odd-even merge sorting algorithm, originally due to Batcher (1968), and show how this generalized algorithm can be applied to sorting on product networks. If G is an arbitrary factor graph with N nodes, its r-dimensional product contains Nr nodes. Our algorithm sorts Nr keys stored in the r-dimensional product of G in O(rrF(N)) time, where F(N) depends on G. We show that, for any factor graph G, F(N) is, at most, O(N), establishing an upper bound of O(r2 N) for the time complexity of sorting Nr keys on any product network. For product networks with bounded r(e.g. for grids), this leads to the asymptotic complexity of O(N) to sort Nr keys, which is optimal for several instances of product networks. There are factor graphs for which F(N)=O(log2 N), which leads to the asymptotic running time of O(log2 N) to sort Nr keys. For networks with bounded N (e.g. in the hypercube N=2, fixed), the asymptotic complexity becomes O(r2). We show how to apply the algorithm to several cases of well-known product networks, as well as others introduced recently. We compare the performance of our algorithm to well-known algorithms developed specifically for these networks, as well as others. The result of these comparisons led us to conjecture that the proposed algorithm is probably the best deterministic algorithm that can be found in terms of the low asymptotic complexity with a small constant  相似文献   

20.
为了解决整数小波变换与传统零树编码(EZW)算法相结合产生的量化阈值的选取问题,有人提出了基于整数平方量化阈值的零树编码(ISZW)算法。但是, 由于ISZW使用连续的整数平方作为量化阈值, 缩短了相邻阈值间的距离,却增加了编码的次数,降低了编码速度。为此设计了基于整数小波变换的零树编码的多位平面并行算法, 其中每个位平面的编码仅需对位平面进行一遍扫描,大大提高了ISZW的编码速度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号